Tuesday, January 8, 2008

Lecture 1 - Ch 2 - Presentational Statistics

There are a number of different ways to graphically represent data, most of which are familiar to most people, but some may be new.

Familiar examples:
Summary table - categorize data and present in a table, often with percentages
Bar chart
Pie chart

This was new to me:
Stem-and-leaf diagram: It's used with numerical data. Take the first digit(s) and represent it/them as a single stem, the last digit is the leaf and may be repeated. In addition, a column is added to the left which indicates how many leaves are on each stem. It's kinda hard to describe the algorithm for constructing this diagram and it's best done with a demo.

The one thing that was unclear was why one of the counts was in parentheses. I looked this up in thee Minitab help file and found: "If the median value for the sample is included in a row, the count for that row is enclosed in parentheses." The row with the median is not necessarily the row with the most number of leaves.

Also somewhat new, but pretty much intuitively obvious:
Frequency Distribution and Histogram:
1. Sort your data
2. Determine the range
3. Select some number of classes into which you want to equally divide your data points
4. Compute the interval = range/# of classes (rounded up)
5. Determine the class boundaries based on the interval
6. Count the number of data points in each interval

Report the frequency and relative frequency (=freq/total # of data points) in a table

A Histogram is a bar chart (usually vertical) representation of the frequency of each class.

Some graphical ways to represent two variables:

Scatter Diagram - basically just an x-y plot of two variables

Time-Series Plot - plot of a single variable over time

