Frequency distributions summarize and compress data by grouping it into classes and recording how many data points fall into each class. That is, they show how many observations on a given variable have a particular attribute. For example, a survey is taken of 50 people’s favorite color. The frequency distribution might indicate 15 people selected green, 12 blue, 6 red, 7 yellow, and 10 purple. Converting these raw numbers into percentages would then provide an even more useful description of the data. The frequency distribution is the foundation of descriptive statistics. It is a prerequisite for both the various graphs used to display data and the basic statistics used to describe a data set — mean, median, mode, variance, standard deviation, and so forth. Note that frequency distributions are generally used to describe both nominal and interval data, though they can describe ordinal data.

WHEN TO USE IT A frequency distribution should be constructed for virtually all data sets. They are especially useful whenever a broad, easily understood description of data concentration and spread is needed. Most data provided by third parties are grouped into a frequency distribution.

Preparation The steps in preparing frequency distributions manually are as follows: C Collect raw data from entity records, interviews, surveys, etc. C If data are nominal, list the classes into which a data point might fall. If data are interval, select an appropriate number of data classes. C Calculate the absolute frequency of each class, i.e. the raw number of data points in each class. Note that the sum of all absolute frequencies must equal the sample size. C Calculate the relative frequency by dividing the absolute frequency by the sample size. This reveals the proportion or percent of data points in each class. Note that the sum of all relative frequencies must be 1. C Calculate the cumulative frequency for each class by adding the number or proportion/percentage of data points in that class to similar quantities for all preceding classes. If you accumulate the number of data points, the last number should equal the sample size. If you accumulate the proportions/percentages, the last number should be 1 or 100 percent.

ADVANTAGES Frequency distributions can:

C condense and summarize large amounts of data in a useful format

C describe all variable types

C facilitate graphic presentation of data

C begin to identify population characteristics

C permit cautious comparison of data sets

DISADVANTAGES Frequency distributions can:

C reveal little about the actual distribution, skew, and kurtosis of data C be easily manipulated to yield misleading results

C de-emphasize ranges and extreme values, particularly when open classes are used (e.g., “over 65,” “under $15,000″ etc.)