beginner 15 minutes

Summarizing Data with Tables

Learn to organize and summarize data using frequency tables, relative frequency distributions, and cumulative frequency tables.

On This Page
Advertisement

Why Summarize Data in Tables?

Raw data is often overwhelming. Consider a dataset with 10,000 observations—you can’t understand it by staring at 10,000 numbers! Tables help us:

  • Organize data into manageable form
  • Identify patterns and common values
  • Communicate findings clearly
  • Prepare for visualization and further analysis

Frequency Tables for Categorical Data

A frequency table shows how often each category or value occurs.

Basic Frequency Table

Favorite Color Survey

A survey asked 50 people their favorite color. Raw data (abbreviated):

Blue, Red, Green, Blue, Blue, Red, Yellow, Green, Blue, …

Frequency Table:

ColorFrequency
Blue18
Red14
Green10
Yellow5
Purple3
Total50

Now we can immediately see that Blue is most popular, followed by Red.

Adding Relative Frequency

Relative frequency expresses each count as a proportion or percentage of the total.

Relative Frequency

Relative Frequency=FrequencyTotal Count\text{Relative Frequency} = \frac{\text{Frequency}}{\text{Total Count}}

Adding Relative Frequency
ColorFrequencyRelative FrequencyPercentage
Blue1818/50=0.3618/50 = 0.3636%
Red1414/50=0.2814/50 = 0.2828%
Green1010/50=0.2010/50 = 0.2020%
Yellow55/50=0.105/50 = 0.1010%
Purple33/50=0.063/50 = 0.066%
Total501.00100%

Interpretation: 36% of respondents prefer blue—more than a third!

Frequency Tables for Numerical Data

For numerical data, we often group values into classes (intervals or bins).

Creating Classes

Guidelines for choosing classes:

  1. Use 5-15 classes (fewer for small datasets)
  2. Make all classes the same width
  3. Make sure classes don’t overlap
  4. Include all data values
Class Width

Class Width=MaximumMinimumNumber of Classes\text{Class Width} = \frac{\text{Maximum} - \text{Minimum}}{\text{Number of Classes}}

Round up to a convenient number.

Exam Scores Frequency Table

Data: 25 exam scores ranging from 52 to 98.

Step 1: Choose number of classes: 5 classes

Step 2: Calculate width: (9852)/5=9.2(98 - 52) / 5 = 9.2 \rightarrow round to 10

Step 3: Create classes starting at 50:

ClassFrequencyRelative Frequency
50-5920.08
60-6940.16
70-7980.32
80-8970.28
90-9940.16
Total251.00

Interpretation: Most students (32%) scored in the 70s, with a fairly symmetric distribution.

Class Boundaries and Midpoints

Class boundaries eliminate gaps between classes. For the class 50-59:

  • Lower boundary: 49.5
  • Upper boundary: 59.5

Class midpoint is the center of each class:

Class Midpoint

Midpoint=Lower Limit+Upper Limit2\text{Midpoint} = \frac{\text{Lower Limit} + \text{Upper Limit}}{2}

Class Midpoints
ClassMidpoint
50-59(50+59)/2=54.5(50 + 59)/2 = 54.5
60-69(60+69)/2=64.5(60 + 69)/2 = 64.5
70-79(70+79)/2=74.5(70 + 79)/2 = 74.5
80-89(80+89)/2=84.5(80 + 89)/2 = 84.5
90-99(90+99)/2=94.5(90 + 99)/2 = 94.5

Midpoints are useful for estimating the mean from grouped data.

Cumulative Frequency Tables

A cumulative frequency table shows the running total of frequencies.

Cumulative Frequency

Shows how many observations fall at or below each class.

Cumulative Frequency Table
ClassFrequencyCumulative Frequency
50-5922
60-6942 + 4 = 6
70-7986 + 8 = 14
80-89714 + 7 = 21
90-99421 + 4 = 25

Interpretation:

  • 6 students scored 69 or below
  • 14 students scored 79 or below
  • 21 students scored 89 or below

Cumulative Relative Frequency

Express cumulative frequencies as proportions.

Complete Frequency Table
ClassFreqRel. FreqCum. FreqCum. Rel. Freq
50-5920.0820.08
60-6940.1660.24
70-7980.32140.56
80-8970.28210.84
90-9940.16251.00

Interpretation:

  • 24% of students scored 69 or below (bottom quarter)
  • 56% scored 79 or below (just over half)
  • 84% scored 89 or below

Two-Way Tables (Contingency Tables)

Two-way tables show the relationship between two categorical variables.

Two-Way Table: Gender and Preference

A survey asked 200 people about their preferred coffee type.

EspressoLatteDripRow Total
Male352540100
Female204535100
Column Total557075200

Reading the table:

  • 35 males prefer espresso
  • 45 females prefer latte
  • 70 people total prefer latte

Joint, Marginal, and Conditional Distributions

Distributions from Two-Way Tables

Using the coffee preference table:

Joint Distribution (proportion in each cell):

EspressoLatteDrip
Male35/200 = 0.1750.1250.200
Female0.1000.2250.175

Marginal Distribution (row or column totals):

  • Male: 100/200 = 50%
  • Female: 100/200 = 50%
  • Espresso: 55/200 = 27.5%

Conditional Distribution (given one variable):

  • Among males, what % prefer latte? 25/100 = 25%
  • Among females, what % prefer latte? 45/100 = 45%

Insight: Females are much more likely to prefer lattes (45% vs 25%)!

Common Mistakes to Avoid

1. Overlapping Classes

Wrong: 50-60, 60-70, 70-80 (where does 60 go?)

Correct: 50-59, 60-69, 70-79 OR use “less than” notation like 50 to under 60, 60 to under 70

2. Unequal Class Widths

Unless there’s a specific reason, keep class widths equal for easier interpretation.

3. Too Many or Too Few Classes

  • Too few: Lose important patterns
  • Too many: Table becomes as confusing as raw data

4. Missing the “Other” Category

For categorical data, include an “Other” category if responses don’t fit existing categories.

5. Not Including Totals

Always include row and column totals—they’re essential for calculations.

When to Use Each Table Type

Table TypeUse When
Simple frequencySummarizing one categorical variable
Grouped frequencySummarizing numerical data with many values
Relative frequencyComparing groups of different sizes
Cumulative frequencyFinding percentiles, proportion above/below a value
Two-way tableExploring relationship between two categorical variables

Real-World Application

Hospital Emergency Room Data

An ER wants to understand patient arrival patterns.

Data: 500 patient arrivals over one week

Frequency Table by Time of Day:

Time PeriodPatientsRel. FreqCum. Freq
12am-4am459.0%45
4am-8am5511.0%100
8am-12pm12024.0%220
12pm-4pm9519.0%315
4pm-8pm11022.0%425
8pm-12am7515.0%500

Insights for staffing decisions:

  • Peak hours: 8am-12pm (24%) and 4pm-8pm (22%)
  • Lowest: 12am-4am (9%)
  • By 4pm, 63% of daily patients have arrived

Two-Way Table by Severity:

MinorModerateSevereTotal
Weekday18010045325
Weekend955525175
Total27515570500

Insight: Weekends have proportionally similar severity distribution (54.3% minor vs 55.4% weekday minor).

Summary

In this lesson, you learned:

  • Frequency tables organize data by counting occurrences
  • Relative frequency expresses counts as proportions (sum to 1.00)
  • Grouped frequency tables use classes for numerical data
  • Class width = Range / Number of Classes
  • Cumulative frequency shows running totals (useful for percentiles)
  • Two-way tables show relationships between categorical variables
  • Joint, marginal, and conditional distributions extract different information

Practice Problems

1. Create a grouped frequency table for these ages (in years): 23, 45, 31, 28, 52, 38, 41, 29, 35, 47, 33, 26, 44, 39, 31, 48, 27, 36, 42, 34

Use 5 classes starting at 20.

2. Using your table from Problem 1, add relative frequency and cumulative frequency columns.

3. In a two-way table, 120 out of 200 urban residents and 60 out of 150 rural residents support a policy. Calculate:

  • Joint distribution of “Urban + Support”
  • Conditional distribution: % support among urban residents
  • Marginal distribution: % who support overall
Click to see answers

1. Range = 52 - 23 = 29, Width = 29/5 ≈ 6 (use 6)

ClassFrequency
20-251
26-316
32-374
38-434
44-494
50-551
Total20

2.

ClassFreqRel. FreqCum. FreqCum. Rel. Freq
20-2510.0510.05
26-3160.3070.35
32-3740.20110.55
38-4340.20150.75
44-4940.20190.95
50-5510.05201.00

3.

  • Total = 200 + 150 = 350
  • Total support = 120 + 60 = 180
  • Joint (Urban + Support) = 120/350 = 34.3%
  • Conditional (Support | Urban) = 120/200 = 60%
  • Marginal (Support) = 180/350 = 51.4%

Next Steps

Now that you can organize data in tables:

Advertisement

Was this lesson helpful?

Help us improve by sharing your feedback or spreading the word.