Expected frequencies: n×p or (row×col)/total? And what do I do with decimals?

I’m cramming for a stats test and expected frequencies keep tripping me up. I get the basic idea, but when I see an actual question I start second-guessing which version to use and what to do with ugly decimals.

Example 1 (goodness-of-fit): Say a spinner has colors in the ratio 2:3:1:4 and I spin it 200 times. I figure the expected counts are 200×(2/10), 200×(3/10), etc. That seems fine. But if the problem gives percentages like 33%, 27%, 12%, 28% (which don’t sum to 100 perfectly), my expected counts come out non-integers. Do I leave them as decimals for the chi-square calc, or round them? If I’m meant to “not round until the end,” what exactly counts as “the end” here?

Example 2 (independence table):

Passed Failed Total
Male ? ? 60
Female ? ? 40
Total 70 30 100

I used E = (row total × column total) / grand total, so Male–Pass = 60×70/100 = 42, etc. Is that the right move every time? If the question only gives P(Male)=0.6, P(Pass)=0.7 and n=100, is expected Male–Pass just 100×0.6×0.7? Same thing, right? And what if n isn’t given-do I just keep it in proportions and not worry about counts? Also, when n is something awkward like 97, the expected counts are messy decimals again. Round or keep as-is?

Edge-case stuff I’m fuzzy on:
– If an expected frequency is under 5 (or like 0.4), am I supposed to merge categories before doing the test? How do you decide which ones to merge without wrecking the setup?
– If there’s a “5+” bin and I’m told the data follow a Poisson with λ = 2.3, is the expected frequency for 5+ just n × P(X ≥ 5)? I tried summing the tail and got something like n×0.1 for that bin, but I’m not super confident I did the tail properly.

Rounding headaches: When I round expected counts, the totals sometimes miss by 1. Do I fix that by nudging the cell with the biggest decimal part, or is there a cleaner trick that won’t get me marked down? I keep fudging one cell and it feels… dodgy.

Ratios: If observed counts are given, and the claim is a ratio like 2:3:5, I’m scaling by n×(2/10), n×(3/10), etc. Is that what examiners expect, or do they want me to force the ratio into whole numbers that add to n somehow?

I think these methods are right, but I’m not 100% which rule applies where, and how strict to be about rounding. A simple checklist or quick mental rules would help a lot.

Any help appreciated!

3 Responses

  1. Use n×p for expected counts: in goodness‑of‑fit multiply n by the stated proportions (ratios → weights/total; renormalize if given percents don’t sum to 1), and in independence tables use E = (row total × column total)/grand total = n×P(A)P(B) (if n isn’t given, leave it in proportions); compute the chi‑square with unrounded expected counts and only round the reported χ² and p‑value at the end.
    If some expecteds are small, aim for all ≥5 (or at least most ≥5 and none <1) by merging adjacent/logically similar categories or using an exact/alternative test; for a 5+ bin use n×P(X≥5); never force expected counts to whole numbers or to sum exactly-any rounding is for display only, not for the test.

  2. Two mantras cover almost everything: for goodness-of-fit, E = n × p; for independence, E = (row total × column total) / grand total, which is the same as n × p(row) × p(col). Get p from the null model: normalize ratios like 2:3:1:4 into probabilities, tidy given percentages by rescaling if they don’t sum to 1, or use model probabilities (e.g., Poisson). For a grouped tail like “5+,” yes, E(5+) = n × P(X ≥ 5). Decimals are your friends here: expected counts do not need to be integers, and you should not round them before the chi-square sum. “Don’t round until the end” means keep full precision while you compute the test statistic (and p-value). If all you’re given are probabilities but no n, you can only state expected proportions (or counts as symbols like 0.42n); you can’t do the chi-square test without n. Ratios like 2:3:5? Convert to probabilities 2/10, 3/10, 5/10 and multiply by n-don’t force exact integer splits.

    The practical guardrails: aim for no expected cell < 1 and at most about 20% of cells < 5. If that’s violated, merge logically adjacent/natural categories (often in the tails) under the null model, not by peeking at the observed bumps; then redo expectations and adjust degrees of freedom (GOF: k − 1 − number of parameters you estimated, like λ; independence: (r − 1)(c − 1)). When n is awkward (say 97), keep the messy decimals as-is. If you must print a rounded table and the totals miss by 1, it’s fine to nudge the cell with the largest fractional part so the margins match-but do the chi-square using the unrounded expectations you computed earlier. That’s the clean, test-proof workflow.

  3. Expected frequencies really come from two recipes that are secretly the same: for goodness-of-fit, E_i = n × p_i (ratios like 2:3:1:4 just mean p = 2/10, 3/10, 1/10, 4/10); for independence tables, E_{ij} = (row total × column total)/n = n × P(row i) × P(col j). If you’re handed percentages that don’t sum to 100 because of rounding, just renormalize them (divide each by the total of the percentages) and use E = n × p; keep the expected counts as decimals-do not round them-then compute χ² with those exact values. “Don’t round until the end” means only round the final χ² (and p-value), not the cell expectations or residuals. If n is unknown, you can talk in expected proportions (p_i or p_i p_j), but you can’t do the χ² calc without n. Awkward n like 97? Still keep the decimals. Small-cell rule of thumb: no expected count below 1, and at most 20% below 5; if violated, merge sensible, pre-planned neighbors (e.g., tail bins or rare categories), then recompute p and E. For a grouped Poisson like “5+,” yes E(5+) = n × P(X ≥ 5) using the Poisson tail. Rounding headaches: don’t force expected counts to integers for the test; if you must show whole numbers that sum to n, use the “largest remainder” tweak for display only and say you computed with unrounded values. Ratios like 2:3:5? Yep-use p = (2,3,5)/10 and E = n×p; never try to force expected counts to be exact integer multiples. I once spent an afternoon nudging cells by ±1 and feeling guilty-turns out the clean answer was just “stop rounding the expectations.”

Leave a Reply

Your email address will not be published. Required fields are marked *

Join Our Community

Ready to make maths more enjoyable, accessible, and fun? Join a friendly community where you can explore puzzles, ask questions, track your progress, and learn at your own pace.

By becoming a member, you unlock:

  • Access to all community puzzles
  • The Forum for asking and answering questions
  • Your personal dashboard with points & achievements
  • A supportive space built for every level of learner
  • New features and updates as the Hub grows