P-Hat Sample Proportion Calculator

Compute sample proportion (p̂), standard error, margin of error, and confidence interval from successes and sample size.

Science p̂ + CI Normality check
Rate this calculator · 5.0 (1)

P-Hat Calculator

p̂ = x ÷ n · CI = p̂ ± z · √(p̂q̂/n)

Instructions — P-Hat Sample Proportion Calculator

1

Enter successes and sample size

Successes (x) is the number of yes/positive/converted/defective outcomes. Sample size (n) is the total number of trials or observations. Both should be non-negative integers, with x ≤ n.

2

Pick confidence level

95% is the science standard. Polls use 95%. A/B tests often use 90%. Pharmaceuticals lean toward 99%. Higher confidence widens the interval.

3

Read p̂ and its interval

The headline is p̂ = x/n, your point estimate. Below it: the confidence interval (where the true proportion likely lies), the margin of error, and the standard error. The normality check tells you whether the Wald approximation is valid.

Rule: need n·p̂ ≥ 5 and n·(1−p̂) ≥ 5 for the normal approximation. Below that, use Wilson or Clopper–Pearson.
Sample size for ±3 pp: n ≈ 1067 at 95% confidence with p̂ = 0.5. That's where "n = 1000" national polls come from.

Formulas

The sample proportion is a simple ratio, but every quantity around it has a precise definition. The same formulas underpin polls, A/B tests, and quality-control charts.

Sample Proportion
$$ \hat{p} = \frac{x}{n} $$
Successes divided by sample size. The number is bounded between 0 and 1.
Standard Error
$$ SE(\hat{p}) = \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} $$
Measures how much p̂ would vary across repeated samples. Maximum at p̂ = 0.5, dropping to zero at p̂ = 0 or 1.
Confidence Interval (Wald)
$$ \hat{p} \pm z_{\alpha/2} \cdot SE(\hat{p}) $$
The standard normal approximation. For 95% CI use z = 1.96; for 99% use z = 2.576.
Margin of Error
$$ ME = z_{\alpha/2} \cdot SE(\hat{p}) $$
Half the width of the confidence interval. Pollsters quote this number when reporting "±3 percentage points".
Normality Condition
$$ n \cdot \hat{p} \geq 5 \;\;\text{and}\;\; n \cdot (1 - \hat{p}) \geq 5 $$
Both products must exceed 5 for the Wald CI to be accurate. Small samples or extreme proportions need an exact binomial method.
Sample Size for ME
$$ n = \left(\frac{z_{\alpha/2}}{ME}\right)^2 \cdot \hat{p}(1-\hat{p}) $$
For 95% CI with ME ≤ 0.03 and worst-case p̂ = 0.5: n = (1.96/0.03)² × 0.25 ≈ 1067.

Reference

Z critical values (two-tailed)
Confidenceαz (critical)
80%0.201.282
90%0.101.645
95%0.051.960
99%0.012.576
99.9%0.0013.291

Typical p̂ scenarios

Example results across common applications. Margins assume 95% confidence.

Polls and surveys
x / n95% CI
520 / 10000.520[49.0%, 55.1%]
200 / 4000.500[45.1%, 54.9%]
48 / 1000.480[38.2%, 57.8%]
340 / 5000.680[63.9%, 72.1%]
2500 / 50000.500[48.6%, 51.4%]
A/B and quality
x / n95% CI
32 / 2000.160[10.9%, 21.1%]
120 / 8000.150[12.5%, 17.5%]
6 / 3000.020[0.4%, 3.6%]
1 / 500.020Wald invalid
0 / 1000.000Use Wilson/CP

Note: when the Wald approximation fails (small samples, extreme proportions), use the Wilson score interval (1927) or the Clopper–Pearson exact binomial interval (1934). Both stay within [0, 1] regardless of n and p̂.

Article — P-Hat Sample Proportion Calculator

P-hat sample proportion calculator

P-hat (p̂) is the sample proportion: successes divided by sample size. For x = 520 yes-votes in a poll of n = 1000, p̂ = 0.520 = 52.0%. It is the best point estimate of the unknown population proportion p, and it comes with a standard error of √(p̂(1−p̂)/n) — which sets the width of the confidence interval around it.

The math is simple. The interpretation is not. P-hat estimates p but is not equal to p; the confidence interval describes how far apart they probably are. This calculator computes p̂, the standard error, the margin of error, and the confidence interval in one pass, with a normality check for the underlying normal approximation.

What is p-hat?

P-hat is the fraction of sample observations that fall into a category of interest. In a survey of 200 people, if 80 are vegetarian, the p-hat for vegetarians is 80/200 = 0.40. It is bounded between 0 and 1, and it is the simplest point estimate of the corresponding population proportion.

The hat in p̂ is the statistical convention for an estimator — a quantity computed from data that approximates an unknown parameter. The Greek p (no hat) denotes the true population proportion. The two are different objects: p is fixed but unknown; p̂ is observed but varies with the sample.

Did you know

The hat notation for sample estimates dates to early-1900s English statistics. R.A. Fisher used a tilde (p̃) in his 1925 textbook, but the hat (p̂) spread faster because it was easier to typeset on mechanical typewriters with a hat symbol available in standard sets.

The p-hat formula

Every quantity you need for a confidence interval on a proportion follows from three numbers: x (successes), n (sample size), and the chosen confidence level.

P-hat formulas
p̂ = x ÷ n SE = √(p̂q̂ / n) where q̂ = 1 − p̂
ME = z · SE CI = p̂ ± ME
z₉₅ = 1.96 z₉₉ = 2.576 Wald valid if n·p̂ ≥ 5 and n·q̂ ≥ 5

Plug your sample numbers in and out come the four quantities. For x = 32, n = 200 (a 16% conversion rate from a 200-visitor A/B test): p̂ = 0.16, q̂ = 0.84, SE = √(0.16 × 0.84 / 200) = 0.0259. For 95% CI, z = 1.96, ME = 0.0508 = 5.08 percentage points. CI = [10.9%, 21.1%].

P-hat versus population proportion p

This is the single most common point of confusion. The population proportion p is what you actually want to know — the true fraction of voters, customers, or defective parts in the population. P-hat is what you can compute from a finite sample. They are different in general, and their difference is exactly what the confidence interval bounds.

A census of the entire population would give p̂ = p exactly. With a sample, p̂ has error that shrinks as n grows. Poll 10 voters and your p̂ might be 0.4; poll 1000 and you'll be close to 0.5; poll 10 million and p̂ essentially equals p.

P-hat standard error and confidence interval

The standard error of p-hat captures how much p̂ varies across repeated samples from the same population. It is largest at p̂ = 0.5 (maximum uncertainty when outcomes are evenly split) and drops to zero as p̂ approaches 0 or 1. For 95% confidence, the margin of error is 1.96 × SE, and the CI is p̂ plus or minus that margin.

An example brings the numbers to life. A national poll surveys 1067 voters and finds 53% favor a policy. SE = √(0.53 × 0.47 / 1067) = 0.0153. ME = 1.96 × 0.0153 = 0.030 = ±3.0 percentage points. CI = [50.0%, 56.0%]. Notice how the conventional "n = 1000, ±3 points" rule of thumb falls out naturally from the formula.

Small poll (n=100)
±9.8 pp
CI wider than candidate lead
National poll (n=1000)
±3.1 pp
Standard polling target
Mega-poll (n=10000)
±1.0 pp
Election forecasting grade

P-hat sample size and accuracy

Margin of error scales as 1/√n. Quadruple the sample to halve the CI width. This is the iron law of polling and survey research. Going from n = 250 to n = 1000 cuts ME from ±6.2 points to ±3.1 points at 95% confidence — meaningful improvement. Going from n = 1000 to n = 4000 cuts it from ±3.1 to ±1.5 points — diminishing returns.

The required sample for a target margin of error is n = (z/ME)² × p̂(1−p̂), worst case at p̂ = 0.5. For ±3 points at 95%: n = (1.96/0.03)² × 0.25 = 1067. For ±1 point: n = (1.96/0.01)² × 0.25 = 9604. For ±0.5 points: 38,416. The election-eve mega-polls run by aggregators like 538 and RealClearPolitics combine multiple polls to reach effective sample sizes in the tens of thousands.

  • n = 100: ±9.8 pp at 95% — too rough for political polling
  • n = 384: ±5.0 pp at 95% — small academic survey
  • n = 1067: ±3.0 pp at 95% — national poll standard
  • n = 2401: ±2.0 pp at 95% — large national poll
  • n = 9604: ±1.0 pp at 95% — exit-poll precision
  • n = 38,416: ±0.5 pp at 95% — research-quality only

Common p-hat mistakes

The first error is confusing p̂ with p. Saying "p = 0.48" when you mean "p̂ = 0.48" elides the entire uncertainty. The hat is doing work — it signals that you have an estimate, not a fact. Statisticians notice. Reporters miss it constantly.

The second is using the Wald CI for small samples. When n·p̂ or n·(1−p̂) falls below 5, the normal approximation breaks. The CI can extend below 0 or above 1, which is nonsensical for a proportion. Wilson's score interval (1927) and the Clopper–Pearson exact binomial interval (1934) handle small samples and extreme proportions cleanly. The calculator flags Wald-invalid cases.

The third mistake is reporting p̂ ± SE rather than p̂ ± z·SE. The standard error alone is a one-standard-deviation band, about 68% confidence. The convention multiplies by z = 1.96 (for 95% CI).

Random sampling matters

The p-hat CI assumes your sample is a random draw from the population. Convenience samples, online opt-in panels, and self-selected respondents can produce tight CIs that systematically miss the true p by far more than the margin of error suggests. Polling failure in 2016 wasn't a CI problem — it was a sampling-frame problem.

A short history of proportion inference

Clopper and Egon Pearson published the first systematic treatment of the binomial confidence interval in 1934. Clopper and Pearson followed in 1934 with the exact interval that bears their names — still the gold standard for small samples. Wilson published his score-based interval in 1927; it was largely forgotten for 70 years before Agresti and Coull rediscovered it in 1998 and showed it outperformed the Wald approximation almost everywhere.

The Wald CI — the simple p̂ ± z · √(p̂q̂/n) — remained dominant in textbooks because it's easy to teach. Modern statistics software defaults to Wilson or Clopper–Pearson for proportions, especially in epidemiology and clinical trials where small samples and rare events are common. This calculator uses Wald with a normality warning, the choice most users expect.

FAQ

P-hat (p̂) is the sample proportion: the number of successes divided by sample size. If 480 of 1000 polled voters favor a candidate, p̂ = 0.48. It is the best point estimate of the true population proportion p, which is unknown.
p̂ = x / n, where x is the count of successes and n is the sample size. Example: 32 conversions out of 200 visitors gives p̂ = 32/200 = 0.16 = 16%. The number is always between 0 and 1.
p is the true population proportion — unknown, what you want to estimate. p̂ is the sample estimate — what you calculate from data. p̂ is a random variable that varies between samples; p is a fixed (but unknown) value. The confidence interval expresses how far p̂ might be from p.
SE(p̂) = √(p̂(1−p̂)/n). For p̂ = 0.5 and n = 1000, SE = √(0.25/1000) = 0.0158. The SE shrinks as n grows; it is largest when p̂ = 0.5 and smaller toward 0 or 1.
CI = p̂ ± z · SE. For 95% confidence, z = 1.96. With p̂ = 0.48 and n = 1000, CI = 0.48 ± 1.96 × 0.0158 = 0.48 ± 0.031 = [44.9%, 51.1%]. Wider intervals reflect higher confidence.
The normal approximation to the binomial fails for small expected counts. The Wald CI assumes p̂ is normally distributed, which requires enough successes (n·p̂ ≥ 5) and failures (n·(1−p̂) ≥ 5). With fewer, the CI can extend below 0 or above 1, or be inaccurate. Use Wilson or exact methods instead.
For ±3 pp margin at 95% confidence, you need n ≈ 1067. The formula is n = (z/ME)² × p̂(1−p̂), worst case at p̂ = 0.5. Doubling sample to 4000 gets ME down to ±1.5 pp. The famous "national poll with n ≈ 1000" sits at that ±3 pp accuracy.
Wilson's score interval (1927) is a better small-sample CI for proportions. It stays within [0, 1] and is more accurate than Wald for small n or extreme p̂. Use Wilson when p̂ is close to 0 or 1, when n is below 30, or when n·p̂ or n·(1−p̂) falls below 5.