Sample Size Calculator & Statistical Power Calculator (2024)

Use this advanced sample size calculator to calculate the sample size required for a one-sample statistic, or for differences between two proportions or means (two independent samples). More than two groups supported for binomial data. Calculate power given sample size, alpha, and the minimum detectable effect (MDE, minimum effect of interest).

Quick navigation:

  • Parameters for sample size and power calculations
  • Calculator output
  • Why is sample size determination important?
  • What is statistical power?
    • Post-hoc power (Observed power)
  • Sample size formula
  • Types of null and alternative hypotheses in significance tests
  • Absolute versus relative difference and why it matters for sample size determination
  • Using the power & sample size calculator

    This calculator allows the evaluation of different statistical designs when planning an experiment (trial, test) which utilizes a Null-Hypothesis Statistical Test to make inferences. It can be used both as a sample size calculator and as a statistical power calculator. Usually one would determine the sample size required given a particular power requirement, but in cases where there is a predetermined sample size one can instead calculate the power for a given effect size of interest.

    Parameters for sample size and power calculations

    1. Number of test groups. The sample size calculator supports experiments in which one is gathering data on a single sample in order to compare it to a general population or known reference value (one-sample), as well as ones where a control group is compared to one or more treatment groups (two-sample, k-sample) in order to detect differences between them. For comparing more than one treatment group to a control group the sample size adjustments based on the Dunnett's correction are applied. These are only approximately accurate and subject to the assumption of about equal effect size in all k groups, and can only support equal sample sizes in all groups and the control. Power calculations are not currently supported for more than one treatment group due to their complexity.

    2. Type of outcome. The outcome of interest can be the absolute difference of two proportions (binomial data, e.g. conversion rate or event rate), the absolute difference of two means (continuous data, e.g. height, weight, speed, time, revenue, etc.), or the relative difference between two proportions or two means (percent difference, percent change, etc.). See Absolute versus relative difference for additional information. One can also calculate power and sample size for the mean of just a single group. The sample size and power calculator uses the Z-distribution (normal distribution).

    3. Baseline The baseline mean (mean under H0) is the number one would expect to see if all experiment participants were assigned to the control group. It is the mean one expects to observe if the treatment has no effect whatsoever.

    4. Minimum Detectable Effect. The minimum effect of interest, which is often called the minimum detectable effect (MDE, but more accurately: MRDE, minimum reliably detectable effect) should be a difference one would not like to miss, if it existed. It can be entered as a proportion (e.g. 0.10) or as percentage (e.g. 10%). It is always relative to the mean/proportion under H0 ± the superiority/non-inferiority or equivalence margin. For example, if the baseline mean is 10 and there is a superiority alternative hypothesis with a superiority margin of 1 and the minimum effect of interest relative to the baseline is 3, then enter an MDE of 2, since the MDE plus the superiority margin will equal exactly 3. In this case the MDE (MRDE) is calculated relative to the baseline plus the superiority margin, as it is usually more intuitive to be interested in that value.

    If entering means data, one needs to specify the mean under the null hypothesis (worst-case scenario for a composite null) and the standard deviation of the data (for a known population or estimated from a sample).

    5. Type of alternative hypothesis. The calculator supports superiority, non-inferiority and equivalence alternative hypotheses. When the superiority or non-inferiority margin is zero, it becomes a classical left or right sided hypothesis, if it is larger than zero then it becomes a true superiority / non-inferiority design. The equivalence margin cannot be zero. See Types of null and alternative hypothesis below for an in-depth explanation.

    6. Acceptable error rates. The type I error rate, α, should always be provided. Power, calculated as 1 - β, where β is the type II error rate, is only required when determining sample size. For an in-depth explanation of power see What is statistical power below. The type I error rate is equivalent to the significance threshold if one is doing p-value calculations and to the confidence level if using confidence intervals.

    Calculator output

    The sample size calculator will output the sample size of the single group or of all groups, as well as the total sample size required. If used to solve for power it will output the power as a proportion and as a percentage.

    Why is sample size determination important?

    While this online software provides the means to determine the sample size of a test, it is of great importance to understand the context of the question, the "why" of it all.

    Estimating the required sample size before running an experiment that will be judged by a statistical test (a test of significance, confidence interval, etc.) allows one to:

    • determine the sample size needed to detect an effect of a given size with a given probability
    • be aware of the magnitude of the effect that can be detected with a certain sample size and power
    • calculate the power for a given sample size and effect size of interest

    This is crucial information with regards to making the test cost-efficient. Having a proper sample size can even mean the difference between conducting the experiment or postponing it for when one can afford a sample of size that is large enough to ensure a high probability to detect an effect of practical significance.

    For example, if a medical trial has low power, say less than 80% (β = 0.2) for a given minimum effect of interest, then it might be unethical to conduct it due to its low probability of rejecting the null hypothesis and establishing the effectiveness of the treatment. Similarly, for experiments in physics, psychology, economics, marketing, conversion rate optimization, etc. Balancing the risks and rewards and assuring the cost-effectiveness of an experiment is a task that requires juggling with the interests of many stakeholders which is well beyond the scope of this text.

    What is statistical power?

    Statistical power is the probability of rejecting a false null hypothesis with a given level of statistical significance, against a particular alternative hypothesis. Alternatively, it can be said to be the probability to detect with a given level of significance a true effect of a certain magnitude. This is what one gets when using the tool in "power calculator" mode. Power is closely related with the type II error rate: β, and it is always equal to (1 - β). In a probability notation the type two error for a given point alternative can be expressed as [1]:

    β(Tα; μ1) = P(d(X) ≤ cα; μ = μ1)

    It should be understood that the type II error rate is calculated at a given point, signified by the presence of a parameter for the function of beta. Similarly, such a parameter is present in the expression for power since POW = 1 - β [1]:

    POW(Tα; μ1) = P(d(X) > cα; μ = μ1)

    In the equations above cα represents the critical value for rejecting the null (significance threshold), d(X) is a statistical function of the parameter of interest - usually a transformation to a standardized score, and μ1 is a specific value from the space of the alternative hypothesis.

    One can also calculate and plot the whole power function, getting an estimate of the power for many different alternative hypotheses. Due to the S-shape of the function, power quickly rises to nearly 100% for larger effect sizes, while it decreases more gradually to zero for smaller effect sizes. Such a power function plot is not yet supported by our statistical software, but one can calculate the power at a few key points (e.g. 10%, 20% ... 90%, 100%) and connect them for a rough approximation.

    Statistical power is directly and inversely related to the significance threshold. At the zero effect point for a simple superiority alternative hypothesis power is exactly 1 - α as can be easily demonstrated with our power calculator. At the same time power is positively related to the number of observations, so increasing the sample size will increase the power for a given effect size, assuming all other parameters remain the same.

    Post-hoc power (Observed power)

    Power calculations can be useful even after a test has been completed since failing to reject the null can be used as an argument for the null and against particular alternative hypotheses to the extent to which the test had power to reject them. This is more explicitly defined in the severe testing concept proposed by Mayo & Spanos (2006).

    Computing observed power is only useful if there was no rejection of the null hypothesis and one is interested in estimating how probative the test was towards the null. It is absolutely useless to compute post-hoc power for a test which resulted in a statistically significant effect being found [5]. If the effect is significant, then the test had enough power to detect it. In fact, there is a 1 to 1 inverse relationship between observed power and statistical significance, so one gains nothing from calculating post-hoc power, e.g. a test planned for α = 0.05 that passed with a p-value of just 0.0499 will have exactly 50% observed power (observed β = 0.5).

    I strongly encourage using this power and sample size calculator to compute observed power in the former case, and strongly discourage it in the latter.

    Sample size formula

    The formula for calculating the sample size of a test group in a one-sided test of absolute difference is:

    Sample Size Calculator & Statistical Power Calculator (1)

    where Z1-α is the Z-score corresponding to the selected statistical significance threshold α, Z1-β is the Z-score corresponding to the selected statistical power 1-β, σ is the known or estimated standard deviation, and δ is the minimum effect size of interest. The standard deviation is estimated analytically in calculations for proportions, and empirically from the raw data for other types of means.

    The formula applies to single sample tests as well as to tests of absolute difference between two samples. A proprietary modification is employed when calculating the required sample size in a test of relative difference. This modification has been extensively tested under a variety of scenarios through simulations.

    Types of null and alternative hypotheses in significance tests

    When doing sample size calculations, it is important that the null hypothesis (H0, the hypothesis being tested) and the alternative hypothesis is (H1) are well thought out. The test can reject the null or it can fail to reject it. Strictly logically speaking it cannot lead to acceptance of the null or to acceptance of the alternative hypothesis. A null hypothesis can be a point one - hypothesizing that the true value is an exact point from the possible values, or a composite one: covering many possible values, usually from -∞ to some value or from some value to +∞. The alternative hypothesis can also be a point one or a composite one.

    In a Neyman-Pearson framework of NHST (Null-Hypothesis Statistical Test) the alternative should exhaust all values that do not belong to the null, so it is usually composite. Below is an illustration of some possible combinations of null and alternative statistical hypotheses: superiority, non-inferiority, strong superiority (margin > 0), equivalence.

    Sample Size Calculator & Statistical Power Calculator (2)

    All of these are supported in our power and sample size calculator.

    Careful consideration has to be made when deciding on a non-inferiority margin, superiority margin or an equivalence margin. Equivalence trials are sometimes used in clinical trials where a drug can be performing equally (within some bounds) to an existing drug but can still be preferred due to less or less severe side effects, cheaper manufacturing, or other benefits, however, non-inferiority designs are more common. Similar cases exist in disciplines such as conversion rate optimization [2] and other business applications where benefits not measured by the primary outcome of interest can influence the adoption of a given solution. For equivalence tests it is assumed that they will be evaluated using a two one-sided t-tests (TOST) or z-tests, or confidence intervals.

    Note that our calculator does not support the schoolbook case of a point null and a point alternative, nor a point null and an alternative that covers all the remaining values. This is since such cases are non-existent in experimental practice [3][4]. The only two-sided calculation is for the equivalence alternative hypothesis, all other calculations are one-sided (one-tailed).

    Absolute versus relative difference and why it matters for sample size determination

    When using a sample size calculator it is important to know what kind of inference one is looking to make: about the absolute or about the relative difference, often called percent effect, percentage effect, relative change, percent lift, etc. Where the fist is μ1 - μ the second is μ1-μ / μ or μ1-μ / μ x 100 (%). The division by μ is what adds more variance to such an estimate, since μ is just another variable with random error, therefore a test for relative difference will require larger sample size than a test for absolute difference. Consequently, if sample size is fixed, there will be less power for the relative change equivalent to any given absolute change.

    For the above reason it is important to know and state beforehand if one is going to be interested in percentage change or if absolute change is of primary interest. Then it is just a matter of fliping a radio button.

    References

    1 Mayo D.G., Spanos A. (2010) – "Error Statistics", in P. S. Bandyopadhyay & M. R. Forster (Eds.), Philosophy of Statistics, (7, 152–198). Handbook of the Philosophy of Science. The Netherlands: Elsevier.

    2 Georgiev G.Z. (2017) "The Case for Non-Inferiority A/B Tests", [online] https://blog.analytics-toolkit.com/2017/case-non-inferiority-designs-ab-testing/ (accessed May 7, 2018)

    3 Georgiev G.Z. (2017) "One-tailed vs Two-tailed Tests of Significance in A/B Testing", [online] https://blog.analytics-toolkit.com/2017/one-tailed-two-tailed-tests-significance-ab-testing/ (accessed May 7, 2018)

    4 Hyun-Chul Cho Shuzo Abe (2013) "Is two-tailed testing for directional research hypotheses tests legitimate?", Journal of Business Research 66:1261-1266

    5 Lakens D. (2014) "Observed power, and what to do if your editor asks for post-hoc power analyses" [online] http://daniellakens.blogspot.bg/2014/12/observed-power-and-what-to-do-if-your.html (accessed May 7, 2018)

    Our statistical calculators have been featured in scientific papers and articles published in high-profile science journals by:

    Sample Size Calculator & Statistical Power Calculator (2024)

    FAQs

    How do you find the sample size for statistical power? ›

    The formula for determining sample size to ensure that the test has a specified power is given below: where α is the selected level of significance and Z 1-α /2 is the value from the standard normal distribution holding 1- α/2 below it. For example, if α=0.05, then 1- α/2 = 0.975 and Z=1.960.

    What is the relationship between sample size and statistical power? ›

    The concept of statistical power is more associated with sample size, the power of the study increases with an increase in sample size. Ideally, minimum power of a study required is 80%.

    How do you calculate statistically significant sample size? ›

    Five steps to finding your sample size
    1. Define population size or number of people.
    2. Designate your margin of error.
    3. Determine your confidence level.
    4. Predict expected variance.
    5. Finalize your sample size.

    What is PS power and sample size? ›

    Ps. PS is an interactive program for performing power and sample size calculations. It may be run as a web app at https://vbiostatps.app.vumc.org/ or downloaded for free. This version can be used for studies with dichotomous or continuous, response measures.

    How to calculate statistical power by hand? ›

    Hand Calculation.

    Power=1-Φ[1.96-(105-100)/(10/n)]+Φ[-1.96-(95-100)/(10/n)], which is, Power=1-Φ[1.96-n/2]+Φ[-1.96+n/2]. That function shows a relationship between power and sample size. For each level of sample size, there is a corresponding sample size.

    What is the formula for calculating sample size? ›

    Sample Size = N / (1 + N*e2)

    Note that this is the least accurate formula and, as such, the least ideal. You should only use this if circ*mstances prevent you from determining an appropriate standard of deviation and/or confidence level (thereby preventing you from determining your z-score, as well).

    Does small sample size reduce statistical power? ›

    Statistically significant findings are harder to detect with small sample sizes. This means that small sample sizes decrease statistical power. Finding non-significant differences with small sample sizes often leads to committing Type II errors.

    What is a good sample size for a study? ›

    For populations under 1,000, a minimum ratio of 30 percent (300 individuals) is advisable to ensure representativeness of the sample. For larger populations, such as a population of 10,000, a comparatively small minimum ratio of 10 percent (1,000) of individuals is required to ensure representativeness of the sample.

    What is the minimum sample size for statistical significance? ›

    Most statisticians agree that the minimum sample size to get any kind of meaningful result is 100. If your population is less than 100 then you really need to survey all of them.

    What is the rule of thumb for sample size? ›

    Summary: The rule of thumb: Sample size should be such that there are at least 5 observations per estimated parameter in a factor analysis and other covariance structure analyses. The kernel of truth: This oversimplified guideline seems appropriate in the presence of multivariate normality.

    What is the rule for sample size in statistics? ›

    The number 30 is often used as a rule of thumb for a minimum sample size in statistics because it is the point at which the central limit theorem begins to apply.

    What is the Fischer's formula for sample size? ›

    Sample Size Determination The sample size was estimated using Fisher's formula [25] n = z 2 pq e 2 Where: n = desired sample size z = standard deviation at desired degree of accuracy which is 1.96 at 95% degree of accuracy.

    What is the program for power and sample size calculations? ›

    The PS program can produce graphs to explore the relationships between power, sample size and detectable alternative hypotheses. It is often helpful to hold one of these variables constant and plot the other two against each other.

    What is the difference between sample size and power of study? ›

    Strictly speaking “power” refers to the probability of avoiding a type II error in a comparative study. Sample size estimation is a more encompassing term that looks at more than just the type II error and is applicable to all types of studies. In common parlance the terms are used interchangeably.

    What does P stand for in sample size? ›

    p̂ is the population proportion. n and n' are sample size. N is the population size. Within statistics, a population is a set of events or elements that have some relevance regarding a given question or experiment.

    What is the sample size for statistical theory? ›

    A good maximum sample size is usually around 10% of the population, as long as this does not exceed 1000. For example, in a population of 5000, 10% would be 500. In a population of 200,000, 10% would be 20,000. This exceeds 1000, so in this case the maximum would be 1000.

    How to calculate sample size for an unknown population? ›

    For instance, for a 95% certainty level, Z would be 1.96. The Z-score compares to the degree of certainty you need to have in your gauge. For sample size calculation of unknown population size, you can use the following formula: n= z2.

    How do you determine how many participants you need for a study? ›

    How many people should I ask to take my survey?
    1. Respondents needed. Take the number of people you need to answer your survey.
    2. ÷ Response rate. Divide by the expected response rate (use 25 for 25%, if you use .25 skip step 3)
    3. x 100. Then multiply by 100 (to account for the using a percentage versus a decimal in step 2)

    Top Articles
    What does it mean to have COVID-19 in 2024? What to know amid increase in cases
    Here's where you can find updated COVID-19 and flu shots for 2024-25
    Use Copilot in Microsoft Teams meetings
    2018 Jeep Wrangler Unlimited All New for sale - Portland, OR - craigslist
    Pinellas County Jail Mugshots 2023
    Becky Hudson Free
    Blue Ridge Now Mugshots Hendersonville Nc
    Chile Crunch Original
    Sivir Urf Runes
    Sound Of Freedom Showtimes Near Cinelux Almaden Cafe & Lounge
    Morristown Daily Record Obituary
    Satisfactory: How to Make Efficient Factories (Tips, Tricks, & Strategies)
    Airrack hiring Associate Producer in Los Angeles, CA | LinkedIn
    Amazing deals for Abercrombie & Fitch Co. on Goodshop!
    Scheuren maar: Ford Sierra Cosworth naar de veiling
    Prey For The Devil Showtimes Near Ontario Luxe Reel Theatre
    Macu Heloc Rate
    Обзор Joxi: Что это такое? Отзывы, аналоги, сайт и инструкции | APS
    Restored Republic June 16 2023
    Water Temperature Robert Moses
    Sams Gas Price Sanford Fl
    Valley Craigslist
    lol Did he score on me ?
    Issue Monday, September 23, 2024
    Tire Pro Candler
    Mumu Player Pokemon Go
    EST to IST Converter - Time Zone Tool
    Bee And Willow Bar Cart
    1400 Kg To Lb
    Arcane Odyssey Stat Reset Potion
    Hotels Near New Life Plastic Surgery
    Powerspec G512
    Baywatch 2017 123Movies
    10 games with New Game Plus modes so good you simply have to play them twice
    Tryst Houston Tx
    San Bernardino Pick A Part Inventory
    Craigslist Pets Plattsburgh Ny
    Shane Gillis’s Fall and Rise
    Riverton Wyoming Craigslist
    Academy Sports New Bern Nc Coupons
    Craigslist Com Panama City Fl
    Emily Tosta Butt
    Setx Sports
    Myrtle Beach Craigs List
    Grand Valley State University Library Hours
    FedEx Authorized ShipCenter - Edouard Pack And Ship at Cape Coral, FL - 2301 Del Prado Blvd Ste 690 33990
    Sandra Sancc
    Greatpeople.me Login Schedule
    Market Place Tulsa Ok
    300 Fort Monroe Industrial Parkway Monroeville Oh
    Rise Meadville Reviews
    Latest Posts
    Article information

    Author: Rev. Leonie Wyman

    Last Updated:

    Views: 5980

    Rating: 4.9 / 5 (79 voted)

    Reviews: 94% of readers found this page helpful

    Author information

    Name: Rev. Leonie Wyman

    Birthday: 1993-07-01

    Address: Suite 763 6272 Lang Bypass, New Xochitlport, VT 72704-3308

    Phone: +22014484519944

    Job: Banking Officer

    Hobby: Sailing, Gaming, Basketball, Calligraphy, Mycology, Astronomy, Juggling

    Introduction: My name is Rev. Leonie Wyman, I am a colorful, tasty, splendid, fair, witty, gorgeous, splendid person who loves writing and wants to share my knowledge and understanding with you.