🚨 Limited Offer: First 50 users get 500 credits for free — only ... spots left!

Free flashcards to ace your AP - AP Statistics

Learn faster with 48 AP flashcards. One-click export to Notion.

Learn fast, memorize everything and ace your AP. No credit card required.

Want to create flashcards from your own textbooks and notes?

Let AI create automatically flashcards from your own textbooks and notes. Upload your PDF, select the pages you want to memorize fast, and let AI do the rest. One-click export to Notion.

Create Flashcards from my PDFs

AP Statistics

48 flashcards

The null hypothesis states that there is no significant effect or relationship between variables being studied.

Descriptive statistics involves organizing, summarizing, and presenting data in a meaningful way, such as through graphs, charts, and summary measures.

Inferential statistics involves using sample data to make estimates, decisions, predictions, or other generalizations about a larger population.

A population is the entire group that you want to study and make conclusions about.

A sample is a subset of individuals from the population that is studied.

A parameter is a numerical value that describes a population characteristic. A statistic is a numerical value that describes a sample characteristic.

The mean is the sum of all values divided by the number of values. It measures the center or average of a data set.

The median is the middle value in an ordered data set, with half the values greater and half the values less than the median.

The mode is the value or values that occur most frequently in a data set.

The range is the difference between the highest and lowest values in a data set.

The standard deviation measures the spread or dispersion of a data set from its mean.

A z-score represents how many standard deviations a data point is away from the mean. It allows comparison across different data sets.

Statistical significance is the likelihood that a result or relationship is caused by something other than chance.

A p-value is the probability of obtaining a result at least as extreme as the observed data, assuming the null hypothesis is true.

The alternative hypothesis states that there is a significant effect or relationship between variables being studied.

A Type I error occurs when the null hypothesis is true, but is rejected based on the sample evidence.

A Type II error occurs when the null hypothesis is false, but is not rejected based on the sample evidence.

The significance level is the probability of making a Type I error, typically set to 0.05 or 0.01 for most studies.

A confidence interval is a range of values that is likely to contain the true population parameter with a certain level of confidence.

A point estimate is a single value that estimates the parameter of interest in a population.

The Central Limit Theorem states that the distribution of sample means approaches a normal distribution as the sample size increases, regardless of the population's distribution.

A sampling distribution is the distribution of a statistic over many samples from the same population.

A simple random sample is a sample where every member of the population has an equal chance of being selected.

Stratified sampling divides the population into homogeneous subgroups, then takes a simple random sample from each subgroup.

Cluster sampling divides the population into clusters, then randomly selects some of those clusters to include in the sample.

Systematic sampling selects members of the population at a fixed periodic interval, such as every 10th person in a line.

An experiment involves imposing some treatment on subjects, while an observational study simply observes and records data.

Confounding variables are extraneous variables that affect the relationship between the explanatory and response variables.

Randomization is the random assignment of treatments or conditions to experimental units to account for potential bias or confounding.

Replication involves repeating an experiment or study multiple times to increase confidence in the results and account for randomness.

A correlation coefficient measures the strength and direction of the linear relationship between two variables.

A scatterplot displays the relationship between two quantitative variables by plotting points for each pair of variable values.

A residual is the difference between an observed value and the value predicted by a statistical model.

A regression line represents the line of best fit that minimizes the sum of squared residuals in a linear regression model.

The least squares method finds the line of best fit by minimizing the sum of squared residuals between observed and predicted values.

The coefficient of determination represents the proportion of variation in the response variable that can be explained by the explanatory variable.

The assumptions include linearity, normality of residuals, homoscedasticity, and independence of errors.

A contingency table summarizes the frequencies of two or more categorical variables to study their relationship.

The chi-square test determines whether there is a significant difference between observed and expected frequencies in a contingency table.

ANOVA stands for Analysis of Variance, a method used to compare means across more than two groups or levels.

The one-way ANOVA tests the equality of means across multiple groups based on a single factor or independent variable.

The two-way ANOVA tests the effects of two independent variables, as well as their interaction, on a dependent variable.

A blocking factor is a variable or source of variation that is controlled or eliminated to increase the precision of an experiment.

A time series is a sequence of data points collected over successive time intervals or periods.

The components of a time series include the trend, seasonal variation, cyclical variation, and irregular or random variation.

A probability distribution describes all the possible values of a random variable and the likelihood of each occurring.

The normal distribution is symmetric and bell-shaped, with most values clustered around the mean and tapering off equally in both directions.

The Empirical Rule states that approximately 68% of data falls within 1 standard deviation of the mean, 95% within 2 standard deviations, and 99.7% within 3 standard deviations for a normal distribution.