 STAT 250: Introduction to Biostatistics
Dr. Kari Lock Morgan
Spring 2015

 Schedule Documents Support Course Description Learning Objectives StatKey ANGEL WileyPlus

LEARNING OBJECTIVES

By the end of the course, you should be able to...

Data Collection

1. Identify cases and variables in a dataset, and classify variables as categorical or quantitative.
2. Recognize that data and knowledge of statistics allows you to investigate a wide variety of interesting phenomena.
3. Distinguish between a sample and a population.
4. Recognize when it is, and is not, appropriate to use sample data to infer information about a population.
5. Recognize that not every association implies causation.
6. Identify potential confounding variables in an observational study.
7. Distinguish between an observational study and a randomized experiment.
8. Recognize that only randomized experiments can lead to claims of causation and explain why randomization is important for causality.
9. Explain how and why placebos and blinding are used in experiments.
10. Distinguish between a completely randomized experiment and a matched pairs experiment.
11. Design and implement a basic randomized experiment.

Exploratory Data Analysis

1. Create (with technology) and interpret a dotplot, boxplot, or histogram, and side-by-side dotplots, boxplots, or histograms.
2. Calculate (with technology) and interpret summary statistics for a quantitive variable, including mean, median, standard deviation, five number summary, range, and IQR, and be able to calculate and compare these within groups.
3. Compute and interpret a z-score for an individual value.
4. Interpret percentiles.
5. Create (with technology) a scatterplot between two quantitative variables, and use the plot to describe the association.
6. Explain what a positive or negative association means between two quantitative variables.
7. Calculate (with technology) and interpret a correlation.
8. Identify outliers (informally or formally) and explain how they effect different statistics.
9. Realize that it is important to plot your data if any variables are quantitative.
10. Create (with technology) bar graphs and side-by-side or segmented bar graphs for categorical variables.
11. Create a frequency, relative frequency, or two-way table to summarize categorical variables.
12. Use a frequency, relative frequency, or two-way table to calculate proportions, difference in proportions, odds, and odds ratios.
13. Calculate and interpret conditional probabilities for categorical variables.
14. Determine an appropriate numerical summary statistic(s) and visualization for any one or two variables being analyzed.

Estimation

1. Distinguish between a population parameter and a sample statistic, recognizing that a parameter is fixed while a statistic varies from sample to sample.
2. Determine and define an appropriate parameter of interest, based on a question.
3. Compute a point estimate for a parameter using an appropriate statistic from a sample.
4. Recognize that a sampling distribution shows how sample statistics tend to vary, but that in reality a sampling distribution can never be obtained in situations where estimation is needed.
5. Recognize that statistics from random samples tend to be centered at the population parameter.
6. Explain how to generate a bootstrap distribution for a given sample and statistic.
7. Use technology to generate a bootstrap distribution, and recognize that it will be centered around the sample statistic.
8. Demonstrate an understanding of standard error as the standard deviation of the statistic.
9. Calculate a standard error from a bootstrap distribution (using technology), and from a formula for means, difference in means, proportions, and difference in proportions.
10. Recognize that a confidence interval will capture the true parameter for the specified percentage of all random samples.
11. Use a bootstrap distribution to construct a 95% confidence interval using the formula statistic ± 2xSE.
12. Use a bootstrap distribution to construct a confidence interval using percentiles of the bootstrap distribution.
13. Use the normal or t-distribution to construct a confidence interval for a mean, proportion, difference in means, difference in proportions, or correlation using technology.
14. Use the normal or t-distribution and the standard error formulas to constuct a confidence interval using the formula statistic ± z*xSE for proportions and difference in proportions or statistic ± t*xSE for means, difference in means, and slope.
15. Interpret a confidence interval in context.
16. Explain how sample size affects standard error and the width of a confidence interval.
17. Demonstrate an understanding of the central limit theorem.
18. Determine whether the conditions are met for the chosen method to be valid.

Testing

1. Recognize when and why statistical tests are needed.
2. Specify null and alternative parameters based on a question of interest, defining relevant parameters.
3. Demonstrate an understanding of the concept of statistical significance.
4. Recognize that the strength of evidence against the null hypothesis depends on how unlikely it would be to get a statistic as extreme just by random chance, if the null hypothesis were true.
5. Use technology to generate a randomization distribution, and realize that it will be centered around the null parameter value.
6. For a given sample and null hypothesis, describe the process of creating a randomization distribution.
7. Use a randomization distribution to calculate a p-value.
8. Connect the definition of a p-value to the motivation behind a randomization distribution.
9. Distinguish between one and two-tailed tests in stating the alternative hypothesis and calculating the p-value.
10. Interpret a p-value.
11. Make a formal decision in a hypothesis test by comparing the p-value to the significance level.
12. State the conclusion to a hypothesis test in context.
13. Recognize that two types of errors can occur, and interpret false positives (Type I) and false negatives (Type II) in context.
14. Recognize a significance level as the tolerable chance of getting a false positive (making a Type I error).
15. Explain the problem of multiple testing and publication bias.
16. Recognize that statistical significance is not always the same as practical significance.
17. Make a less formal statement about the strength of evidence in a p-value.
18. Determine the decision for a two-tailed hypohtesis test from the corresponding confidence interval.
19. Use technology and the normal or t-distribution to calculate a p-value for tests for means, difference in means, proportions, difference in proportions, correlation, and slope.
20. Use the normal or t-distribution, the standard error formulas, and the formula (statistic - null value)/SE to calculate a p-value for tests for means, difference in means, proportions, difference in proportions, correlation, and slope.
21. Determine whether a chi-square goodness of fit test or a chi-square test for association is appropriate to answer a question of interest.
22. State hypotheses for a chi-square goodness-of-fit test for one categorical variable and for a chi-square test for association for two categorical variables.
23. Calculate the test statistic for a chi-square goodness-of-fit test and a chi-square test for association both with and without technology.
24. Use a randomization distribution or a chi-square distribution to calculate a p-value for a chi-square test.
25. State the conclusion in context for a chi-square goodness-of-fit test and a chi-square test for association.
26. Determine whether the conditions are met to use a normal, t, or chi-square distribution for inference.
27. Conduct a hypothesis test from start to finish for a variety of different situations.
28. Determine whether a confidence interval, a hypothesis test, both, or neither is most appropriate for answering a question of interest.

Modeling

1. Use technology to find the regression line for two quantitative variables, giving the equation and plotting the line on a scatterplot.
2. Calculate predicted values from a regression equation.
3. Interpret the slope (and intercept, when appropriate) of a regression line in context.
4. Calculate residuals and visualize residuals on a scatterplot.
5. Beware of extrapolating when making predictions, fitting a line to nonlinear data, and the effect of outliers.
6. Recognize the importance of plotting your data.
7. Check a scatterplot for obvious violations of the assumptions of simple linear regression.
8. Construct a confidence interval and test a hypothesis about the slope in a linear regression model.
9. Compute (with technology) and interpret R2 in a regression model.
10. Use technology to fit a multiple regression model.
11. Interpret coefficients in a multiple regression model, recognizing that care should be taken when interpreting coefficients of predictors that are strongly associated with each other.
12. Use a multiple regression model to make predictions.