Lectures

TuTh 12:30-2pm, 60 Evans

Labs

Thursday 2-4pm, 332 Evans Thursday 4-6pm, 332 Evans

Instructor Office Email Office Hours
Will Fithian 301 Evans wfithian@berkeley.edu Tu 2-3pm, W 11am-12pm in Evans 301

 

GSI Email Office Hours
Kevin Attiyeh kattiyeh@berkeley.edu M 8-10am, Tu 9-11am in Evans 428

 

There is no book for this class. Instead, we have provided extensive lecture notes. If you would like some additional optional reading, you can try the following books.

  • The Statistical Sleuth: A Course in Methods of Data Analysis by Ramsey and Schafer
  • Introductory Statistics with R by Peter Dalgaard

Neither of these books covers all of the topics we will cover, nor do they have the same perspective and focus as this class – they do not have extensive use of bootstrapping and resampling methods. But for those students wanting some additional structure or R assistance these books may be helpful and should be at the right level for this class

Online R Resources

Downloading RStudio onto your own computer and editing files locally is the recommended way to do assignments, if your computer is capable of doing this.

If you have a Chromebook or no laptop, you can alternatively use mybinder:, an online R environment for using RStudio. Be sure to download any file you are editing before you quit the session!

In addition, the following references may be helpful:

There are many more options! If you find a good one that isn’t on the list, let me know.

Syllabus

Note links will only work once the material is posted (i.e. as the semester progresses). Visit the sites from previous semesters to get access to the reading material in advance (there is only small variation from year to year)

Week Description Chapter Lab Link Assignment Due
01 Boxplots, discrete distributions, intro to continuous distributions 01 Lab 1  
02 Continuous distributions, density curves, density estimation 01 Lab 2  
03 Permutation test, t-test and assumptions 02 Lab 3 Hw1 Due (F)
04 More on assumptions, type I error, multiple testing, Bonferroni corrections 02 Lab 4  
05 Confidence intervals, Bonferroni corrections, review simple regression 02, 03 Lab 5 Hw2 Due (W)
06 Polynomial regression, loess curves 03 Lab 6  
07 Finish loess curves, smooth density plots, pairs plots, alluvial plots, mosaic plots 03, 04 Lab 7 Hw3 Due (M)
08 Heatmaps, hierarchical clustering, PCA 04 Midterm Review Hw4 Due (W)
09 Midterm (M)
Finish PCA, start multiple regression
04, 05 Lab 8 Project 1 Due (F)
10 Multiple linear regression, fitting and interpretation, fitted values, residuals, Multiple R-squared, Residual degrees of freedom and residual standard error 05 Lab 9  
  Spring Break      
11 Multiple regression with categorical explanatory variables and interactions, Inference in multiple regression: F-tests via the anova function 05 Lab 10  
12 Inference in multiple regression: t-tests, standard errors, confidence intervals and prediction intervals. Variable selection in linear regression. Regression diagnostics 05 Lab 11 Hw5 Due (M)
13 The Classification problem and logistic regression, interpretation in terms of odds, binary predictions via confusion matrices, precision and recall, deviance, variable selection via AIC 06 Lab 12 Hw6 Due (W)
14 Regression trees, classification trees and Random Forests 07 Lab 13 Project 2 Due (F)
15 Reading and recitation week: no class     Project 3 Due (F)