In Spring 2017, this course was offered under the number ‘STAT 28’.

Lectures

MWF 11-12:00pm, Barrows 20

Labs

Thursday 2-4P, 342 Evans Thursday 4-6pm, 342 Evans

Instructors Office Email Office Hours
Adityanand Guntuboyina 423 Evans Hall agun@berkeley.edu W 2-4
Elizabeth Purdom 433 Evans Hall epurdom@stat.berkeley.edu W 2-4

 

GSI Email Office Hours
Boying Gong jorothy_gong@berkeley.edu M 4-6, Evans 342 and F 9-11am Evans 444

 

There is no book for this class. Instead, we have provided extensive lecture notes. If you would like some additional optional reading, you can try the following books.

  • The Statistical Sleuth: A Course in Methods of Data Analysis by Ramsey and Schafer
  • Introductory Statistics with R by Peter Dalgaard

Neither of these books covers all of the topics we will cover, nor do they have the same perspective and focus as this class – they do not have extensive use of bootstrapping and resampling methods. But for those students wanting some additional structure or R assistance these books may be helpful and should be at the right level for this class

Syllabus

Week Description Chapter Lab Link Assignment Due
01 Boxplots, discrete distributions, intro to continuous distributions 01 Lab 1  
02 Continuous distributions, density curves, density estimation 01 Lab 2  
03 Permutation test, t-test and assumptions 02 Lab 3 Hw1 Due (F)
04 More on assumptions, type I error, multiple testing, Bonferroni corrections 02 Lab 4  
05 Confidence intervals, Bonferroni corrections, review simple regression 02, 03 Lab 5 Hw2 Due (W)
06 Polynomial regression, loess curves 03 Lab 6  
07 Finish loess curves, smooth density plots, pairs plots, alluvial plots, mosaic plots 03, 04 Lab 7 Hw3 Due (M)
08 Heatmaps, hierarchical clustering, PCA 04 Midterm Review Hw4 Due (W)
09 Midterm (M)
Finish PCA, start multiple regression
04, 05 Lab 8 Project 1 Due (F)
10 Multiple linear regression, fitting and interpretation, fitted values, residuals, Multiple R-squared, Residual degrees of freedom and residual standard error 05 Lab 9  
  Spring Break      
11 Multiple regression with categorical explanatory variables and interactions, Inference in multiple regression: F-tests via the anova function 05 Lab 10  
12 Inference in multiple regression: t-tests, standard errors, confidence intervals and prediction intervals. Variable selection in linear regression. Regression diagnostics 05 Lab 11 Hw5 Due (M)
13 The Classification problem and logistic regression, interpretation in terms of odds, binary predictions via confusion matrices, precision and recall, deviance, variable selection via AIC 06 Lab 12 Hw6 Due (W)
14 Regression trees, classification trees and Random Forests 07 Lab 13 Project 2 Due (F)
15 Reading and recitation week: no class     Project 3 Due (F)