What are the prerequisites?
Why can I not enroll in Stat 131A?
Who should take Stat 131A?
I want to learn R, is Stat 131A a good course for that?
Does Stat 131A satisfy requirements of a major or minor?
How does Stat 131A compare to the historical versions of 131A (pre Fall 2019)?
How does Stat 131A compare to other courses offered by the Statistics department?

What are the prerequisites?

All students are required to have taken

  • STAT C8/CS C8 (i.e. DATA 8) or STAT 20
  • One semester of calculus (Math 1A/10A/16A)

These are all “real” requirements, in the sense we will depend on material in these courses at times in the course. In particular, in addition to the statistics content taught in DATA 8/STAT 20, we assume students have the introduction to programming that is taught in these courses (either python or R), so that we do not need to reteach basic programming ideas. If you have not taken these courses you must speak to the instructor to determine whether you can take this course.

Recommended courses It will probably be helpful with students with little programming experience other than these courses to taken STAT 33 or Stat 133 (or be taking it currently) so as to have better fluency in programming and R. The Stat 131A course will introduce the necessary concepts and does not assume this background, so these courses are not necessary. But students with limited programming background will find the programming aspects of the course much easier if they have taken Stat 33 or Stat 133.

Note that we are assuming the version of STAT 20 taught since Fall of 2018, where working with data in R is taught. If you took STAT 20 before that time, you will be allowed by the computer system to enroll, but you will be lacking critical background if you have not taken STAT 33 or STAT 133 or some other introduction to R programming.

If you cannot enroll in Stat 131A

If the pre-requisite courses described above do not appear on your Berkeley transcript, the enrollment system will not allow you to enroll.

If you believe you meet these pre-requisites through other coursework or experience (e.g. you are a graduate student who has taken similar courses in another university), you should email the instructor to see if the instructor will provide a waiver to allow you to enroll in advance of the first day of class. You MUST put 131A in the subject of your email.

If you have not met these pre-requisites, but still wish to take the course you should speak with the instructor on the first day of class after hearing more about the pre-requisites and the policies of the instructor. The instructor will not provide waivers before the first day of class for students who have not satisfied these pre-requisites

If you are a student from outside of Berkeley (e.g. from Berkeley Extension), it is a different enrollment system (these are “concurrent enrollments”). All such students have to be approved by the instructor regardless of pre-requisites, and by Berkeley policy only can be done once all Berkeley students have been given a chance to enroll. Please attend the first class to learn more about the class and how the instructor will handle concurrent enrollments. The instructor will not address any of these requests until after the first day of class.

Who should take Stat 131A?

This audience for this class is envisioned to be

  1. Students wishing to get a data science minor
  2. Pre-majors in data science related fields who want to get more experience with data analysis or
  3. Majors outside of statistics/computer science who would like to gain greater statistical skills for data analysis

The course is intended to be a next step for students who have taken a introductory statistics or data science course, but are not ready (or do not ever plan) to take the foundational courses needed for the more advanced courses in the fields of statistics or data science. The course exposes students to a wide range of statistical methods, including topics that are often only seen in more advanced courses. We focus on imparting on a firm understanding of the context for when methods should be used, and how to appropriately interpret their results – rather than an in-depth treatment of the underpinnings of the methods. The goal is that students walk away with an understanding of a broad array of the tools commonly seen in practice.

I want to learn R, is Stat 131A a good course for that?

Students will use R in the course of Stat 131A and learn relevant commands for doing the statistical methods we cover. However, there is not intensive instruction on the mechanics of R. Stat 133 (or Stat 33) is the right course to take to learn R. Indeed, if you have already taken those courses, it will be helpful for this course, but is not necessary.

In particular, students who have taken a wide range of statistics courses, but are looking for more experience in R should NOT take this course. This course is an course for students with only introductory statistics background.

Does Stat 131A satisfy requirements of a major or minor?

Stat 131A does not satisfy any requirements for the majors in Statistics or Data Science.

Stat 131A does satisfy the requirements of a Data Science minor, specifically the 2-course core pathway track, see https://data.berkeley.edu/academics/undergraduate-programs/data-science-minor for more information.

131A also satisfies quantitative or statistical requirements in some other majors, but you would need to check with your major of interest to determine this.

How does it compare to the historical versions of 131A (pre Fall 2019)

STAT 131A had been historically offered every semester by the Statistics department under the title “Statistics for Life Sciences”. This previous iteration of 131A was basically an introductory statistics course, similar in content to Stat 20.

Starting in the Fall of 2019, STAT 131A course has been dramatically modified and is now entitled “Statistical Methods for Data Science”. Unlike the previous version, STAT 131A now is intended to be taken after taking an introductory course in statistics (and one that involves some introduction to programming). As a result, it covers much more advanced topics, and is appropriate for a student who wants to go beyond just introductory statistics material.

Why the change? (and what happened to STAT 28?)

The previous version of STAT 131A was overly duplicative of introductory material already offered in lower-level classes. By changing this course, we are now able to offer an upper-division course that covers more advanced material without the extensive background required of our other upper-division courses (courses intended for majors in Statistics or Data Science).

The current version of STAT 131A uses the materials that were developed for the course STAT 28. The material taught in STAT 28 is now covered in STAT 131A. The change was because the material previously taught in STAT 28 was decided to be too advanced and face paced for a lower-level courses, as well as STAT 131A now has an additional pre-requisite of calculus that was not required for STAT 28.

How does it compare to other courses offered by the Statistics department?

Here is a brief description of other courses offered in the statistics department that could be taken after Data 8/STAT 20 and how they compare to STAT 131A.

  • STAT 28: STAT 28 is the previous version of 131A and is no longer offered by the department. The material taught in STAT 28 is now covered in STAT 131A. If you have taken STAT 28 when it was offered, you should not take STAT 131A.

  • STAT 88: STAT 88 is a 2-unit connector course for DATA 8. It teaches ideas of probability and estimation beyond that taught in DATA 8, and requires calculus background. It can be taken concurrently with DATA 8. It has little overlap with the material in STAT 131A, and the few places that overlap, the treatment in STAT 131A is an applied and a much less mathematical treatment.

  • STAT C100/CS C100/DATA 100: DS 100 is a upper-division course cross-listed between CS and STAT and required for the Data Science Major. This intermediate level class bridges between DATA 8 and upper division courses in computer science and statistics as well as methods courses in other fields. In this class, students will master the data science life-cycle and learn many of the basic principles and techniques of data science spanning algorithms, statistics, machine learning, visualization, and data systems.

    The prerequisites for DS100 are DATA 8, Math 54 (linear algebra), and CS 61a or CS88 (programming). It is intended for advanced sophomore and juniors and seniors. DS100 will use Python as its programming language. Here is a link to DS 100

    There are some similarities between the topics in STAT 131A and that of DS100. In particular, both treat more advanced methods of data analysis, such as inference, regression, logistic regression, data visualization. Both treat resampling ideas (bootstrap, permutation tests) for inference.

    STAT 131A does not require as much math or programming background as DS100 and is accessible for students from STAT 20 as well as DATA 8. More generally it’s topics reflect this difference:

    • STAT 131A has less emphasis on computing, programming concepts, and mathematics, and does not cover: SQL, optimization, regularized regression, Big Data/Spark
    • STAT 131A has more emphasis on inference (i.e. p-values and confidence intervals) and traditional parametric models commonly used in domain applications.
  • STAT 133, STAT 134, STAT 135: These three courses are the core courses required for all majors in statistics. These courses cover, respectively, statistical computing (including programming in R), probability, and the theory of statistics. These courses have a much more in-depth treatment of these topics than STAT 131A. STAT 133/134/135 also provide the foundation to take upper-division courses numbered greater than 150 in statistics. These 150 and above upper-division courses offered in the statistics department cover many of the methods described in STAT 131A in greater detail and with greater emphasis on the mathematical understanding of the methods. STAT 131A, on the other hand, is a survey through some important such methods in one semester.

    • STAT 133: The main focus of STAT 133 is statistical computing. STAT 133 teaches R, but also teaches other computational skills, such as unix environment, shell scripting, and/or SQL (depending on the instructor) and goes far beyond the computing instruction in STAT 131A. The examples and motivation in STAT 133 are from statistical data analysis, so STAT 133 involves a great deal of analyzing data, and statistical methods to do so are also taught. But the goal of STAT 133 is not a wide-ranging survey of statistical methods; STAT 133 is a deep dive into computing tools in the context of data analysis. The focus of STAT 131A, on the other hand, is understanding the tools for data analysis; the computation is important for being able to do data analysis, but unlike STAT 133, is not the main topic of lectures, exams, etc. So both courses involve data analysis and computation in R but with different focuses and depth. STAT 131A and STAT 133 are the two required core courses for students choosing to take the 2-course core pathway for a Data Science minor, and the two courses compliment each other.
    • STAT 134 teaches probability and uses calculus. It is in the format of a traditional math class (proofs, exercises, etc).
    • STAT 135 is an introduction to the mathematical underpinings of statistical estimation and hypothesis testing. STAT 135 in focused on teaching these ideas in the context of data and real-life situations, and so emphasizes data analysis to complement the mathematical ideas. But STAT 135 remains much more mathematical than either DATA 8 or STAT 131A. STAT 134 is a pre-requisite for STAT 135.
  • STAT 140: STAT 140 is a probability class, like STAT 134. Unlike STAT 134, it requires DATA 8 as a pre-requisite and is concentrated on teaching probability techniques particularly applicable for data science applications and makes use of simulation (and thus programming knowledge) to teach the topics. Here is a link to STAT 140