Donghui Yan, PhD

Associate Professor




Liberal Arts 394F


University of California, BerkeleyPhD in Statistics


  • Statistics
  • Machine learning
  • Data Science





Foundational topics in data science. Students will learn a broad range of data science skills applicable across different domains, including social sciences, finance, crime and justice, social networks, and engineering. Students will develop statistical and computational thinking skills, and they will apply these skills to real-world datasets. Specific topics include applied data problems, statistical software, data frames, descriptive statistics, natural language processing, data storage, data merging, linear regression, and data mining. The core skills developed in this course lay a foundation for more advanced coursework in data management, visualization, exploratory data analysis, and machine learning. No prior knowledge of programming or statistics is required.

Application of knowledge discovery and data mining tools and techniques to large data repositories or data streams. This project-based capstone course provides students with a framework in which students gain both understanding and insight into the application of knowledge discovery tools and principles on data within the student's cognate area. This course is intended for data science majors only.

A calculus-based introduction to statistics. This course covers probability and combinatorial problems, discrete and continuous random variables and various distributions including the binomial, Poisson, hypergeometric normal, gamma and chi-square. Moment generating functions, transformation and sampling distributions are studied.

A special course to meet the needs of students for material not encountered in other courses. Topics dealt with require the approval of the departmental chairperson.

Continuation of MTH 332. Covering topics are advanced mathematical statistics topics, including detailed hypothesis testing, linear models, and regression analysis. This course also covers concepts and selected algorithms in machine learning.


Research Interests

  • Statistics
  • Machine Learning
  • Data Mining
  • Data Science

External links

Latest from Donghui

Request edits to your profile