Donghui Yan, PhD

Associate Professor




Liberal Arts 394F


University of California, BerkeleyPhD in Statistics


  • Statistics
  • Machine learning
  • Data Science





Foundational topics in data science. Students will learn a broad range of data science skills applicable across different domains, including social sciences, finance, crime and justice, social networks, and engineering. Students will develop statistical and computational thinking skills, and they will apply these skills to real-world datasets. Specific topics include applied data problems, statistical software, data frames, descriptive statistics, natural language processing, data storage, data merging, linear regression, and data mining. The core skills developed in this course lay a foundation for more advanced coursework in data management, visualization, exploratory data analysis, and machine learning. No prior knowledge of programming or statistics is required.

Application of knowledge discovery and data mining tools and techniques to large data repositories or data streams. This project-based capstone course provides students with a framework in which students gain both understanding and insight into the application of knowledge discovery tools and principles on data within the student's cognate area. This course is intended for data science majors only.


Research Interests

  • Statistics
  • Machine Learning
  • Data Mining
  • Data Science

External links

Latest from Donghui

Request edits to your profile