Introduction to Machine Learning and Computational Statistics – DSC6135
University of Rwanda
This course will give you an introduction to machine learning, probabilistic modeling and data science. We will cover two major areas in machine learning: supervised learning, and unsupervised learning.
Our learning approach will be a mixture of conceptual, theoretical, and practical. We will discuss the motivations behind common probabilistic models, and the properties that determine whether or not such models will work well for a particular task. On the one hand, you will derive the mathematical underpinnings for many common ML approaches, as well as apply those techniques to model real data.
After this course, you will be able to…
- learn how to think in principled ways of modeling (understand the why’s)
- understand deeply how and why machine learning works
- learn how to regularize models
- learn how to optimize objective functions
- learn how to validate
- deal with data computationally large/small, and statistically small.
- Python programming.
- Background in stats, and probability (we will review concepts).
- Some linear algebra, multivariate calculus.
Students are expected to write non-trivial programs. Code will be provided in python.
This is an intensive course of 4h/day for 10 days in total. Each day will be approximately composed of the following modules
- Quizz of concepts/correction of homeworks + summary of previous day and questions (30 min)
- Lecture (1h)
- Practical (30 min)
- Break (15 min)
- Lecture (1h)
- Practical (30 min)
- Introduction to Homework (15 min)
Homeworks will be released/described at the end of class, every two days. There will be short quizzes/review of concepts at the beginning of each day.
Requirements and Grading
- (50 pts) 5 mandatory homeworks (every 2 days)
- (30 pts) quizzes
- (15 pts) paper presentation
- (5 pts) participation
Your main deliverable will be homework reports. You’ll be assessed on effort, the sophistication of your technical approach, the clarity of your explanations, the evidence that you present to support your evaluative claims, and the performance of your implementation. A high performing approach with little explanation will receive little credit, while a careful set of experiments that illuminate why a particular direction turned out to be a dead end may receive close to full credit.
The goal of this course is to instill a strong technical background for you to responsibly apply machine learning in the world. Thus, in addition to the derivations and the practicals, each class will include a story about real-world applications of machine learning. We will also talk about ethical implications of machine learning.
Related, we expect all participants in this course—instructors, staff, students—to be committed to an open, professional, and inclusive environment. Just like the maths, these qualities take cultivation and effort. We will start with the premise that we are all open-minded people trying our best and encourage constructive feedback to improving the course environment.
Many slides and homeworks are attributable to/inspired by:
- Mike Hughes (Tufts University)
- Finale Doshi-Velez (Harvard University)
- Erik Sudderth (University of California)
- James, Witten, Hastie, Tibshirani, Bishop (ISL/ESL books)
For any question or concern, please send us an email to any of us:
Javier Zazo (jzazo - at - g.harvard.edu)
Weiwei Pan (weiweipan - at - g.harvard.edu)
Melanie F. Pradier (melanie - at - seas.harvard.edu)