MATH 251: Statistical and Machine Learning Classification
Fall 2018, San Jose State UniversityCourse description [Syllabus]
This is an advanced topics course in the machine learning field of classification, with the goals of introducing
- Dimensionality Reduction
- Instance-based Methods
- Discriminant Analysis
- Logistic Regression
- Support Vector Machine
- Kernel Methods
- Ensemble Methods
- Neural Networks
all based on the benchmark dataset of MNIST Handwritten Digits. Such a teaching strategy was partly inspired by Michael Nielsen's free online book - Neural Networks and Deep Learning, which notes explicitly that this dataset hits a ``sweet spot'' - it is challenging, but ``not so difficult as to require an extremely complicated solution, or tremendous computational power''. In addition, the digit recognition problem is very easy to understand, yet practically important.
Course progress
| Date | Slides | Further Reading |
|---|---|---|
| 8/22 | Review | |
| 8/27 | Introduction | Final project instructions |
| 9/5 | Instance-based classification | Chapter 2 of textbook 1 |
| 9/12 | PCA [Matrix algebra] | Section 10.2 of textbook 1 |
| 9/24 | LDA (for dimensionailty reduction) | Prof. Olga Veksler’s lecture |
| 10/8 | Bayes classifiers | Section 4.4 of textbook 1 |
| 10/15 | Midterm | Midterm solution |
| 10/22 | Logistic regression | Section 4.3 of textbook 1 |
| 10/29 | Support vector machines [Lagrange Dual] | Chapter 9 of textbook 1 |
| 11/19 | Ensemble learning | [Trevor Hastie's slides] [Adele Cutler's lecture] [Chapter 8 of textbook] |
| 11/26 | Neural networks | [Michael Nielsen’s book] [Olga Veksler’s lecture] [Perceptron] |
| 12/5 | Course summary and project information | |
| 12/10 | Final project presentations | |
| 12/12 | Final project presentations (cont'd) |
More learning resources
Programming languages
- MATLAB:
- Common Matlab commands;
- Online tutorials (see here for a simple one);
- Statistics and Machine Learning Toolbox Documentation;
- Python:
- R:
Useful course websites
- Prof. Veksler's CS9840a Learning and Computer Vision at University of Western Ontario
- Andrew Ng's CS 229 Machine Learning at Standford University
- Manik's CSL 864 - Special Topics in AI: Classification at Microsoft
Data sets
- USPS Zip Code Data
- UCI Machine Learning Repository
- LibSVM data sets
- Extended Yale Face Database B
- Oxford Flowers Category Datasets