MATH 253: Mathematical Methods for Data Visualization
San Jose State University, Spring 2020Course description [Syllabus]
This is a graduate course on dimension reduction for the purpose of data visualization. The course is 70% theory (linear algebra) and 30% programming (for matrix computing and data plotting).- Programming basics and high quality data plotting in 3D
- Advanced linear algbera
- Dimension reduction techniques
- Principal component analysis
- Multidimensional scaling
- ISOmap
- Laplacian eigenmaps
- Fisher linear discriminant
- Introduction to clustering
Textbook
"Foundations of Data Science Hardcover" [Unofficial version], by Avrim Blum, John Hopcroft, and Ravi Kannan, Cambridge University Press (March 12, 2020). We will use Chapter 3 and Appendix 12.8 of the book.
Additionally, the course will rely on the following papers:
Course progress
Date |
Lecture Slides | Further Reading |
---|---|---|
1/23 |
Course introduction and overview [slides] |
Course syllabus [MATLAB Onramp] |
1/28 |
Review of linear algebra and multivariable calculus [slides] |
Appendix 12.8 of the textbook (p437) |
2/4 |
Matrix computing in MATLAB [slides] |
[Matrices and Arrays] [Mathworks linear algebra documentation] |
2/11 |
High quality data plotting in MATLAB [slides] |
|
2/20 |
Rayleigh quotient [slides] |
|
2/25 |
Singular value decomposition of matrices [slides] |
Chapter 3 of textbook |
2/27 |
Generalized inverse and pseudoinverse [slides] |
[Chapter 3 of textbook] [Prof. Sawyer's notes] [Prof. Laub's notes] |
3/3 |
Matrix norm and low-rank approximation [slides] |
Chapter 3 of textbook |
3/16 |
Principal Component Analysis (PCA) [slides] |
Tutorial by J. Shlens |
3/26 |
Multidimensional Scaling (MDS) [slides] |
A book chapter on MDS |
4/7 | Original paper (Science, 2000) |
|
4/14 |
Linear Discriminant Analysis (LDA) [slides] |
Prof. Olga Veksler’s lecture |
4/30 |
Laplacian Eigenmaps [slides] |
Original paper (Neural Computation, 2003) |
More learning resources
Programming languages
- MATLAB:
- Python:
- Python for data science - learn in 3 days
- Python Introduction and Linear Algebra Review (Stanford lecture)
- NUMPY TUTORIAL WITH EXERCISES
- NumPy for Matlab users
- BEST PANDAS TUTORIAL | LEARN PANDAS WITH 50 EXAMPLES
- Sample codes by D. Sarkar for data visualization
- A Guide to Pandas and Matplotlib for Data Exploration
Useful course websites
- Prof. Veksler's CS9840a Learning and Computer Vision at University of Western Ontario
- Andrew Ng's CS 229 Machine Learning at Standford University
- Manik's CSL 864 - Special Topics in AI: Classification at Microsoft
Data sets
- 20 Newsgroups Data [data] [website]
- MNIST Handwritten Digits [data] [website]
- Fashion-MNIST
- USPS Zip Code Data
- Wine Quality Data Set
- UCI Machine Learning Repository
- Extended Yale Face Database B
- Oxford Flowers Category Datasets