Math 203 Course Page

Return to my homepage

MATH 203: Applied Mathematics, Computing & Statistics Projects (CAMCOS)

Spring 2017, San Jose State University

Final product

Slides Report

Toy data

The 20 newsgroups data set [Processed version] (use X_100)

References

Overview of document clustering

  • A Survey of Text Clustering Algorithms [Link]

Dimensionality reduction of document data

  • Latent Semantec Indexing (LSI) [paper]
  • Locality Preserving Indexing (LPI) [Link]

Spectral clustering

Landmark based spectral clustering (LSC)