Clustering analysis of a subset of the adult dataset.
First i plot losses over over different number of clusters to figure out how many clusters I want.
Using the elbow method I have two choices. I went with 4.
These are our 4 cluster (stereotypical types of people). Only the last one earns over 50k. (which was the original point of the dataset)
Finally, here is a TSNE visualization
Z. Huang. Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3):283–304, 1998


