# 6 338 Progress Presentation

Published on December 28, 2007

The Matrix: Using Intermediate Features to Classify and Predict Friends in a Social Network:  The Matrix: Using Intermediate Features to Classify and Predict Friends in a Social Network Michael Matczynski 6.338 Status Report April 14, 2006 Vision:  Vision In order to successfully classify users in a social network such as facebook.com, we should leverage intermediate features. Steps:  Steps Gather profile, friend, and group data from all MIT users on facebook.com Build graph Develop PageRank algorithm to determine profile popularity Generate intermediate features from profiles Develop algorithm to identify similarities between all users Develop online interface for users 1. Gather Data:  1. Gather Data Gathered data from 11,744 MIT profiles Profile data (major, living group, etc) Friend information (to build the graph) 2. Build Graph:  2. Build Graph Due to privacy settings, not all friend information is available Nonetheless, because a friendship link is undirected, the friends of users with strict privacy settings can mostly be deduced 3. PageRank Algorithm:  3. PageRank Algorithm Google’s PageRank Algorithm determines important nodes of a graph by using each link as a vote for that particular node Run Time: <1sec / iteration PageRank converges within 20 iterations Results: Due to the undirected nature of social networks, PageRank is highly correlated with number of friends  Not that useful 4. Generate Intermediate Features from Profiles:  4. Generate Intermediate Features from Profiles 5. Identify Similar Users:  5. Identify Similar Users Modified PageRank Algorithm One network for each attribute (ie. Music) Resulting PageRank would indicate clusters of similar interest Neural Networks Train neural network with known friends and learn about similarities / classifications 6. Online Interface:  6. Online Interface If interesting results emerge, develop an online interface so members of the MIT community can learn about themselves Next Steps:  Next Steps Generate intermediate features Determine classification algorithm Parallel computation

