In an earlier post I mentioned that I have been playing around with the Youtube API to see how to find out how videos might be connected.
I have been able to find music videos related to each other within 5 or 6 levels of suggested videos. Strangely my own videos on my Youtube channel don't show as being related.
The first simple, obvious optimisation that I included in the Youtube network path search was to not look up suggestions for videos that had already been checked. Out of interest I kept a record of these duplicate suggestions and noticed that the percentage seemed quite high. Now I am a little curious about how common duplicates are in other types of networks.
Twitter is the obvious candidate for exploring social networks, as most things are public and there is a simple RESTful API to navigate.
I may need to set up some rules to recognise someone as a celebrity or promotional / marketing account if they have a very high number of followers, likewise I could treat someone as a spammer if they follow an exorbitant number of accounts.
If I'm feeling particularly motivated to apply what I've been learning about recently, I could even import the data into a MongoDB database, set up some indexes and run some queries. (The prevalence of JSON formatting in these platforms makes this easier than you might think).