Wednesday, 9 May 2012

Analysing social networks and recommendations

In an earlier post I mentioned that I have been playing around with the Youtube API to see how to find out how videos might be connected.

I have been able to find music videos related to each other within 5 or 6 levels of suggested videos.  Strangely my own videos on my Youtube channel don't show as being related.

The first simple, obvious optimisation that I included in the Youtube network path search was to not look up suggestions for videos that had already been checked.  Out of interest I kept a record of these duplicate suggestions and noticed that the percentage seemed quite high.  Now I am a little curious about how common duplicates are in other types of networks.

Twitter is the obvious candidate for exploring social networks, as most things are public and there is a simple RESTful API to navigate.

I may need to set up some rules to recognise someone as a celebrity or promotional / marketing account if they have a very high number of followers, likewise I could treat someone as a spammer if they follow an exorbitant number of accounts.

If I'm feeling particularly motivated to apply what I've been learning about recently, I could even import the data into a MongoDB database, set up some indexes and run some queries.  (The prevalence of JSON formatting in these platforms makes this easier than you might think).

No comments:

Post a comment