Feeds That Matter A Study of Bloglines Subscriptions
Akshay Java Pranam Kolari, Tim Finin, Anupam Joshi, Tim Oates
Outline
• • • • • Background and Motivation Bloglines General Statistics Grouping Related Topics Applications Conclusion
Bloglines Feed Reader
Folders
Use folder label as approximation for topic. Group similar folders together Rank Feeds under a “topic”
Motivation
• Study user generated tags in feed reader subscriptions • Find relevant blogs about a topic • Needed labeled, training data for building text classifiers for different topics
Tag Cloud generated by using folder names and merging related folders
Outline
• • • • • Background and Motivation Bloglines General Statistics Grouping Related Topics Applications Conclusion
Bloglines General Statistics
• 83K publicly listed subscribers • 2.8M feeds, 500K are unique • 26K users (35%) use folders to organize subscriptions • Data collected in May 2006
Although there may be ~ 50M+ Blogs, only a small fraction get continued user attention in the form of subscriptions Users subscribe to Web 2.0 content such as flickr, delicious, technorati and google searches
Bloglines General Statistics
Feed Subscriptions follow a power law distribution
Bloglines General Statistics
• Most users subscribe to modest number of feeds • Most users have only a few folders • User attention is limited
Bloglines General Statistics
As subscriptions increase, users tend to organize them into folders.
Outline
• • • • • Background and Motivation Bloglines General Statistics Grouping Related Topics Applications Conclusion
Bloglines General Statistics
technologica Musica
Weather
Foreign Language
Email, Mailing List, Tracking
A folksonomy emerges from the folder names. Many users use popular folder names to classify feeds.
Tag Cloud Before Merge
Tag Cloud After Merge
Folder names are used as topics. Lower ranked folder are merged into a higher ranked folder if there is an overlap and a high cosine similarity.
Merging Tags
Interesting Cases: • Music vs. Musica : English and Spanish Music sites • Podcasting vs. Podcasts: One refers to the tools for podcasting while the other feeds containing podcasts • Regional Interests: China, Japan, India, etc. • Foreign Language: Spanish, German
Feeds That Matter
Top Feeds for “Politics”
Merged folders: “political”, “political blogs”
• • • • • • • • • • Talking Points Memo: by Joshua Micah Marshall Daily Kos: State of the Nation Eschaton The Washington Monthly Wonkette, Politics for People with Dirty Minds http://instapundit.com/ Informed Comment Power Line AMERICAblog: Because a great nation deserves the truth Crooks and Liars
Top Feeds for “Knitting”
Merged folders “knitting blogs”
• Yarn Harlotknitting • Wendy Knits! • See Eunny Knit! • the blue blog • Grumperina goes to local yarn shops and Home Depot • You Knit What?? • Mason-Dixon Knitting • knit and tonic • Crazy Aunt Purl • http://www.lollygirl.com/blog/
Most Subscribed Feeds, Top Folders
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Top Feeds Bloglines Wired Slashdot BloingBoing Dilbert Gizmodo Engadget Official Google Blog Alist Apart News: CNN, Reuters, Moreover Top Folders 1. News 2. Blogs 3. Tech 4. Comics 5. Politics 6. Podcasts 7. Design 8. Sports 9. Science 10. Business
Tag Merging
Folder names are used as topics. Lower ranked folder are merged into a higher ranked folder if there is an overlap and a high cosine similarity.
Outline
• • • • • Background and Motivation Bloglines General Statistics Grouping Related Topics Applications Conclusion
FTM! Site
Explore Popular Topics Subscribe To Interesting Feeds
If you like X you will like…
http://ftm.umbc.edu
Feed Recommender (Method 1)
• Two feeds are similar if they are categorized under similar folders
Technology
Business
Politics
knitting
Feed Recommendation (Method 2)
• Start with a seed set from FTM! • Using, graph from WWE dataset, find nodes influenced by the seed set • Find other blogs frequently co-cited by the followers
Blogs influenced by seed set
Feed Recommendation Using Co-citation
Politics Knitting
Outline
• • • • • Background and Motivation Bloglines General Statistics Grouping Related Topics Applications Conclusion
Conclusions
• Folder labels can be used to produce an intuitive set of topics for feeds or blogs • Subscription information combined with simple techniques can be quite effective in ranking blogs for a topic. • Many useful applications such as feed recommendation and meme trackers can benefit from this data.
“Want to find a few good feeds? Try Feeds That Matter, an interesting grouping of publicly listed feeds at Bloglines’’ delicious user skyamese
It brings you popular feeds from Bloglines in different categories and I found almost all the popular feeds in appropriate categories out there. Worth paying a visit – netgautam blogger
University Study Reveals Rich Data on Bloglines Feeds Feeds That Matter is a fascinating new analysis project out of UMBC and a terrific way to find new RSS feeds to subscribe to.. - Steve Rubel, Micropersuasion blog
Provides a "swarm" with keywords on subjects which will take you to a list of blogs/sites relating to that keyword. All are rss feeds
delicious user damenjoe .
Find how to classify your feeds and find new feeds based on tags - delicious user inf
Nothing better to read online? Feeds that matters gives you loads of highly rated feeds in all category
Thanks!
Links to loads of good RSS feeds. hmspolio ….great source for some quality content for a blog or just for browsing. Blendedblog blogger
…find information and resources that have already been filtered by like minded people – Tryangulation blog
…it's a great example of a technique for extracting usefulmetadata from the world JD on EP blog
kind of a meta blog delicious user frontporsche
600+ bookmarks on delicious & more…
Easy way to find good blogs delicious user kc144
Backup
Feed Recommendation (Method 2)
• Starting with a seed set from FTM! Find other influential feeds from Blogpulse data, using cocitations.
www.dailykos.com
Blogs influenced by seed set