I have been using Google Reader on and off since it was created. I would use it for a month, get flooded with too much content to sort through, and just stop reading, letting my queue continue to build into the thousands. I would then declare reader bankruptcy, cut down on what is in my feed, and start over to give up another month later. Eventually, I gave up and now completely rely on Twitter to filter in meaningful content for me, and I follow a good mix generalists and specialists that seem to keep me apprised of both popular and esoteric content that pertain to most of my interests. But it is still lacking, the signal-to-noise ratio is still low. Popular content quickly saturates content streams while esoteric but good content is drowned out. I am already at the upper limit of how many people I can pay attention to, but I follow few urban planners and architects, anthropologists and sociologists. So even though I am deeply interested in those fields, I do not read much content from them.
Others such as Adam Gurri say that rather than following every interesting content source, feeds are for tracking the important ones1. And when it comes to what we want to make sure to read, feed readers work well. But they help us collate the sources we have already curated for ourselves, without helping us analyze and sift what we have, or discover new content. There is too much good content for any single person to meaningfully handle. Aside from having full-time researchers sift through and find the most apt articles and research studies, there is no good way to track all the things that are meaningful to my current work, as well as archiving those that are not but are could be useful in the future.
Personal archives are not enough
Search engines such as Google are getting better at helping us find the things we know want, but their ability to help us find the things we don’t know we need (this is why Google is investing so much energy into Google+, social relationships are core to how we find new interests and obsessions). Generalized search engines are also still terrible at digging through our personal history (I’m not just talking about browser history) to find the things we vaguely remember, the blogs we bookmarked for future reading but never returned to, the ideas that we have been sleeping on and thinking about for weeks, months, or years.
The things we write, the things everyone else see, hear, and read are just the tip of the iceberg of our thought. Alfred Whitehead once said that philosophy was all a footnote to Plato. While that may not be true, we are definitely all products of our culture and personal histories. Owning our own data is not enough. If we want to take full advantage of what we have, we need tools to help us connect our ideas with the ideas we’ve encountered. To help us make the connections between the ideas we’ve encountered with the ideas we haven’t.
We need digital anti-libraries
The writer Umberto Eco belongs to that small class of scholars who are encylopedic, insightful, and nondull. He is the owner of a large personal library (containing thirty thousand books), and separates visitors into two categories: those who react with “Wow! Signore professore dottore Eco, what a library you have! How many of these books have you read?” and the others - a very small minority - who get the point that a private library is not an ego-boosting appendage but a research tool. Read books are far less valuable than unread ones. The library should contain as much of what you do not know as your financial means, mortgage rates, and the currently tight read-estate market allows you to put there. You will accumulate more knowledge and more books as you grow older, and the growing number of unread books on the shelves will look at you menacingly. Indeed, the more you know, the larger the rows of unread books. Let us call this collection of unread books an antilibrary2.
Personal libraries are not just for the books we have read. They should also be for what we haven’t read yet. We already have a few tools helping us such as Readability, Instapaper, and Pocket. But that is not enough, they are built for delaying reading time, not archiving for future reference. But people are also starting to build tools for archiving pages for future reference (such as Pinboard), but we have to tell them the individual pieces we want to read, not give them a source to keep tracking and save for later. That is exactly what we need to do with our RSS feeds.
Tools such as news.google.com, Prismatic, and Flipboard help us sort what is meaningful to us, discover new avenues, and broaden our scope, but they emphasize the new. They are poor at bringing old material to light. A digital anti-library of articles we have not read is like Eco’s library: a research tool that helps us map the unknown. It helps give us a foothold into exploring the things we know may be important to us, but have not yet studied. We can keep track of the topics that only tangentially matter to our primary projects, or quickly update ourselves on topics we only deal with occasionally.
Digital catalogs: Beyond Putnam and Dewey
To merely archive and store our content sources is not enough. We need to organize them usefully. Build new ways to browse and explore what we have. Anyone who has done extensive research in an academic library knows the value of the Library of Congress Catalog system when researching (I lived in BD220 and JC578 my senior year of college). In physical libraries, a good cataloging system can be the difference between looking up and down a bookshelf and running up and down several flights of stairs. In digital, it is much easier to search but it can be much harder to find. For example, while it is easy to find the book on typography that Robert Bringhurst wrote (The Elements of Typographic Style), it is much harder to find the books by that famous infographics guy who helped create graphics for the White House (Edward Tufte), while in a research library using LCC, I could go to Z246 (typography) or QA273.3 (data visualization) and browse.
The Internet is not well-cataloged. But it is organized…sort of. At its core, the Internet is a giant network of ideas. Words, ideas, pictures, videos, they all link to one another in some way. There is no need to catalog interdisciplinary articles into a single field, we do not need to decide whether a blog post on the social effects of starchitecture should fit under architecture, urban design, or sociology, we can catalog and tag them in as many fields and subject areas as they reasonably fit. Through links, related words and phrases (n-grams), images, linkbacks, comments, and metadata we can generate a mass of relationship data among much of what is published. Even the simplest relationships such as shared keywords and tags can offer a wealth of somewhat pertinent and likely related information. But we should not stop there. When we do full-text analysis, we can find links to see if articles and blog posts link to one another, and offer first-degree relationships. We can check citations within journal articles to find even more. Ditto author relationships. Ditto image comparisons. We can find second- and third-degree relationships when we look for new material that link to our first sets of results3. With a little manual work and writing in the margins (so to speak) to tell our tools that certain ideas are related (eg what I try to do with the Connections series on Normative Connections) we can seed our research tools with networks of ideas to build off of.
That is, to me, the true potential for RSS and ATOM feeds: Building personal digital libraries. Cataloging potentially useful information streams for future use. To draw connections between my ideas with the ideas of others. To find new (and old) avenues to explore.