diversity within the real time stream

I got a call on Friday from a journalist at the Financial Times who was writing on the Twitter ecosystem. We had an interesting conversation and he ran his piece over the weekend Twitter branches out as London’s ‘ecosystem’ flies.

As the title suggests the focus was on the Twitter ecosystem in London.    Our conversation also touched on the overall size and health of the real-time ecosystem — this topic didn’t make it into the article. It’s hard to gauge the health of a business ecosystem that is still very much under development and has yet to mature into one that produces meaningful revenues. Yet the question got me thinking — it also got me thinking that it has been a while since I had posted here. It was one busy summer. I have a couple of long posts I’m working on but for now I want to do this quick post on the real-time ecosystem and in it offer up some metrics on its health.

Back in June I did a presentation at Jeff Pulver’s 140conf, the topic of which was the real-time / Twitter ecosystem.   Since then, I have been thinking about the diversity of data sources, notably the question of where people are publishing and consuming real-time data streams. At betaworks we are fairly deep into the real time / Twitter ecosystem.  In fact, every company at betaworks is a participant, in one manner or another, in this ecosystem, and that’s a feature, not a bug! Of the 20 or so companies in the betaworks network, there is a subset that we we operate; one of those is bit.ly.

2puffsIn an attempt to answer this question about the diversity of the ecosystem, let me run through some internal data from bit.ly.   bit.ly is a URL shortener that offers among other things real-time tracking of the clicks on each link (add “+” to any bit.ly URL to see this data stream).   With a billion bit.ly links clicked on in August — 300m last week — bit.ly has become almost part of the infrastructure of the real time cloud.  Given its scale bit.ly’s data is a fair proxy for the activity of the real-time stream, at least of the links in the stream.

On Friday of this week (yesterday) there were 20,924,833 bit.ly links created across the web (we call these “encodes”). These 20.9m encodes are not unique URL’s, since one popular URL might have been shortened by multiple people. But each encode represents intentionality of some form. bit.ly in turn retains a parent : child mapping, so that you can see what your sharing of a link generates vs. the population (e.g., I shared a video on Twitter the other day; my specific bit.ly link got 88 clicks, out of a total of 250 clicks on any bit.ly link to that same video.  see http://bit.ly/Rmi25+).

So where were these 20.9m encodes created? Approximately half of the encodes took place within the Twitter ecosystem. No surprise here: Twitter is clearly the leading public, real-time stream and about 20% of the updates on Twitter contain at least one link, approx half of which are bit.ly links.   But here is something surprising: less than 5% of the 20.9m came from Twitter.com (i.e., from Twitter’s use of bit.ly as the default URL-shortener). Over 45% of the total encodes came from other services associated in some way with Twitter – i.e. the Twitter ecosystem — a long and diverse list of services and companies within the ecosystem who use bit.ly.

The balance of the encodes came from other areas of the real time web, outside of Twitter. Google Reader incorporated bit.ly this summer, as did Nokia, CBS, Dropbox, and some tools within Facebook. And then of course people use the bit.ly web site — which has healthy growth — to create links and then share them via instant-messaging services, MySpace, email, and countless other communications tools.

The bit.ly links that are created are also very diverse. Its harder to summarise this without offering a list of 100,000 of URL’s — but suffice it to say that there are a lot of pages from the major web publishers, lots of YouTube links, lots of Amazon and eBay product pages, and lots of maps. And then there is a long, long tail of other URL’s. When a pile-up happens in the social web it is invariably triggered by link-sharing, and so bit.ly usually sees it in the seconds before it happens.

This data says to me that the ecosystem as a whole is becoming fairly diverse. Lots of end points are publishing (i.e. creating encodes) and then many end points are offering ways to use the data streams.

In turn, this diversity of the emerging ecosystem is, I believe, an indicator of its health. Monocultures aren’t very resilient to change; ecosystems tend to be more resilient and adaptable. For me, these few data points suggest that the real-time stream is becoming more and more interesting and more and more diverse.

  • http://kortina.net kortina

    I am glad to see you mention Google Reader in this conversation–it's an easily overlooked member of the stream category ( I actually read your post on Google Reader mobile in blackberry browser because I don't have a good twitter client and browsing any stream is the way I spend downtime on my phone–these streams are definitely of the same category, behaviorally for me.)

    I was thinking about this after reading your post, and I believe that Twitter has actually taught me how to use Google Reader. I used to think of Google Reader like my email inbox, because there are read/unread statuses with each article. The overwhelming number of messages on twitter and lack of read / unread status, however, made me realize that you cannot consume entire streams like you would do with email. You just jump in, browse around the recent stuff, and jump back out. Streams are reminding us just how much data is flowing through our lives everyday and that attention is a choice given only to some bits of data.

    I imagine we'll become acclimated, again, to deferring more attention choices to editors / curators of content as we become dissatisfied with just turning on the stream and seeing what's at the top–we're going to need some way to filter/reduce the flow of these streams to something that fits the finite bits of time we devote to these floods of data passing by us.

    • http://leftovertakeout.com gbattle

      You are right, Kortina.

      The largest publisher/subscriber broadcast streaming ecosystem is television. USENET, feed readers, and Twitter all follow the beauty of televisions' asymmetrical sharing of information where on-demand consumption values urgency over comprehensiveness.

    • Johnborthwick

      Kortina I agree. Most feed readers used the metaphor of email inboxes — messages that you were meant to read, vs. streams to browse. My guess is that some feed readers are going to get a whole new UX in the coming year.

      • coloncleanse34

        Yea good points john

  • http://leftovertakeout.com gbattle

    John,

    I too am glad to see you write about diversity within the real-time stream. However, you've approached the conversation with a huge assumption regarding bit.ly – that all of the encodes/decodes represent real-time activity which I believe is false. A bit.ly encode doesn't define a real-time action or even real-time intentionality – the channels where the shortened URLs are shared and clicked does. I would imagine that a growing number of bit.ly encodes are merely for URL tracking through traditional channels or just the condensed URL footprint, hence, we should be very careful about what bit.ly data approximates.

    - Is there a more accurate consumption breakdown of the channels today?
    - How has channel consumption changed over time? (growth, momentum, moving average, etc.)
    - What channels outside of real-time channels have seen growth? (Email, RSS, forums, blogs, etc.)
    - What type of directionality is there for cross-channel sharing? (Virality from channel to channel)

    I believe that a little more transparency regarding diversity would not only provide people with insight into channel growth and usage, but also inspiration for valuable new businesses to innovate upon the bit.ly stack.

    • Johnborthwick

      Greg to your first point — you are correct some encodes and decodes arent in the real time stream — but most of them are. I would love to see the data you ask for in the dash points — most of it isnt available. The API at bit.ly is simple and it doesnt require as service provider to register — so its only a subset of partners who register. Similarly if you shorten something on bit.ly's web site and then cut / copy / paste it elsewhere (ie: email) we can track where the encode was created but not where it was shared. I will share more data as we figure out how to collect it.

      • http://leftovertakeout.com gbattle

        Thanks JB. Looking forward to more data.

  • http://twitter.com/DAMONVICKERS Damon Vickers

    John!!!!

    haven't seen you in ages…
    great to see you on CNBC-

    damon

  • erandomthree

    Good points JB and its nice to see you're back!

  • Pingback: THINK / Musings» Blog Archive » charting the real time web OR the curious tale of how TechCrunch traffic inexplicably fell off a cliff in December

  • Pingback: THINK / Musings» Blog Archive » Ongoing tracking of the real time web …

  • http://www.titidirectonline.co.uk/ski-snowboard-goggles skiing goggles

    great stuff

  • http://www.titidirectonline.co.uk/ski-snowboard-goggles skiing goggles

    great stuff

  • http://www.louisvuittonhouse.com/ louis vuitton handbags

    I believe that a little more transparency regarding diversity would not only provide people with insight into channel growth and usage, but also inspiration for valuable new businesses to innovate upon the bit.ly stack.

  • http://www.adultsexuk.co.uk Sex Uk

    Given its scale bit.ly’s data is a fair proxy for the activity of the real-time stream, at least of the links in the stream.