diversity within the real time stream

I got a call on Friday from a journalist at the Financial Times who was writing on the Twitter ecosystem. We had an interesting conversation and he ran his piece over the weekend Twitter branches out as London’s ‘ecosystem’ flies.

As the title suggests the focus was on the Twitter ecosystem in London.    Our conversation also touched on the overall size and health of the real-time ecosystem — this topic didn’t make it into the article. It’s hard to gauge the health of a business ecosystem that is still very much under development and has yet to mature into one that produces meaningful revenues. Yet the question got me thinking — it also got me thinking that it has been a while since I had posted here. It was one busy summer. I have a couple of long posts I’m working on but for now I want to do this quick post on the real-time ecosystem and in it offer up some metrics on its health.

Back in June I did a presentation at Jeff Pulver’s 140conf, the topic of which was the real-time / Twitter ecosystem.   Since then, I have been thinking about the diversity of data sources, notably the question of where people are publishing and consuming real-time data streams. At betaworks we are fairly deep into the real time / Twitter ecosystem.  In fact, every company at betaworks is a participant, in one manner or another, in this ecosystem, and that’s a feature, not a bug! Of the 20 or so companies in the betaworks network, there is a subset that we we operate; one of those is bit.ly.

2puffsIn an attempt to answer this question about the diversity of the ecosystem, let me run through some internal data from bit.ly.   bit.ly is a URL shortener that offers among other things real-time tracking of the clicks on each link (add “+” to any bit.ly URL to see this data stream).   With a billion bit.ly links clicked on in August — 300m last week — bit.ly has become almost part of the infrastructure of the real time cloud.  Given its scale bit.ly’s data is a fair proxy for the activity of the real-time stream, at least of the links in the stream.

On Friday of this week (yesterday) there were 20,924,833 bit.ly links created across the web (we call these “encodes”). These 20.9m encodes are not unique URL’s, since one popular URL might have been shortened by multiple people. But each encode represents intentionality of some form. bit.ly in turn retains a parent : child mapping, so that you can see what your sharing of a link generates vs. the population (e.g., I shared a video on Twitter the other day; my specific bit.ly link got 88 clicks, out of a total of 250 clicks on any bit.ly link to that same video.  see http://bit.ly/Rmi25+).

So where were these 20.9m encodes created? Approximately half of the encodes took place within the Twitter ecosystem. No surprise here: Twitter is clearly the leading public, real-time stream and about 20% of the updates on Twitter contain at least one link, approx half of which are bit.ly links.   But here is something surprising: less than 5% of the 20.9m came from Twitter.com (i.e., from Twitter’s use of bit.ly as the default URL-shortener). Over 45% of the total encodes came from other services associated in some way with Twitter – i.e. the Twitter ecosystem — a long and diverse list of services and companies within the ecosystem who use bit.ly.

The balance of the encodes came from other areas of the real time web, outside of Twitter. Google Reader incorporated bit.ly this summer, as did Nokia, CBS, Dropbox, and some tools within Facebook. And then of course people use the bit.ly web site — which has healthy growth — to create links and then share them via instant-messaging services, MySpace, email, and countless other communications tools.

The bit.ly links that are created are also very diverse. Its harder to summarise this without offering a list of 100,000 of URL’s — but suffice it to say that there are a lot of pages from the major web publishers, lots of YouTube links, lots of Amazon and eBay product pages, and lots of maps. And then there is a long, long tail of other URL’s. When a pile-up happens in the social web it is invariably triggered by link-sharing, and so bit.ly usually sees it in the seconds before it happens.

This data says to me that the ecosystem as a whole is becoming fairly diverse. Lots of end points are publishing (i.e. creating encodes) and then many end points are offering ways to use the data streams.

In turn, this diversity of the emerging ecosystem is, I believe, an indicator of its health. Monocultures aren’t very resilient to change; ecosystems tend to be more resilient and adaptable. For me, these few data points suggest that the real-time stream is becoming more and more interesting and more and more diverse.

Trackbacks

  1. [...] and by machines — there is a surprising amount of diversity within the real time stream as I posted about a while back.  Two charts are displayed below.    On the left there are bit.ly decodes [...]

  2. [...] and by machines — there is a surprising amount of diversity within the real time stream as I posted about a while [...]