January 2010
Mon Tue Wed Thu Fri Sat Sun
« Oct   Mar »
 123
45678910
11121314151617
18192021222324
25262728293031

Month January 2010

Ongoing tracking of the real time web …

The last post that I did about real time web data mixed data with a commentary and a fake headline about how data is sometimes misunderstood in regards to the real time web.    This post repeats some of that data but the focus of the post is the data.   I will update the post periodically with relevant data that we see at betaworks or that others share with us.   To that end this post is done in reverse order with the newest data on top.

Tracking the real time web data

The measurement tools we have still only sometimes work for counting traffic to web pages and they certainly dont track or measure traffic in streams let alone aggregate up the underlying ecosystems that are emerging around these new markets.  At betaworks we spend a lot of time looking at and tracking this underlying data set.   It’s our business and its fascinating.   Like many companies each of the individual businesses at betaworks have fragments of data sets but because betaworks acts as ecosystem of companies we can mix and match the data to get results that are more interesting and hopefully offer greater insight

——————————-

(i) tumblr growth for the last half of 2009

Another data point re: growth of the real time web through the second half of last year through to Jan 18th of this year.  tumblr continues to kill it.     I read this interesting post yesterday about how tumblr is leading in its  category through innovation and simple, effective, product design.   The compete numbers quoted in that post are less impressive than these directly measured quantcast numbers.


(h) Twitter vs. the Twitter Ecosystem

Fred Wilson’s post adds some solid directional data on the question of the size of the ecosystem.   “You can talk about Twitter.com and then you can talk about the Twitter ecosystem. One is a web site. The other is a fundamental part of the Internet infrastructure. And the latter is 3-5x bigger than the former and that delta is likely to grow even larger.”

(g) Some early 2010 data points re: the Real Time Web

  • Twitter: Jan 11th was the highest usage day ever (source: @ev via techcrunch)
  • Tweetdeck: did 4,143,687 updates on Jan 8, yep 4m. Or, 48 per second (source: Iain Dodsworth / tweetdeck internal data)
  • Foursquare: Jan 9th biggest day ever.    1 update or check-in per second (source: twitter and techcrunch)
  • Daily Booth: in past 30 days more than 10mm uniques (source: dailybooth internal data)
  • bit.ly: last week was the largest week ever for clicks on bit.ly links. 564m were clicked on in total. On the Jan 6th there were a record of 98m decodes.    1100 clicks every second.

(f) Comparing the real time web vs. Google for the second half of 2009

Andrew Parker commented on the last post that the chart displaying the growth trends was hard to decipher and that it maybe simpler to show month over month trending.  It turns out the that month over month is also hard to decipher.   What is easier to read is this summary chart.    It shows the average month over month growth rates for the RT web sites (the average from Chart A).   Note 27.33% is the average growth rate for the real time web companies in 2009 — that’s astounding.    The comparable number for the second half of 2009 was 10.5% a month — significantly lower but still a very big number for m/m growth.

(e) Ongoing growth of the real time stream in the second half of 2009

This is a question people have asked me repeatedly in the past few weeks.  Did the real time stream grow in Q4 2009?    It did.    Not at the pace that it grew during q1-q3, but our data at betaworks confirms continued growth.   One of the best proxies we use for directional trending in the real time web are the bit.ly decodes.   This is the raw number of bit.ly links that are clicked on across the web.    Many of these clicks occur within the Twitter ecosystem, but a large number are outside of Twitter, by people and by machines — there is a surprising amount of diversity within the real time stream as I posted about a while back.

Two charts are displayed below.    On the bottom are bit.ly decodes (blue) and encodes (red)  running through the second half of last year.    On the top is a different but related metric.   Another betaworks company is Twitterfeed.    Twitterfeed is the leading platform enabling publishers to post from their sites into Twitter and Facebook.    This chart graphs the total number of feeds processed (blue) and the total number of publishers using Twitterfeed, again through the second half of the year (note if the charts inline are too small to read you can click though and see full size versions).   As you can see similar the left hand chart — at Twitterfeed the growth was strong for the entire second half of 2009.

Both these charts illustrate the ongoing shift that is taking place in terms of how people use the real time web for navigation, search and discovery.    My preference is to look at real user interactions as strong indicators of user behavior.   For example I actually find Google trends more useful often than comScore, Compete or the other “page” based measurement services.   As interactions online shift to streams we are going to have to figure out how measurement works. I feel like today we are back to the early days of the web when people talked about “hits” — it’s hard to parse the relevant data from the noise.  The indicators we see suggest that the speed at which this shift to the real time web is taking place is astounding.   Yet it is happening in a fashion that I have seen a couple of times before.

(d) An illustration of the step nature of social growth. bit.ly weekly decodes for the second half of 2009.

Most social networks I have worked with have grown in a step function manner.  You see this clearly when you zoom into the bit.ly data set and look at weekly decodes, illustrated above.   You often have to zoom in and out of the data set to see and find the steps but they are usually there.     Sometimes they run for months — either up or sideways.    You can see the steps in Facebook growth in 2009.    I saw effect up close with ICQ, AIM, Fotolog, Summize and now with bit.ly.   Someone smarter than me has surely figured out why these steps occur.    My hypothesis is that as social networks grow they jump in a sporadic fashion from one dense cluster of relationships to a new one.   The upward trajectory is the adoption cycle of that new, dense cluster and the flat part of the step is the period between the step to next cluster.     Blended in here there are clearly issues of engagement vs. trial.   But it’s hard to weed those out from this data set.   As someone mentioned to me in regards to the last post this is a property of scale-free networks.

(c) Google and Amazon in 2009

Google and Amazon — this is what it looked like in 2009:

It’s basically flat.     Pretty much every user in the domestic US is on Google for search and navigation and on Amazon for commerce — impressive baseline numbers but flat for the year (source: Quantcast).  So then lets turn to Twitter.

(b) Twitter – an estimate of Twitter.com and the Twitter ecosystem

Much ink has been spilt over Twitter.com’s growth in the second half of the year.   During the first half of the year Twitter’s experience hyper growth — and unprecedented media attention.    In the second half of the year the media waned, the service went through what I suspect was a digestion phase — that step again?     Steps aside — because I dont in anyway seek to represent Twitter Inc. — there are two questions that in my mind haven’t been answered fully:

(i) what international growth in the second half of 2009?, that was clearly a driver for Facebook in ’09.  Recent data suggests growth continued to be strong.

(ii) what about the ecosystem.

Unsurprisingly its the second question that interests me the most.    So what about that ecosystem?    We know that approx 50% of the interactions with the Twitter API occur outside of Twitter.com but many of those aren’t end user interactions.     We also know that as people adopt and build a following on Twitter they often move up to use one of the client or vertical specifics applications to suit their “power” needs.   At TweetDeck we did a survey of our users this past summer.     The data we got suggested 92% of them then use Tweetdeck everyday — 51% use Twitter more frequently since they started using TweetDeck.  So we know there is a very engaged audience on the clients.     We also know that most of the clients arent web pages — they are flash, AIR, coco, iPhone app’s etc. all things that the traditional measurement companies dont track.

What I did to estimate the relative growth of the Twitter ecosystem is the following.   I used Google Trends and compiled data for Twitter and the key clients.    I then scaled that chart over the Twitter.com traffic.   Is it correct? — no.   Is it made up? — no.   It’s a proxy and this is what it looks like (again, you can click the chart to see a larger version).

Similar to the Twitter.com traffic you see the flattening out of the ecosystem in the summer.    But you see growth in the forth quarter that returns to the summer time levels.     I suspect if you could zoom in and out of this the way I did above you would see those steps again.

(a) The Real Time Web in 2009

Add in Facebook (blue) and Meebo (green) both steaming ahead — Meebo had a very strong end of year.    And then tile on top the bit.ly data and the Twitterfeed numbers (bit.ly on the right hand scale) and you have an overall picture of growth of the real time web vs. Google and Amazon.   As t

charting the real time web
OR
the curious tale of how TechCrunch traffic inexplicably fell off a cliff in December

For a while now I have been thinking about doing a post about some of the data we track at betaworks.   Over the past few months people have written about Twitter’s traffic being up, down or sideways — the core question that people are asking is the real time web growing or not, is this hype or substance?     Great questions — the answer to all of the above is from the data set I see: yes.   Adoption and growth is happening pretty much across the board — and in some areas its happening at an astounding pace.    But tracking this is hard.   It’s hard to measure something that is still emerging.    The measurement tools we have still only sometimes work for counting traffic to web pages and they certainly dont track or measure traffic in streams let alone aggregate up the underlying ecosystems that are emerging around these new markets.  At betaworks we spend a lot of time looking at and tracking this underlying data set.   It’s our business and its fascinating.

I was inspired to finally write something by first a good experience and then a bad one.    First the good one.    Earlier this week I saw a Tweet from Marshall Kirkpatrick about Gary Hayes’s social media counter.    It’s  very nicely done — and an embed is available.     This is what it looks like (note the three buttons on top are hot, you can see the social web, mobile and gaming):

The second thing was less fun but i’m sure it has happened to many an entrepreneur.    I was emailed earlier this week by a reporter asking about some data – I didnt spend the time to weed through the analysis and the reporter published data that was misleading.    More on this incident later.

Lets dig into some data.    First — addressing the question people have asked me repeatedly in the past few weeks.  Did the real time stream grow in Q4 2009?    It did.    Not at the pace that it grew during q1-q3, but our data confirms continued growth.   One of the best proxies we use for directional trending in the real time web are the bit.ly decodes.   This is the raw number of bit.ly links that are clicked on across the web.    Many of these clicks occur within the Twitter ecosystem, but a large number are outside of Twitter, by people and by machines — there is a surprising amount of diversity within the real time stream as I posted about a while back.  Two charts are displayed below.    On the left there are bit.ly decodes (blue) and encodes (red)  running through the second half of last year.    On the right is a different but related metric.   Another betaworks company is Twitterfeed.    Twitterfeed is the leading platform enabling publishers to post from their sites into Twitter and Facebook.    This chart graphs the total number of feeds processed (blue) and the total number of publishers using Twitterfeed, again through the second half of the year (note if the charts inline are too small to read you can click though and see full size versions).   As you can see similar the left hand chart — at Twitterfeed the growth was strong for the entire second half of 2009.

Both these charts illustrate the ongoing shift that is taking place in terms of how people use the real time web for navigation, search and discovery.    My preference is to look at real user interactions as strong indicators of user behavior.   For example I actually find Google trends more useful often than comScore, Compete or the other “page” based measurement services.   As interactions online shift to streams we are going to have to figure out how measurement works. I feel like today we are back to the early days of the web when people talked about “hits” — it’s hard to parse the relevant data from the noise.  The indicators we see suggest that the speed at which this shift to the real time web is taking place is astounding.   Yet it is happening in a fashion that I have seen a couple of times before.

Most social networks I have worked with have grown in a step function manner.  You see this clearly when you zoom into the bit.ly data set and look at weekly decodes.   This is less clear but also visible when you look at daily trending data (on the right) — but add a 3 week moving average on top of that and you can once again see the steps.   You often have to zoom in and out of the data set to see and find the steps but they are usually there.     Sometimes they run for months — either up or sideways.      I saw this with ICQ, AIM, Fotolog, Summize through to bit.ly.   Someone smarter than me has surely figured out why these steps occur.    My hypothesis is that as social networks grow they jump in a sporadic fashion to a new dense cluster or network of relationships.   The upward trajectory is the adoption cycle of that new, dense cluster and the flat part of the step is the period between the step to next cluster.     Blended in here there are clearly issues of engagement vs. trial.   But it’s hard to weed those out from this data set.   I learnt a lot of this from Yossi Vardi and Adam Seifer.    Two people I had the privilege of working with over the years — two people whose DNA is wired right into this stuff.  At Fotolog Adam could take the historical data set and illustrate how these clusters moved — in steps — from geography to geography, its fascinating.

TechCrunch falls off a cliff

Ok I’m sure there are some people reading who are thinking — well this is interesting but I actually want to read about TechCrunch falling off a traffic cliff.   I’m sorry – I actually don’t have any data to suggest that happened.  After noting yesterday that provocative headline is  sometimes a substitute for data I thought — heck I can do this too!    This section of the post is more of a cautionary tale — if you are confused by this twist let me back up to where I started.   I mentioned that there were two motivations for me sitting down and writing this post.   The second one was that earlier this week  TechCrunch story ran this week saying that bit.ly market share had shifted dramatically.     It hasn’t.   The data was just misunderstood by the reporter.   The tale (I did promise a tale) began last August when TechCrunch ran the following chart about the market share of URL shorteners.

The pie chart showed the top 5 URL shorteners and then calculated the market share each had  — what percent each was *of* the top five.     The  data looks like this:

bit.ly 79.61%
TinyURL 13.75%
is.gd 2.47%
ow.ly 2.26%
ff.im 1.92%
(79.61+13.75+2.47+2.26+1.92 = 100)
The comparable data from yesterday is:

bit.ly = 75%
TinyURL = 10%
ow.ly = 6%
is.gd = 4%
tumblr = 4%
(again this adds up to 100%)

Not much news in those numbers, especially when you consider they come from the Twitter “garden hose” (a subset of all tweets) and swing by as much as +/- 5% daily.   The tumblr growth into the top 5 and the ow.ly bump up is nice shift for them – but not really a story.     The hitch was that the reporter didn’t consider that there are other URL’s in the Twitter stream aside from these five.   Some are short URL’s and some aren’t.   So this metric doesn’t accurately reflect overall short URL market share — it shows the shuffling of market share amongst the top five.   But media will be media.   I saw a Tweet this week about how effective Twitter is at disseminating information — true and false — despite all the shifts that are going on headlines in a sense carry even more weight than in the “read all about it” days.

The lesson here for me was the importance of helping reporters and analysts get access to the underlying data — data they can use effectively.   We sent the reporter the  data but he saw a summary data set that included the other URL’s and didn’t understand that back in August there were also “other” URL’s.   After the fact we worked to sort this out and he put a correction in his post.   But the headline was off and running — irrespective of how dirty or clean the data was.   Basic mistake — my mistake — and this was with a reporter who knows this stuff well.   Given the paucity of data out there and the emergent state of the real time web  this stuff is bound to happen.

Ironically, yesterday, bit.ly hit an all time high in terms of decodes — over 90m.   But back to the original question — there is a valid question the reporter was seeking to understand, namely: what is the market share of dem short thingy’s?      We track this metric — using the Twitter garden hose and identifying most of the short URL’s to produce a ranking (note its a sample, so the occurrences are a fraction of the actuals).     And it’s a rolling 24 hr view — so it moves around quite a bit — but nonetheless it’s informative.  This is what it looked like yesterday:

Over time this data set is going to become harder to use for this purpose.    At bit.ly we kicked off our white label service before the holidays.   Despite months of preparation we weren’t expecting the demand.   As we provision and setup the thousands of publishers, blogger and brands who want white label services its going to result in a much more diverse stream of data in the garden hose.

Real Time Web Data

Finally I thought it would be interesting to try to get a perspective on the emergence of the real time web in 2009 — how did its growth compare and contrast with the incumbent web category leaders?    Let me try to frame up some data around this.   Hang in there, some of the things I’m going to do are hacks (at best) — as I said I was inspired!   Lets start with the user growth in the US among the current web leaders — Google and Amazon — this is what it looked like in 2009:

It’s basically flat.     Pretty much every user in the domestic US is on Google for search and navigation and on Amazon for commerce — impressive baseline numbers but flat for the year (source: Quantcast).  So then lets turn to Twitter.    Much ink has been spilt over Twitter.com’s growth in the second half of the year.   During the first half of the year Twitter’s growth, I suspect, was driven to a great extent by the unprecedented media attention it received — media and celebrities were all over it.    Yet in the second half of the year that waned and the traffic numbers to the Twitter.com web site were flat for the second half of the year.    That step issue again?

Placing steps aside — because I dont in anyway seek to represent Twitter Inc. — there are two questions that haven’t been answered  (a) what about international growth, that was clearly a driver for Facebook in ’09, where was Twitter internationally?   (b) what about the ecosystem.     Unsurprisingly its the second question that interests me the most.    So what about that ecosystem?

We know that approx 50% of the interactions with the Twitter API occur outside of Twitter.com but many of those aren’t end user interactions.     We also know that as people adopt and build a following on Twitter they often move up to use one of the client or vertical specifics applications to suit their “power” needs.   At TweetDeck we did a survey of our users this past summer.     The data we got suggested 92% of them then use Tweetdeck everyday — 51% use Twitter more frequently since they started using TweetDeck.  So we know there is a very engaged audience on the clients.     We also know that most of the clients arent web pages — they are flash, AIR, coco, iPhone app’s etc. all things that the traditional measurement companies dont track.

What I did to estimate the relative growth of the Twitter ecosystem is the following.   I used Google Trends and compiled data for Twitter and the key clients.    I then scaled that chart over the Twitter.com traffic.   Is it correct? — no.   Is it made up? — no.   It’s a proxy and this is what it looks like (again, you can click the chart to see a larger version):

Similar to the Twitter.com traffic you see the flattening out in the summer.    But similar to the data sets referenced above you see growth in the forth quarter.     I suspect if you could zoom in and out of this the way I did above you would see those steps again.     So lets put it all together!    Its one heck of a busy chart.   Add in Facebook (blue) and Meebo (green) both steaming ahead — Meebo had a very strong end of year.    And then tile on top the bit.ly data and the Twitterfeed numbers (both on different scales) and you have an overall picture of growth of the real time web vs. Google and Amazon.

Ok.   One last snap shot then im wrapping up.    Chartbeat — yep another betaworks company — had one of its best weeks ever this past week — no small thanks to Jason’s Calacanis’s New Year post about his Top 10 favorite web products of 2009.   To finish up here is a video of the live traffic flow coming into Fred Wilson’s blog at AVC.com on the announcement of the Google Nexus one Phone.    Steve Gilmore mentioned the other week how sometimes interactions in the real time web just amaze one.    Watching people swarm to a site is a pretty enthralling experience.    We have much work to do in 2010.    Some of it will be about figuring out how to measure the real time web.   Much of it will be continuing to build out the real time web and learning about this fascinating shift taking place right under our feet.

random footnote:

A data point I was sent this am by Iain that was interesting — yet it didnt seem to fit in anywhere?!   Asian twitter clients were yesterday over 5% of the requests visible in the garden hose.