Larrying Wikipedia

If Wikipedia were a for-profit company, what would it be worth? It’s a question I’ve been thinking about – and, apparently, others have, too. HipMojo did some back-of-the-envelope calculations last Thursday and came with $580 million as an estimated hypothetical valuation. That got Jason Calacanis drooling, and he upped HipMojo’s valuation by a factor of ten, writing that the online encyclopedia would be worth a cool $5 billion.

Calacanis’s number is insane, but there’s no doubt that, in today’s market, a for-profit, ad-running Wikipedia would be worth a whole lot – and then some.

Just consider the latest Comscore report on the world’s top ten web sites. Wikipedia comes in at #6, having attracted 155 million unique visitors during the month of September. More striking still is that its traffic increased by a whopping 12% over August’s total – the same rate of increase posted by YouTube, the #14 site. But Wikipedia’s growth is much more impressive than YouTube’s because Wikipedia was starting from a much larger base – twice as large, in fact.

Beyond its huge popularity, Wikipedia is by its nature a search-advertiser’s dream. Visitors to the site, after all, are seeking information about particular topics, many of which – from diseases to porn stars – are also highly prized targets for search advertisers. (Compare Wikipedia to, say, a site like Digg, where users are not seeking out information on a particular topic, and individual pages present a hodge-podge of information that frustrates ad-placement algorithms.) If you do a Google search on a high-priced AdWords keyword like “asbestos,” Wikipedia comes up #3 and it’s the first nongovernmental result. It’s also #3 for “lasik,” #3 for “lawyer,” #3 for “Viagra,” #4 for “mesothelioma,” #6 for “poker,” and #2 for “sex.” And, of course, it’s #1 for hundreds of other terms, including “lawsuit” and “personal computer.”

For a search company like Google, in other words, Wikipedia would almost certainly be the most valuable single property for displaying search ads on the entire internet (with the exception, of course, of its own search site). Because of Wikipedia’s increasing dominance over search results for common terms, moreover, its lack of ads turns it into a vast black hole for Google and other search-ad syndicators. It sucks in huge numbers of web surfers without spitting out any ad revenue whatsoever. It’s not hard to see, therefore, how valuable a property it would become if it began to run ads.

I trust the Wikipedians will defend the company’s nonprofit status, and I hope they resist Calacanis’s fatuous suggestion that it’s “unconscionable to not monetize the Wikipedia” by sticking ads on the site and giving the money to charity. By that logic, you’d put billboards along the sides of the Grand Canyon – and on Calacanis’s forehead. There’s nothing immoral about choosing not to commercialize something.

But the huge theoretical value of Wikipedia’s site does makes me wonder why no devious entrepreneurs have made a concerted effort to take Wikipedia’s content, which is of course free to be reused in any way, and reformat and rebrand it as an attractive commercial site. Why, in other words, hasn’t anyone done to Wikipedia what Larry Ellison last week did to RedHat?

It’s not like Wikipedia’s site is perfect. The design’s mediocre, the user interface is often confusing, and the search tool is just plain awful. While it’s true that some sites, like Answers.com, already syndicate Wikipedia’s content, they mix it up with a whole bunch of other stuff and don’t really add any value beyond the existing Wikipedia site. I have to believe that a couple of talented coders could pretty easily hack together an improved version of Wikipedia, with a better design, a better interface, and a much better search engine. They could also strip out the site’s endless supply of arcana – the history and talk pages, for instance, and all those “warning” boxes – which are beloved by the Wikipedians but are just distracting screen junk for pretty much everyone else on earth. That would simplify the site and also reduce storage costs. Then, once the site was up, you’d advertise the hell out of it, positioning it as a superior version of the free encyclopedia.

Even if you siphoned off just a fraction of Wikipedia’s traffic, it would still be a pretty darn lucrative site – and, if it really was easier to use than Wikipedia, the traffic would tend to grow naturally. Now, granted, such a move would be a pretty slimy thing to do – and would no doubt earn the entrepreneurs some really bad karma. But, hey, it’s not like the market’s invisible hand is known for the gentleness of its caress. So tell me: What am I missing? Why hasn’t anyone made a real effort to do a Larry on Wikipedia?

15 thoughts on “Larrying Wikipedia

  1. Greg L

    It has been monetized (via copying the content) though not comprehensively. Take a snippet from any Wikipedia entry and run it on Google.

  2. JasonCalacanis

    Actaully, Answers.com is a fork of the wikipedia and it comes up right behind it in Google. They make money off of Wikipedia and in return Wikipedia gets some link love and branding. Answers.com has the problem of their pages being out of date really quick because there is no API for Wikipedia for people to pull live data from. I think Wikipedia will start charging folks like Answers for professional services/servers to ping for changes/updates.

    Also, I shouldn’t have used the word unconscionable in my post–I was talking about from MY perspective, but that is a harsh word. I could have said it better.

    Anyway, I think the real solution is to let USERS of the wikipedia decide for themselves. Put a leaderboard up and let folks click a link right under it to “Turn of Advertising.”

    I just wrote a post about this at http://www.calacanis.com

  3. Raphaël Labbé

    It will be a difficult win because wikipedia has win the battle using the best Search Engine Optimization ever : content !

    All the lexical terms around one word are generally inside the article.

    One could win the battle using a different approach, i will bet on mobile.

    The cell is the best place because it’s easier to use bookmark than search and it’s the hell for presentation.

    Calacanis is perfectly rigth about future monetization. I woud be glad to pay for easy access to content for my company U.[lik]. I am not hopping to compete directly as the content distributor, but rather see wikipidea as a huge + and put a service on top.

    By the way if you come to paris I’ll be glad to pay you a drink and have a cool chat with one of the best disruptive voice on the internet.

    Leafar

  4. Michael Heraghty

    Good question. I guess, as Raphaël seemed to be suggesting (?), it would be hard to out-SEO Wikipedia.

    SEO folklore tell us that the Google algorithm filters out duplicate content. Where close similarities are found, whoever published first (according to Google’s index) “wins”.

    Any site trying to leverage Wikipedia’s content would probably have to significantly alter it (or, more to the point, have users significantly alter it) before it could compete in the search listings.

    It would also need to get zillions of relevant links, and to build quality traffic by this and other means.

    There’s also the Wikipedia “brand” to contend with.

    Those are the hurdles. But, if a daring person had enough resources and conviction to get behind such a project and push hard, who knows…?

  5. Greg L

    Would it surprise or dismay anyone if Google actually explicitly tweaked its algorithm to favor Wikipedia as a source of information? I realize the point above that this isn’t in Google’s financial interests vis-a-vis AdSense revenues, but I think it is entirely consistent with earlier expressions of Google’s philosophy. I find the high prominence of Wikipedia in many query results hard to explain wihout the assumption that the entire domain is garnering some kind of special weight in the rankings.

  6. Anonymous

    Great point, Nick. It seems like someone could do a mashup of Wikipedia, a dictionary, and a little other online content and have a pretty popular site. That is not far from what Answers.com is after all.

    Another thing that surprises me is that there aren’t more attempts to syphon traffic from Wikipedia by editing Wikipedia content. For example, I could see someone using a robot to add subtle spammy links to millions of pages across Wikipedia. I doubt the review system could handle a massive number of manipulative edits from many IP addresses if the changes could not immediately be dismissed as spam.

  7. Nick Carr

    SEO folklore tell us that the Google algorithm filters out duplicate content. Where close similarities are found, whoever published first “wins”.

    Yes, I’m sure you’re right – you’d be at a disadvantage Googlewise, at least at the start.

    There’s also the Wikipedia “brand” to contend with.

    I’m not convinced that would be a huge barrier – but, as I mentioned, you’d need an effective ad campaign to get over it.

    Any site trying to leverage Wikipedia’s content would probably have to significantly alter it

    But that would undermine the elegance of the Larry model. I think you’d want to have locked content, sucked in directly from Wikipedia, with no opportunity for users to edit it. It wouldn’t be a wiki; it would be just an information site. What % of wikipedia visitors edit the content (or even care about the fact that they could edit the content) – it has to be a very, very, very small number.

    Anyway, I think the real solution is to let USERS of the wikipedia decide for themselves. Put a leaderboard up and let folks click a link right under it to “Turn of Advertising.”

    Again, the % of users who would take the trouble to override the default is so small that it’s hardly woth discussing. The vast majority of users aren’t USERS, they’re just users.

    I could see someone using a robot to add subtle spammy links to millions of pages across Wikipedia. I doubt the review system could handle a massive number of manipulative edits from many IP addresses if the changes could not immediately be dismissed as spam.

    That’s a nasty thought, but, like you, I’m surprised that spammers haven’t tried it.

    Would it surprise or dismay anyone if Google actually explicitly tweaked its algorithm to favor Wikipedia as a source of information?

    It would definitely surprise me.

  8. Seth Finkelstein

    I’ve actually been studying these issues from the SEO point of view, and come to two tentative conclusions:

    1) Google HAS NOT tweaked its algorithm to favor Wikipedia. But Wikipedia’s prominence is *consistent* with triggering several weaknesses in the ranking system. It’s not quite “gaming”, but it’s kind of an unhealthy co-dependency in ranking factors.

    2) Wikipedia-duplication-spamming faces a lot of problems from Google’s similarity filtering. I call this “Google Killed The Wikipedia Fork”. It’s a lot harder than it looks to do that sort of copy-and-grab-traffic.

    I’ve been wondering if it’s worth writing this up in a formal report – downside, The Wrath Of The Wikipedians can be unpleasant, is there an upside?

  9. Phil

    Why, in other words, hasn’t anyone done to Wikipedia what Larry Ellison last week did to RedHat?

    Oh, that Larry. I was still waiting for the Citizendium reference.

    Seth: I call this “Google Killed The Wikipedia Fork”

    Ouch. Perhaps this is a Citizendium post after all.

  10. Raphaël Labbé

    Michael, you’re correct about my point. I was meaning out-SEO. Greg’s point seems accurate to me , increasing the challenge in the same time (you have to by pass google then !). But please Seth if you have a full report I would be a happy reader. The upside seems Knowledge ;-) and my gratitude.

    When crawling the web to check inputs from users we have over-rated wikipedia weight, it seems natural and google may have done it a long time ago (when wikipedia passed the 1’000 Alexa rank Milestone in jan-04 / The traffic jumped from 1M page view to 5M PV in one month). And that does not match with Wikipedia’s Google Trends

    Put a service on top of wikipedia is definitly the way to solve this problem. U.[lik] is about taste, wikipedia is cold information when entertainement is not. And there are a lot of other possible examples (Talk to a doctor about wikipedia). But I think that keeping the edit possibility is a way to get to your version of wikipedia and maybe win on some articles.

    And since We are in the recommandation area as Greg, let me suggest building something that is usefull to Wikipedia readers as article suggestion. Something with a passive CF ;-)

    Leafar

  11. Nick Carr

    Seth, I’d certainly like to hear more about the “unhealthy co-dependency in ranking factors.” Nick

  12. Greg L

    Seth —

    Your ALL CAPS response didn’t seem exactly tentative — how can you be so sure what in Google’s algo? And, in any event, as Nick says, please do tell about the “unhealthy co-dependency” thing…

  13. Seth Finkelstein

    Greg – by coincidence, take a look at the current post “Evidence of attraction” Google 81%, Yahoo 77%.

    It’s *NOT* (yup, all-caps) that there’s a special we-love-Wikipedia flag anywhere (unless you want to say that Yahoo has it too). It’s in the weightings of the page ranking factors. Where the unhealthy co-dependency comes in is that both Wikipedia and Google have a kind of mutual self-interest in “pushing buttons” on certain ranking factors … I’m not explaining this well, it really needs some SEO explained first.

  14. Chris_B

    I’m surprised that spammers haven’t tried it.

    From what I understand, Wikipedia is surpisingly bot resistant, but only of a certain type of bot. Were someone to code up a slow acting DDOS style distributed Wikipedia spam bot and infect enough PCs, it might be possible.

    The Wrath Of The Wikipedians can be unpleasant

    Thousands of unwashed tubby zealots waving their cupcake stained fists in anger but soon being distracted and returning to their most beloved persuit of “wikifying” another article on TV trivia.

    Humor aside, I can see lots of reasons why its a very bad idea to try to sell ads off Wikipedia. Due to all of the problems of unreliability, no “stable version”, constant edit wars, potential liability, plagarism, etc., who would actually pay to advertise there except adult sites? IMNSHO, no publicly owned company should touch this idea with a 10 foot pole.

Comments are closed.