It’s funny how a set of instructions – an algorithm – written by people can come to be granted, by those same people, a superhuman authority. As if a machine fashioned by man should, upon trembling into motion, shed its earthly origin and assume a god-granted imperium, beyond our small-minded questioning.
Last week, CNET’s Elinor Mills reported on how a web search for “Martin Luther King” returns, as its first result on Google and as its second result on Windows Live Search, a web site (martinlutherking.org) operated by a white supremacist organization named Stormfront. The site, titled “Martin Luther King Jr.: A True Historical Examination,” refers to King as “The Beast” and says he was “just a sexual degenerate, an America-hating Communist, and a criminal betrayer of even the interests of his own people.” The site also features an essay on “Jews & Civil Rights” by former Ku Klux Klan official David Duke.
What’s remarkable, though, is not that a search algorithm might be gamed by extremists but that the owners of the algorithm might themselves defend the offensive result – and reject any attempt to override it as an assault on the “integrity” of their system. AOL, because it subcontracts its search results to Google, finds itself in the uncomfortable position of promoting the white supremacist site to its customers. In response to an inquiry from CNET, the company was quick to distance itself from the search result and to place the responsibility for it on Google:
AOL spokesman Andrew Weinstein said the company has contacted Google about the Martin Luther King search results. “We get all of our organic search results from Google, as you know, so we don’t set the algorithms by which they are ranked. Although we can’t micro-manage billions of search results, our users would not expect this to be the first result for that common search, and we do not want to promote the Web sites of hate organizations, so we have asked Google to remove this particular site from the results it provides to us.”
That seems like an entirely reasonable position. Clearly, a white supremacist site is not the site that any rational person would consider an appropriate recommendation for someone looking for information on a black civil-rights leader. But Google doesn’t seem to agree. In fact, in responding to CNET, it defends the King result as being “relevant to the query” and suggests that it is evidence of the integrity of the Google PageRank algorithm:
At Google, a Web site’s ranking is determined by computer algorithms using thousands of factors to calculate a page’s relevance to any given query, a company representative said. The company can’t tweak the results because of that automation and the need to maintain the integrity of the results, she said. “In this particular example, the page is relevant to the query and many people have linked to it, giving it more PageRank than some of the other pages. These two factors contribute to its ranking,” the representative wrote in an e-mail.
A Microsoft spokesman is even more explicit in asserting that the King result is a manifestation of algorithmic “integrity”:
The results on Microsoft’s search engine are “not an endorsement, in any way, of the viewpoints held by the owners of that content,” said Justin Osmer, senior product manager for Windows Live Search. “The ranking of our results is done in an automated manner through our algorithm which can sometimes lead to unexpected results,” he said. “We always work to maintain the integrity of our results to ensure that they are not editorialized.”
By “editorialized” he seems to mean “subjected to the exercise of human judgment.” And human judgment, it seems, is an unfit substitute for the mindless, automated calculations of an algorithm. We are not worthy to question the machine we have made. It is so pure that even its corruption is a sign of its integrity.
Nick, I think all the evidence that your blame for this episode is misplaced can be found in these comments. When those who agree with you suggest a “Bury This!” option, which would effectively let any group hide any story from the public, I think it’s clear why editorializing needs to be kept separate from the underlying data whenever possible. By all means have a separate filter layer on top, but as search engines become the way we find information on the web, we must make every effort to keep them inclusive at their lowest level.
Responding to mturro… the issue isn’t whether it should be censored (as you and others note). The issue is whether this is even remotely the “most relevant” result for a search for “martin luther king.”
I think it’s safe to say that any mainstream editor would include hate speach (and FBI wiretaps, etc.) as relevant to a discussion of Martin Luther King, Jr., but I don’t think any would believe these were close to the top of relevance. (Jim Crow laws, segregation, and other forms of opposition to MLK would be much more relevant as the anti-positions… hate speach from a new-Nazi organization currently in existence would be way down the list).
So, to me, as we ascribe a certain amount of importance to Google societally, it’s important that we understand its limitations… and I also think it’s prudent for Google to acknowledge (and continually improve) their limitations.
Software is being asked to take on more and more “advisory” (editorial?) roles… we in the industry (I am one of those) need to understand its limitations, so that we can eliminate them (or work around them with… God forbid… humans…)
Obviously I’m not advocating that Google stop trying to improve it’s search algorithms… of course they should. What I object to is the idea that they should make modifications in order to eliminate a specific offensive result from a particular search. That is introducing an editorial perspective to search… ascribing to it a function that is best left to a human being. Fortunately a human is most likely the one who initiated the search to begin with and it falls upon that person to deal with those editorial decisions and to decide what they accept as good or bad information… not the machine or its inventor.
As for the relevance of hate to the life and work of MLK… why would the distant history of Jim Crow or segregation be more relevant than the explicit ignorance of group espousing hateful ideas now? I would think that the blinding hate displayed by the site in question is perhaps the best example… here and now… in the flesh… of what King was dealing with. The site serves as an extremely visible reminder that his struggle is not HISTORY but real and relevant TODAY.
This shows the great ability Google has to check political facts as Eric Schmidt claimed (see Techdirt).
I made a small cartoon.
Bye,
Oliver
Unfortunately, we seem to be missing the basic end-user rule of modern capitalism: caveat emptor. Google, Microsoft, et al, don’t -care- about truth or integrity, except as it impacts their basic business (making money for their share holders). In this case, they get more kudos from being “impartial”, as indeed they are.
And if the white dudes behind martinlutherking.org can play the game better than anyone else, then surely that is not Google’s “fault”. No one is to “blame”. It is another sign of Adam Smith’s invisible hand in action. Revel in it, folks! This is what has made the Internet the place it is today.
(BTW, when I just did a search, the site is now #1. Free advertising from this column. Ta da!)
-mark.
The first issue is why this site name was not yanked by the powers that be, is it still ICANN? Secondly the response by Google and Microsoft seem kind of silly. Why can’t the algorithm be tweaked? They have to know that this is not the info people are looking for.
If I searched for info on Dr. King and this is the first site or even the 100th site that popped up I would be PO’d. How is the opinion of a fringe group (I hope!) useful to me looking for legit info on King. They should get off their high horse and take another look at their algorithms.
There are two separate questions: How can algorithmically produced results be questioned? And what does integrity mean with regard to search engine results?
I think the answer to the first cannot be that old simplistic complaint about ascribing god-like authority to computers. What should be questioned is the algorithm and the issue is that we don’t know how it works, so we cannot question it.
The other question about the integrity of search engine results cannot be answered by asking for the morality of its content. But you are right to question whether a racist pamphlet is the most relevant information about Martin Luther King. Whether it is the most relevant or not, however, depends on the intentions of the person who does the searching. This person, like the people who linked to the racist page, might be interested in racist attacks on Martin Luther King. Or they might be more interested in a biography of King, which is what you get when you check wikipedia.
And there we are right at the center of the problem that search engines try to solve, which is to figure out what question the searcher has in mind based on the words he enters. In the case of google, the question turns out to be whether the number of links to a page (and other criteria) are a good enough match for most searchers intentions.
Googles definition of integrity is that nobody has tricked the algorithm into ranking a page higher than it would otherwise be. I think to accept this definition doesn’t mean to ascribe god-like authority to the algorithm. However, we should be able to question the algorithm. We should ask for it to be made public so it can be subjected to public scrutiny. But to ask for an ethics filter is not very different from asking for censorship. Who does the censoring? Google? Mr. Bush? The Chinese government? You?
At the heart of Google’s algorithm is the assumption that a link is a positive indication of a site’s value. This idea stems from the academic practice of citations. When you write an academic paper, you cite references to other papers in order to build upon the author’s previous work.
This works very well in acadaemia because everyone is an adult and there is no bebfit to be gained by gaming the system. Google uses and extends this citation idea to create it’s page ranking of importance. Thus, you could conclude that PageRank is broken.
Having a human intervene in this process is an anathema to Google because a) you can resolve everything to maths and b) you get into political arguments and can be accused of bias. Earlier in these comments someone mentioned that Google had ‘fixed’ some results for heart attacks and suicide. This was a foolish thing done for good reasons.
Issues like this will continue to arise until someone cracks the semantic issue of understanding the intent behind a search request, the holy search grail of natural language search. When someone simply types “Martin Luther King” into a search engine, there is no indication of the context in which the search should occur, so you get what you consider are inappropriate results.
In summary, PageRank is not broken, you need to tell Google (and MSN) more about what you are looking for.
While the white supremecist ranking is disturbing, it should be pointed out that it comes out this way partly because of the habit of referring to Martin Luther King _Junior_ as simply MLK. Type in his full name, and the first hit in rankings is the one at Stanford.
To me this just points up that one has to be careful what you’re asking for when you do a search. IOW, do a sloppy search and you’re not necessarily going to get an accurate result.
Seems to me that we like to shift blame whenever we can. The only problem with doing this is that although it may make us feel better about ourselves in the short-term, in the long-term it makes us weaker and hands over the power to some other ‘thing’ that is causing us grief and is out of our control/ is controlling us.
We can’t chop and change to make these searches personalised – Bertram Brookes understood this with his fundamental problem of information science. So rather than change the system to be all things to all people, make people understand the system (or become Information literate) so they can use these tools with confidence. A quick evaluation of the authority and agenda of the MLK site provides us as much ‘information’ as the content itself, and is a valid consideration in any research on this topic.