Popular Posts

Monday, December 30, 2013

Enter "Search"

You can throw around some impressive numbers talking about the Web these days:  a trillion Web pages (so says Wired founder Kevin Kelly), and as of this writing 1.59 billion of them indexed on search engines. Google, of course, is the story here--as much today as a decade ago.  When the company debuted it's "BackRub" search engine on Stanford University's servers back in the late 1990s, within a year the software was VC funded and moving out of its academic roots and into commercial tech stardom.  Since then, many of the needle-in-haystack worries about finding information on the exponentially growing World Wide Web have become largely otiose.  Why?  Because, generally speaking, Google works.

But like many great ideas, the Google recipe for Web search is somewhat paradoxical.  On the one hand, Google--as a company and as a search technology--is the paragon of science, engineering, and numbers.  Indeed, the math-and-science ethos of Google is part of it's corporate culture.  Visit the Googleplex--the sprawling campus in Mountain View, California where Google is headquartered--and you'll get a sense that everything from employee work schedules to seemingly cosmetic changes on its homepage to geek-talk about algorithms is subject to testing, to numbers.  Google is data-driven, as they say.  Data is collected about everything--both in the company and on the Web--and then analyzed to figure out what works.  Former CEO Eric Schmidt remarked once, tellingly, about his company that "in the end, it's all just counting."  And it is, of course.

On the other hand, though, what propelled Google to stardom as a search giant (and later as an advertising force) was the original insight of founders Larry Page and Sergei Brin--two Stanford computer science students at the time, as we all now know--that it's really people, and not purely data, that makes Google shine.  PageRank, coined after it's inventor Larry Page, is what made Page's pre-Google "BackRub" system so impressive.  But PageRank wasn't processing words and data from Web pages, but rather links, in the form of HTML back-links that connected Web page to Web page making the Web, well, a "web."

Page's now famous insight came from his academic interests in the graph-theoretic properties of collections of academic journal articles connected via author references, where the quality of a particular article could be judged by (roughly) examining references to it from articles with authors having known authority and credentials on the same topic.  Page simply imagined the then-nascent World Wide Web as another collection of articles (here: Web pages) and the HTML links connecting one to the other as the references.  From here, the notion of "quality" implicit in peer-reviewed journals can be imported into the Web context, and he had the germ of a revolution in Web search.

Of course it worked, and almost magically well.  When Page (and soon Brin) demo'd the BackRub prototype, simple queries like "Stanford" or "Berkeley" would return the homepages of Stanford University or The University of California at Berkeley.  (Yes, that's pretty much it.  But it worked.) It's a seemingly modest success today, but at the time, Web search was a relatively unimportant, boring part of the Web that used word-frequency calculations to match relevant Web pages to user queries.  Search worked okay this way, but it wasn't very accurate and it wasn't very exciting.  Scanning through pages of irrelevant results was a commonplace.

Most technologists and investors of the day therefore pictured search technology as a mere value-add to something else, and not a stand alone application per se.  The so-called portal sites like Yahoo!, which used human experts to collect and categorize Web pages into a virtual "mall" for Web browsers and shoppers were thought to be the present and future of the Web.  Search was simply one of the offerings on these large sites.

But the human element used by Yahoo! to classify Web pages was much more powerfully captured by Page and Brin algorithmically--by computer code--to leverage human smarts about quality to rank Web pages.  And this is the central paradox--while Google became the quintessential "scientific" company on the Web, it leaped to stardom with an insight that was all too human--people, not computers, are good at making judgments about content and quality.   And of course, with this insight, the little BackRub system bogging down Stanford's servers quickly became the big Google search giant.  Suddenly, almost over night, search was all the rage.

Putting it a bit anachronistically, then, you could say Google was, from the beginning, a social networking technology--or at least a precursor.  The idea that the intelligence of people can be harnessed by computation led to more recent tech "revolutions" like Web 2.0.  For instance, in tagging systems like de.licio.us (now owned by Yahoo!), users searched people generated tags or "tagsonomies" of Web pages.  Tagging systems were a transitional technology between the "Good Old Fashioned Web" of the late 1990s with its portal sites and boring keyword search (like Yahoo!), to a more people-centered Web where what you find interesting (by "tagging" it) is made available for me, and you and I can then "follow" each other when I discover that you tag things I like to read.  Once this idea catches on, social networking sites like My Space and later Facebook are, one might say, inevitable.

So by the mid-2000s, user generated content (UGC) like the earlier de.licio.us, a host of user-driven or "voting" sites like Digg (where you could vote for or "digg" a Web page submitted on the site), and large collaboration projects like Wikipedia were simply transforming the Web.  Everywhere you looked, it seemed, people were creating new and often innovative content online.  As bandwidth increased, visual media sites for sharing photos and videos (e.g., YouTube) emerged, and within it seems months, becoming major Web sites.  And as Web users linked to all of this UGC, and Google's servers indexed it, and it's PageRank-based algorithms searched it by exploiting the human links, Google's power was growing by almost Herculean proportions.  Like the Sci-Fi creature that gets stronger from the energy of the weapons you use to shoot it, every fad or trend or approach that took fire on the Web translated ineluctably into a more and more powerful Google.  By the end of the 2000s, it seemed every person on the planet with an Internet connection was "googling" things on the Web, to the tune of around 100 billion searches per month.

     Excepting, perhaps, the idea of a perfect being like God, every other idea has its limits, and Google is no exception.  Enter, again, our troubling question:  how, if the Web is driven increasingly by human factors, and Google leverages such factors, can Google be making us stupid (as Carr puts it)?  Why need we be assured we're not "gadgets" (as Lanier puts it)?  If all this tech is really about people anyway, what gives?  "What gives?" is a good way of putting things, and it's to this question that we now turn.

No comments: