12 May 2009 | Author: James Zigrino Head of R&DThe third wave of search?
The
first wave of search saw a battle between two fundamental ideologies about how to catalogue the web:
- User moderated by a human editor - used by the likes of DMOZ and Yahoo Directory
- Computer indexed - used by the likes of Google, Altavista and AllTheWeb, who utilised brute computing power tempered by a little human QA
In the end the latter won hands down and
Google proved to be the best of the bunch. The human moderated approach found a new home in blogging and social media; a much more distributed and democratic form than the old web
directories (anyone who ever tried to debate with a DMOZ editor will attest to this!).
These in turn again got indexed by those brute force engines. These days, arguably the greatest achievement of the Web 2.0 world - Wikipedia - has usurped DMOZ in Googles affections as the human edited resource of choice.
The
second wave was more subtle and it is still happening right now. Ask was probably the first off the mark, but soon everyone was at it; finding relationships and aggregating similar stuff together. So a search for, say, a band would return a mix of fan pages, lyrics and official information, all carefully identified and differentiated.
On Google you can now expect to find for any given search term a demarcated mix of news, shopping, local, web and other results. In fact, the very notion of a simple, ranked SERP seems to be disappearing before our eyes. You might be tempted to think that that is the end of the story - the future of search as we know it.
And to a certain extent you'd be absolutely right - intelligently grouping data together is the Executive Summary of the search world. It works and it's hard to beat. Its current zenith could well be Googles shopping search, which intelligently gathers prices, web crawled and Google Base content plus reviews together in one handy service.
However, all of the above have one flaw; they can only give you examples of what you're looking for if exactly what you want already exists somewhere out there on the web.
Let's say you're looking for a graph of population figures for Iowa for last 5 years. Google can only provide you with one if someone has already created such a graph and helpfully tagged it up and described it well enough for Google to find it on a web page and return it for your search. If there is nothing like that out there waiting for you, well ... tough. You'll just have to get out the graph paper and draw it yourself.
That was, perhaps, until now.
Up to now only humans could take existing information, infer from it and maybe synthesise new conclusions; just like when your friends know what you like (existing information) and buy you something you'll find interesting for your birthday (synthesise something new - an idea for a present).
Its not quite as easy as it looks, as anyone who has been stuck for a present idea will attest. Even we humans aren't so great at it. It's one reason why all adult males have too many pairs of socks and jumpers.
But, starting slowly with Amazons "people who bought this..." feature, recommendation engines are catching up. Now, engines like Likaholix (created by two Google engineers Bindu Reddy and Arvind Sundararajan) or last.fm can make pretty accurate inferences given the right data. Google's personal search is also toying around with these same concepts.
But that's still not the future. If I want a graph of the population of Iowa, I don't want to be offered a graph of crop yields in Arkansas instead, because that's what other people like me we looking for.
What I really need is a
search engine that can synthesise new information from what it already has to hand to answer my exact question if there isn't a exact answer readily available. Now that would be the future. That's what the all-knowing computer in Star Trek does. That would be the
third wave.
So, can computers make those kind of inferences? Well, Alison Pease of the University of Edinburgh (UK) has developed an AI program called HRL (
http://homepages.inf.ed.ac.uk / apease / research / hrl.html ) in which software agents select and brainstorm inferences from any information they are given. And its pretty smart. For instance, HRL independently formulated a famous mathematical proposition called Goldbach's conjecture all by itself. OK, so we humans beat HRL to it by 250 years, but it's no less remarkable for it.
Does that help us with our Iowa graph? We're getting closer; in April, Stephen Wolfram - famous for software application Mathematica - unveiled
Wolfram Alpha, a search engine that can answer quite complex questions by building answers from other, unrelated data sets.
For instance a query such as, "What is 25 million dollars in 1945 worth in 2008?" would not trouble Wolfram Alpha at all.(The answer, depending on your luck with your bank of late, will be anywhere from 300 million to - *ahem* - zero.)
And our Iowa query finally gets answered too, without fuss.
Whats interesting is how Wolframs approach takes us all the way back to the beginning of our story; their solution takes 150 full time expert "editors" to manually feed the data into Mathematica (but from there onwards, the system them does all the intelligent combining and inference without human assistance).
We can't help thinking that if we could just combine Googles reach into books, web sites and public data, its incredible ability to intelligently collect, group and combine data into sets with Wolfram Alphas ability to make inferences, we really would have a killer combination.
And possibly Wolfram think so too; Wolfram Alpha will launch - just like Google did - with a commercial API. That will allow other companies - maybe even other
search engines - to process their data with Wolframs technology. And then the possibilities are literally endless. I for one can't wait to see what they come up with.
In the meantime, want to try Wolfram Alpha for yourself? Its launching any day now... go take a look at
www.wolframalpha.com