Search engines for grownups
Google is an amazing company. It is so all-pervasive it has become a verb. It also annoys the hell out of me, and I avoid it whenever I can. No matter how annoying their interface becomes, or how many weird and privacy invading things they do, no matter how many crypto-religious fruitcakes they hire, they’re the only game in town for most people. I don’t like monopolies. I think monopolies are inherently evil and should be shunned by people with a conscience, or tamed by the judicial system. Since the US government is presently composed of ninnyhammers obsessed with irrelevant things, and geldings who have forgotten about the anti-trust laws, it falls to the individual to do something about it. Where is Teddy Roosevelt when you need him?
There are alternatives available. The problem is, nobody knows about them. Google dominates people’s thoughts about search the way Microsoft used to dominate people’s ideas about computers in general. Some of the alternatives are very much worth knowing about, even if you are happy with using Google.
For most people, the best alternative is Yandex.com. Yandex is the biggest player in the Russian market. It’s been around for longer than Google has, it is run by mature computer scientists who specialize in machine learning, and is one of the best search engines you have never heard of. The English language version of their search engine is considered experimental, but the results are very good. For general search, it is as good or better than Google. The results are uncannily accurate, and the clutter is practically nonexistent. Speaking of clutter: I’m really happy with how their page looks; no clutter. The English language page is missing some “searchy” features at present: for example -no English language news aggregator (which means, no news results in the basic search either). This feature exists in Russian, so I assume it is coming. Multimedia? Well, they’re not so hot here, but searching for funny pictures is a rare task for me. Google has a marginal win on maps for the US, mostly for the public transit option that works (Yandex seems OK for driving maps). The Russian language translation facilities at Yandex are, of course, excellent: much better than Google. As a slavophile, I find this invaluable.
One privacy advantage Yandex has which Google never will: Yandex does not do business with American intelligence agencies. I do not like the fact that Google has become an arm of US intelligence agencies. It is to their credit that Google discloses their relationship with the US government (most of Silicon Valley is in bed with the spooks, but they don’t talk about it). It is the surveillance state that I abhor. Yandex may very well be doing the same thing with the Russian government, but the FSB is a much smaller threat to American civil rights than our own spooks. While I see no immanent dangers from the all-seeing eye, and I am far from paranoid, the US is going through a weird time right now, and history is a dark and bloody subject. Do I really want the future government to know what websearches I was doing in 2010? No, thanks, tovarich.
As a crypto-academic consultant, I end up doing a lot of searches for technical papers. Google is OK at this (I have found no utility in “google scholar” -the regular search results are equivalent). Yandex actually does significantly better. Of course, these kinds of searches are a broad net. If you have a decent idea of what you’re looking for, INSPEC is still the gold standard. You have to pay for INSPEC, or walk to a university library, but that is what serious people use for deep search in an academic subject.
Yandex does fail one important use case for me. One of the fundamental ways people get work done on computers is searching for error messages and bugs and “how-tos” on message boards. If you’re dealing with a computer problem, chances are good that someone else had the problem, and asked others about it on an online forum; whether it is a compiler directive or a wonky KDE feature. This is a tremendously helpful knowledge base. Google beats everyone at this at present, mostly because you can sort by date. Close behind google for this use is duckduckgo.com.
I have high hopes for Yandex. While Google hires a lot of rock star programmers and well known computer scientists, Google also seems unfocused and adolescent (read the takimag article for more concrete criticisms). The Yandex guys: they’re grownups. They have succeeded in a country of flinty hard men. People actually died trying to do business in Russia in the 90s; these guys made it. They’ve only been doing English for a little while, and they’re already better than Google at quite a few things. Search in Russian is much harder than search in English, as the language is strongly inflected. So, Yandex solved a much harder problem than Google did at the outset. Google wastes its time with nonsense like Google+ or attempts to bring about the “singularity” by hiring Crazy Ray Kurzweil. Meanwhile, Yandex is using its technology to assist particle physicists at CERN, which seems a bit more impressive. I’ve seen significant improvements in Yandex search results over the past few months. It is very exciting to watch a complex contraption like this improving so quickly. Consider this: they have achieved all this on revenues which are 1/60 of what Google takes in. The flabby marshmallows at Google may not be worried now, but these guys are coming for them. If I had a bunch of steel hard brainy Russian cossacks in my rear view mirror, I’d be nervous.
On a slightly different topic: one of the hardest things a technical or fact-oriented person looks for on the internets is data. Most search engines are completely useless for this type of thing. It’s really a different type of problem from ordinary search. I have only found two search engines which do this well.
One is Wolfram Alpha, which I made fun of at one point. I now find it indispensible for looking up simple facts and figures, using an English language query. It doesn’t have large amounts of data, but it’s easy to get to the data: just tell it what you need. Kudos to them for getting this right. It ain’t bad for doing integrals and such either; certainly more convenient than using some long-in-the-tooth open source computer algebra system like Axiom or Maxima. While it kind of sucked when it first came out, the suck is all gone: this is an excellent product every numerate individual should avail themselves of.
The other is quandl.com. I have been using it for only a few weeks, and don’t know how I lived without it. I had a lot less data to work with, and I went through a lot more trouble to obtain it. For quants, this is an indispensible tool for historical economic data. For datanauts in general; ditto. Before quandl, you had to scrape publicly available data from myriad websites. Post-quandl; well, it’s easy to get at, and if you register with them, you can download dynamically updated data in easily parsed CSV format all damn day. Hooray for Quandl! Please don’t sell out to gigantor corp that will make you suck. If you must, sell out to Yandex!