Google Scholar was revolutionary for a number of reasons. Acharya and his team worked hard to get academic publishers to allow Google to crawl their journals. Since many of the articles unearthed by Scholar were locked behind paywalls, simply locating something in a search would not mean that a user could read it. But he or she would know that it existed, and that makes a tremendous difference. (Imagine setting off on a research project and finding out months later that someone had done the same work.) Google also pushed the paywall publishers to allow users to see abstracts of the work. The world’s biggest online archive of journal articles, JSTOR, offered only scans of articles, and had no way to separate the abstract from the whole piece. (Those accessing JSTOR through subscribing institutions could see full text.) So Scholar convinced JSTOR to provide its users to see the first scanned page of the article for free. “Often the first page has the abstract, or in older articles you have the introduction,” says Acharya, whose job title at Google is Distinguished Engineer. “That at least allows you to get a sense of it so you can decide whether you should put in additional effort.” Google Scholar will then provide the information that will help users get the complete text, whether online for free, downloaded for a fee, or in a nearby library.

(All Google users benefited from all that newly crawled information, too, as the company included those articles and books in its general search index.)

At launch, Google Scholar won wide acclaim, even from those generally skeptical about the company. Two well known library scientists, Shirl Kennedy and Gary Price wrote, “When big announcements come from Google and web engines, we often get nervous…. Not this time, however. This is BIG news and something that should have been around for years.” (There was some criticism, though. One complaint was that Google Scholar had no API to allow other services to access it. Others said that since Google didn’t share information like its ranking algorithm and all its sources, it fell short of a “scholarly” standard.)

Some in the research community favorably contrasted it to Google’s more controversial Book Search, which was launched at the same time. Scholar avoided the sort of copyright controversy that Book Search generated, despite the fact scholarly publishing world is a war zone, with an increasing number of academics lodging protests against powerful publishers who control the major journals. This is a conflict pitting profit against public good. It was the principle of open research that led Internet activist Aaron Swartz to download a corpus of JSTOR documents legally provided to MIT; the government prosecution of that act ended only with Swartz’s suicide. Google Scholar does not officially take a stand on the issue, but its implicit philosophy seems to endorse an egalitarian spread of information. In any case, when possible, Scholar tries to help negotiate around paywalls for non-subscribers by linking to articles in multiple locations — often, authors of paywalled works have free versions on their personal websites.

Originally some of the biggest publishers, determined to keep a tight grip on the academic work they typically don’t pay for (and then sell for huge sums), refused to let Google crawl their contents.

Over the years Acharya has worked hard to change their minds. “It is knocking on one door after another,” he says. “Elsevier took three, four, five years. The American Chemical Society was somewhat slower, but largely it is knocking on door after door after door.”

Acharya has kept knocking on doors, because from the very moment Scholar launched, he has been devoted to improving the product. “The first version worked well, but I was not happy with it,” he says. Working with Verstak and a small team, he has consistently added features (one particularly useful addition identifies related articles to the ones ranked for a specific search) and even expanded Scholar’s reach to ambitious new realms, most notably judicial case law in 2009. (This was described as “a shot across the bow of the multi-billion dollar legal publishing business” which previously controlled that public information.) Acharya’s role spans not only engineering but operations, partner relations, library liaison, contracts, and evangelism.

The engineering isn’t an afterthought, though. A lot of artificial intelligence is necessary to keep improving the system. For instance, Archaya and Verstak got a patent for “Identifying a primary version of a document.” (By the way, I found out this factoid by using Google Scholar.)

Another innovation of Scholar has been its ability to correctly identify the authors of books and papers, an important feature for those interested in the work of a specific researcher. “”Scholarship tends to have a lot of authors named as ‘Jay Smith’ — there are a lot of Jay Smiths out there,” he says. “And if you think that’s as easy problem, think of the name Huang — there are about 200 Chinese last names that cover 95% of authors.” Google tackles this problem by creating clusters of papers that are likely to be written by the same individual and, for the last step, asks the actual authors (who almost inevitably use the service) to identify which groups of paper are theirs. Asking users directly to create search results, seems very un-Googley, but as Acharaya says, “We can’t automatically solve this problem entirely—so we just give you a list of clusters, you say, ‘These are mine,’ and you are done. The rest is automated.” Knowing who the authors are, Google can create profiles of where they fit into academia—who are their coauthors, who they have cited, who has cited them.

Acharya’s continued leadership of a single, small team (now consisting of nine) is unusual at Google, and not necessarily seen as a smart thing by his peers. By concentrating on Scholar, Acharya in effect removed himself from the fast track at Google. He was part of a number of amazingly talented Ph.D. engineers that joined the company around 2000, and some of them are still doing work vital to Google’s core, pushing boundaries of computer science and artificial intelligence. He has the engineering chops to work with them. But he can’t bear to leave his creation, even as he realizes that at Google’s current scale, Scholar is a niche.

“I didn’t have the confidence that if I left it behind it would continue to be what I want it to be,” he says. “Normally you leave projects behind, because you do the next interesting thing. This seemed just too important to let my desire for a new project drive what I did next.”

Only at Google, of course, would the world’s most popular scholarly search service be seen as a relative backwater. Acharya isn’t permitted to reveal how big Scholar’s index is, though he does note that it’s an order of magnitude bigger than when it started. He can also say, “It’s pretty much everything — every major to medium size publisher in the world, scholarly books, patents, judicial opinions, small, most small journals…. It would take work to find something that’s not indexed.” (One serious estimate places the index at 160 million documents as of May 2014.) But like it or not, the niche reality was reinforced after Larry Page took over as CEO in 2011, and adopted an approach of “more wood behind fewer arrows.” Scholar was not discarded — it still commands huge respect at Google which, after all, is largely populated by former academics—but clearly shunted to the back end of the quiver. Not only was Scholar missing from the list of top services (Image Search, News, etc.) but bumped from the menu promising “more” services like Gmail and Calendar. Its new place was a menu labeled “even more.”

Asked who informed him of what many referred to as Scholar’s “demotion,” Acharya says, “I don’t think they told me.” But he says that the lower profile isn’t a problem, because those who do use Scholar have no problem finding it. “If I had seen a drop in usage, I would worry tremendously,” he says. “There was no drop in usage. I also would have felt bad if I had been asked to give up resources, but we have always grown in both machine and people resources. I don’t feel demoted at all.”

Acharya is now 50. He’s excited about adding new features to Scholar — improving the “alerts” function and other forms that help users discover information important to them that they might not know is out there. Would he want to continue working on Scholar for another ten years? “One always believes there are other opportunities, but the problem is how to pursue them when you are in a place you like and you have been doing really well. I can do problems that seem very interesting me — but the biggest impact I can possible make is helping people who are solving the world’s problems to be more efficient. If I can make the world’s researchers ten percent more efficient, consider the cumulative impact of that. So if I ended up spending the next ten years going this, I think I would be extremely happy.”

That satisfaction seems plenty for Acharya, especially when he thinks of the millions of people — everywhere from rural India to Mountain View, California — who have the world’s scholarship at their fingertips, for free. But will Google itself spring for at least a doodle on November 18, when Scholar turns ten?