Andrew Ng wants to bring deep learning – an emerging computer science field that seeks to mimic the human brain with hardware and software – into the DIY era.

Last year at Google he built a computerized brain that worked as a cat detector. It used a roughly 1-billion-connection network trained on 1,000 computers to teach itself how to spot cat videos on YouTube. While this worked well, Ng says, some researchers walked away thinking, "If I don't have 1,000 computers, is there still any hope of my making progress on deep learning?" The system cost roughly $1 million.

“I was quite dismayed at this, particularly given that there are now a few other computer science research areas where a lot of the cutting-edge research is done only within giant companies,” he recalls. “Others simply don't have the resources to do similar work.”

On Monday, he's publishing a paper that shows how to build the same type of system for just $20,000 using cheap, but powerful, graphics microprocessors, or GPUs. It's a sort of DIY cookbook on how to build a low-cost neural network. He hasn’t yet decided whether the code for the model will be open sourced, but the new paper gives enough detail for people with enough coding brawn to build their own faux brains.

"I hope that the ability to scale up using much less expensive hardware opens up another avenue for everyone around the world," he says. "That's the reason I'm excited – you can now build a 1-billion-connection model with $20,000 worth of hardware. It opens up the world for researchers to improve the performance of speech recognition and computer vision."

Down the line, this research on souped-up versions of neural networks running on GPUs could give rise to more powerful – and financially lucrative – GPU-based applications at large tech companies.

Built by companies such as Nvidia and AMD, GPUs power the graphics card on your PC or video game console. But about a decade ago, computer scientists started to realize that they were also really good for doing certain types of mathematical calculations.

“GPUs are so incredibly powerful,” says David Anderson, a computer scientist at Berkeley. “Programs that previously ran on supercomputers, we’re now realizing we can rewrite to run on GPUs at a fraction of the price.” His team at Berkeley recently rejigged the volunteer-parallel-computing platform, BOINC, to be able to run on GPUs. BOINC helps scientists analyze astronomical and biomedical data.

Already universities and companies like Google, Shazam, Salesforce, Baidu and imgix are using these graphical chips to meet their ever-expanding computing needs to perform tasks as varied as voice recognition, quantum chemistry, and molecular modeling.

For this new research, Ng's team also built a super-sized, 11-billion-connection version of the cat detector for roughly $100,000. He wants to build a high-performance computer that will allow researchers who don’t have the deep pockets of some of these companies and universities to do research on deep learning. It's a bit like what Apple and Microsoft did for personal computing or what cheaper sequencing hardware did for genomics. Both democratized technologies that were inaccessible to many.

The Google Cat experiment ran on 1,000 computers with 16,000 CPUs. Ng’s group distributed their beefed-up, low-cost model, including the database of images on which it was trained, across 64 Nvidia GPUs on 16 computers and used special hardware to connect them in order to minimize the time required for these different modules to communicate with one another.

Ng is excited about this progress, but he admits there’s still work to be done. The new model is not that much smarter – or faster – than the original cat detector even though its neural net has a whopping 11 billion connections, or 10 times as many as its predecessor.

Plus, there are questions as to how easily Ng’s new model could be ported to other applications given that his group had to device specialized hardware and software to make it work.

“The infrastructure seems to be particular to their specific unsupervised learning algorithm. The useful algorithms for training these networks, like the supervised algorithms that we use, and the one Google uses to train their photo-tagger are much harder to parallelize,” wrote NYU’s Yann LeCun, one of the pioneers of deep learning, in an email interview.

There are also issues with using GPUs that need to be worked out. Although Google, is trailblazing into the GPU space, most large technology companies have not invested heavily in graphics chips because using them in the cloud can be complicated. CPUs are better at sharing computing resources and can switch easily between several jobs, but the technology to do that on GPUs is not yet mature, says Ng. Plus running jobs on GPUs also requires specialized code.

“[GPUs] are simply being co-opted by machine learning and AI researchers for a different purpose. So it’s not exactly a natural fit,” wrote Bruno Olshausen, a computational neuroscientist and the director of the Redwood Center for Theoretical Neuroscience at the University of California, Berkeley, in an email. “If we really want to make progress in building intelligent machines, then we will need to direct our efforts to build new types of hardware that are specifically adapted for neural computation.” Olshausen is currently working on this problem as part of an ongoing multi-university research project.