Beyeng Liu

How did it start?

What was the most compelling part of the idea?

If Google Code Search had still been around, do you think you still would have started this?

You're using PySonar and RubySonar to do deep analysis and type inferencing. In terms of the accuracy, are you seeing one language be easier than others to get right? For example, Ruby and Rails have a lot of metaprogramming patterns, as does Python.

What's your existing stack, and what kind of data volumes are you looking at?

Something like this would be super useful for companies. Have you given any thought to that?

Quinn and I went to Stanford together, worked on a bunch of class projects together, and worked at Palantir together. One day I was chatting with him at my housewarming in Corona Heights - we were discussing issuses we'd experienced as programmers. That's where the discussion really started.Programmers today are mostly reading code, making decisions about the libraries to use, and figuring out how to put those libraries together. The other tools available today don't take advantage of all the information that's available. I like to say that the right example is worth a thousand lines of documentation - an example can show you at a glance how you should use an API and reason about it.We do things quite a bit differently than Google Code Search. It was this really fast trigram index. It was great if you had a particular expression in mind. But they didn't go through and parse the code like we do. It couldn't tell you who else uses this particular function or repository. We can. The use-case that we're targeting is: "I'm writing some code. I want to use a specific function and figure out how to use it as quickly as possible."Go is by far the easiest, since it's both statically typed and has good libraries for introspection. PySonar does a really good job at Python. For Ruby, we're using RubySonar and YARD. The particular tools we're using in flux, but we're going to open source them soon. As far as metaprogramming goes, it turns out with a few heuristics, you can capture a lot of the magic that people do in Ruby and Python.Mostly Go on AWS. There's a massive PostgreSQL instance and massive Elasticsearch node. We also have a lot of file storage on S3. Basically, all history of every repository we've ever looked at, we store. We also have this really cool setup with Makefiles - when you're processing, analyzing, indexing repositories, you have a lot of complex and interdependent steps. Our system does all that with Makefiles very elegantly. We're both systems folks, so we gravitate towards those kinds of solutions. Overall we have 5 terabytes of data total right now. EBS has the git repositories for speed and cost - would be too expensive to get things from S3 for every request.We have. We've talked to Twitter, Facebook, and a bunch of other companies about private installations. Down the road, yes, we are open. But for now we want to focus on building an awesome product that every single programmer can use. Sourcegraph should be one of the top three applications open while you're coding.