A few weeks ago I decided to scratch an itch I’ve been having for a while — to participate in some bug bounty programs. Perhaps the most daunting task of the bug bounty game is to pick a program which yields the highest return on investment. Soon though, I stumbled upon a web application that executes user-submitted code in a Python sandbox. This looked interesting so I decided to pursue it.

Let me out of this god-forsaken sandbox!

After a bit of poking around, I discovered how to break out of the sandbox with some hacks at the Python layer. Report filed. Bugs fixed, and a nice reward to boot, all within a couple days. Sweet! A great start to my bug bounty adventures. But this post isn’t about that report. All in all, the issues I discovered are not that interesting from a technical perspective. And it turns out the issues were only present because of a regression.

But I wasn’t convinced that securing a Python sandbox would be so easy. Without going into too much detail, the sandbox uses a combination of OS-level isolation and a locked-down Python interpreter. The Python environment uses a custom whitelisting/blacklisting scheme to prevent access to unblessed builtins, modules, functions, etc. The OS-based isolation offers some extra protection, but it is antiquated by today’s standards. Breaking out of the locked-down Python interpreter is not a 100% win, but it puts the attacker dangerously close to being able to compromise the entire system.

So I returned to the application and prodded some more. No luck. This is indeed a tough cookie. But then I had a thought — Python modules are often just thin wrappers of mountainous C codebases. Surely there are gaggles of memory corruption vulns waiting to be found. Exploiting a memory corruption bug would let me break out of the restricted Python environment.

Where to begin? I know the set of Python modules which are whitelisted for importation within the sandbox. Perhaps I should run a distributed network of AFL fuzzers? Or a symbolic execution engine? Or maybe I should scan them with a state of the art static analysis tool? Sure, I could have done any of those things. Or I could have just queried the some bug trackers.

Turns out I did not have this hindsight when beginning the hunt, but it did not matter much. My intuition led me to discovering an exploitable memory corruption vulnerability in one of the sandboxes’ whitelisted modules via manual code review and testing. The bug is in Numpy, a foundational library for scientific computing — the core of many popular packages, including scipy and pandas. To get a rough idea of Numpy’s potential as a source of memory corruption bugs, let’ check out the lines-of-code counts.