Compile Me!

ChocoPy is a programming language designed for classroom use in undergraduate compilers courses. ChocoPy is a restricted subset of Python 3, which can easily be compiled to a target such as RISC-V. The language is fully specified using formal grammar, typing rules, and operational semantics. ChocoPy is used to teach CS 164 at UC Berkeley. ChocoPy has been designed by Rohan Padhye and Koushik Sen, with substantial contributions from Paul Hilfinger.

At a glance, ChocoPy is:

Familiar : ChocoPy programs can be executed directly in a Python (3.6+) interpreter. ChocoPy programs can also be edited using standard Python syntax highlighting.

: ChocoPy programs can be executed directly in a Python (3.6+) interpreter. ChocoPy programs can also be edited using standard Python syntax highlighting. Safe : ChocoPy uses Python 3.6 type annotations to enforce static type checking. The type system supports nominal subtyping.

: ChocoPy uses Python 3.6 type annotations to enforce static type checking. The type system supports nominal subtyping. Concise : A full compiler for ChocoPy be implemented in about 12 weeks by undergraduate students of computer science. This can be a hugely rewarding exercise for students.

: A full compiler for ChocoPy be implemented in about 12 weeks by undergraduate students of computer science. This can be a hugely rewarding exercise for students. Expressive: One can write non-trivial ChocoPy programs using lists, classes, and nested functions. Such language features also lead to interesting implications for compiler design.

Bonus: Due to static type safety and ahead-of-time compilation, most student implementations outperform the reference Python implementation on non-trivial benchmarks.

Try ChocoPy

# Search in a list def contains(items:[int], x:int) -> bool: i:int = 0 while i < len(items): if items[i] == x: return True i = i + 1 return False if contains([4, 8, 15, 16, 23], 15): print("Item found!") # Prints this else: print("Item not found.")

SPLASH-E Paper

Rohan Padhye, Koushik Sen, and Paul N. Hilfinger. 2019. ChocoPy: A Programming Language for Compilers Courses. In Proceedings of the 2019 ACM SIGPLAN SPLASH-E Symposium (SPLASH-E ’19), October 25, 2019, Athens, Greece. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3358711.3361627

Download: Paper PDF (1MB) | Slides PDF (90MB)

Teaching Resources

The following resources are available for conducting a compilers course with ChocoPy:

A language reference manual, complete with formal grammar, typing rules, and operational semantics.

An implementation guide describing calling conventions and runtime support in RISC-V.

A Java-based framework for conducting three Programming Assignments (PAs): Lexing and Parsing ChocoPy into Abstract Syntax Trees (ASTs) Semantic analysis and type checking on ASTs Code generation from typed ASTs to RV32IM assembly The framework uses JSON as an intermediate representation for piping results between compiler stages. The framework includes auto-grading support. All artifacts are platform independent and only require Java 8+ and Apache Maven.

The framework uses JSON as an intermediate representation for piping results between compiler stages. The framework includes auto-grading support. All artifacts are platform independent and only require Java 8+ and Apache Maven. A modified version of the Venus simulator for executing RISC-V programs. Venus can be deployed to the web (for online debugging) or compiled to the JVM (for testing and auto-grading).

A Java-based reference compiler that compiles ChocoPy programs to auto-documented RISC-V assembly. Students have access to a binary version of this compiler, for use in debugging their own assignments. The reference compiler also powers the online examples on this web page.

These resources can be made available to instructors upon request.

Language Features

Static Type Checking

# A broken program def is_even(x:int) -> bool: if x % 2 == 1: return 0 # FIXME else: return True print(is_even("3")) # FIXME

Nested functions with global/nonlocal variables

# Compute x**y def exp(x: int, y: int) -> int: a: int = 0 global invocations # Count calls to this function def f(i: int) -> int: nonlocal a def geta() -> int: return a if i <= 0: return geta() else: a = a * x return f(i-1) a = 1 invocations = invocations + 1 return f(y) invocations:int = 0 print(exp(2, 10)) print(exp(3, 3)) print(invocations)

Lists, classes, and dynamic dispatch

# A resizable list of integers class Vector(object): # Attributes items: [int] = None size: int = 0 # Constructor def __init__(self:"Vector"): self.items = [0] # Returns current capacity def capacity(self:"Vector") -> int: return len(self.items) # Increases capacity of vector by one element def increase_capacity(self:"Vector") -> int: self.items = self.items + [0] return self.capacity() # Appends one item to end of vector def append(self:"Vector", item: int): if self.size == self.capacity(): self.increase_capacity() self.items[self.size] = item self.size = self.size + 1 # A faster (but more memory-consuming) implementation of vector class DoublingVector(Vector): doubling_limit:int = 16 # Overriding to do fewer resizes def increase_capacity(self:"DoublingVector") -> int: if (self.capacity() <= self.doubling_limit // 2): self.items = self.items + self.items else: # If doubling limit has been reached, fall back to # standard capacity increases self.items = self.items + [0] return self.capacity() vec:Vector = None num:int = 0 # Create a vector and populate it with The Numbers vec = DoublingVector() for num in [4, 8, 15, 16, 23, 42]: vec.append(num) print(vec.capacity())

ChocoPy does not support modules and imports, higher order functions, native dictionaries, and exceptions.

Want to use ChocoPy to run your own compilers course? Send an email to instructors@chocopy.org.

If you would like to reference ChocoPy in a research paper, please cite the SPLASH-E paper.