Home

Parsers turn strings of characters into meaningful data structures (like a JSON object!). nearley is a fast, feature-rich, and modern parser toolkit for JavaScript. nearley is an npm Staff Pick.

nearley 101

Install: $ npm install -g nearley (or try nearley live in your browser here!) Write your grammar: # Match a CSS color # http://www.w3.org/TR/css3-color/#colorunits @builtin "whitespace.ne" # `_` means arbitrary amount of whitespace @builtin "number.ne" # `int`, `decimal`, and `percentage` number primitives csscolor -> "#" hexdigit hexdigit hexdigit hexdigit hexdigit hexdigit | "#" hexdigit hexdigit hexdigit | "rgb" _ "(" _ colnum _ "," _ colnum _ "," _ colnum _ ")" | "hsl" _ "(" _ colnum _ "," _ colnum _ "," _ colnum _ ")" | "rgba" _ "(" _ colnum _ "," _ colnum _ "," _ colnum _ "," _ decimal _ ")" | "hsla" _ "(" _ colnum _ "," _ colnum _ "," _ colnum _ "," _ decimal _ ")" hexdigit -> [a-fA-F0-9] colnum -> int | percentage Compile your grammar: $ nearleyc csscolor.ne -o csscolor.js Test your grammar: $ nearley-test -i "#00ff00" csscolor.js Parse results: [ [ '#', [ '0' ], [ '0' ], [ 'f' ], [ 'f' ], [ '0' ], [ '0' ] ] ] Turn your grammar into a generator: $ nearley-unparse -n 3 csscolor.js #Ab21F2 rgb ( -29.889%,7,8172) #a40 You try it! Type a CSS color here: …and the parsed output will appear here! Create beautiful railroad diagrams to document your grammar formally. $ nearley-railroad csscolor.ne -o csscolor.html See a demo here.

Features

nearley is the first JS parser to use the Earley algorithm (insert your own ‘early bird’ pun here). It also implements Joop Leo’s optimizations for right-recursion, making it effectively linear-time for LL(k) grammars.

algorithm (insert your own ‘early bird’ pun here). It also implements Joop Leo’s optimizations for right-recursion, making it effectively for LL(k) grammars. nearley lives happily in node , but doesn’t mind the browser .

, but doesn’t mind the . nearley outputs small files. And its expressive DSL comes with plenty of syntactic sugar to keep your source files short. And sweet.

files. And its DSL comes with plenty of to keep your source files short. And sweet. nearley’s grammar language is powerful and expressive: you can use macros , import from a large builtin library of pre-defined parser-pieces, use a tokenizer for extra performance, and more!

, import from a large of pre-defined parser-pieces, use a for extra performance, and more! nearley is built on an idiomatic streaming API . You even have access to partial parses to build predictive user interfaces.

. You even have access to partial parses to build user interfaces. nearley processes left recursion without choking. In fact, nearley will parse anything you throw at it without complaining or going into a sulk infinite loop.

without choking. In fact, nearley will parse anything you throw at it without complaining or going into a infinite loop. nearley handles ambiguous grammars gracefully. Ambiguous grammars can be parsed in multiple ways: instead of getting confused, nearley gives you all the parsings (in a deterministic order!).

gracefully. Ambiguous grammars can be parsed in multiple ways: instead of getting confused, nearley gives you all the parsings (in a deterministic order!). nearley allows for debugging with generous error detection . When it catches a parse-time error, nearley tells you exactly what went wrong and where.

. When it catches a parse-time error, nearley tells you exactly what went wrong and where. nearley is powerful enough to be bootstrapped . That means nearley uses nearley to compile parts of nearley. nearleyception!

. That means nearley uses nearley to compile parts of nearley. nearleyception! nearley parsers can be inverted to form generators which output random strings that match a grammar. Useful for writing test cases , fuzzers , and Mad-Libs .

which output random strings that match a grammar. Useful for writing , , and . You can export nearley parsers as railroad diagrams , which provide easy-to-understand documentation of your grammar.

, which provide easy-to-understand documentation of your grammar. nearley comes with fantastic tooling. You can find editor plug-ins for vim, Sublime Text, Atom, and VS Code; there are also plug-ins for Webpack and gulp.

Projects using nearley

Artificial Intelligence, NLP, Linguistics: Shrdlite is a programming project in Artificial Intelligence, a course given at the University of Gothenburg and Chalmers University of Technology. It uses nearley for reading instructions in natural language (i.e. English). lexicon-grammars was used to parse lexicons for a project at Australian National University.

Standard formats: node-dmi is a module that reads iconstate metadata from BYOND DMI files, edtf.js is a parser for Extended Date Time Format, node-krl-parser is a KRL parser for node, bibliography is a BibTeX-to-HTML converter, biblatex-csl-converter converts between bibtex/CSL/JSON, scalpel parses CSS selectors (powering enzyme, Airbnb’s React testing tool), rfc5545-rrule helps parse iCalendar data, mangudai parses RMS scripts for Age of Empires II, tf-hcl parses and generates HCL config files, css-selector-inspector parses and tokenizes CSS3 selectors, css-property-parser validates and expands CSS shorthands, node-scad-parser parses OpenSCAD 3D models, js-sql-parse parses SQL statements,resp-parser is a parser for the RESP protocol, celio parses Celestia star catalogs.

Templating and files: uPresent is a markdown-based presentation authoring system, saison is a minimal templating language, Packdown is a tool to generate human-readable archives of multiple files.

Programming languages: Carbon is a C subset that compiles to JavaScript, optimized for game development, ezlang is a simple language, tlnccuwagnf is a fun general-purpose language, nanalang is a silly esoteric language, english is a less esoteric programming language, ecmaless is an easily-extensible language, hm-parser parses Haskell-like Hindley-Milner type signatures, kozily implements the Oz language, abstract-machine inspects execution models, fbp-types provides typechecking primitives for flow-based systems, lp5562 is an assembler for the TI LP5562 LED driver, VSL is a Versatile Scripting Language, while-typescript is an implementation of the WHILE language, lo is a language for secure distributed systems, jaco is an implementation of CMU’s C0 teaching language, walt is a subset of JavaScript that targets WebAssembly.

Mathematics: Solvent is a powerful desktop calculator, Truth-table is a tool to visualize propositional logic in truth tables, Emunotes is a personal Wiki with inline graphing and computation, react-equation parses and renders equations in React, the mLab generates category theory papers.

Domain-specific languages: Hexant is a cellular automata simulator with a DSL for custom automata, Dicetower is an advanced dice plugin for hubot, deck.zone is a language to create board games, in-seconds is a time calculator for music applications, website-spec is a tool for functional web testing, pianola allows declarative function composition, idyll is a markup language for data-driven documents, virtsecgroup provides virtual AWS security groups, deadfad is a hex editor that lets you specify structs, bishbosh helps you create command-line interfaces, syso codifies aspects of French legal contracts, siteswap parses Siteswap notation for juggling patterns, jsgrep provides syntactic grep for JavaScript, electro-grammar parses descriptions of electronic components like resistors and capacitors, cicero helps create smart legal contracts, Eventbot is a calendar plugin for Slack used by thousands of teams, Obyte is a cryptocurrency platform, OptiCSS is a CSS optimizer built by LinkedIn.

Other: ProceduralPsychEpisode generates “random episodes of the hilarious but formulaic show”

Parsing libraries: nearley is a parser toolkit for JavaScript. It has a nearley-based DSL to specify parsers.

Give to nearley

nearley has been maintained by volunteers since 2014. If you want to help support us, contact @kach or @tjvr on GitHub. We’ll send over our PayPal information – and maybe something nice. :-).