July 29, 2014 at 20:22 Tags Compilation , LLVM & Clang

Clang tooling sees lots of interest and development focus in the past few years. At last, we have a convenient, accurate, open-source and well supported framework for programmatically analyzing and refactoring C++ code; I find this very exciting.

A great outcome of this rapid pace of development is that new APIs and tools spring up all the time. For example, some time ago the Clang tooling developers figured out folks doing AST traversals have to write a lot of repetitive code to find interesting AST nodes, so they came up with a great new API called AST matchers, which I want to discuss here.

Visitors vs. matchers Here's a motivating example. Suppose we're looking for pointer-typed variables being used in if comparisons. To make this more specific, let's say we're looking for cases where the pointer-typed variable is on the left-hand-side of an equality comparison ( == ). To find such nodes in a recursive visitor, we'd have to write something like this: bool VisitIfStmt(IfStmt *s) { if ( const BinaryOperator *BinOP = llvm::dyn_cast<BinaryOperator>(s->getCond())) { if (BinOP->getOpcode() == BO_EQ) { const Expr *LHS = BinOP->getLHS(); if ( const ImplicitCastExpr *Cast = llvm::dyn_cast<ImplicitCastExpr>(LHS)) { LHS = Cast->getSubExpr(); } if ( const DeclRefExpr *DeclRef = llvm::dyn_cast<DeclRefExpr>(LHS)) { if ( const VarDecl *Var = llvm::dyn_cast<VarDecl>(DeclRef->getDecl())) { if (Var->getType()->isPointerType()) { Var->dump(); // YAY found it!! } } } } } return true ; } This is quite a bit of code, but nothing out of the ordinary if you've been working with Clang ASTs for a while. Perhaps it can be golfed down into a somewhat shorter form, but the main problem is that to write this one has to go through quite a bit of documentation and header files to figure out which methods to call and what kinds of objects they return. Here's the equivalent AST matcher: Finder.addMatcher( ifStmt(hasCondition(binaryOperator( hasOperatorName( "==" ), hasLHS(ignoringParenImpCasts(declRefExpr( to(varDecl(hasType(pointsTo(AnyType))).bind( "lhs" )))))))), &HandlerForIf); Some difference, right? The declarative nature of matcher definitions makes this very natural to read and to map to the actual problem. HandlerForIf is a MatchCallback object that has direct access to the bound nodes of the matcher: struct IfStmtHandler : public MatchFinder::MatchCallback { virtual void run( const MatchFinder::MatchResult &Result) { const VarDecl *lhs = Result.Nodes.getNodeAs<VarDecl>( "lhs" ); lhs->dump(); // YAY found it!! } }; There's actually quite a bit of documentation available about AST matchers on the official Clang website. For a complete example that can be built outside of the LLVM tree, I redid the tooling sample from the previous article, now with AST matchers (all available in the llvm-clang-samples repository).