Swift Abstract Syntax Tree

5 minute read

Swift compiler has an interesting mode: -dump-ast which outputs the abstract syntax tree of a swift source code. AST is used to represent the source code in form of a tree containing syntactic information.

For eg. a piece of code like this :

import Foundation class Foo { struct FooStruct { func fooStructFunc () {} } func fooFunc () { } } struct Bar { func barFunc () { } }

Will generate this syntax tree :

(source_file (import_decl 'Foundation') (class_decl "Foo" type='Foo.Type' access=internal (struct_decl "FooStruct" type='Foo.FooStruct.Type' access=internal (func_decl "fooStructFunc()" type='Foo.FooStruct -> () -> ()' access=internal (body_params (pattern_typed implicit type='Foo.FooStruct' (pattern_named implicit type='Foo.FooStruct' 'self')) (pattern_tuple type='()' names=)) (brace_stmt)) (constructor_decl implicit "init()" type='Foo.FooStruct.Type -> () -> Foo.FooStruct' access=internal designated (body_params (pattern_typed implicit type='inout Foo.FooStruct' (pattern_named implicit type='inout Foo.FooStruct' 'self')) (pattern_tuple implicit type='()' names=)) (brace_stmt (return_stmt)))) (func_decl "fooFunc()" type='Foo -> () -> ()' access=internal (body_params (pattern_typed implicit type='Foo' (pattern_named implicit type='Foo' 'self')) (pattern_tuple type='()' names=)) (brace_stmt)) (destructor_decl implicit "deinit" type='Foo -> ()' access=internal @objc (body_params (pattern_typed implicit type='Foo' (pattern_named implicit type='Foo' 'self'))) (brace_stmt)) (constructor_decl implicit "init()" type='Foo.Type -> () -> Foo' access=internal designated (body_params (pattern_typed implicit type='Foo' (pattern_named implicit type='Foo' 'self')) (pattern_tuple implicit type='()' names=)) (brace_stmt (return_stmt)))) (struct_decl "Bar" type='Bar.Type' access=internal (func_decl "barFunc()" type='Bar -> () -> ()' access=internal (body_params (pattern_typed implicit type='Bar' (pattern_named implicit type='Bar' 'self')) (pattern_tuple type='()' names=)) (brace_stmt)) (constructor_decl implicit "init()" type='Bar.Type -> () -> Bar' access=internal designated (body_params (pattern_typed implicit type='inout Bar' (pattern_named implicit type='inout Bar' 'self')) (pattern_tuple implicit type='()' names=)) (brace_stmt (return_stmt)))))

The above output can be represented something like this which might be easier to understand :

I have ommited some of the things like implicit init, deinit etc for simplicity.

There is a issue for Swiftpm where we need to conform the subclasses of XCTestCase to XCTestCaseProvider protocol (swift on Linux doesn’t support reflection right now) which is just boilerplate code which can be generated automatically.

This basically means we need to find out all the methods inside subclasses of XCTestCase which begins with “test” and have signature ()->() then generate some boiler plate code. This is where the syntax tree can help.

Lets start by creating the Node of the AST:

class Node { var contents : String = "" var nodes : [ Node ] = [] }

Each node will hold the content in form of string and an array of nodes in that node.

The output of AST contains node contents between ( and ) and their nodes are nested inside parenthesis similarly. So lets declare a stack to hold the nodes which we’re currently processing.

var stack = Array < Node > ()

We also need few other things:

a bool quoteStarted : To make sure we don’t parse things inside quotes

: To make sure we don’t parse things inside quotes a String data : Will hold current node’s contents

: Will hold current node’s contents an array of Node , sources : Will hold references to sources as a swift module can have multiple source files

var quoteStarted = false var data = "" var sources : [ Node ] = []

Now we can start iterating in the AST output and put a node in stack whenever we encounter ( and pop when we encounter ) connecting the popped node with top node in stack.

for char in astString . characters { if char == "(" && ! quoteStarted { let node = Node () if data . characters . count > 0 , let lastNode = stack . last , let chuzzed = data . chuzzle () { lastNode . contents = chuzzed if lastNode . contents == "source_file" { sources += [ lastNode ] } } stack . append ( node ) data = "" } else if char == ")" && ! quoteStarted { if case let poppedNode = stack . removeLast () where stack . count > 0 { if data . characters . count > 0 , let chuzzed = data . chuzzle () { poppedNode . contents = chuzzed } stack . last !. nodes += [ poppedNode ] data = "" } } else { data = data + String ( char ) if char == " \" " || char == "'" { quoteStarted = ! quoteStarted } } }

Thats it and we now have the ASTs inside sources array.

Since we’re only interested in classes and methods lets declare a type enum :

enum Type : String { case Class = "class_decl" case Fn = "func_decl" case Unknown = "" }

and add a computed property on Node to get the type while iterating in nodes.

var type : Type { if contents . hasPrefix ( Type . Class . rawValue ) { return . Class } if contents . hasPrefix ( Type . Fn . rawValue ) { return . Fn } return . Unknown }

Now we can just start iterating and print Classes and its methods :

for source in sources { for node in source . nodes where node . type == . Class { print ( "----------------" ) print ( node . name ) print ( "----------------" ) for classNode in node . nodes where classNode . type == . Fn { print ( classNode . name ) } } }

I ran this on all of Swiftpm’s Test modules and here is the result :

---------------- DescribeTests ---------------- testDescribingNoModulesThrows() ---------------- DependencyResolutionTestCase ---------------- testInternalSimple() testInternalComplex() testExternalSimple() testExternalComplex() testIndirectTestsDontBuild() ---------------- InvalidLayoutsTestCase ---------------- testNoTargets() testMultipleRoots() testInvalidLayout1() testInvalidLayout2() testInvalidLayout3() testInvalidLayout4() testInvalidLayout5() ---------------- ModuleMapsTestCase ---------------- fixture(name:CModuleName:rootpkg:body:) ---------------- ValidLayoutsTestCase ---------------- testSingleModuleLibrary() testSingleModuleExecutable() testSingleModuleCustomizedName() testSingleModuleSubfolderWithSwiftSuffix() testMultipleModulesLibraries() testMultipleModulesExecutables() testPackageIdentifiers() ---------------- VersionGraphTests ---------------- testNoGraph() testOneDependency() testOneDepenencyWithMultipleAvailableVersions() testTwoDependencies() testTwoDirectDependencies() testTwoDirectDependenciesWhereOneAlsoDependsOnTheOther() testSimpleVersionRestrictedGraph() testComplexVersionRestrictedGraph() testVersionConstrain() testTwoDependenciesRequireMutuallyExclusiveVersionsOfTheSameDependency_Simple() testTwoDependenciesRequireMutuallyExclusiveVersionsOfTheSameDependency_Complex() testVersionUnavailable() ---------------- MockCheckout ---------------- constrain(to:) setVersion(_:) ---------------- _MockFetcher ---------------- find(url:) finalize(_:) fetch(url:) ---------------- GetTests ---------------- testRawCloneDoesNotCrashIfManifestIsNotPresent() testRangeConstrain() testGitRepoInitialization() ---------------- GitTests ---------------- testHasVersion() testHasNoVersion() testCloneShouldNotCrashWihoutTags() testCloneShouldCrashWihoutTags() ---------------- ManifestTests ---------------- loadManifest(_:line:body:) testManifestLoading() testNoManifest() ---------------- PackageTests ---------------- testBasics() testExclude() testEmptyTestDependencies() testTestDependencies() testTargetDependencyIsStringConvertible() ---------------- TOMLTests ---------------- testLexer() testParser() testParsingTables() ---------------- PackageTests ---------------- testMatchDependencyWithPreReleaseVersion() ---------------- VersionTests ---------------- testEquality() testNegativeValuesBecomeZero() testComparable() testDescription() testFromString() testSort() testRange() testSuccessor() testPredecessor() ---------------- PackageTests ---------------- testUrlEndsInDotGit1() testUrlEndsInDotGit2() testUrlEndsInDotGit3() testUid() ---------------- ModuleTests ---------------- test1() test2() test3() test4() test5() test6() testIgnoresFiles() testModuleTypes() ---------------- FileTests ---------------- loadInputFile(_:) testOpenFile() testOpenFileFail() testReadRegularTextFile() testReadRegularTextFileWithSeparator() ---------------- RmtreeTests ---------------- testDoesNotFollowSymlinks() ---------------- PathTests ---------------- test() testPrecombined() testExtraSeparators() testEmpties() testNormalizePath() testJoinWithAbsoluteReturnsLastAbsoluteComponent() testParentDirectory() ---------------- WalkTests ---------------- testNonRecursive() testRecursive() testSymlinksNotWalked() testWalkingADirectorySymlinkResolvesOnce() ---------------- StatTests ---------------- test_isdir() test_isfile() test_realpath() test_basename() ---------------- RelativePathTests ---------------- testAbsolute() testRelative() testMixed() ---------------- ResourcesTests ---------------- testResources() ---------------- ShellTests ---------------- testPopen() testPopenWithBufferLargerThanThatAllocated() testPopenWithBinaryOutput() ---------------- StringTests ---------------- testTrailingChomp() testSeparatorChomp() testEmptyChomp() testChuzzle() ---------------- URLTests ---------------- testSchema()

Works pretty good!

The gist for full code is available here