Parsing is used to check whether a string matches a regular expression. If there is a match then the string is tokenized and the data is extracted to be stored in the memory.

Example variable definition

example1: int = 10 mut example2: int = 20

Parsing for the variable definition will ignore the “= 10” and “= 20”. Other parsers such as an “AssignmentParser” and “IntegerParser” would be used for these. We are going to focus on the “example1:int” and “mut example2: int” sections.

Parser – Abstract class

First is an abstract parser class to define the structure of all parsers. All parsers must inherit from this class.

// Parser.scala abstract class Parser[T <: Block] { /** * A list of all regular expressions * * @return list of regexs */ def getRegexs: List[String] /** * Whether the regex is contained within the string * * @return true if exists */ def shouldParse(line: String): Boolean = getRegexs.exists(_.r.findFirstIn(line).nonEmpty) /** * Take the superBlock and the tokenizer for the line and return a blocks of this parsers's type. */ def parse(superBlock: Block, tokenizer: Tokenizer): Block }

The “shouldParse” method loops through all regular expressions contained within the parser and returns true if it is contained within the string.

The “parse” method is used to step through the tokenized string to extract all required information. E.g. variable name and type.

The “Block” class that is returned is used to store the extracted information in memory. You can create a stub class for now.

Define Variable Parser – Extends Parser.scala

To start we will create a parser that will be looking for a variable definition.

// DefineVariableParser.scala class DefineVariableParser extends Parser[DefineVariableBlock] { /** * A list of all regular expressions * * @return list of regexs */ override def getRegexs: List[String] = List( "(mut[ ]+)?[a-zA-Z][a-zA-Z0-9]*[ ]*:[ ]*[a-zA-Z][a-zA-Z0-9]*[ ]*" ) def parse(superBlock: Block, tokenizer: Tokenizer): DefineVariableBlock = { val nextToken: String = tokenizer.nextToken.token val mutable: Boolean = nextToken == "mut" // mutable/immutable val name: String = if (mutable) tokenizer.nextToken.token else nextToken // Variable name tokenizer.nextToken // skip ":" val varType = tokenizer.nextToken.token // Variable Type new DefineVariableBlock(superBlock, mutable, name, varType) } }

There first is a check to see if the “mut” keyword is used. If so “mutable” is set to true. If not then the variable name is extracted.

Next the “:” is skipped over.

Lastly the variable type is extracted.

The information is then passed as arguments to the Block class.

Conclusion

We have now defined a parser with a “shouldParse” and “parse” method. This will allow for us check what the string represents, and if matched will allow for us to extract all relevant data.

Source Files

To access the complete files access the structure/parser directory within my project.

https://github.com/Michael2109/cobalt