First, let’s merge postfix and infix cases, as they are almost the same. The idea is to change priorities for ! from (11, ()) to (11, 100) , where 100 is a special, very strong priority, which means that the right hand side of a "binary" operator is empty. We’ll handle this in a pretty crude way right now, but all the hacks would go away once we refactor the rest.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 fn expr_bp ( lexer : & mut Lexer , min_bp : u8 ) -> Option < S > { if min_bp == 100 { return None ; } let mut lhs = match lexer .next () { Token :: Atom ( it ) => S :: Atom ( it ), Token :: Op ( '(' ) => { let lhs = expr_bp ( lexer , 0 ) .unwrap (); assert_eq! ( lexer .next (), Token :: Op ( ')' )); lhs } Token :: Op ( op ) => { let ((), r_bp ) = prefix_binding_power ( op ); let rhs = expr_bp ( lexer , r_bp ) .unwrap (); S :: Cons ( op , vec! [ rhs ]) } t => panic! ( "bad token: {:?}" , t ), }; loop { let op = match lexer .peek () { Token :: Eof => break , Token :: Op ( op ) => op , t => panic! ( "bad token: {:?}" , t ), }; if let Some (( l_bp , r_bp )) = infix_binding_power ( op ) { if l_bp < min_bp { break ; } lexer .next (); let rhs = expr_bp ( lexer , r_bp ); let mut args = Vec :: new (); args .push ( lhs ); args .extend ( rhs ); lhs = S :: Cons ( op , args ); continue ; } break ; } Some ( lhs ) }

Yup, we just check for hard-coded 100 constant and use a bunch of unwraps all over the place. But the code is already smaller.

Let’s apply the same treatment for prefix operators. We’ll need to move their handing into the loop, and we also need to make lhs optional, which is now not a big deal, as the function as a whole returns an Option . On a happier note, this will allow us to remove the if 100 wart. What’s more problematic is handing priorities: minus has different binding powers depending on whether it is in an infix or a prefix position. We solve this problem by just adding an prefix: bool argument to the binding_power function.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 fn expr_bp ( lexer : & mut Lexer , min_bp : u8 ) -> Option < S > { let mut lhs = match lexer .peek () { Token :: Atom ( it ) => { lexer .next (); Some ( S :: Atom ( it )) } Token :: Op ( '(' ) => { lexer .next (); let lhs = expr_bp ( lexer , 0 ) .unwrap (); assert_eq! ( lexer .next (), Token :: Op ( ')' )); Some ( lhs ) } _ => None , }; loop { let op = match lexer .peek () { Token :: Eof => break , Token :: Op ( op ) => op , t => panic! ( "bad token: {:?}" , t ), }; if let Some (( l_bp , r_bp )) = binding_power ( op , lhs .is_none ()) { if l_bp < min_bp { break ; } lexer .next (); let rhs = expr_bp ( lexer , r_bp ); let mut args = Vec :: new (); args .extend ( lhs ); args .extend ( rhs ); lhs = Some ( S :: Cons ( op , args )); continue ; } break ; } lhs } fn binding_power ( op : char , prefix : bool ) -> Option < ( u8 , u8 ) > { let res = match op { '=' => ( 2 , 1 ), '+' | '-' if prefix => ( 99 , 9 ), '+' | '-' => ( 5 , 6 ), '*' | '/' => ( 7 , 8 ), '!' => ( 11 , 100 ), '.' => ( 14 , 13 ), _ => return None , }; Some ( res ) }

Keen readers might have noticed that we use 99 and not 100 here for "no operand" case. This is not important yet, but will be during the next step.

We’ve unified prefix, infix and postfix operators. The next logical step is to treat atoms as nullary operators! That is, we’ll parse 92 into (92) S-expression, with None for both lhs and rhs . We get this by using (99, 100) binding power. At this stage, we can get rid of distinction between atom tokens and operator tokens, and make the lexer return underlying char 's directly. We’ll also get rid of S::Atom , which gives us this somewhat large change:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 enum S { Cons ( char , Vec < S > ), } impl fmt :: Display for S { fn fmt ( & self , f : & mut fmt :: Formatter < '_ > ) -> fmt :: Result { match self { S :: Cons ( head , rest ) => { if rest .is_empty () { write! ( f , "{}" , head ) } else { write! ( f , "({}" , head ) ? ; for s in rest { write! ( f , " {}" , s ) ? } write! ( f , ")" ) } } } } } struct Lexer { tokens : Vec < char > , } impl Lexer { fn new ( input : & str ) -> Lexer { let mut tokens = input .chars () .filter (| it | ! it .is_ascii_whitespace ()) .collect :: < Vec < _ >> (); tokens .reverse (); Lexer { tokens } } fn next ( & mut self ) -> Option < char > { self .tokens .pop () } fn peek ( & mut self ) -> Option < char > { self .tokens .last () .copied () } } fn expr ( input : & str ) -> S { let mut lexer = Lexer :: new ( input ); expr_bp ( & mut lexer , 0 ) .unwrap () } fn expr_bp ( lexer : & mut Lexer , min_bp : u8 ) -> Option < S > { let mut lhs = match lexer .peek () { Some ( '(' ) => { lexer .next (); let lhs = expr_bp ( lexer , 0 ) .unwrap (); assert_eq! ( lexer .next (), Some ( ')' )); Some ( lhs ) } _ => None , }; loop { let token = match lexer .peek () { Some ( token ) => token , None => break , }; if let Some (( l_bp , r_bp )) = binding_power ( token , lhs .is_none ()) { if l_bp < min_bp { break ; } lexer .next (); let rhs = expr_bp ( lexer , r_bp ); let mut args = Vec :: new (); args .extend ( lhs ); args .extend ( rhs ); lhs = Some ( S :: Cons ( token , args )); continue ; } break ; } lhs } fn binding_power ( op : char , prefix : bool ) -> Option < ( u8 , u8 ) > { let res = match op { '0' ..= '9' | 'a' ..= 'z' | 'A' ..= 'Z' => ( 99 , 100 ), '=' => ( 2 , 1 ), '+' | '-' if prefix => ( 99 , 9 ), '+' | '-' => ( 5 , 6 ), '*' | '/' => ( 7 , 8 ), '!' => ( 11 , 100 ), '.' => ( 14 , 13 ), _ => return None , }; Some ( res ) }

This is the stage where it becomes important that "fake" binding power of unary - is 99 . After parsing first constant in 1 - 2 the r_bp is 100 , and we need to avoid eating the following minus.

The only thing left outside the main loop are parenthesis. We can deal with them using (99, 0) priority — after ( we enter a new context where all operators are allowed.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 fn expr_bp ( lexer : & mut Lexer , min_bp : u8 ) -> Option < S > { let mut lhs = None ; loop { let token = match lexer .peek () { Some ( token ) => token , None => break , }; if let Some (( l_bp , r_bp )) = binding_power ( token , lhs .is_none ()) { if l_bp < min_bp { break ; } lexer .next (); let rhs = expr_bp ( lexer , r_bp ); if token == '(' { assert_eq! ( lexer .next (), Some ( ')' )); lhs = rhs ; continue ; } let mut args = Vec :: new (); args .extend ( lhs ); args .extend ( rhs ); lhs = Some ( S :: Cons ( token , args )); continue ; } break ; } lhs } fn binding_power ( op : char , prefix : bool ) -> Option < ( u8 , u8 ) > { let res = match op { '0' ..= '9' | 'a' ..= 'z' | 'A' ..= 'Z' => ( 99 , 100 ), '(' => ( 99 , 0 ), '=' => ( 2 , 1 ), '+' | '-' if prefix => ( 99 , 9 ), '+' | '-' => ( 5 , 6 ), '*' | '/' => ( 7 , 8 ), '!' => ( 11 , 100 ), '.' => ( 14 , 13 ), _ => return None , }; Some ( res ) }

Or, after some control flow cleanup:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 fn expr_bp ( lexer : & mut Lexer , min_bp : u8 ) -> Option < S > { let mut lhs = None ; loop { let token = match lexer .peek () { Some ( token ) => token , None => return lhs , }; let r_bp = match binding_power ( token , lhs .is_none ()) { Some (( l_bp , r_bp )) if min_bp <= l_bp => r_bp , _ => return lhs , }; lexer .next (); let rhs = expr_bp ( lexer , r_bp ); if token == '(' { assert_eq! ( lexer .next (), Some ( ')' )); lhs = rhs ; continue ; } let mut args = Vec :: new (); args .extend ( lhs ); args .extend ( rhs ); lhs = Some ( S :: Cons ( token , args )); } }

This is still recognizably a Pratt parse, with its characteristic shape

1 2 3 4 5 6 7 fn parse_expr () { loop { ... parse_expr () ... } }