Ranger : An on-demand range generator for GCC

Ranger is a new approach to generating ranges in GCC. It is designed to be low-overhead and requires no infrastructure such as dominators or loop analysis. The interface is designed to be easy to use anywhere in GCC, and the API for querying is as simple as asking for the range of an ssa_name on an arbitrary statement.

Existing range infrastructure in the compiler works from the top down, walking the IL in dominator order (or inserting ASSERT_EXPRs), noting all ranges and propagates these values forward in case they are needed. When used outside of the VRP framework, the only range information available is a global range. The Ranger starts at the statement where the range is requested and walks backwards through the use-def chains finding the requested (and related) ranges as required. It provide very accurate location specific ranges and this provides some significant speedups, especially in passes which only casually use ranges. The ranger also caches ranges it looks up and when used in a heavily utilized range environment such as a VRP, it still appears to perform very well, with the possibility of still being faster.

The code currently resides in the svn branch ssa-range.

list of shortcomings and a todo list here.

Thoughts on Trunk Integration. We think there is great value in adding this infrastructure to trunk, and realize there are some things to be addressed as we exit this prototyping stage. This link details what we think makes sense, and we welcome input on what you think.

On the branch, we have converted the following passes to use the Ranger, all with good results:

-walloca

gimple-ssa-sprintf warning pass

some other pass..

simple Ranger Branch VRP

Candidates for further conversion include any pass which uses global range information.

See the pass conversion part of the integration page.

Component Breakdown

There are 3 primary components to the Ranger.

irange : A new range representation to replace the current value_range structure from tree-vrp. The new irange class is implemented directly with wide-ints, and maintains ranges as a series of normalized sub-ranges. It currently sets a compile time limit of 3 sub-ranges to represent a range, but the class API is designed to eventually allow optimizations to change this number in case something like switch optimizations wants to have very precise ranges with an arbitrary number of sub-ranges. It also allows all code relating to managing ranges such as intersect and union to be contained in one place.

range-ops : Another new class related closely to the irange class. This class manages performing various tree-code like operations on ranges. It maintains a table of implemented operations which provides a central location which manages things like adding, subtracting, anding, multiplying, or whatever tree-code operation desired is. New/missing operations are easy to add, and once added are used throughout the ranger in all operational aspects. Ie, no special casing required for most classes of operations as the ranger simply invokes the range-op handler to perform range operations.

Ranger : A set of classes which implement the on-demand model. It is broken into a number of smaller components which manage various tasks at the statement, basic-block, and cross-cfg level.

This document introduces the Ranger and its technology. It is long and discusses most of the technical details involved in implementing all 3 components of the Ranger. I suggest reading it to get a feel for exactly what the Ranger is, how it works, and where it is headed.

Primary API

The top level ranger class is called path_ranger and is found in ssa-range.h. there are 4 primary APIs of interest:

bool path_range_edge (irange& r, tree name, edge e) : Returns the range of NAME on edge E.

bool path_range_entry (irange& r, tree name, basic_block bb) : Returns the range of NAME on entry to basic block BB.

bool path_range_stmt (irange& r, gimple *g) : Returns the range of the result of the expression in statement G.

bool path_range_on_stmt (irange&r, tree name, gimple *g) : Returns the range of NAME if it were used on statement G.

There are a few other other entry points for more specialized needs. They will all be documented in a coming API document which will be located here when finished.

Class Irange

(No, this is not a product of that fruit company). The new irange class was created to provide a single representation of ranges which is not limited to a range or anti-range. It implements all the basic kinds of operations you would expect, plus a few others.

union

intersect

not

upper/lower bound

contains_p

cast (yes, casting from one type to another is handled here, which has been very useful)

We have routines which can convert back and forth between our irange class and value_range for easy inter-operability for now. The irange class implements a super-set of the ranges value_range can represent, with one exception. Value_range utilizes trees for the end points, and within VRP end points are sometime represented with expressions that it wishes to defer resolving until a later point in time. The initial prototypes for the Ranger also included symbolic endpoints. During the evolution of the Ranger design it was discovered this was actually redundant and limiting.

The ranger evaluates ranges on-demand by looking back through def chains of SSA names. This stepping through back through the def chain in effect recreates the symbolic expressions that were being placed in the range class. Where VRP defers evaluating a range, the ranger calculates a first guess at the range. If it is later discovers to be worth reevaluating a range because one or more inputs have changed, we simply invoke the ranger mechanism again to lookup the value using the new range inputs. It will look back through the same use-def chain and calculate a new value.

This provides a cleaner interface than the existng symbolic range mechanism.

There is no special handling to evaluate expressions within ranges. Most code is fairly straightforward now.

The "expressions" that can be handled are the full range of expressions the ranger understands by walking the IL, not just the subset we decide to allow in value_range.

We never have to restrict or give up on performing a union or intersection because there is one or more symbolic expressions involved.

much less special casing of handling the various states. VRP is full of "is this or that end point a symbolic". we only deal with wide-int based range endpoints.

This also enables us to expand the number of sub-ranges. I found when trying to utilize symbolics within ranges that more than one range (or anti-range) was very difficult to perform operations such as union on.

Class range-op

Range-op was designed to abstract all operations to a simple interface that once implemented, would provide a single location which performs all the actions required to perform an operation on a range. Most importantly, it also allows a range for an operand to be calculated from the range of the definition and the range of the other operands. range-op consists of 3 primary routines that need to be implemented, either whole or in part.

Given lhs = op1 OP op2:

bool fold_range (irange& r, const irange& op1, const irange& op2) : Perform the operation op1 OP op2.

bool op1_irange (irange& r, const irange& lhs, const irange& op2) : Calculate op1 given a range for lhs and op2.

bool op2_irange (irange& r, const irange& lhs, const irange& op1) : Calculate op2 given a range for lhs and op1.

This is where the Ranger gets it ability to work backwards through the IL determining ranges:

a_6 = b_5 + 10 if (a_6 > 10)

The Ranger can trivially calculate the range for a_6 on the true edge to be [11, MAX]. By utilizing the backward calculating ability, it can determine that the range on this edge for b_5 is:

[11, MAX] = b_5 + 10, or [11, MAX] - 10 = b_5, or finally resolves to [1, MAX-10]

If the incoming range of b_5 changes, or we detect a change in the range of a_6 which changes it's range, we simply invoke the backward walking calculation again and get a new result.

Once an operation is implemented and a handler installed, it is available to all aspects of the Ranger. Virtually every manipulation of ranges within the IL are handled through the general range-ops interface. We have found using this general model that numerous unexpected ranges have been found producing some surprising results. Quite frequently now I find the "bug" in ranges I am looking at is in fact a correct range, it is my assumption or test case that is wrong. It is rapidly growing smarter than I am as we teach it more operations.

Not all operations have an algebraic equivalent to allow op1 or op2 to be calculated.. that just means they don't get implemented, and we cannot wind backwards through those operations.

a guide for implementing range operations is here. Adding a new operation is straightforward most of the time.

The complete patch for adding ABS_EXPR support is here. Aldy implemented this code for ABS_EXPR, which he managed from scratch even without consulting the document.

Technical notes on the Ranger

After many prototypes and discoveries, we have arrived at what we think is a good approach for the on-demand range model. It operates from simple basic principles and builds on each of those principles to get to the path_ranger which puts the whole thing together. Each layer is contained within a file of its own.

ssa-range-stmt.[ch]

This file contains a class which represents a gimple statement. It performs a lot of summarizing the range related information in a statement for easy consumption by the ranger. In particular, it acts as a direct interface to the range-ops routines as applied to this particular statement.

It defines class range_stmt which is initialized by assigning/creating from a gimple statement. It works for unary and binary statements only at this point. During initialization it examines the statement to see if it is something the ranger understands (ie, it has a range-op handler), and that all the operands are irange compatible. (integral, ssa_names, etc). If any of these conditions are not met, the statement is considered invalid. You will see the coding paradigm throughout the ranger:

range_stmt s(stmt); if (!s.valid()) return;

Which allow the statement to then be used without ever doing any additional error checking.

ssa-range-bb.[ch]

This file contains the block_ranger. It is responsible for determining what ssa_names (if any) within a block can have their ranges calculated, what input requirements are, and actually calculating the range within a block.

It has a private class gori (Generates Outgoing Range Info) which is responsible for determining (on-demand) the definition chains within a block, as well as any imports to the chain. Imports are ssa-names which can affect the calculation of a range with a different input. This will primarily be of use when implementing an iterative solution which recalculates ranges when input conditions change, subsuming the need to have symbolic ranges in class irange. Aldy has found another use for it in his backwards threading algorithm as well.

The export list is the list of ssa-name in the block for which a range might be generated. Any name not in this list doesnt require any examining this block. This information is utilized by the path ranger to decide which blocks to generate a range for, and which ones to simply pass ranges through.

The block ranger itself has a similar API to the path ranger, except it only calculates ranges within the block, not utilizing any external information. It generates what we refer to as the static range. As it is calculated with no external inputs, it is a constant, and can be cached on edges. we don't do this yet, but have thoughts of doing that to speed things up if needed.

Important to note: The block ranger is all about static ranges. The path ranger will then combine the static ranges to produce dynamic ranges over the CFG.

Also of note, the protected routine get_operand_range() is used to pick up the range of an ssa_name on a statement within the engine. As expected, the block_ranger's version will look within the block, and then stop looking. The routine is virtual, which allows the path_ranger to implement a version which will look outside the block, and allow the same block_ranger routines which are calculating static ranges to also pick up incoming ranges from the path_ranger when desired to produce more accurate results, and doesn't require the path_ranger to re-implement any of the basic static range analysis code.

ssa-range.[ch]

This contains the code for the path_ranger which is the primary interface most people will care about. It also contains a few private classes which are used for various purposes:

class non_null_ref - Used to track non null pointer references within blocks

class ssa_block_ranges - cache for ranges of an ssa_name on exit from basic blocks

class block-range_cache - managed list of ssa_block_ranges

Path_ranger inherits from block_ranger and takes care of acquiring any requisite ranges spanning the CFG.

A key routine is path_ranger::path_range_stmt() which calculates a range for the result of a statement. You may have noted that the range-ops interface only handles unary and binary opcodes. Other statements of interest are handled here, such as PHIs and CALLS. We may someday try to unify these into range ops or an equivalent, but it hasn't seemed necessary until now. Also certain things like PHI nodes require CFG and edge awareness, and other non-standard sort of things.

class path_ranger also has a method called exercise. This loops through the entire IL asking for the range of every ssa_name at the beginning of each block, as well as on each outgoing edge. This provides coverage that there are no ICEs in calculating any range at any place. When -fdump-tree-rvrp-details is used, the range listing produced calls exercise and dumps any range which doesn't map to range_for_type(). This is how we show what ranges we know are.

ssa-range-global.[ch]

This implements a global cache which the Ranger utilizes to track what it currently knows about an SSA_NAME. It is local to the ranger, and is currently thrown away when the ranger is done. In the fullness of time, this could perhaps become persistent, but for now it forces re-evaluation so we do not need to detect whether any of the input conditions to a range have been altered since the last calculation. It is also worth noting that when no range information is available, the Ranger will start with whatever GCCs current global range is, so any previous runs of VRP or loop analysis will be integrated into the results.

ssa-range-vrp.[ch]

This implement the Ranger VRP (RVRP) pass. It is basically a fast and simple branch oriented VRP which demonstrates both the ease of use and power of the ranger. It simply looks at the last statement of each block, and if it is a branch checks to see if the branch can be folded. The primary loop of the optimization is:

path_ranger ranger; FOR_EACH_BB_FN (bb, cfun) { gcond *cond; gimple *stmt = last_stmt (bb); if (stmt && (cond = dyn_cast <gcond *> (stmt))) { // See if there is any range produced by the condition irange r; if (ranger.path_range_stmt (r, stmt)) { // If it evaluates to a constant, retrieve that value if (r.singleton_p (i)) { if (!argument_ok_to_propagate (gimple_cond_lhs (cond)) || !argument_ok_to_propagate (gimple_cond_rhs (cond))) continue; if (wi::eq_p (i, 0)) gimple_cond_make_false (cond); else gimple_cond_make_true (cond); update_stmt (cond); } } } }