What is Teradata Query Rewriting?

Teradata query rewriting is part of the Teradata optimization process.

The optimizer replaces your query with a better version, which is assumed to be less resource-intense and, hopefully, faster.

Of course, both queries have to deliver the same result set. To achieve better performance, Teradata utilizes several optimization strategies.

We will show you the most frequently applied optimization techniques.

1. Removing of unreferenced columns and expressions from the VIEW select list

Without this enhancement, all VIEW columns would have to be spooled during query execution.

Let’s assume we have the following view definition:

CREATE VIEW MyView AS

SELECT col1, col2, AVG(col3) myavg

FROM table1 t01, table2 t02

WHERE t01.pk=t02.pk GROUP BY 1,2;

We are running the following query:

SELECT MAX(myavg) FROM MyView;

In the above example, view columns col1 and col2 can be eliminated, and only the column myavg will be spooled.

Another example will show you that it is even possible to eliminate spool usage:

SELECT COUNT(*) FROM MyView;

The above query does not reference any view column. The number of rows can be derived from the cylinder index alone!

The elimination of view columns from the spool is a widespread technique the optimizer uses for spool usage reduction.

2. Conversion of outer joins to inner joins

Whenever the optimizer detects that a query can be satisfied with an inner join, the outer join will be replaced to improve join performance:

SELECT DISTINCT(col1)

FROM table1 LEFT OUTER JOIN table2 ON table1.pk=table2.pk

WHERE table2.col2 = 100 ;

In the above example, the outer join can safely be converted to an inner join as the WHERE condition filters any non-matching rows.

3. Folding of Views

When applying this enhancement, the optimizer completely removes the reference to views from the execution plan (this is called view folding).

Let’s assume we have the following view definition:

CREATE VIEW MyView AS

SELECT col1, col2, SUM(col3*col4) the_sum

FROM table1, table2

WHERE table1.pk=table2.pk GROUP BY 1,2;

We run the following query:

SELECT col2 FROM MyView WHERE the_sum > 100;

The optimizer would rewrite the above example in the following form:

SELECT col2 FROM table1, table2

WHERE table1.pk=table2.pk

GROUP BY 1,2

HAVING SUM(col3*col4) > 100;

As you can see, the view was eliminated from the execution plan.

4. Predicate Pushdown

With this technique, the optimizer pushes down existing query predicates. In the above example, the WHERE predicate would be applied already in the inner sub-query t01:

SELECT SUM(the_sum) the_overall_sum

FROM (

SELECT col1, col2, SUM(col3*col4) the_sum

FROM table1, table2

WHERE table1.col1=table2.col2

GROUP BY 1, 2 ) t01

WHERE col1 IN (1,2,3,4,5) ;

5. Elimination of SET operator branches

In case a SET branch contains an unsatisfiable condition, the branch will be eliminated from the query.

Let’s assume we have the example query below:

SELECT * FROM table1 WHERE month_id= 1

UNION ALL

SELECT * FROM table2 WHERE month_id = 1 ;

Further, we assume to have a check constraint on month_id, ensuring that table1 only contains month_id = 1 and table2 contains month_id = 2:

The optimizer can safely rewrite the above query to:

SELECT * FROM table1 WHERE month_id = 1 ;

6. Elimination of Joins

Unneeded Joins are eliminated from the execution plan:

SELECT table1.* FROM table1 LEFT JOIN table2 ON table1.pk=table2.pk ;

The above query can be satisfied by spooling only table1:

SELECT table1.* FROM table1;

As you can see, the optimizer tries its best to improve your queries. I hope you enjoyed this post!