the original test environmet by dark100 at http://sljit.sourceforge.net/regex_perf.html

clone this test environmet

Performance comparison of regular expression engines

Introduction

Participants

The engines were not fine tuned (because of my lack of knowledge about their internal workings). I just compiled them with the default options. I know enabling or disabling some features can heavily affect the results. If you feel that you have a better configuration just drop me an e-mail and I will update the results (mailto:nasciiboy@gmail.com).

The regular expression engines are compiled with -O3 to allow the best performance.

This comparison page was inspired by the work of John Maddock (See his own regex comparison here). The input (data.txt) is also the same he used before: mtent12.zip. It is a text file (e-book) which size is about 20 Mbytes.

Only common patterns are selected, they are not pathological cases nor have any PERL specific features. The comparison was caseful.

Results

x86-64 bit Intel Cerelon 847 1.1GHz (GCC 7.2.0, Go 1.9.1 – GNU/Linux)

Regular expression tre re2Golang re2 pcre-JIT pcre-DFA pcre onig regexp4Golang regexp4 regexp3 ".|

" "." 6486ms (19285324) 34159ms (19285324) 9072ms (19285324) 1130ms (19285324) 6677ms (19285324) 4448ms (19285324) 12876ms (19285324) 1553ms (19285324) 605ms (19285324) 1808ms (19285324) "\d" ":d" 1044ms (27084) 3166ms (27084) 134ms (27084) 55ms (27084) 68ms (27084) 65ms (27084) 127ms (27084) 1340ms (27084) 594ms (27084) 1853ms (27084) "\S" ":S" 4633ms (15451664) 32530ms (15451664) 7405ms (15451664) 931ms (15451664) 4593ms (15451664) 3086ms (15451664) 10453ms (15451664) 2056ms (15451664) 810ms (15451664) 1912ms (15451664) "\S+" ":S+" 2565ms (3414592) 8000ms (3414592) 1957ms (3414592) 333ms (3414592) 1619ms (3414592) 924ms (3414592) 3114ms (3414592) 1521ms (3414592) 654ms (3414592) 1059ms (3414592) "\w" ":w" 4769ms (14751878) 32506ms (14751878) 7052ms (14751878) 952ms (14751878) 4537ms (14751878) 2994ms (14751878) 10865ms (14751878) 1811ms (14750958) 819ms (14750958) 1888ms (14750958) "[:w_]" "\w" 4762ms (14751878) 29682ms (14751878) 7066ms (14751878) 952ms (14751878) 4486ms (14751878) 2989ms (14751878) 10663ms (14751878) 2472ms (14751878) 1160ms (14751878) 3171ms (14751878) "[a-zA-Z0-9_]" "[a-zA-Z0-9_]" 4650ms (14751878) 29437ms (14751878) 7121ms (14751878) 994ms (14751878) 4538ms (14751878) 3074ms (14751878) 10483ms (14751878) 2735ms (14751878) 1104ms (14751878) 5260ms (14751878) "[a-zA-Z]+" "[a-zA-Z]+" 2317ms (3495761) 7582ms (3495761) 2048ms (3495761) 327ms (3495761) 1551ms (3495761) 1007ms (3495761) 2956ms (3495761) 2250ms (3495761) 860ms (3495761) 2771ms (3495761) "[.\s]+" "[.:s]+" 2205ms (991813) 9541ms (3430783) 2036ms (3430783) 399ms (3430783) 1037ms (3430783) 949ms (3430783) 2676ms (3430783) 2851ms (3430783) 1363ms (3430783) 3741ms (3430783) "([^

]+)" "<[^\

]+>" 1591ms (314387) 5611ms (314387) 430ms (314387) 82ms (314387) 1038ms (314387) 210ms (314387) 697ms (314387) 1301ms (314387) 578ms (314387) 788ms (314387) "e" "e" 520ms (1781425) 2842ms (1781425) 1000ms (1781425) 139ms (1781425) 439ms (1781425) 380ms (1781425) 1434ms (1781425) 1739ms (1781425) 663ms (1781425) 1809ms (1781425) "(((((e)))))" "<<<<<e>>>>>" 529ms (1781425) 4399ms (1781425) 1001ms (1781425) 203ms (1781425) 1269ms (1781425) 1256ms (1781425) 1971ms (1781425) 6168ms (1781425) 2360ms (1781425) 19939ms (1781425) "((((((((((e))))))))))" "<<<<<<<<<<e>>>>>>>>>>" 525ms (1781425) 6709ms (1781425) 1006ms (1781425) 298ms (1781425) 1850ms (1781425) 1945ms (1781425) 2176ms (1781425) 12171ms (1781425) 4034ms (1781425) 58376ms (1781425) "Twain" "Twain" 1022ms (2388) 12ms (2388) 8ms (2388) 49ms (2388) 47ms (2388) 10ms (2388) 52ms (2388) 1610ms (2388) 589ms (2388) 2450ms (2388) "(Twain)" "<Twain>" 1015ms (2388) 13ms (2388) 8ms (2388) 49ms (2388) 48ms (2388) 14ms (2388) 53ms (2388) 2294ms (2388) 927ms (2388) 6027ms (2388) "(?i)Twain" "#*Twain" 1337ms (2657) 3515ms (2657) 166ms (2657) 51ms (2657) 287ms (2657) 203ms (2657) 349ms (2657) 1616ms (2657) 727ms (2657) 2575ms (2657) "(?i:Tw)(?:(?:a?A?[i_I]{0,2})?[nN])" "(Tw)#*((a?A?[i_I]{0,2})?[nN])" 1375ms (2989) 3474ms (2989) 168ms (2989) 56ms (2989) 319ms (2989) 236ms (2989) 353ms (2989) 2154ms (2989) 1021ms (2989) 9216ms (2989) "[a-z]shing" "[a-z]shing" 1597ms (1877) 3614ms (1877) 277ms (1877) 47ms (1877) 2302ms (1877) 1487ms (1877) 49ms (1877) 3250ms (1877) 1267ms (1877) 5413ms (1877) "Huck[a-zA-Z]+|Saw[a-zA-Z]+" "Huck[a-zA-Z]+|Saw[a-zA-Z]+" 1779ms (396) 4431ms (396) 126ms (396) 8ms (396) 72ms (396) 70ms (396) 117ms (396) 3894ms (396) 1456ms (396) 6599ms (396) "[a-q][^u-z]{13}x" "[a-q][^u-z]{13}x" 4446ms (5021) 11181ms (5021) 613ms (5021) 5ms (5021) 6169ms (5021) 1742ms (5021) 135ms (5021) 9796ms (5021) 3347ms (5021) 10391ms (5021) "Tom|Sawyer|Huckleberry|Finn" "Tom|Sawyer|Huckleberry|Finn" 3151ms (3015) 9003ms (3015) 134ms (3015) 89ms (3015) 96ms (3015) 97ms (3015) 141ms (3015) 6608ms (3015) 2608ms (3015) 10659ms (3015) "(Tom|Sawyer|Huckleberry|Finn)" "<Tom|Sawyer|Huckleberry|Finn>" 3302ms (3015) 8876ms (3015) 131ms (3015) 87ms (3015) 100ms (3015) 102ms (3015) 142ms (3015) 7240ms (3015) 2964ms (3015) 23149ms (3015) "[hHeELlOo][hHeELlOo][hHeELlOo][hHeELlOo][hHeELlOo]" "[hHeELlOo][hHeELlOo][hHeELlOo][hHeELlOo][hHeELlOo]" 3656ms (534) 4503ms (534) 250ms (534) 247ms (534) 889ms (534) 638ms (534) 687ms (534) 2903ms (534) 1589ms (534) 10916ms (534) "Tom.{10,25}river|river.{10,25}Tom" 1844ms (2) 4541ms (2) 152ms (2) 44ms (2) 247ms (2) 211ms (2) 236ms (2) N/A N/A N/A "ing[^a-zA-Z]" "ing[^a-zA-Z]" 1149ms (85956) 228ms (85956) 111ms (85956) 54ms (85956) 243ms (85956) 140ms (85956) 140ms (85956) 1611ms (85956) 627ms (85956) 3131ms (85956) "[a-zA-Z]ing[^a-zA-Z]" "[a-zA-Z]ing[^a-zA-Z]" 1859ms (85823) 3884ms (85823) 309ms (85823) 56ms (85823) 2376ms (85823) 1549ms (85823) 143ms (85823) 3274ms (85823) 1328ms (85823) 6761ms (85823) "([a-zA-Z]+ing)" 2070ms (95863) 5411ms (95863) 318ms (95863) 213ms (95863) 6616ms (95863) 4174ms (95863) 2603ms (95863) N/A N/A N/A

x86-64 bit AMD A6-9500 3.8GHz (GCC 7.2.1, Go 1.9.1 – GNU/Linux)

Regular expression tre re2Golang re2 pcre-JIT pcre-DFA pcre onig regexp4Golang regexp4 regexp3 ".|

" "." 2039ms (19285324) 12477ms (19285324) 3361ms (19285324) 594ms (19285324) 1673ms (19285324) 1379ms (19285324) 4188ms (19285324) 559ms (19285324) 190ms (19285324) 660ms (19285324) "\d" ":d" 347ms (27084) 1064ms (27084) 46ms (27084) 16ms (27084) 19ms (27084) 20ms (27084) 43ms (27084) 478ms (27084) 253ms (27084) 669ms (27084) "\S" ":S" 1468ms (15451664) 9661ms (15451664) 2779ms (15451664) 339ms (15451664) 1040ms (15451664) 964ms (15451664) 3649ms (15451664) 721ms (15451664) 287ms (15451664) 709ms (15451664) "\S+" ":S+" 908ms (3414592) 2793ms (3414592) 728ms (3414592) 113ms (3414592) 501ms (3414592) 303ms (3414592) 943ms (3414592) 531ms (3414592) 200ms (3414592) 368ms (3414592) "\w" ":w" 1483ms (14751878) 9036ms (14751878) 2637ms (14751878) 345ms (14751878) 1001ms (14751878) 927ms (14751878) 3387ms (14751878) 637ms (14750958) 305ms (14750958) 693ms (14750958) "\w" "[:w_]" 1479ms (14751878) 9049ms (14751878) 2627ms (14751878) 345ms (14751878) 1003ms (14751878) 920ms (14751878) 3364ms (14751878) 849ms (14751878) 433ms (14751878) 1190ms (14751878) "[a-zA-Z0-9_]" "[a-zA-Z0-9_]" 1476ms (14751878) 9898ms (14751878) 2636ms (14751878) 346ms (14751878) 1016ms (14751878) 966ms (14751878) 3384ms (14751878) 920ms (14751878) 390ms (14751878) 1953ms (14751878) "[a-zA-Z]+" "[a-zA-Z]+" 782ms (3495761) 3242ms (3495761) 744ms (3495761) 116ms (3495761) 491ms (3495761) 292ms (3495761) 977ms (3495761) 750ms (3495761) 301ms (3495761) 927ms (3495761) "[.\s]+" "[.:s]+" 658ms (991813) 3080ms (3430783) 684ms (3430783) 117ms (3430783) 329ms (3430783) 276ms (3430783) 865ms (3430783) 955ms (3430783) 484ms (3430783) 1338ms (3430783) "([^

]+)" "<[^\

]+>" 562ms (314387) 1909ms (314387) 149ms (314387) 32ms (314387) 342ms (314387) 64ms (314387) 234ms (314387) 433ms (314387) 174ms (314387) 246ms (314387) "e" "e" 166ms (1781425) 952ms (1781425) 339ms (1781425) 54ms (1781425) 149ms (1781425) 120ms (1781425) 440ms (1781425) 581ms (1781425) 231ms (1781425) 679ms (1781425) "(((((e)))))" "<<<<<e>>>>>" 166ms (1781425) 1472ms (1781425) 347ms (1781425) 94ms (1781425) 433ms (1781425) 381ms (1781425) 540ms (1781425) 2126ms (1781425) 824ms (1781425) 7649ms (1781425) "((((((((((e))))))))))" "<<<<<<<<<<e>>>>>>>>>>" 166ms (1781425) 2267ms (1781425) 342ms (1781425) 136ms (1781425) 841ms (1781425) 617ms (1781425) 678ms (1781425) 3981ms (1781425) 1465ms (1781425) 24102ms (1781425) "Twain" "Twain" 334ms (2388) 5ms (2388) 3ms (2388) 12ms (2388) 16ms (2388) 4ms (2388) 13ms (2388) 559ms (2388) 211ms (2388) 1030ms (2388) "(Twain)" "<Twain>" 333ms (2388) 5ms (2388) 3ms (2388) 12ms (2388) 16ms (2388) 5ms (2388) 14ms (2388) 781ms (2388) 336ms (2388) 2506ms (2388) "#*Twain" "(?i)Twain" 445ms (2657) 1204ms (2657) 65ms (2657) 13ms (2657) 96ms (2657) 71ms (2657) 117ms (2657) 543ms (2657) 246ms (2657) 1061ms (2657) "(?i:Tw)(?:(?:a?A?[i_I]{0,2})?[nN])" "(Tw)#*((a?A?[i_I]{0,2})?[nN])" 458ms (2989) 1191ms (2989) 68ms (2989) 25ms (2989) 102ms (2989) 80ms (2989) 113ms (2989) 729ms (2989) 334ms (2989) 3617ms (2989) "[a-z]shing" "[a-z]shing" 524ms (1877) 1320ms (1877) 90ms (1877) 11ms (1877) 750ms (1877) 498ms (1877) 12ms (1877) 1073ms (1877) 436ms (1877) 1837ms (1877) "Huck[a-zA-Z]+|Saw[a-zA-Z]+" "Huck[a-zA-Z]+|Saw[a-zA-Z]+" 548ms (396) 1511ms (396) 42ms (396) 4ms (396) 23ms (396) 23ms (396) 39ms (396) 1271ms (396) 494ms (396) 2483ms (396) "[a-q][^u-z]{13}x" "[a-q][^u-z]{13}x" 1467ms (5021) 3540ms (5021) 359ms (5021) 3ms (5021) 1985ms (5021) 588ms (5021) 40ms (5021) 3306ms (5021) 1190ms (5021) 3356ms (5021) "Tom|Sawyer|Huckleberry|Finn" "Tom|Sawyer|Huckleberry|Finn" 931ms (3015) 2794ms (3015) 44ms (3015) 24ms (3015) 30ms (3015) 31ms (3015) 47ms (3015) 2295ms (3015) 864ms (3015) 4132ms (3015) "(Tom|Sawyer|Huckleberry|Finn)" "<Tom|Sawyer|Huckleberry|Finn>" 937ms (3015) 2956ms (3015) 44ms (3015) 26ms (3015) 31ms (3015) 33ms (3015) 47ms (3015) 2511ms (3015) 1011ms (3015) 9374ms (3015) "[hHeELlOo][hHeELlOo][hHeELlOo][hHeELlOo][hHeELlOo]" "[hHeELlOo][hHeELlOo][hHeELlOo][hHeELlOo][hHeELlOo]" 1104ms (534) 1526ms (534) 90ms (534) 81ms (534) 288ms (534) 203ms (534) 227ms (534) 954ms (534) 594ms (534) 3782ms (534) "Tom.{10,25}river|river.{10,25}Tom" 604ms (2) 1520ms (2) 57ms (2) 20ms (2) 76ms (2) 71ms (2) 76ms (2) N/A N/A N/A "ing[^a-zA-Z]" "ing[^a-zA-Z]" 379ms (85956) 76ms (85956) 41ms (85956) 23ms (85956) 79ms (85956) 48ms (85956) 39ms (85956) 592ms (85956) 222ms (85956) 1255ms (85956) "[a-zA-Z]ing[^a-zA-Z]" "[a-zA-Z]ing[^a-zA-Z]" 639ms (85823) 1341ms (85823) 103ms (85823) 24ms (85823) 767ms (85823) 485ms (85823) 39ms (85823) 1143ms (85823) 461ms (85823) 2355ms (85823) "([a-zA-Z]+ing)" 717ms (95863) 1903ms (95863) 106ms (95863) 75ms (95863) 1766ms (95863) 1424ms (95863) 925ms (95863) N/A N/A N/A

Compile

Deps

bash

make

gcc

g++

go

get data

$ wget http://www.gutenberg.org/files/3200/old/mtent12.zip $ dos2unix mtent12.txt data.txt $ dos2unix data.txt $ rm mtent12*

build benchmarks

$ ./rebuild.sh $ ./rebench.sh

clean