Making find -exec faster Written on 15 Jan 2015 − last updated on 3 May 2019 history

Here’s a little find trick that few people seem to know:

# 13 seconds $ time find . -type f -exec stat {} \; > /dev/null 13.20s real 3.94s user 9.22s sys # 1.5 seconds; that's almost 10 times faster! $ time find . -type f -exec stat {} + > /dev/null 1.48s real 0.68s user 0.79s sys # Run the first command again, to make sure we’re not being biased by fs # cache or got some fluke $ time find . -type f -exec stat {} \; > /dev/null 13.40s real 3.67s user 9.51s sys # FYI $ find . -type f | wc -l 2641

That’s quite a large difference! All we did was swap the ; for a + .

Let’s see what POSIX has to say about it (emphases mine):

If the primary expression is punctuated by a <semicolon> , the utility utility_name shall be invoked once for each pathname [.. snip ..] If the primary expression is punctuated by a <plus-sign> , the primary shall always evaluate as true, and the pathnames for which the primary is evaluated shall be aggregated into sets. The utility utility_name shall be invoked once for each set of aggregated pathnames.

Or in plain English: if you use ; find will execute the utility once for every path; if you use + it will cram as many paths as it can in an invocation.

How many? Well, as many as ARG_MAX allows. Quoting from POSIX Again:

{ARG_MAX}

Maximum length of argument to the exec functions including environment data.

Minimum Acceptable Value: {_POSIX_ARG_MAX} {_POSIX_ARG_MAX}

Maximum length of argument to the exec functions including environment data.

Value: 4096

Most contemporary systems have it set much higher though; Linux (3.16, x86_64) defines ARG_MAX as 131072 (128k), while FreeBSD (10, i386) gives it as 262144 (256k).

Let’s verify this with truss :

# Amount of files we have $ find . -type f | wc -l 2641 $ truss find . -type f -exec stat {} \; >& truss-slow $ truss find . -type f -exec stat {} + >& truss-fast # Less than ARG_MAX, so we expect one fork() $ find . -type f | xargs | wc -c 119528 # Yup! $ grep fork truss-fast | wc -l 1 # And we fork() once for every file $ grep fork truss-slow | wc -l 2641

There is one small caveat, this won’t work:

# FreeBSD find $ find . -type f -exec cp {} /tmp + find: -exec: no terminating ";" or "+" # GNU find is even more cryptic $ find: missing argument to `-exec'

Going back to POSIX:

Only a <plus-sign> that immediately follows an argument containing only the two characters “{}” shall punctuate the end of the primary expression. Other uses of the <plus-sign> shall not be treated as special.

In other words, the command needs to end with {} + . cp {} /tmp + doesn’t, and thus gives an error.

We can work around this by spawning a sh one-liner:

$ find . -type f -exec sh -c 'cp "$@" /tmp' _ {} +

You need to pass the _ since sh -c sets the special $0 parameter from the first argument (more details).

Footnotes Linux users can use strace ; OpenBSD users ktrace .