Find -exec, actions, fork and speed

Assuming you understand such a command: find . type -f -exec ls -l {} \; this post will explains certain subtleties about find, fork, and speed. First, remember not to pipe find with the very bad and dangerous xargs, as explained in this must read “find guide”[1].

So, if you can’t use xargs, you’ll use -exec. What I wanted to discuss is the difference(s) between \; and \+ at the end of a find -exec command.

find /tmp/find -type f -exec ls -artl {} \;

In this find expression (using \;), each time find finds a matching filename, find clones and ls -artl is triggered. As a result, the files aren’t sorted by modification time since ls -artl is used one file after another.

 #  find /tmp/find -type f -exec ls -artl {} \;
-rw-r--r--  1  root  root  0     Nov  23  11:49  /tmp/find/new/quux
-rw-r--r--  1  root  root  1024  Nov  16  10:22  /tmp/find/veryold/foo
-rw-r--r--  1  root  root  520   Oct  8   2010   /tmp/find/veryold/bar
-rw-r--r--  1  root  root  961   Nov  22  14:39  /tmp/find/old/baz

As you can see, files are not sorted, and strace shows a PID per clone, that is to say, per found file..

# strace  find /tmp/find -type f -exec ls -artl {} \; &>/dev/stdout | grep clone
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7947
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7948
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7949
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7950

find /tmp/find -type f -exec ls -artl {} \+

On the other hand, in this expression (using \+), find first computes a list of filenames then passes it to -exec. It means ls -artl is only triggered once, on the whole file list, which is really really faster and actually sort files by modification time even though they are in different directories.
Note: on large lists, you might hit your OS limitation of allowed number of arguments for a command. ie, ARG_MAX.

#  find /tmp/find -type f -exec ls -artl {} \+
-rw-r--r-- 1 root root  520 Oct  8  2010 /tmp/find/veryold/bar
-rw-r--r-- 1 root root 1024 Nov 16 10:22 /tmp/find/veryold/foo
-rw-r--r-- 1 root root  961 Nov 22 14:39 /tmp/find/old/baz
-rw-r--r-- 1 root root    0 Nov 23 11:49 /tmp/find/new/quux

As you can see, files are sorted, and strace shows only a single clone PID.

# strace  find /tmp/find -type f -exec ls -artl {} \+ &>/dev/stdout | grep clone
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7581728) = 7956

To make it short, -exec command {} \+ on 10K files will trigger 10K commands of one argument each and consume 10K PIDs while \; only requires a single PID to run a single command of 10K arguments.

[1] http://mywiki.wooledge.org/UsingFind#Actions_in_bulk:_xargs.2C_-print0_and_-exec_.2B-

1 comment so far.

  1. Good to know about it, Florian.
    Best wishes.

Share your thoughts

*