Find -exec, actions, fork and speed

Assuming you understand such a command: find . type -f -exec ls -l {} \; this post will explains certain subtleties about find, fork, and speed. First, remember not to pipe find with the very bad and dangerous xargs, as explained in this must read “find guide”[1].

So, if you can’t use xargs, you’ll use -exec. What I wanted to talk about is the difference(s) between \; and \+ at the end of a find -exec command.

find /tmp/find -type f -exec ls -artl {} \;

In this find expression (using \;), each time find finds a matching filname, ls -artl is fired. Meaning it forks a lot (actually, it clones) and the files aren’t sorted by modification time since ls -artl is used file by file…

 #  find /tmp/find -type f -exec ls -artl {} \;
-rw-r--r--  1  root  root  0     Nov  23  11:49  /tmp/find/find_new/quux
-rw-r--r--  1  root  root  1024  Nov  16  10:22  /tmp/find/find_veryold/foo
-rw-r--r--  1  root  root  520   Oct  8   2010   /tmp/find/find_veryold/bar
-rw-r--r--  1  root  root  961   Nov  22  14:39  /tmp/find/find_old/baz

As you can see, files are not sorted, and strace shows lots of PID because it clones on each file to execute ls.

# strace  find /tmp/find -type f -exec ls -artl {} \; &>/dev/stdout | grep clone
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7947
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7948
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7949
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7950

find /tmp/find -type f -exec ls -artl {} \+

In this find expression (using \+), find computes a list of filenames before passing it to -exec. It means ls -artl is fired only once, on the whole file list: there is no forks and only one execution of ls which is really really really faster and actually sort files by modification time.

#  find /tmp/find -type f -exec ls -artl {} \+
-rw-r--r-- 1 root root  520 Oct  8  2010 /tmp/find/find_veryold/bar
-rw-r--r-- 1 root root 1024 Nov 16 10:22 /tmp/find/find_veryold/foo
-rw-r--r-- 1 root root  961 Nov 22 14:39 /tmp/find/find_old/baz
-rw-r--r-- 1 root root    0 Nov 23 11:49 /tmp/find/find_new/quux

As you can see, files are sorted, and strace shows only a single PID.

# strace  find /tmp/find -type f -exec ls -artl {} \+ &>/dev/stdout | grep clone
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7581728) = 7956

To make it short, -exec rm {} \+ on 1000 files will do 1xrm(1000files) where \; would have done 1000xrm(1file).

[1] http://mywiki.wooledge.org/UsingFind#Actions_in_bulk:_xargs.2C_-print0_and_-exec_.2B-

No comments yet, d'oh!

Share your thoughts