Assuming you understand such a command: find . type -f -exec ls -l {} \; this post will explains certain subtleties about find, fork, and speed. First, remember not to pipe find with the very bad and dangerous xargs, as explained in this must read “find guide”[1].
So, if you can’t use xargs, you’ll use -exec. What I wanted to talk about is the difference(s) between \; and \+ at the end of a find -exec command.
find /tmp/find -type f -exec ls -artl {} \;
In this find expression (using \;), each time find finds a matching filename, find clones and ls -artl is triggered. As a result, the files aren’t sorted by modification time since ls -artl is used one file after another.
# find /tmp/find -type f -exec ls -artl {} \;
-rw-r--r-- 1 root root 0 Nov 23 11:49 /tmp/find/new/quux
-rw-r--r-- 1 root root 1024 Nov 16 10:22 /tmp/find/veryold/foo
-rw-r--r-- 1 root root 520 Oct 8 2010 /tmp/find/veryold/bar
-rw-r--r-- 1 root root 961 Nov 22 14:39 /tmp/find/old/baz
As you can see, files are not sorted, and strace shows a PID per clone, that is to say, per found file..
# strace find /tmp/find -type f -exec ls -artl {} \; &>/dev/stdout | grep clone
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7947
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7948
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7949
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7705728) = 7950
find /tmp/find -type f -exec ls -artl {} \+
On the other hand, In this expression (using \+), find first computes a list of filenames then passes it to -exec. It means ls -artl is only triggered once, on the whole file list, which is really really faster and actually sort files by modification time even though they are in different directories.
# find /tmp/find -type f -exec ls -artl {} \+
-rw-r--r-- 1 root root 520 Oct 8 2010 /tmp/find/veryold/bar
-rw-r--r-- 1 root root 1024 Nov 16 10:22 /tmp/find/veryold/foo
-rw-r--r-- 1 root root 961 Nov 22 14:39 /tmp/find/old/baz
-rw-r--r-- 1 root root 0 Nov 23 11:49 /tmp/find/new/quux
As you can see, files are sorted, and strace shows only a single clone PID.
# strace find /tmp/find -type f -exec ls -artl {} \+ &>/dev/stdout | grep clone
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7581728) = 7956
To make it short, -exec command {} \+ on 10K files will trigger 10K commands of one argument each and consume 10K PIDs while \; only requires a single PID to run a single command of 10K arguments.
[1] http://mywiki.wooledge.org/UsingFind#Actions_in_bulk:_xargs.2C_-print0_and_-exec_.2B-
No comments yet, d'oh!