Date
Oct. 11th, 2024
 
2024年 9月 9日

Post: Delete all but the most recent X files in bash

Today is the A Memorial Day

Delete all but the most recent X files in bash

Published 10:09 Sep 04, 2023.

Created by @ezra. Categorized in #UNIX/Linux, and tagged as #UNIX/Linux.

Source format: Asciidoc

Table of Content

To give a bit more of a concrete example, imagine some cron job writing out a file (say, a log file or a tar-ed up backup) to a directory every hour. I’d like a way to have another cron job running which would remove the oldest files in that directory until there are less than, say, 5.

And just to be clear, there’s only one file present, it should never be deleted.

Here’s a pragmatic, POSIX-compliant solution that comes with only one caveat: it cannot handle filenames with embedded newlines - but I don’t consider that a real-world concern for most people.

For the record, here’s the explanation for why it’s generally not a good idea to parse ls output.

ls -tp | grep -v '/$' | tail -n +6 | xargs -I {} rm -- {}
Note

This command operates in the current directory; to target a directory explicitly, use a subshell …​ with cd:

(cd /path/to && ls -tp | grep -v '/$' | tail -n +6 | xargs -I {} rm -- {})

The same applies analogously to the commands below.

The above is inefficient, because xargs has to invoke rm separately for each filename.

However, your platform’s specific xargs implementation may allow you to solve this problem:

A solution that works with GNU xargs is to use -d '\n', which makes xargs consider each input line a separate argument, yet passes as many arguments as will fit on a command line at once:

ls -tp | grep -v '/$' | tail -n +6 | xargs -d '\n' -r rm --
Note

Option -r (--no-run-if-empty) ensures that rm is not invoked if there’s no input.

A solution that works with both GNU xargs and BSD xargs (including on macOS) - though technically still not POSIX-compliant - is to use -0 to handle NUL-separated input, after first translating newlines to NUL (0x0) chars., which also passes (typically) all filenames at once:

ls -tp | grep -v '/$' | tail -n +6 | tr '\n' '\0' | xargs -0 rm --

Explanation:

  • ls -tp prints the names of filesystem items sorted by how recently they were modified , in descending order (most recently modified items first) (-t), with directories printed with a trailing / to mark them as such (-p).

    • Note: It is the fact that ls -tp always outputs file/directory names only, not full paths, that necessitates the subshell approach mentioned above for targeting a directory other than the current one ((cd /path/to && ls -tp …​)).

  • grep -v '/$' then weeds out directories from the resulting listing, by omitting (-v) lines that have a trailing / (/$).

    • Caveat: Since a symlink that points to a directory is technically not itself a directory, such symlinks will not be excluded.

  • tail -n +6 skips the first 5 entries in the listing, in effect returning all but the 5 most recently modified files, if any.

    • Note that in order to exclude N files, N+1 must be passed to tail -n +.

  • xargs -I {} rm — {} (and its variations) then invokes on rm on all these files; if there are no matches at all, xargs won’t do anything.

    • xargs -I {} rm — {} defines placeholder {} that represents each input line as a whole, so rm is then invoked once for each input line, but with filenames with embedded spaces handled correctly.

    • -- in all cases ensures that any filenames that happen to start with - aren’t mistaken for options by rm.


A variation on the original problem, in case the matching files need to be processed individually or collected in a shell array:

# One by one, in a shell loop (POSIX-compliant):
ls -tp | grep -v '/$' | tail -n +6 | while IFS= read -r f; do echo "$f"; done
# One by one, but using a Bash process substitution (<(...),
# so that the variables inside the `while` loop remain in scope:
while IFS= read -r f; do echo "$f"; done < <(ls -tp | grep -v '/$' | tail -n +6)
# Collecting the matches in a Bash *array*:
IFS=$'\n' read -d '' -ra files  < <(ls -tp | grep -v '/$' | tail -n +6)
printf '%s\n' "${files[@]}" # print array elements
Pinned Message
HOTODOGO
The Founder and CEO of Infeca Technology.
Developer, Designer, Blogger.
Big fan of Apple, Love of colour.
Feel free to contact me.
反曲点科技创始人和首席执行官。
开发、设计与写作皆为所长。
热爱苹果、钟情色彩。
随时恭候 垂询