Friday, September 11, 2009

Bash: Finding files between two dates in the current directory

Today my boss asked me for a bash command (or script) to find some files between two dates.
Thanks to Jadu Saikia over at Unstableme his post UNIX BASH scripting: Find Files between two dates, I had a starting point.

This will find all files between the two dates (20071019 & 20071121) in this case.
find . -type f -exec ls -l --time-style=full-iso {} \; | awk '{print $6,$NF}' | awk '{gsub(/-/,"",$1);print}' | awk '$1>= 20071019 && $1<= 20071121 {print $2}'

Now, if you want just PGP files you would do:
find *.pgp -type f -exec ls -l --time-style=full-iso {} \; | awk '{print $6,$NF}' | awk '{gsub(/-/,"",$1);print}' | awk '$1>= 20071019 && $1<= 20071121 {print $2}'

The second request that my boss was looking for with this is the file size, something that was being left out by awk. So we can fix that by updating the command to:
find *.pgp -type f -exec ls -lh --time-style=full-iso {} \; | awk '{print $6,$NF,$5}' | awk '{gsub(/-/,"",$1);print}' | awk '$1>= 20090624 && $1<= 20090901 {print $2,$3}'

We added in a $5 to the first awk command, and the final one had $3 added to it. Also I like human readable file sizes so I added -h to the ls command.

2 comments:

Tom Ashley said...

I just came across this as it is a solution to a problem I have.
However I believe this to be a very slow solution.
Passing 'ls' to -exec of a find command is very slow.
A better solution is to pipe the find command into xargs and run it this way.

As a benchmark, ruuning it against 72000+ files, the speed difference was 1 minute with xargs vs 10 minutes with -exec.

You could always use the -ls switch for find, but this doesn't allow you to select the arguments to pass to ls

Good solution though.

Richard said...

If you're trying to find folders instead of files between two dates just add the -d paramater to the ls command.

Also, you can combine the first two awk commands into one like so:
awk '{gsub(/-/,"",$6);print $6,$NF}'
I am not good enough with awk to combine all three of them together