Find files and calculate total disk space usage

At work, I have a Solaris SVM Disk Suite device which houses many end user files. Over the past year, this file system has grown by over 75%. Initially, I was growing the file system as it reached capacity but one thing I noticed over the years as a sysadmin is that users will always use the disk space if its there. Business reasons restrict applying quotas to the file system so I decided to do some upfront grunt work instead.

A few du/find combinations later, I noticed that 80% of the disk space was being consumed by compressed pdf files in gzip format. Furthermore, most of the pdf files were over a year old. These files are certainly candidates for deletion but I will let the business decide that one.

Here is how you can find all gzip’ed files in the current working directory which have not been modified in over a year and calculate the total disk space consumed.

[[email protected] ~] # uname -a
SunOS host 5.10 Generic_127111-03 sun4u sparc SUNW,Sun-Fire-V890
[[email protected] ~] # cd /some_filesystem/
[[email protected] ~] # find . -name ‘*.pdf.gz’ -type f -mtime +365 -ls | awk ‘{ sum += $7 } END { kb = sum / 1024; mb = kb / 1024; gb = mb / 1024; printf “%.0f MB (%.2fGB) disk space used\n”, mb, gb}’

Once you’ve discovered how much disk space those files are consuming, the next step is up to you but if you had too, you could just as easily remove those offending files. I wont go into preaching about having properly backed up host.

Modify the find command to remove the files:

[[email protected] ~] # find . -name ‘*.pdf.gz’ -type f -mtime +365 -exec rm -rf {} \;

Argument list too long? Pipe stdout to xargs.

[[email protected] ~] # find . -name ‘*.pdf.gz’ -type f -mtime +365 -print | xargs rm -rf

Like always, your mileage will vary.


About this entry