One of the common problems while working in Linux is finding large files to free some space. Suppose, your file system is full and you are receiving an alert to remove spaces or if your host is run out of space and your server is not starting up, the first thing you do is find the top 10 largest files and see if you can delete them. Usually, old files, large Java heap dumps are good candidates for removal and freeing up some space. If you are running Java applications like core Java-based programs or web applications running on Tomcat then you can remove those heap dump files and free some space, but the big question is how do you find those? How do you know the size of the biggest file in your file system, especially if you don't know which directory it is? We'll try to find answers to some of those questions in this article.
When I was new to Linux, I don't have any other choice but to go to the log directory and look for old files that are larger than the rest and delete them. They worked well until one day our server died due to a huge cache file.
I wasn't able to locate that because it wasn't in the log directory, then I come to know about the find command which let you search sub-directories for large files as shown below:
$ find . -size +1G
This command will print all the files which are greater than 1GB from the current directory and any subdirectory.
The only problem with this one is that it doesn't print the exact size. The problem was solved by using the -printf option, which allows you to specify a format String much like Java's printf() method.
When I was new to Linux, I don't have any other choice but to go to the log directory and look for old files that are larger than the rest and delete them. They worked well until one day our server died due to a huge cache file.
I wasn't able to locate that because it wasn't in the log directory, then I come to know about the find command which let you search sub-directories for large files as shown below:
$ find . -size +1G
This command will print all the files which are greater than 1GB from the current directory and any subdirectory.
The only problem with this one is that it doesn't print the exact size. The problem was solved by using the -printf option, which allows you to specify a format String much like Java's printf() method.
By the way, if you are new to the beautiful but vast world of Linux commands then I highly recommend you go through a comprehensive Linux course to learn in a structured way. If you need a recommendation, I highly recommend you to join the Linux Mastery: Master the Linux Command Line in the 11.5 Hours course on Udemy. It's the highest-rated Linux course on Udemy and very hands-on with an enthusiastic instructor.
How to find large files with their size in Linux and UNIX?
You can use the find command and du command to find out all the large files and directories which are hogging disk space. If you are file system is 100 % full or close to 100% then you will need to find these big files and directories so that you can delete them if not needed. Generally, old log files and core dump files are good candidates to free disk space.
1. Finding big files using the find command in Linux
You can further tweak the command to find files up to a certain size like the below command will find all files. Here is the modified UNIX command to find large files with size :$ find . -size +1G -printf '%s %p\n'
here is %s is for size and %p is for the path.
Alternatively, You can also use -exec option to run ls on each file the find command return to print its size as shown below:
$ find . -size +100M -exec ls -sh {} \;
This is good enough, you can just see which files you can delete and free some space, but the problem is that you will not find any file which is larger than 1GB, hence I always use this command with some hypothetical large number like 10GB, etc, but, those are just workaround, not the proper fix. Let's see what we can do next.
Btw, if you are new to the find command then I suggest you get familiar with its different options as it's a very important and powerful command. I suggest you check Linux Command Line Interface (CLI) Fundamentals course to learn more about various options of the find command in Linux.
2. Finding large files using the du command in Linux
Btw, you can also use the du (disk usage) command to find large directories and their size, as shown below :$ du -a . | sort -n -r | head -n 10
16095096 .
13785288 ./logs
6095380 ./logs/app
2125252 ./temp
2125244 ./temp/data
2125240 ./temp/data/app
This is the right command, it will list both directories and files. I have also combined the output of the du command with the sort command to print the top 10 largest files and directories.
This is exactly what we are looking for. In fact, this is also one of the frequently asked Linux Interview questions, so if you know this trick you answer this question on interviews as well.
As I have said, a good knowledge of various Linux commands is very important for a programmer working on a Linux machine. I know you can always Google things but you have to know what to Google and that's why basic information about various Linux commands is essential.
If you feel that you don't know enough Linux commands then you can join a comprehensive course like Linux Command Line Basics to get hold of the Linux commands which matter most.
That's all about how to find the large files and directories in Linux. As I said, earlier I used to search large files by using the find command with -size option but that is more or less guesswork because you never know the size of the largest file in a machine, but by using a reasonable high size, you can possibly find all big files in your filesystem.
One more command you can use to find the large files with size in Linux is the disk usage or du command, which will also list both files and directories.
Other Linux Articles and Resources you may like
- Top 5 Courses to learn Vim Editor (online courses)
- VI Editor examples and tips for beginners (vi examples)
- 5 Example of kill commands in Unix and Linux (example)
- 10 examples of lsof command in Linux? (examples)
- How to set up cron jobs in Linux (Crontab example)
- 10 examples of Networking commands in Unix (nslookup)
- 7 Best Linux Courses for DevOps Engineers (Linux courses)
- How to use the netstat command to find which process is listening on a port? (example)
- Linux find + du + grep example (example)
- 10 Linux command line courses for Beginners (courses)
- How does the nslookup command work in UNIX? (answer)
- 10 Examples of curl command in Linux (cURL)
- 10 Examples of chmod command in Linux (chmod)
Thanks for reading this article so far. If you like this find command and du command Linux tutorial and my explanation then please share it with your friends and colleagues.
P. S. - If you are looking for some free online courses to start your Linux journey, you should check out my list of Free Linux online Courses for Programmers, Cloud Engineers, Data scientists, IT Professionals, and System Administrators.
We have been facing space issue quite often in our app server. Whenever I use find command it doesn't produce output which adds upto total size. For example, the /app partition has 100GB, find list some top 50 files adding upto 6 or 7 GB but partition was still showing as 100% full.
ReplyDeletelike $ find /app -printf '%s %p\n'| sort -nr | head -10
After more investigation we found that the space was hogged by deleted files. Our process was creating cache files and deleting them but keeping the file descriptor. Since our process only starts on Sunday, it hold a lot of reference of deleted files which holds those missing 50+ GB space.
The solution was to locate those files, find the process and restart it to free up those space. That's where the lsof command helped
I use following command to find those deleted files which were still holding disk space
$ lsof -F sn0 | tr -d '\000' | grep deleted | sed 's/^[a-z]*\([0-9]*\)n/\1 /' | sort -n
After that, I found the process using ps -ef | grep "text" and kill that process.
Boom....
we have all the disk space back :-)
So, if find command doesn't show you enough large files to delete, check if your process is keeping reference of deleted files and hogging up space.
How do you find files size of combing folder form the root location du -h. du -h dose not add the other dir inside the main folder
ReplyDelete