In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article is to share with you about how to find and delete duplicate files in Linux, the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
Find and delete duplicate files in Linux
For the purposes of this guide, I will discuss the following three tools:
Rdfind
Fdupes
FSlint
These three tools are free and open source and run on most Unix-like systems.
1. Rdfind
Rdfind, which means redundant data find (redundant data Lookup), is a free and open source tool that finds duplicate files by visiting directories and subdirectories. It is based on the content of the file rather than the name of the file. Rdfind uses a sorting algorithm to distinguish between original and duplicate files. If you have two or more identical files, Rdfind will intelligently find the original files and identify the remaining files as duplicate files. Once a copy of the file is found, it will report back to you. You can decide whether to delete or use hard links or symbolic (soft) links instead.
Install Rdfind
Rdfind exists in AUR. Therefore, in Arch-based systems, you can install it using any of the Yay AUR program helpers as below.
$yay-S rdfind
On Debian, Ubuntu, Linux Mint:
$sudo apt-get install rdfind
On Fedora:
$sudo dnf install rdfind
On RHEL, CentOS:
$sudo yum install epel-release$ sudo yum install rdfind
Usage
Once the installation is complete, run the Rdfind command with only the directory path to scan for duplicate files.
$rdfind ~ / Downloads
As you can see from the screenshot above, the Rdfind command scans the ~ / Downloads directory and stores the results in a file called results.txt under the current working directory. You can see the names of possible duplicate files in the results.txt file.
$cat results.txt# Automatically generated# duptype id depth size device inode priority nameDUPTYPE_FIRST_OCCURRENCE 1469 8 9 2050 15864884 1 / home/sk/Downloads/tor-browser_en-US/Browser/TorBrowser/Tor/PluggableTransports/fte/tests/dfas/test5.regexDUPTYPE_WITHIN_SAME_TREE-1469 8 9 2050 15864886 1 / home/sk/Downloads/tor-browser_en-US/Browser/TorBrowser/Tor/PluggableTransports/fte/tests/dfas/test6.regex [...] DUPTYPE_FIRST_OCCURRENCE 13 0 403635 2050 15740257 1 / home/sk/Downloads/Hyperledger (1) .pdfDUPTYPE_WITHIN_SAME_TREE-13 0 403635 2050 15741071 1 / home/sk/Downloads/Hyperledger.pdf# end of file
By checking the results.txt file, you can easily find those duplicate files. You can delete them manually if you like.
In addition, you can use the-dryrun option to find all duplicate files without modifying anything else, and output summary information on the terminal.
$rdfind-dryrun true ~ / Downloads
Once duplicate files are found, you can replace them with hard links or symbolic links.
Use hard links instead of all duplicate files, run:
$rdfind-makehardlinks true ~ / Downloads
Use symbolic links / soft links instead of all duplicate files, run:
$rdfind-makesymlinks true ~ / Downloads
There are some empty files in the directory. You may want to ignore them. You can use the-ignoreempty option as follows:
$rdfind-ignoreempty true ~ / Downloads
If you no longer want these old files, delete duplicate files instead of replacing them with hard or soft links.
Delete duplicate files and run:
$rdfind-deleteduplicates true ~ / Downloads
If you don't want to ignore empty files, delete them with all duplicate files. Run:
$rdfind-deleteduplicates true-ignoreempty false ~ / Downloads
For more details, see the help section:
$rdfind-help
Man pages:
$man rdfind2. Fdupes
Fdupes is another command-line tool that identifies and removes duplicate files in specified directories and subdirectories. This is a free and open source tool written in C language. Fdupes identifies duplicate files by comparing the file size, some MD5 signatures, all MD5 signatures, and * * performing byte-by-byte comparison check.
Similar to the Rdfind tool, Fdupes comes with very few options to perform operations, such as:
Recursively search for duplicate files in directories and subdirectories
Exclude empty and hidden files from the calculation
Show duplicate file size
Delete a duplicate file immediately
Use different owners / groups or permission bits to exclude duplicate files
More
Install Fdupes
Fdupes exists in the default repository of most Linux distributions.
On Arch Linux and its variants such as Antergos and Manjaro Linux, install it using Pacman as follows.
$sudo pacman-S fdupes
On Debian, Ubuntu, Linux Mint:
$sudo apt-get install fdupes
On Fedora:
$sudo dnf install fdupes
On RHEL, CentOS:
$sudo yum install epel-release$ sudo yum install fdupes
Usage
Fdupes is very simple to use. Just run the following command to find duplicate files in the directory, such as: ~ / Downloads.
$fdupes ~ / Downloads
Sample output from my system:
/ home/sk/Downloads/Hyperledger.pdf/home/sk/Downloads/Hyperledger (1) .pdf
You can see that there is a duplicate file in the / home/sk/Downloads/ directory. It shows only duplicate files in the parent directory. How do I display duplicate files in a subdirectory? Use the-r option as below.
$fdupes-r ~ / Downloads
Now you will see the duplicate files in the / home/sk/Downloads/ directory and subdirectories.
Fdupes can also be used to quickly find duplicate files from multiple directories.
$fdupes ~ / Downloads ~ / Documents/ostechnix
You can even search multiple directories and recursively search one of them, as follows:
$fdupes ~ / Downloads-r ~ / Documents/ostechnix
The above command will search for duplicate files in the ~ / Downloads directory, ~ / Documents/ostechnix directory and its subdirectories.
Sometimes, you may want to know the size of duplicate files in a directory. You can use the-S option as follows:
$fdupes-S ~ / Downloads403635 bytes each:/home/sk/Downloads/Hyperledger.pdf/home/sk/Downloads/Hyperledger (1). Pdf
Similarly, to display the size of duplicate files in the parent and subdirectories, use the-Sr option.
We can use the-n and-An options to exclude blank files and hidden files, respectively.
$fdupes-n ~ / Downloads$ fdupes-A ~ / Downloads
When searching for duplicate files in the specified directory, * * commands will exclude zero-length files, and subsequent commands will exclude hidden files.
Summarize duplicate file information, using the-m option.
$fdupes-m ~ / Downloads1 duplicate files (in 1 sets), occupying 403.6 kilobytes
Delete all duplicate files, using the-d option.
$fdupes-d ~ / Downloads
Sample output:
[1] / home/sk/Downloads/Hyperledger Fabric Installation.pdf [2] / home/sk/Downloads/Hyperledger Fabric Installation (1). Pdf Set 1 of 1, preserve files [1-2, all]:
This command will prompt you to keep or delete all other duplicate files. Enter any number to keep the corresponding file and delete the rest of the file. You need to be more careful when using this option. If you are not careful, you may delete the original file.
If you want to keep * files in each duplicate file collection and delete other files without prompting, use the-dN option (not recommended).
$fdupes-dN ~ / Downloads
Delete duplicate files when they are encountered, using the-I flag.
$fdupes-I ~ / Downloads
For more details on Fdupes, see the help section and the man page.
$fdupes-help$ man fdupes3. FSlint
FSlint is another tool for finding duplicate files, and sometimes I use it to remove unwanted duplicate files from the Linux system and free up disk space. Unlike the other two tools, FSlint has two modes: GUI and CLI. So it's more friendly for beginners. FSlint finds not only duplicate files, but also bad symbolic links, bad name files, temporary files, bad user ID, empty directories, non-condensed binaries, and so on.
Install FSlint
FSlint exists in AUR, so you can install it using any AUR helper.
$yay-S fslint
On Debian, Ubuntu, Linux Mint:
$sudo apt-get install fslint
On Fedora:
$sudo dnf install fslint
On RHEL,CentOS:
$sudo yum install epel-release$ sudo yum install fslint
Once the installation is complete, launch it from the menu or application launcher.
The FSlint GUI is shown as follows:
As you can see, FSlint has a friendly interface and is clear at a glance. In the "Search path" column, add the directory path you want to scan, and click the "Find" button in the lower left corner to find duplicate files. Verify that the recursion option can recursively search for duplicate files in directories and subdirectories. FSlint will quickly scan the given directory and list duplicate files.
Select duplicate files from the list to clean up, or you can select "Save", "Delete", "Merge" and "Symlink" to manipulate them.
In the "Advanced search parameters" column, you can specify the excluded path when searching for duplicate files.
FSlint command line options
FSlint provides the following CLI toolset to find duplicate files in your file system.
Findup-find duplicate files
Findnl-find the name specification (the file name in question)
Findu8-look for illegal utf8-encoded file names
Findbl-find bad links (problematic symbolic links)
Findsn-find files with the same name (file names that may conflict)
Finded-find an empty directory
Findid-find the files of dead users
Findns-find non-compact executables
Findrs-look for extra whitespace in the file name
Findtf-find temporary files
Findul-find libraries that may not be used
Zipdir-Recycle wasted space under the ext2 directory entry
All of these tools are located under / usr/share/fslint/fslint/fslint.
For example, look for duplicate files in a given directory and run:
$/ usr/share/fslint/fslint/findup ~ / Downloads/
Similarly, the command to find an empty directory is:
$/ usr/share/fslint/fslint/finded ~ / Downloads/
Get more details about each tool, for example: findup, run:
$/ usr/share/fslint/fslint/findup-help
For more details on FSlint, see the help section and the man page.
$/ usr/share/fslint/fslint/fslint-the above is how to find and delete duplicate files in Linux. The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.