How to clean up duplicate files gracefully by Linux 07/06 Update SLTechnology News&Howtos

How to clean up duplicate files gracefully by Linux

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

Most people do not understand the knowledge points of this article "Linux how to elegantly clean up duplicate files", so the editor summarizes the following contents, detailed content, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "Linux how to elegantly clean up duplicate files" article.

1. Rdfind-look for duplicate files in Linux

Rdfind, which means redundant data find (redundant data Lookup), is a free and open source tool that finds duplicate files by visiting directories and subdirectories. It is based on the content of the file rather than the name of the file. Rdfind uses a sorting algorithm to distinguish between original and duplicate files. If you have two or more identical files, Rdfind will intelligently find the original files and identify the remaining files as duplicate files. Once a copy of the file is found, it will report back to you. You can decide whether to delete or use hard links or symbolic (soft) links instead.

To install rdfind in Linux, use the following command based on your Linux distribution.

$sudo apt-get install rdfind [on Debian/Ubuntu]

Misplaced & sudo yum install rdfind [on CentOS/RHEL]

$sudo dnf install rdfind [on Fedora 22 +]

$sudo pacman-S rdfind [on Arch Linux]

To run rdfind on a directory, simply type rdfind and the target directory. Let's look at an example:

Linuxmi@linuxmi:~$ rdfind / home/user

As you can see, rdfind saves the results in a file named RESULTS.TXT in the directory where you run the program from there. This file contains all duplicate files found by rdfind. You can view files and manually delete duplicate files as needed.

Another thing you can do is to use the-dryruna option, which provides a list of duplicates without doing anything:

Rdfind-dryrun true / home/user

After you find the duplicates, you can choose to replace them with hard links.

Rdfind-makehardlinks true / home/user

If you want to delete duplicates, you can run.

Rdfind-deleteduplicates true / home/user

To see other useful options for rdfind, you can use the rdfind manual.

Man rdfind

2. Fdupes-scan for duplicate files in Linux

Fdupes is another program that allows you to identify duplicate files on your system. It is free and open source, and is written in C language. It uses the following methods to determine duplicate files:

Compare partial md5sum signatures

Relatively complete md5sum signature

Byte-by-byte comparison verification

Just like rdfind, it has similar options:

Recursive search

Exclude empty files

Displays the size of the duplicate file

Delete duplicates immediately

Exclude files from other owners

To install fdupes in Linux, use the following command based on the Linux distribution.

$sudo apt-get install fdupes [on Debian / Ubuntu]

Misplaced & sudo yum install fdupes [on CentOS / RHEL]

$sudo dnf install fdupes [on Fedora 22 +]

$sudo pacman-S fdupes [on Arch Linux]

The Fdupes syntax is similar to rdfind. Just type the command, and then type the directory you want to scan.

$fdupes

Linuxmi@linuxmi:~$ fdupes / home/linuxmi/www.linuxmi.com

/ home/linuxmi/www.linuxmi.com/linuxmi.txt

/ home/linuxmi/www.linuxmi.com/linuxmi (copy). Txt

To search for files recursively, you must specify the-r option like this.

$fdupes-r

Linuxmi@linuxmi:~$ fdupes-r / home/linuxmi/www.linuxmi.com

/ home/linuxmi/www.linuxmi.com/linuxmi.txt

/ home/linuxmi/www.linuxmi.com/linuxmi (copy). Txt

/ home/linuxmi/www.linuxmi.com/color-schemes/.git/logs/refs/remotes/origin/HEAD

/ home/linuxmi/www.linuxmi.com/color-schemes/.git/logs/refs/heads/master

/ home/linuxmi/www.linuxmi.com/color-schemes/.git/logs/HEAD

/ home/linuxmi/www.linuxmi.com/color-schemes/script/test

/ home/linuxmi/www.linuxmi.com/test

You can also specify multiple directories and specify directories to search recursively.

Fdupes-r

To have fdupes calculate the size of duplicate files, use the-S option.

Fdupes-S

To collect summary information about the files found, use the-m option.

Fdupes-m

Fdupes-m / home/linuxmi/www.linuxmi.com/

Finally, if you want to remove all duplicates, use the option of-d shown below.

Fdupes-d

Fdupes will ask which found files to delete. You will need to enter the file number:

Fdupes-d / home/linuxmi/www.linuxmi.com/

The solution that is absolutely not recommended is to use this-N option, which will cause only the first file to be retained.

Fdupes-dN

To get a list of available options for use with fdupes, view the help page by running.

Fdupes-help

3. DupeGuru-look for duplicate files in Linux

DupeGuru is an open source and cross-platform tool that can be used to find duplicate files in Linux systems. The tool can scan the file name or the contents of one or more folders. It also allows you to find a file name similar to the file you want to search for.

There are different versions of dupeGuru for Windows,Mac and Linux platforms. Its fast fuzzy matching algorithm can help you find duplicate files in a short time. It is customizable, and you can extract the exact duplicate files you need and clear unwanted files from the system.

To install dupeGuru in Linux, use the following command based on your Linux distribution.

-on Debian/Ubuntu/Mint-

$sudo add-apt-repository ppa:dupeguru/ppa

$sudo apt-get update

$sudo apt-get install dupeguru

-on Arch Linux-

$sudo pacman-S dupeguru

4. Duplicate File Finder for FSlint-Linux

FSlint is a free utility for finding and clearing various forms of lint on the file system. It also reports duplicate files, empty directories, temporary files, duplicate / conflicting (binary) names, incorrect symbolic links, etc. It has both command line and GUI mode.

To install FSlint in Linux, use the following command based on the Linux distribution.

$sudo apt-get install fslint [on Debian/Ubuntu]

Misplaced & sudo yum install fslint [on CentOS/RHEL]

$sudo dnf install fslint [on Fedora 22 +]

$sudo pacman-S fslint [on Arch Linux]

The above is about the content of this article on "how to gracefully clean up duplicate documents by Linux". I believe we all have a certain understanding. I hope the content shared by the editor will be helpful to you. If you want to know more about the relevant knowledge, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.