Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Detailed explanation of Hadoop Recycle Bin and fs.trash parameters

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Foreword:

In the Linux system, one of the biggest personal inconveniences is that there is no concept of the Recycle Bin. Rm-rf can easily cause great losses. In Hadoop or HDFS, there is the concept of trash (Recycle Bin), which can be found after the data has been mistakenly deleted. The trash option in Hadoop is off by default, so if you want to take effect, you need to turn on the trash option in advance and modify the core-site.xml in conf. Let's test the difference before and after enabling it: 1. Do not enable trash [Hadoop @ hadoop000 ~] $hdfs dfs-put test.log / [hadoop@hadoop000 ~] $hdfs dfs-ls / Found 3 items-rw-r--r-- 1 hadoop supergroup 34 2018-05-23 16:49 / test.logdrwx--hadoop supergroup 0 2018-05-19 15:48 / tmpdrwxr-xr-x-hadoop supergroup 0 2018-05-19 15:48 / user# delete test.log Note that [hadoop@hadoop000 ~] $hdfs dfs-rm-r / test.logDeleted / test.log# re-check and find that test.log has been deleted [hadoop@hadoop000 ~] $hdfs dfs-ls / Found 2 itemsdrwx--hadoop supergroup 0 2018-05-19 15:48 / tmpdrwxr-xr-x-hadoop supergroup 0 2018-05-19 15:48 / user2. Enable trash [Hadoop @ hadoop000 hadoop] $pwd/opt/software/hadoop-2.8.1/etc/hadoop# add fs.trash parameter configuration to enable trash (process does not need to restart) [hadoop@hadoop000 hadoop] $vi core-site.xml fs.trash.interval 1440 fs.trash.checkpoint.interval 1440 # fs.trash.interval refers to within this recycling cycle The file is actually moved to this directory in trash, rather than deleting the data immediately. Hdfs will not really delete the data until the recycling cycle has really arrived. The default unit is minutes, 1440 minutes = 60: 24, which is exactly one day. Fs.trash.checkpoint.interval refers to the interval between garbage collection checks, which should be less than or equal to fs.trash.interval. # refer to the official documentation: http://hadoop.apache.org/docs/r2.8.4/hadoop-project-dist/hadoop-common/core-default.xml [hadoop@hadoop000 ~] $hdfs dfs-put test.log / [hadoop@hadoop000 ~] $hdfs dfs-ls / Found 3 items-rw-r--r-- 1 hadoop supergroup 34 2018-05-23 16:54 / test.logdrwx- -- hadoop supergroup 0 2018-05-19 15:48 / tmpdrwxr-xr-x-hadoop supergroup 0 2018-05-19 15:48 / user# removes different test.log notes [hadoop@hadoop000 ~] $hdfs dfs-rm-r / test.log 16:54:55 on 18-05-23 INFO fs.TrashPolicyDefault: Moved: 'hdfs://192.168.6.217:9000/test.log' to Trash at: hdfs://192.168.6.217:9000/user/hadoop/.Trash/Current/test.log# found deleted files in the Recycle Bin [hadoop@hadoop000] $hdfs dfs-ls / user/hadoop/.Trash/CurrentFound 1 items-rw-r--r-- 1 hadoop supergroup 34 2018-05-23 16:54 / user/hadoop/.Trash/Current/test.log# restore mistakenly deleted files [hadoop@hadoop000] $hdfs dfs-mv / user/hadoop/.Trash/Current/test.log / test.log [hadoop@hadoop000 ~] $hdfs dfs-ls / Found 3 items-rw-r--r-- 1 hadoop supergroup 34 2018-05-23 16:54 / test.logdrwx--hadoop supergroup 0 2018-05-19 15:48 / tmpdrwxr-xr-x-hadoop supergroup 0 2018-05-19 15:48 / user

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report