歡迎來到Linux教程網
Linux教程網
Linux教程網
Linux教程網
Linux教程網 >> Unix知識 >> 關於Unix >> Linux技巧:一次刪除一百萬個文件的最快方法

Linux技巧:一次刪除一百萬個文件的最快方法

日期:2017/3/6 14:26:33   编辑:關於Unix
最初的測評 昨天,我看到一個非常有趣的刪除一個目錄下的海量文件的方法。這個方法來自http://www.quora.com/How-can-someone-rapidly-delete-400-000-files裡的Zhenyu Lee。

  最初的測評

  昨天,我看到一個非常有趣的刪除一個目錄下的海量文件的方法。這個方法來自http://www.quora.com/How-can-someone-rapidly-delete-400-000-files裡的Zhenyu Lee。

  他沒有使用find 或 xargs,他很有創意的利用了rsync的強大功能,使用rsync –delete將目標文件夾以一個空文件夾來替換。之後,我做了一個實驗來比較各種方法。讓我吃驚的是,Lee的方法要比其它的快的多。下面就是我的測評。

  環境:

  CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz

  MEM: 4G

  HD: ST3250318AS: 250G/7200RPM

Method # Of Files Deletion Time rsync -a –delete empty/ s1/ 1000000 6m50.638s find s2/ -type f -delete 1000000 87m38.826s find s3/ -type f | xargs -L 100 rm 1000000 83m36.851s find s4/ -type f | xargs -L 100 -P 100 rm 1000000 78m4.658s rm -rf s5 1000000 80m33.434s

  使用 –delete 和 –exclude,你可以選擇性刪除符合條件的文件。還有一點,當你需要保留這個目錄做其它用處時,這種方法是再適合不過了。

  重新測評

  幾天前,Keith-Winstein在回復Quora上的這個帖子時說我之前的測評無法復制,因為操作的時間持續的太久。我澄清一下,這些數據過大,可能是因為我的計算機在過去的幾年裡做的事太多,測評中可能存在一些文件系統錯誤。但我不確定是這些原因。現在好了,我弄了一天比較新的計算機,把測評再做一次。這次我使用/usr/bin/time,它能提供更詳細的信息。下面就是新的結果。

  (每次都是1000000個文件。每個文件的體積都是0。)

Command Elapsed System Time %CPU cs (Vol/Invol) rsync -a –delete empty/ a 10.60 1.31 95 106/22 find b/ -type f -delete 28.51 14.46 52 14849/11 find c/ -type f | xargs -L 100 rm 41.69 20.60 54 37048/15074 find d/ -type f | xargs -L 100 -P 100 rm 34.32 27.82 89 929897/21720 rm -rf f 31.29 14.80 47 15134/11

  原始輸出

  # method 1

  ~/test $ /usr/bin/time -v rsync -a --delete empty/ a/

  Command being timed: "rsync -a --delete empty/ a/"

  User time (seconds): 1.31

  System time (seconds): 10.60

  Percent of CPU this job got: 95%

  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:12.42

  Average shared text size (kbytes): 0

  Average unshared data size (kbytes): 0

  Average stack size (kbytes): 0

  Average total size (kbytes): 0

  Maximum resident set size (kbytes): 0

Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 24378 Voluntary context switches: 106 Involuntary context switches: 22 Swaps: 0 File

  Average resident set size (kbytes): 0

  Major (requiring I/O) page faults: 0

  Minor (reclaiming a frame) page faults: 24378

  Voluntary context switches: 106

  Involuntary context switches: 22

  Swaps: 0

  File system inputs: 0

  File system outputs: 0

  Socket messages sent: 0

  Socket messages received: 0

  Signals delivered: 0

  Page size (bytes): 4096

  Exit status: 0

  # method 2

  Command being timed: "find b/ -type f -delete"

  User time (seconds): 0.41

  System time (seconds): 14.46

  Percent of CPU this job got: 52%

  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:28.51

  Average shared text size (kbytes): 0

  Average unshared data size (kbytes): 0

  Average stack size (kbytes): 0

  Average total size (kbytes): 0

  Maximum resident set size (kbytes): 0

  Average resident set size (kbytes): 0

  Major (requiring I/O) page faults: 0

  Minor (reclaiming a frame) page faults: 11749

  Voluntary context switches: 14849

  Involuntary context switches: 11

  Swaps: 0

  File system inputs: 0

  File system outputs: 0

  Socket messages sent: 0

  Socket messages received: 0

  Signals delivered: 0

  Page size (bytes): 4096

  Exit status: 0

  # method 3

  find c/ -type f | xargs -L 100 rm

  ~/test $ /usr/bin/time -v ./delete.sh

  Command being timed: "./delete.sh"

  User time (seconds): 2.06

  System time (seconds): 20.60

  Percent of CPU this job got: 54%

  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:41.69

  Average shared text size (kbytes): 0

  Average unshared data size (kbytes): 0

  Average stack size (kbytes): 0

  Average total size (kbytes): 0

  Maximum resident set size (kbytes): 0

  Average resident set size (kbytes): 0

  Major (requiring I/O) page faults: 0

  Minor (reclaiming a frame) page faults: 1764225

  Voluntary context switches: 37048

  Involuntary context switches: 15074

  Swaps: 0

  File system inputs: 0

  File system outputs: 0

  Socket messages sent: 0

  Socket messages received: 0

  Signals delivered: 0

  Page size (bytes): 4096

  Exit status: 0

  # method 4

  find d/ -type f | xargs -L 100 -P 100 rm

  ~/test $ /usr/bin/time -v ./delete.sh

  Command being timed: "./delete.sh"

  User time (seconds): 2.86

  System time (seconds): 27.82

  Percent of CPU this job got: 89%

  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:34.32

  Average shared text size (kbytes): 0

  Average unshared data size (kbytes): 0

  Average stack size (kbytes): 0

  Average total size (kbytes): 0

  Maximum resident set size (kbytes): 0

  Average resident set size (kbytes): 0

  Major (requiring I/O) page faults: 0

  Minor (reclaiming a frame) page faults: 1764278

  Voluntary context switches: 929897

  Involuntary context switches: 21720

  Swaps: 0

  File system inputs: 0

  File system outputs: 0

  Socket messages sent: 0

  Socket messages received: 0

  Signals delivered: 0

  Page size (bytes): 4096

  Exit status: 0

  # method 5

  ~/test $ /usr/bin/time -v rm -rf f

  Command being timed: "rm -rf f"

  User time (seconds): 0.20

  System time (seconds): 14.80

  Percent of CPU this job got: 47%

  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:31.29

  Average shared text size (kbytes): 0

  Average unshared data size (kbytes): 0

  Average stack size (kbytes): 0

  Average total size (kbytes): 0

  Maximum resident set size (kbytes): 0

  Average resident set size (kbytes): 0

  Major (requiring I/O) page faults: 0

  Minor (reclaiming a frame) page faults: 176

  Voluntary context switches: 15134

  Involuntary context switches: 11

  Swaps: 0

  File system inputs: 0

  File system outputs: 0

  Socket messages sent: 0

  Socket messages received: 0

  Signals delivered: 0

  Page size (bytes): 4096

Exit status: 0 我真的十分好奇為什麼Lee的方法要比其它的快,竟然比rm -rf也要快。如果有人知道,請寫在下面,非常感謝。

  Exit status: 0

  我真的十分好奇為什麼Lee的方法要比其它的快,竟然比rm -rf也要快。如果有人知道,請寫在下面,非常感謝。

Copyright © Linux教程網 All Rights Reserved