About drive wiping: You're probably better off using the ATA Secure Erase command, which is very quick and does the entire disc. dd and other tools risk not doing blocks marked as bad, for example.
He's right that a single overwrite of zero is probably good enough to make sure that data is gone, but it's probably not enough to persuade other people that it's gone. A few passes of pseudo random data is probably better if you need to persuade other people that the data has gone.
But if it's really important drive wiping is what you do to protect the drives until you get a change to grind them.
There is also a cryptographic erase option on secure erase if the drive supports it. It is nearly instantaneous and you can follow up with other slower methods if desired.
Also for SSDs, using the secure erase method is important because of overprovisioning and garbage collection. If that is not available, on most SSD algorithms, doing two full pass writes (with random sector data if drive supported compression) will get you close to wiping out all contents as possible.
The OPAL standard has that cryptographic erase function. I've used it before, but did not deeply verify if any data was recoverable. At least the partition table was gone. In theory, the command destroys the old key and creates a new one that the drive uses to read and write data. A different key means everything is noise. You do need access to the printed label on the drive itself to do it (for the PSID).
>but it's probably not enough to persuade other people that it's gone.
I believe there is a long standing bounty for anyone who can retrieve useful data from a drive that had been zero'd once. No one has been able to thus far.
A lot of the disk wiping "culture" stems from a much earlier time when disk technology was less reliable, especially in regards to writes. Dan Gutmann himself says that the Gutmann method is long antiquated and only worked with MFM/RLL encoded disks from the 80s and early 90s.
Perhaps instead of humoring these people, we should be educating them. A zero'd out disk is a wiped disk until someone proves otherwise.
This reminds me of assertions we used to take for granted about DRAM. We used to assume that the contents are lost when you cut the power, but then someone turned a can of cold air on a DIMM. We usually assume that bits are completely independent of each other, but then someone discovered the row hammer. The latter is especially interesting because it only works on newer DIMM technology. Technology details change, and it's hard to predict what the ramifications will be. A little extra caution isn't necessarily a bad thing.
This may be my paranoia talking, but is there a way short of an examination with a scanning electron microscope to ensure that the erase command is actually doing what it's supposed to do?
Not so much that the drive manufacturers are engaging in malfeasance (though that's certainly not off the table), but that it's not unheard of for certain agencies in certain governments to crock low level system components (intercepting them in shipping and so forth) so they work against the user.
You could just read the disk again and check for nonzero blocks. If you don't trust the disk to read or write to itself either, then you might as well just toss it.
> Unfortunately, there is no command-line option to have `dd` print progress
How difficult could it be to write a dd command from scratch that does include progress-reporting? I mean, dd is simply reading blocks of data from one file descriptor and writing them to another.
> The other way to see progress on `dd` is to issue a signal 3 (USR1, iirc) to the dd process. kill -3 <dd pid>
Be careful with this on some distributions and compilations of DD. Purely anecdotal evidence, but in college I had a friend imaging a very large (5400RPM) drive and about 10 hours into the process he lamented that he wished he could see how far along it was.
I popped open a terminal, ps -A |grep dd, kill -USR1 $PID, and it just exited.
I tend to run `watch kill -USR1 $(pidof dd)` in a second terminal. Watch executes your command repeatidly (by default every two seconds) so you then get regular dd status updates.
'>>' will cause O_APPEND to be specified as flags when opening "/dev/sda". I'm pretty sure this flag is ignored on block devices as it's obviously useless.
The redirection will happen in your shell, but the command will be called inside the shell invoked by sudo. So that won't work either unless you have write privs to the block device, but if that were the case, you probably didn't need sudo to read the dmg file. So this will likely fail for a reason other than the one you pointed out.
I know you meant to use the pipe instead of redirection, but it might be worth updating your comment for the benefit of others who are less command line literate :)
mysqldump -u mysql db | ssh user@rsync.net "cat > db_dump"
Namely, the syntax is one character shorter. (But only because I used whitespace around >).
With dd, you can control the transfer units (the size of the read and write system calls which are performed) whereas cat chooses its own buffering. However, this doesn't matter on regular files and block devices. The transfer sizes only matter on raw devices where the block size must be observed. E.g. traditional tape devices on Unix where if you do a short read, or oversized write, you get truncation.
> is there a disadvantage to using a higher blocksize?
Maybe, depending on the details. Imagine reading 4 GB from one disk then writing it all to another, all at 1 MB/sec. If your block size is 4 GB, It'll take 4000 seconds to read, then another 4000 seconds to write... and will also use 4 GB of memory.
If your block size is 1 MB instead, then the system has the opportunity to run things in parallel, so it'll take 4001 seconds, because every read beyond the first happens at the same time as a write.
And if your block size is 1 byte, then in theory the transfer would take almost exactly 4000 seconds... except that now the system is running in circles ferrying a single byte at a time, so your throughput drops to something much less than 1 MB/sec.
In practice, a 1 MB block size works fine on modern systems, and there's not much to be gained by fine-tuning.
It is worth noteting that the shred program mentioned is more or less useless on modern filesystems for a variety of reasons, the man-page has a list that it will fail to work correctly on (btrfs, ext3, NFS).
It may well be that the only usable filesystem for it, is FAT32 (and possibly NTFS, not sure on that thou).
He's right that a single overwrite of zero is probably good enough to make sure that data is gone, but it's probably not enough to persuade other people that it's gone. A few passes of pseudo random data is probably better if you need to persuade other people that the data has gone.
But if it's really important drive wiping is what you do to protect the drives until you get a change to grind them.
Also for SSDs, using the secure erase method is important because of overprovisioning and garbage collection. If that is not available, on most SSD algorithms, doing two full pass writes (with random sector data if drive supported compression) will get you close to wiping out all contents as possible.
https://github.com/Drive-Trust-Alliance/sedutil/blob/master/...
# openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" -nosalt </dev/zero \ | pv -bartpes <DISK_SIZE> | dd bs=64K of=/dev/sd"X"
To randomize the drive/partition using a randomly-seeded AES cipher from OpenSSL (displaying the optional progress meter with pv): https://wiki.archlinux.org/index.php/Securely_wipe_disk/Tips...
Then I take out the drill press and make a bunch of holes.
I believe there is a long standing bounty for anyone who can retrieve useful data from a drive that had been zero'd once. No one has been able to thus far.
A lot of the disk wiping "culture" stems from a much earlier time when disk technology was less reliable, especially in regards to writes. Dan Gutmann himself says that the Gutmann method is long antiquated and only worked with MFM/RLL encoded disks from the 80s and early 90s.
Perhaps instead of humoring these people, we should be educating them. A zero'd out disk is a wiped disk until someone proves otherwise.
Not so much that the drive manufacturers are engaging in malfeasance (though that's certainly not off the table), but that it's not unheard of for certain agencies in certain governments to crock low level system components (intercepting them in shipping and so forth) so they work against the user.
..or just plain ignorance. A study indicates that back in 2011, half of the major drive vendors weren't doing the erase correctly. https://www.usenix.org/legacy/events/fast11/tech/full_papers...
You only need one.
https://en.wikipedia.org/wiki/Data_erasure#Number_of_overwri...
How difficult could it be to write a dd command from scratch that does include progress-reporting? I mean, dd is simply reading blocks of data from one file descriptor and writing them to another.
dd if=/dev/zero count=10 bs=1M | pv > file.bin
The other way to see progress on `dd` is to issue a signal 3 (USR1, iirc) to the dd process. kill -3 <dd pid>
Be careful with this on some distributions and compilations of DD. Purely anecdotal evidence, but in college I had a friend imaging a very large (5400RPM) drive and about 10 hours into the process he lamented that he wished he could see how far along it was.
I popped open a terminal, ps -A |grep dd, kill -USR1 $PID, and it just exited.
He was rather pissed that I lost him 10 hours.
https://github.com/Xfennec/progress
Anyway, this "trick" is mentioned on the man page, which isn't that long. No additional tool required.
(This has caught me out before. Oh how I wish these things were standardised...)
dd if=/dev/zero of=/dev/null status=progress 4814691328 bytes (4,8 GB) copied, 4,000000 s, 1,2 GB
It's only the size copied and the speed but it's usually enough.
Deleted Comment
https://www.gnu.org/software/ddrescue/
However, I often do the following, which works pretty well:
[1] http://www.rsync.net/products/attic.html
With dd, you can control the transfer units (the size of the read and write system calls which are performed) whereas cat chooses its own buffering. However, this doesn't matter on regular files and block devices. The transfer sizes only matter on raw devices where the block size must be observed. E.g. traditional tape devices on Unix where if you do a short read, or oversized write, you get truncation.
Maybe, depending on the details. Imagine reading 4 GB from one disk then writing it all to another, all at 1 MB/sec. If your block size is 4 GB, It'll take 4000 seconds to read, then another 4000 seconds to write... and will also use 4 GB of memory.
If your block size is 1 MB instead, then the system has the opportunity to run things in parallel, so it'll take 4001 seconds, because every read beyond the first happens at the same time as a write.
And if your block size is 1 byte, then in theory the transfer would take almost exactly 4000 seconds... except that now the system is running in circles ferrying a single byte at a time, so your throughput drops to something much less than 1 MB/sec.
In practice, a 1 MB block size works fine on modern systems, and there's not much to be gained by fine-tuning.
It may well be that the only usable filesystem for it, is FAT32 (and possibly NTFS, not sure on that thou).