The Linux Page

Low Level Formatting Hard Drives

Dog that shread a bed to pieces by gomagoti.
By Gomagoti

(License CC: Some rights reserved)  

WARNING

Do not apply these on an SSD. Chances are it will not overwrite the data and it will reduce the lifetime of the SSD for no good reason. An SSD allocates blocks of data in a different way than HDDs.

To low level format a hard drive, the best is to clear it with all zeroes or some random value.

One of the easiest way is to use the cat command line instruction with the zero device:

cat /dev/zero >/dev/sdb

The zero device is faster than the random device. Also we clear the entire device (/dev/sdb) and not just a partition (i.e. /dev/sdb1).

Although some people will be able to restore the data, even after you cleared it this way, they'll need some $100,000 gear to be able to do it. They better need your data really bad (as in: make at least $200,000 in return, although they may get many hard drives... but crooks don't think that way.)

To fix the problem, I suggest you use the random device instead:

cat /dev/urandom >/dev/sdb

This is neat. However, the cat instruction will continue to copy the data as long as possible, past the end of the /dev/sdb device. In other words, it will generate errors and in some case make the device unusable afterward (you'll need a reboot to access the device again.)

A better tool to copy data with a known size is dd. First we need to determine the number of blocks. Using fdisk you can wipe out the existing partition table and then create one entry representing the entire hard drive (although if your drive is more than 2Tb, fdiska won't work too well! check out parted...) fdisk commands are one letter. It would be something like this:

fdisk /dev/sdb
m
p
d
15 to 1
[repeat d until all partitions are gone, use p to verify]
n
1
<enter>
<enter>
p
w

The m command gives you help information about all the available commands.

The p command prints the existing partition table.

The d command is used to delete a partition.

The n command is to create a new partition. It is followed by 1 (partition number) then <enter> twice to accept the default block numbers (first and last by default.)

Use p again to have a good view of the data you just saved in the partition.

The w command writes the new partition to the hard drive.

Now we're ready to use the dd command:

dd count=123456789 if=/dev/urandom of=/dev/sdb1

Here the count is the number of blocks defined when using p. In the following example, it would be 97659103.

   Device   Boot  Start   End      Blocks     Id  System
/dev/sda1         1       12158    97659103+  8e  Linux

The if option defines the input file. We use the urandom, although you could /dev/zero too.

The of option defines the output "file" or block device. Notice that here I used /dev/sdb1 instead of /dev/sdb. Since the partition is the entire drive, it will delete everything and the partition itself we just destroyed it anyway.

I've read that to really clear everything what you want to do is repeat the process a few times (3 to 5 times.) If you have a very slow drive, think about it twice... Anyway, it can in the background while you do other things. In my case, I did that on a 120Gb laptop hard drive which had a mega speed of 11Mb/s. So... 120,000 / 11 = 10,909 seconds, or a little over 3 hours to clear the drive. Not only that, when you use the /dev/urandom, the computer has to come up with random data. That may be even slower than the 11Mb/s output! It took more than 5h to reset that drive to random numbers!

Now, the neat thing with dd is that it understands the USR1 signal and prints out progress statistics when you send it to the tool. This way  you can see whether the process is stuck.

First you need to find the pid (you could start dd in the background and use $? but I don't recommand it.)

ps -ef | grep dd

The list that ps outputs will include dd, see the first number is the process identifier. Using that number, run the kill command as in:

kill -USR1 1234

At that point you get some output from dd tell you where it's at. The first number between parenthesis is the total number of kilobytes or gigabytes. That's probably the easiest way to see how much is already done (i.e. if it says 40Gb and you have a 120Gb hard drive, then 1/3rd is done.) There is also the number of seconds it has been working on the process and the speed at which data is flowing. At the beginning it may look like it goes really fast, that's because it will bufferize using your entire memory (i.e. if you have 96Gb of RAM, then it will bufferize 90+Gb before really starting to write to the output, that will go really fast and thus give you a crazy speed...)