The Linux Page

Low Level Formatting an HDD or SSD

Dog that shread a bed to pieces by gomagoti.
By Gomagoti

(License CC: Some rights reserved)  

WARNING

Do not apply the Secure Erase on a USB Drive or a drive connected through an SAS or RAID system as it may render your drive unusable. (See Brick (electronics) for an idea of what you could end up with in this case.)

Secure Erase

Modern hard drive controllers and drivers on Linux can make use of the special feature called SECURE ERASE.

To see whether your drive supports that feature, use the hdparam command:

sudo hdparm -I /dev/sdX

where X is the partition you would like to erase (i.e. f3). If you'd like to erase the entire drive, you may want to repartition the entire drive in a single partition (see the fdisk/parted talk in the Old Method below.)

What you are interested in is the Security section. Here is that section for one of my 10Tb hard drives:

Security:
    Master password revision code = 65534
        supported
    not    enabled
    not    locked
    not    frozen
    not    expired: security count
    not    supported: enhanced erase
    1072min for SECURITY ERASE UNIT.

As we can see, the security erase would take about 1072 minutes (over 17 hours). Another important aspect, the drive must be "not frozen". If currently frozen, the security erase command will not work. A trick to have the freeze removed is to suspend the computer and wake it back up. If that doesn't work, you'll have to check your BIOS settings. The BIOS is what freezes a drive in this way.

Note: the "not enabled" means that there is no password. See below for details about that one.

The command to do a security erase on your partition is as follow:

sudo hdparm --user-master u --security-erase "password1" /dev/sdX

This command will work properly on SSD drives as well since it sends an ATA command to the controller which can then do what is required to erase all the concerned chips available on the drive.

Note that you MUST be sure that /dev/sdX is the partition you really want to erase. If you use the wrong one... you know, that's a one way street. After that erase the data is 100% gone.

"password1" is required. This is a form of security check that you're not erasing the wrong partition. By default, your partition probably has no password. You should be able to setup a password using your BIOS, but that would require rebooting (and actually some BIOS may have a hard time to boot in that situation). Instead you can use the hdparm command like so:

sudo hdparm --user-master u --security-set-pass password1 /dev/sdX

The erase will remove the password so you won't have to do anything more. Any password will do, "password1" is easy to remember.

Source: Solid State Drive Clearing


Old Method

WARNING

Do not apply these commands on an SSD. Chances are it will not overwrite the data and it will reduce the lifetime of the SSD for no good reason. An SSD allocates blocks of data in a different way than HDDs.

Quick Reset

To low level format a hard drive, the best is to clear it with all zeroes or some random value.

One of the easiest way is to use the cat command line instruction with the zero device:

cat /dev/zero >/dev/sdb

The zero device is faster than the random device. Also we clear the entire device (/dev/sdb) and not just a partition (i.e. /dev/sdb1).

Magnetic Data is Sticky

Although some people will be able to restore the data, even after you cleared it with zeroes, they'll need some $100,000 gear to be able to do it. They better need your data really bad (as in: make at least $200,000 in return, although they may get many hard drives... but crooks don't think that way.)

To fix the problem, I suggest you use the random device instead:

cat /dev/urandom >/dev/sdb

This is neat. However, the cat instruction will continue to copy the data as long as possible, past the end of the /dev/sdb device. In other words, it will generate errors and in some case make the device unusable afterward (you'll need a reboot to access the device again.)

Reset with the Exact Size

I have not tried yet, but there is this tool called pv which seems to be even better than dd. It offers many more options and especially, it can be made to continue even if write errors occurs. So far I never had such problems with my hard drive (except for one which completely crashed not long after I started getting some dead sectors). With dd, it is not possible to continue on errors. Instead, you'd have to determine the sector where it stopped and then restart the process after that one sector and try again until it really restart... It would be tedious.

A better tool to copy data with a known size is dd. First we need to determine the number of blocks. Using fdisk you can wipe out the existing partition table and then create one entry representing the entire hard drive (although if your drive is more than 2Tb, fdisk wouldn't work too well. Newer versions seem to have been enhanced to support very large disks properly. Otherwise consider using parted instead...) fdisk commands are one letter. It would be something like this:

fdisk /dev/sdb
m
p
d
15 to 1
[repeat d until all partitions are gone, use p to verify]
n
1
<enter>
<enter>
p
w

The m command gives you help information about all the available commands.

The p command prints the existing partition table.

The d command is used to delete a partition.

The n command is to create a new partition. It is followed by 1 (partition number) then <enter> twice to accept the default block numbers (first and last by default.)

Use p again to have a good view of the data you just saved in the partition.

The w command writes the new partition to the hard drive.

Now we're ready to use the dd command:

dd status=progress count=<see below> if=/dev/urandom of=/dev/sdb1

The count option is the number of blocks as shown by the p command. In the following partition example, it would be 97659103:

   Device   Boot  Start   End      Blocks     Id  System
/dev/sda1         1       12158    97659103+  8e  Linux

The if option defines the input file. Here I used the urandom device, although you could use /dev/zero too.

The of option defines the output "file" or block device. Notice that here I used /dev/sdb1 instead of /dev/sdb. Since the partition is the entire drive, it will delete everything and the partition itself we just destroyed it anyway.

I've read that to really clear everything what you want to do is repeat the process a few times (3 to 5 times.) If you have a very slow drive, think about it twice... Of course, it can run in the background while you do other work. In my case, I did that on a 120Gb laptop hard drive which had a mega speed of 11Mb/s. So... 120,000 / 11 = 10,909 seconds, or a little over 3 hours to clear the drive. Not only that, when you use the /dev/urandom, the computer has to come up with random data. That may be even slower than the 11Mb/s output! It took more than 5h to reset that drive to random numbers!

Current Status

Update

As mentioned by someone in a comment, you can instead use the status command line like so:

dd status=progress ...

That way you get progress information about once per second. I've added that option above rendering this section pretty useless unless you somehow did not include the option on the command line.

Now, the neat thing with dd is that it understands the USR1 signal and prints out progress statistics when you send it to the tool. This way  you can see whether the process is stuck.

First you need to find the pid (you could start dd in the background and use $$ but I don't recommand it.)

ps -ef | grep dd

The list that ps outputs will include dd. The first number is the process identifier (PID).

On modern Linux, you can also use one of the following:

pidof dd
pgrep -x dd

The -x command line option is important to make sure you only get information about dd and not any process that happen to include the letters dd in their name.

Using that PID number, run the kill command as in:

kill -USR1 1234

At that point you get some output from dd telling you where it's at. The first number between parenthesis is the total number of kilobytes or gigabytes. That's probably the easiest way to see how much is already done (i.e. if it says 40Gb and you have a 120Gb hard drive, then 1/3rd is done.) There is also the number of seconds it has been working on the process and the speed at which data is flowing. At the beginning it may look like it goes really fast, that's because it will bufferize using your entire memory (i.e. if you have 96Gb of RAM, then it will bufferize 90+Gb before really starting to write to the output, that will go really fast and thus give you a crazy speed...)

What's up with SSDs?

It has been a while now. To make things faster and especially to extend the lifetime of the drives (i.e. the number of writes are limited), the SSDs controllers have implemented some special behaviors to handle where data gets wrtten.

When you ask the controller to write data on block N, the value of N does not change for you, but as far as the controller is concerned it changes on each iteration. The idea is quite simple: an SSD is composed of many chips. You can do a write on any one chip. As far as the chip is concerned, this counts as aging by one more day. (this is, of course, a metaphore)

Any one chip has a maximum age; say 100,000 days. In other words, we can write to that one chip a maximum of 100,000 times before the chip is much more likely to fail (in general, manufacturers use a 20% gap. In other words, if they tell you that it can survive 100,000 writes, it is likely that the chip supports 120,000. This makes sure that the 100,000th write works as expected and you will be able to re-read your data.)

Once all the chips were written 100,000 times, your drive should only be used in read-only mode.

Of course, as you use your drive some blocks get written once and never get modified (i.e. you install your OS once and never upgrade). So those may get a very small age and sit there. On the other hand, some other blocks may be overwritten all the time. Say you have many services writing logs, those are going to be written, deleted, overwritten, deleted, etc. In that case, a new block is going to be used for each write and it will rotate through your drive. However, as far as the OS is concerned, the blocks have the same number, yet the controller uses a different chip each time.

As a result the OS has no means to say "I want to write data on this very chip". It can only say "overwrite block N with this new data". So for the purpose of clearing a Drive, the controller will prevent us from doing so. If many of the chips have an age of 50,000 and one has an age of 10 or so, most of your write are going to happen on the one age 10 if the controller is given the chance of reusing that one chip. It is very likely that cat or dd will not be able to overwrite the entire disk and it will just make some of the chips look much older for no good reasons.

Re: Low Level Formatting an HDD or SSD

I wouldn't say a big mistake, but it's not going to work with just `dd ...` or `cat ...`. If the drive has a working SMART system that offers an erase function, it will be capable of deleting everything properly. In all other cases, it will most likely destroy the file system, but not all the data. So it's probably enough for yourself or if you want to give the USB drive (or an SSD) to a good friend, but if a good hacker gets that drive, he will be able to read the data that was not properly deleted (i.e. the data on sectors that the write won't touch because they already got written more than some other sectors).

For HDD, when you say you want to write to sector N, the controller writes to sector N. So for those, it works every time either way, although the SMART system may offer an erase function which is much faster (local to the drive, at least, so no requirement on DMA for that clear).

Also you could also use `/dev/urandom` which gives a better chance of hiding the existing data (with a really powerful magnetic reader, you will still be able to see the old data if you only write zeroes; random data will make it way more confusing to such readers).

Re: Low Level Formatting an HDD or SSD

Ok, basically it is due to the type of communication interface, in this case USB? And why the "brick"? I've always used "dd" with "/dev/zero"to format my USB (SATA) drives so have I made a big mistake so far?
Thank you.

Re: Low Level Formatting an HDD or SSD

A USB drive is essentially the same as an SSD drive, so trying to just format such a drive is not likely to write data where you think you are writing it. It will work only if the USB drive has a smart controller which can do a real clear of all the cells.

The concept is pretty simple: if you write 10 times on Cell A, the next 10 writes which have nothing to do with that specific file would end up on Cell B. It is done that way so on average we write an equal amount of time to each cell. Great technology to make the drive last as long as possible, not so great if you want to do a low format (because the driver tells you that you are writing on sector N when in reality you are writing on sector M, but because that's done under the hood, you have no way of knowing.)

Re: Low Level Formatting an HDD or SSD

Hi, just one question: why do you advise against low-level formatting on USB-connected disks?
Thank you.

Re: Low Level Formatting an HDD or SSD

I just got a new 8Tb hard drive since my second 2Tb drive died. I created a GPT label and partition. Then I formatted that partition and from the start, I'm "missing" 390gb of data. This is because the system saves a lot of meta data to manage the drive. Especially, it wants to manage the ext4 journal.

Here are the numbers:

alexis@isabelle:~$ sudo mkfs -t ext4 /dev/sdb1
mke2fs 1.44.1 (24-Mar-2018)
Creating filesystem with 1953506385 4k blocks and 244191232 inodes
Filesystem UUID: bf425244-bb95-4332-a8eb-1e02e254cdf7
Superblock backups stored on blocks:
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
	4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
	102400000, 214990848, 512000000, 550731776, 644972544, 1934917632

Allocating group tables: done
Writing inode tables: done
Creating journal (262144 blocks): done
Writing superblocks and filesystem accounting information: done

alexis@isabelle:~$ df /mnt/unsafe
Filesystem            1K-blocks      Used  Available Use% Mounted on
/dev/sdb1            7751366384     94236 7360554488   1% /mnt/unsafe
alexis@isabelle:~$ expr 7751366384 - 7360554488
390811896

If this is what you are seeing, then it's normal. It would have nothing to do with the fact that the formatting blocked at 300Gb.

Another possibility is that you have bad blocks, in which case I would get a new drive. You can keep that old drive of "unsafe data" (temporary files, etc.)

There is a smartmontools package for Linux which includes a smartctl command. That can be used to check the drive through the controller and get an idea of the drive's health as viewed by the controller (opposed to the OS which doesn't use that info).

Re: Low Level Formatting an HDD or SSD

But I do not want to destroy the hard drive, I want to fix the broken part when using the Hard Disk tool that comes in linux mint (which has the option low level format and I don't know if you write only 0 or something else, I just know I can't access that part).

When formatting the hard drive to partition it, I made the mistake of using that option, the hard drive was frozen and that's how I ran out of 300 GB.

I'm sorry, I'm not good at English and I don't know how to explain my problem better.

Re: Low Level Formatting an HDD or SSD

I suppose that by "low level formatting" you are referring to my examples above. With a "dd", "cat", "pv", I really do not see how that could 300Gb "loss" could happen unless your did a low formatting of the MBR which includes the partition table. In that case, the partition table would have been "tweaked" in such a way that it "created" a 300Gb partition. I don't really see anything else that could have happened.

I would try by resetting the MBR and try formatting again. To clear the MBR, write two blocks with zeroes. Something like this:

dd if=/dev/zero of=/dev/hdz bs=512 count=2

fdisk will be able to handle a completely cleared MBR and add new partitions as expected. You may need to reboot your machine between such writes, though, because the kernel may cache the old data and try to use that instead of properly re-reading the data from the drive (at least that was the case in the old days).

Note that resetting a drive before using it is not useful. All you need to do with Linux is format the file system (see "mkfs"). The reset example shown here is if you want to get rid of a hard drive without shredding it to pieces (which costs some money).

The fact that it stopped at 300Gb may be related to a speed issue. At times, writing too fast to a hard drive can lead to such problems. On the top of my head, I'm not too sure how to pace one of the tools I mentioned above.

Also, as a note, I should not call this "low level formatting" because it's not. The low level is done at the factory now and you can't do that again. The low level formatting writes the necessary codes so later the system can read/write sectors. What we do here, though, is just resetting the data within those sectors.

Re: Low Level Formatting an HDD or SSD

I made a mistake with a 6 TB hard drive.

I tried to format at a low level with the default disk tool in linux mint and I froze at 300 GB. Now I get that I have 300 GB occupied on hard disk and I can not recover them. Is there any way to fix it without having to train it at a low level? I'm afraid it will freeze again.

Try using fast formatting, but the 300 GB still come out as busy, could it be that when formatting at a low level those sectors have some kind of ghost file of the incomplete process?

Re: Low Level Formatting an HDD or SSD

There's no evidence that it's possible to recover data from a zeroed modern hard drive, no matter how much you're willing to spend. Except for a few reallocated blocks, which wont be much.

Still, if you want random data, /dev/urandom is going to be brutally slow. I get 9MB/s here. Meanwhile a simple openssl command gives me 530MB/s. Here's a nice quick source of random data:

openssl enc -rc4-40 -pass pass:"$(head -c128 /dev/urandom|base64)" -nosalt </dev/zero

I can also highly recommend installing and using "pv" instead of cat or dd or similar, as it gives a nice progress bar and has lots of options.

I've wiped hundreds of drives and NEVER had an issue with making them "unusable" with the sole exception of awful buggy Silicon Power USB thumb drives. Nearly all commands will quit at the first error you get when trying to go past EOM and quit quite gracefully. Only with some ignore-error options to dd can you get it to keep going.

Still, if you have a driver bug and want to stop right at the end of the media without errors, no need to fool with fdisk. blockdev --getsize64 will tell you exactly how many bytes your drive has. Pass that number to the "-Ss" option of pv, and it'll completely wipe the drive without hitting EOM and reporting an error.

Re: Low Level Formatting Hard Drives

I'm usually using bash, indeed. And I meant to say $$. I fixed it in my post.

I also found out how to properly erase an SSD drive so I added that info. The same info can be used to erase any drive (HDD or SSD), just be careful if you have a SAS/RAID/USB drive, those may explode instead.

Thank you for your comment!

Re: Low Level Formatting Hard Drives

Instead of using ps and grep to get the process id of dd, you could use either "pidof dd", or "pgrep -x dd" (the "-x" argument is normally used to tell pgrep to not use regular expressions, but it can also ensure process name is EXACTLY what you typed, as opposed to any process that happens to have two instances of the letter "d" in a row).

Also, assuming you are using BASH as you shell, I don't think $? does what you think it does. $? just prints the exit status of your previous command, NOT the PID. Perhaps you meant to type $$, which would print the shell's PID? (Not that this would make much sense, either.)

Another thing: Instead of constantly sending SIGUSR1 to dd, you could add the "status=progress" argument. This will cause dd to print the same data it would print with SIGUSR1 (minus the "x records in" and "y records out" lines), but it does so automatically. (Once every second, if I'm not mistaken.) Thanks to the magic of Carriage Returns, it does this without filling up your terminal with a million lines of (mostly) the same output.