The Linux Page

Testing your memory on a live Linux system

Memory Chip — how to test their integrity

Today I wanted to test the memory on a remote server. I could not just reboot and run memtest86+ so instead I had to look for a different solution to testing most of the computer memory without having to reboot...

Test Memory Through Linux File Cache with On-Disk Temporary File

WARNING: This test asssumes that /tmp is NOT mounted with tmpfs. To verify use df /tmp. If it says tmpfs under the Filesystem column, then this test is likely to fail as written.

# Good, using a physical disk
% df /tmp
Filesystem     1K-blocks  Used Available Use% Mounted on
/dev/md6         4675752  9808   4405380   1% /tmp

# Invalid, using tmpfs (i.e. RAM disk)
% df /tmp
Filesystem     1K-blocks  Used Available Use% Mounted on
tmpfs            4675752  9808   4405380   1% /tmp

I found an interesting page in that regard describing a way to do so using md5sum on a very large file.

The is a verbatim copy of the Linux command line proposed.

WARNING: see paragraph about bs and count  below.

dd if=/dev/urandom bs=768304 of=/tmp/memtest count=1050
md5sum /tmp/memtest; md5sum /tmp/memtest; md5sum /tmp/memtest

The size, 768304 (In Kb), is expected to be close to your memory size. You should know how much memory you have, otherwise you can type free or top and you'll see somewhere what it is. It's also available in the /proc/meminfo as the MemTotal line (usually the first line):

% grep MemTotal /proc/meminfo
MemTotal:        8165856 kB
%

This example shows the total amount of RAM for an 8Gb server.

The count parameter shows how many times the bs size will be duplicated. I think that there is a little mistake in that the count should be 1024 instead. Possibly also the numbers should be inverted. The default bs is 512 which is the size of a block and the count should be how many times you repeat that block.

dd if=/dev/urandom bs=1024 of=/tmp/memtest count=768304

This means write 768304 blocks of 1024 bytes in /tmp/memtest using random data as provided by /dev/urandom.

The second line above runs md5sum 3 times on that file. Since Linux has the bad habit of caching the data, this means a lot of that data will reside in memory when the 2nd and 3rd md5sum commands run. And that means you're testing a great deal of the memory of your computer.

The idea is that if the memory is going bad, then the each md5sum will return a different result and you are likely to get ECC errors in your console & in your syslog file (if such are turned on in your BIOS and kernel.)

Original post: Linux Server Memory Check

Test Memory with tmpfs File

It is also possible to create a tmpfs RAM disk and then run a test against that disk. By default the size of a tmpfs RAM disk is 50% of the computer total RAM. It may get higher with servers supporting very large memory amount (like 256Gb of RAM.)

WARNING: This test is likely going to use 50% of your RAM. Use that with reserve on a live system serving clients.

You can then fill the disk with zeroes to see how fast it goes. Note that the test will work once. Repeating it will not go at the same speed or check all the RAM over and over again.

sudo mkdir /mnt/memtest
sudo mount -t tmpfs /mnt/memtest /mnt/memtest
dd if=/dev/zero of=/mnt/test/test bs=1M

The dd command, in this case, runs until the filesystem is filled up. However, you fill the whole memory with zeroes only. This is not a very good test to make sure that the memory is indeed valid. However, it will give you information about the speed at which the file was created and thus an idea of memory access speed. Note, however, that writing to disk has overhead in handling the file format and file with just zeroes is likely to end up being created as a sparse file (where part of the file is not commited to disk because they are all zeroes.)

Changing the dd command to use /dev/urandom will better verify the memory, however, it won't show you a validspeed because /dev/urandom is likely a bottleneck in this case. So you are more likely to determine the speed at which /dev/urandom gives you random data.

Source: What are the best possible way to benchmark the RAM (no-ECC) under linux/arm?

Run memtest when Physical Access to Computer is Available

Now, if you can, try your memory with memtest86+. It's amuch better RAM test than the example above.

Run Speed Test on Memory

In order to verify that your memory is still running at full speed, you may use the mbw command.

First install it if you don't already have it installed:

apt-get install mbw

If you're not under Debian or an RPM system, grab the mbw source from github.

Then run the command with parameters best adapted to your situation. For example, this gives me a good idea of my memory bandwidth in plain CPU code (opposed to SIMD or some other parallelism) which better represents the speed at which you can access memory with a regular algorithm and see whether your algorithm is faster or about the same speed as your memory (i.e. memory access being the bottleneck of your routine) or whether it is slower.

% mbw -t1 2Gb
Long uses 8 bytes. Allocating 2*262144 elements = 4194304 bytes of memory.
Getting down to business... Doing 10 runs per test.
0      Method: DUMB    Elapsed: 0.00051    MiB: 2.00000    Copy: 3891.051 MiB/s
1      Method: DUMB    Elapsed: 0.00041    MiB: 2.00000    Copy: 4854.369 MiB/s
2      Method: DUMB    Elapsed: 0.00040    MiB: 2.00000    Copy: 4987.531 MiB/s
3      Method: DUMB    Elapsed: 0.00040    MiB: 2.00000    Copy: 5025.126 MiB/s
4      Method: DUMB    Elapsed: 0.00040    MiB: 2.00000    Copy: 5037.783 MiB/s
5      Method: DUMB    Elapsed: 0.00040    MiB: 2.00000    Copy: 5050.505 MiB/s
6      Method: DUMB    Elapsed: 0.00040    MiB: 2.00000    Copy: 5050.505 MiB/s
7      Method: DUMB    Elapsed: 0.00040    MiB: 2.00000    Copy: 5037.783 MiB/s
8      Method: DUMB    Elapsed: 0.00041    MiB: 2.00000    Copy: 4866.180 MiB/s
9      Method: DUMB    Elapsed: 0.00040    MiB: 2.00000    Copy: 5025.126 MiB/s
AVG    Method: DUMB    Elapsed: 0.00041    MiB: 2.00000    Copy: 4854.369 MiB/s

As we can see in this example, a test over 2Gb of my memory give us an average speed of 4.8Gb/s. Note that the copy uses "long" which is 8 bytes at a time aligned. This is much faster than parser that would read input data one byte at a time.

However, image processing using SIMD would probably use the memory full speed as this test shows.