A site for solving at least some of your technical problems...
A site for solving at least some of your technical problems...
Today I wanted to test the memory on a remote server. I could not just reboot and run memtest86+ so instead I had to look for a different solution to testing most of the computer memory without having to reboot...
WARNING: This test asssumes that /tmp is NOT mounted with tmpfs. To verify use df /tmp. If it says tmpfs under the Filesystem column, then this test is likely to fail as written.
# Good, using a physical disk % df /tmp Filesystem 1K-blocks Used Available Use% Mounted on /dev/md6 4675752 9808 4405380 1% /tmp # Invalid, using tmpfs (i.e. RAM disk) % df /tmp Filesystem 1K-blocks Used Available Use% Mounted on tmpfs 4675752 9808 4405380 1% /tmp
I found an interesting page in that regard describing a way to do so using md5sum on a very large file.
The is a verbatim copy of the Linux command line proposed.
WARNING: see paragraph about bs and count below.
dd if=/dev/urandom bs=768304 of=/tmp/memtest count=1050 md5sum /tmp/memtest; md5sum /tmp/memtest; md5sum /tmp/memtest
The size, 768304 (In Kb), is expected to be close to your memory size. You should know how much memory you have, otherwise you can type free or top and you'll see somewhere what it is. It's also available in the /proc/meminfo as the MemTotal line (usually the first line):
% grep MemTotal /proc/meminfo MemTotal: 8165856 kB %
This example shows the total amount of RAM for an 8Gb server.
The count parameter shows how many times the bs size will be duplicated. I think that there is a little mistake in that the count should be 1024 instead. Possibly also the numbers should be inverted. The default bs is 512 which is the size of a block and the count should be how many times you repeat that block.
dd if=/dev/urandom bs=1024 of=/tmp/memtest count=768304
This means write 768304 blocks of 1024 bytes in /tmp/memtest using random data as provided by /dev/urandom.
The second line above runs md5sum 3 times on that file. Since Linux has the bad habit of caching the data, this means a lot of that data will reside in memory when the 2nd and 3rd md5sum commands run. And that means you're testing a great deal of the memory of your computer.
The idea is that if the memory is going bad, then the each md5sum will return a different result and you are likely to get ECC errors in your console & in your syslog file (if such are turned on in your BIOS and kernel.)
Original post: Linux Server Memory Check
It is also possible to create a tmpfs RAM disk and then run a test against that disk. By default the size of a tmpfs RAM disk is 50% of the computer total RAM. It may get higher with servers supporting very large memory amount (like 256Gb of RAM.)
WARNING: This test is likely going to use 50% of your RAM. Use that with reserve on a live system serving clients.
You can then fill the disk with zeroes to see how fast it goes. Note that the test will work once. Repeating it will not go at the same speed or check all the RAM over and over again.
sudo mkdir /mnt/memtest sudo mount -t tmpfs /mnt/memtest /mnt/memtest dd if=/dev/zero of=/mnt/test/test bs=1M
The dd command, in this case, runs until the filesystem is filled up. However, you fill the whole memory with zeroes only. This is not a very good test to make sure that the memory is indeed valid. However, it will give you information about the speed at which the file was created and thus an idea of memory access speed. Note, however, that writing to disk has overhead in handling the file format and file with just zeroes is likely to end up being created as a sparse file (where part of the file is not commited to disk because they are all zeroes.)
Changing the dd command to use /dev/urandom will better verify the memory, however, it won't show you a validspeed because /dev/urandom is likely a bottleneck in this case. So you are more likely to determine the speed at which /dev/urandom gives you random data.
Source: What are the best possible way to benchmark the RAM (no-ECC) under linux/arm?
Now, if you can, try your memory with memtest86+. It's amuch better RAM test than the example above.
In order to verify that your memory is still running at full speed, you may use the mbw command.
First install it if you don't already have it installed:
apt-get install mbw
If you're not under Debian or an RPM system, grab the mbw source from github.
Then run the command with parameters best adapted to your situation. For example, this gives me a good idea of my memory bandwidth in plain CPU code (opposed to SIMD or some other parallelism) which better represents the speed at which you can access memory with a regular algorithm and see whether your algorithm is faster or about the same speed as your memory (i.e. memory access being the bottleneck of your routine) or whether it is slower.
% mbw -t1 2Gb Long uses 8 bytes. Allocating 2*262144 elements = 4194304 bytes of memory. Getting down to business... Doing 10 runs per test. 0 Method: DUMB Elapsed: 0.00051 MiB: 2.00000 Copy: 3891.051 MiB/s 1 Method: DUMB Elapsed: 0.00041 MiB: 2.00000 Copy: 4854.369 MiB/s 2 Method: DUMB Elapsed: 0.00040 MiB: 2.00000 Copy: 4987.531 MiB/s 3 Method: DUMB Elapsed: 0.00040 MiB: 2.00000 Copy: 5025.126 MiB/s 4 Method: DUMB Elapsed: 0.00040 MiB: 2.00000 Copy: 5037.783 MiB/s 5 Method: DUMB Elapsed: 0.00040 MiB: 2.00000 Copy: 5050.505 MiB/s 6 Method: DUMB Elapsed: 0.00040 MiB: 2.00000 Copy: 5050.505 MiB/s 7 Method: DUMB Elapsed: 0.00040 MiB: 2.00000 Copy: 5037.783 MiB/s 8 Method: DUMB Elapsed: 0.00041 MiB: 2.00000 Copy: 4866.180 MiB/s 9 Method: DUMB Elapsed: 0.00040 MiB: 2.00000 Copy: 5025.126 MiB/s AVG Method: DUMB Elapsed: 0.00041 MiB: 2.00000 Copy: 4854.369 MiB/s
As we can see in this example, a test over 2Gb of my memory give us an average speed of 4.8Gb/s. Note that the copy uses "long" which is 8 bytes at a time aligned. This is much faster than parser that would read input data one byte at a time.
However, image processing using SIMD would probably use the memory full speed as this test shows.