This page is currently a scratch pad for my installation of the BadRAM kernel patch. I'll clean this up later. Or you can
An alternative to using the BadRAM which I find just as easy is memmap (included by default in at least 8.04), I describe how to use it here: http://gquigs.blogspot.com/2009/01/bad-memory-howto.html
More information on the patch can be found here: http://rick.vanrein.org/linux/badram/index.html
Information about compiling a custom kernel on Ubuntu can be found here: [KernelCustomBuild]
An old thread on the forums is here: http://ubuntuforums.org/showthread.php?t=102254
Why do this?
You have a computer with a stick of memory that is predictably bad. This means that memory errors only come from the same couple of addresses. With the BadRAM patch, one can tell the kernel not to allow these addresses to be used. I have a 512MB stick of memory that works great, except it has almost a megabyte that returns errors in Memtest, and if I use this memory in my system I often encounter crashes. With the BadRAM patch, I can avoid this problem.
Building the Patch
If the system you are using already has (a) bad stick(s) of memory, you have two options: a) Put good memory in temporarily until you've built the patch b) If memory errors don't show up in lower memory, you can add the "mem=##MB" kernel option to temporarily disable memory above that point. Ex: According to memtest, my stick of memory has errors between 315MB and 325MB. By adding "mem=314MB" I tell the kernel to pretend I only have 314MB. My system will be stable enough to build the patch, at which point I can pass a parameter to ONLY skip the section between 315MB and 325MB (this providing me 502MB of usable memory).
Start by following the instructions for building a Custom Kernel linked above. Once you've downloaded the source, apply Rick Vanrein's patch. Then you can build the kernel as normal. Be sure to build the BadRam module.
After you've built the new kernel, add it to your /boot/grub/menu.list file as an acceptable kernel. Add the parameter listing addresses in memory to avoid. Memtest can provide you with such a list.
Boot the system from your new kernel. Success.
I relied heavily on az's comments here: http://ubuntuforums.org/showpost.php?p=565328&postcount=7
I followed 'What you'll need' from the [KernelCustomBuild] page. Since I'm not a kernel developer, I next did
- cd /usr/src sudo apt-get source linux-source
Next I unpacked (note that your filename may be different)
- sudo tar xvjf linux-source-2.6.20.tar.bz2
Download the patch (Check Rick's website for the URL)
- sudo wget (URL-TO-THE-2.6.20-PATCH)
Enter the folder
- cd linux-source-2.6.20
Copy the existing kernel config from the boot partition
- sudo cp /boot/config-2.6.20-16-generic .config
Apply the kernel patch:
sudo patch -p1 < ../BadRAM-184.108.40.206.patch
Build the kernel package (Press Y to add support for BADRAM)
- sudo make-kpkg --initrd --append-to-version=-badram --stem=linux kernel_image kernel_headers
Install the new kernel package and kernel headers package
- sudo dpkg -i linux-image-(whatever)-badram.deb sudo dpkg -i linux-headers-(whatever)-badram.deb
On reboot, everything seemed to work fine except XOrg. I couldn't get X to start even with the nv driver. I've yet to figure out how to build the restricted-drivers package against the custom kernel headers installed in the step above.
Still to add: running memtest and determining the bad memory addresses; editing /etc/grub/menu.lst to block those bad memory addresses; building restricted-drivers package to facilitate the use of vmware-player, nvidia drivers, etc from aptitude.
BADRAM setting in Grub2
The GRUB2 config file in Natty has a line for configuring kernel bad ram exclusions. So, I will assume that is the preferred way of mapping out a section of memory that is showing errors. The line I set was
The suggested way on every web site I could find was to set this was to run memtest86 and let it show you the BadRAM settings. memtest86 gave me a page of stuff I would have had to enter. I could see that all the addresses were in one 16K block, so I just wanted to map that 16K block out of action. Here is how I generated the correct entry.
The first parameter is easy. That is the base address of the bad memory. In my case, I could see that all the bad addresses were greater than 0x7DDF0000 and less than 0x7DDF4000. So, I took the beginning of the 16K block as my starting address.
The second parameter is a mask. You put 1s where the address range you want shares the same values and 0s where it will vary. This means you need to pick your address range such that only the low order bits vary. Looking at my address, the first part of the mask is easy. You want to start with 0xffff. For the next nibble, I will explain with bit maps. I want to range from 0000 to 0011. So, the mask for badram would be 1100 or a hex c. The last 3 nibbles need to be all 0s in the mask, since we want the entire range mapped out. So, we get a total result of 0xffffc000.
After setting this line in /etc/default/grub, I ran sudo update-grub and rebooted and my bad memory was no longer being used. No kernel patches are needed to map out bad memory using this method.