A site for solving at least some of your technical problems...
A site for solving at least some of your technical problems...
[tableofcontents title: Summary; minlevel: 3; maxlevel: 6;]
Debian or Ubuntu and pivot_root and GRUB and hard drives... In general, I like Linux, but once in a while, it generates quite a few problems!
Yesterday, I used apt-get upgrade to upgrade my system which I hadn't do for the last 2 months or so. 460 packages upgraded and many more untouched. So far so good. The TeX installation failed because I said I'd want a group and the installer did not automatically create that group. I created the group and all was fine (though, looking at the logs, many X11 entries were half installed.)
As it went on, I noticed a few kernel updates and stuff about GRUB so I thought, let's reboot to run the latest 100%. Yeah... unless it doesn't reboot, hey?!
First of all, GRUB has a part in its menu.lst that is automatically updated. That part is just not trustworthy, it puts some defaults (it very much feels like it at least) and of course my system doesn't work well with defaults.
Just to mention, my HD is a copy of a RAID (yeah... I don't have my RAID on that computer anymore, it's for the company server now!) and somehow GRUB is capable to know that it was installed for RAID but not that it isn't anymore. So all were back to root=/dev/md0.
Okay! So today I got the answer! When I need to install GRUB from a Linux Rescue, it goes about with the command "df" to determine what's what. The problem is that the df command just prints in some formatted way the mtab file. But the mtab file is NOT up to date!!! So, now that I know, I can tell you that to fix the boot you do this:
Note: On newer kernels (2.6.x), it looks like the drives are much more likely to be sda, sdb, ... I had a boot problem the other day because that changed and the boot drive could not be mounted since I was telling the system to look for hda2, and it did not know about hda at all! Try changing your /etc/fstab to use /dev/sda1, /dev/sda2, ..., /dev/sdb1, ... instead of hd. S stands for serial, in case you were wondering. All the drives are being changed from paralell (IDE/ATA) to serial (SATA) using some SCSI drivers.
Yeah... well, I lost my 170Gb RAID for a 200GB SATA. That's certainly a lot better, that SATA is 133Mhz access whereas the RAID was at 33Mhz. (this is because I have a CDROM, like most of us, which only takes 33Mhz... and that drops one channel to 33Mhz and for a RAID you need everything the same, so all the IDE were 33Mhz... it seems to me that one of the drives should have been at least at 100Mhz, but well...)
So, the SATA drive in my BIOS is recognized as the 1st drive. Practical when you say: Boot on drive C: (for systems which are brain dead in regard to booting). That's what I have. Once the system is booted, it looks at the IDEs first because their hardward connect comes first. That means my SATA instead of being hda, it's hde!
A lot better, the Live CD could boot in a more real console. That console could be used to check and edit the setup much easier and especially I could run GRUB after a chroot. Very cool. So I played around and around to try to figure out all of these things. Since the GRUB installation program had wiped out my previous setup, the problem was to find out what that was. (Now I made a backup of the menu.lst AND I have the setup outside the "auto generated" marks — note that this is a special Debian feature, you don't have it in Red Hat [maybe in Fedora?])
The very first error was something like:
Ha! Of course, GRUB counts from 0 and not 1 like everybody else so I lost a good hour figuring out that I had to put 9. At first I was very much wondering which drive I was on since GRUB would say that my SATA was hd1 when I'm in a shell booted. But it was hd0 anyway. Okay, so it was (hd0,9) for my SATA drive.
The file system is recognized or at least so I thought. Now it tries to read the input file and I get a kernel panic after the system says that it cannot find some file (pivot_root here, you could get some others too.)
pivot_root: No such file or directory /sbin/init: 424: cannot open dev/console: No such file Kernel panic: Attempted to kill init!
This one took me the rest of my day to figure out. I tried all sorts of things and the main one, I thought, would be to put root=/dev/blah to make sure that the correct device was used to do the root pivoting.
So I wrote root=/dev/hde. And that didn't work either. Now the error was invalid file system. On to the Ubuntu shell, use fsck on all the partitions, not even 1 error, of course, since I had shutdown cleanly. So?! What was wrong? Very simple: you need to specify the exact partition and not just a device. So all I had to do here was to add 10 at the end: root=/dev/hde10. And it worked, finally!
Okay, this last one is certainly my fault! I could have paid a little more attention. I'm just thinking that it would be neat if a software like GRUB could know that the specified device was wrong and explain what's wrong instead of the errors we usually get... Maybe one day it will be like that.
Argh! I tried today to go back to a RAID system since I got new drives (2x 300Gb at $99 each! in April 2006) And I get similar errors, but the other way around. More or less, I have to boot on /dev/hde5 (i.e. I have to tell the kernel & initrd that the root is on hde5) even though the RAID is on /dev/md5.
cramfs: wrong magic
Which means it's trying to read the wrong stuff. The problem is it doesn't tell you what it's trying to read. So I dunno what to do next to fix that part. Now it looks like it boots just fine. The RAID is up and I can play around and the RAID seems to be fully functional.
It goes on... 8-) This is not really a booting issue, it's starting X-Windows. With all the changes I have made, somehow the /tmp directory got its permissions changed from the usual to
drwxr-xr-x. I had to do a
chmod 1777 /tmp and my X11 session would start. I'm wondering whether something else got messed up like that!
All software make use of numbers. Everything is a number. The most basic number in a computer is 0 or 1. This is called a bit. These are represented with electricity. Although in most cases we see it as 0 - Ground and 1 - Voltage (i.e. 1 volt), the bit representation in software and in hardware may be interpreted either way (i.e. a 0 could mean that the voltage is 1V and not 0V.)
Combining these zeroes and ones we offer end users to handle much larger numbers. With 8 bits, you can have numbers from 0 to 255 (unsigned) or -128 to +127 (signed.) Now a day, computers can handle a much larger number of bits in one cycle. Most processors use 64 bits but they can calculate numbers on 128, 256, and for some 1024 bits at once. Also with parallelism, the size can be viewed as even larger (i.e. handling a 64 bit number in 1,536 threads like on my old nVidra Quadro 600 is equivalent to one large number of 98,304 bits! That would be 2 power 98,304 possibilitie or about 2.8359e+29592 in decimal.)
Integers are easy to handle. Although when working on math problems you generally see the set of avaialble numbers as equivalent to N although mathematicians know that computers can really only handle a limited set of numbers. For example, on a 64 bit computer, the usual range is -9223372036854775808 to 9223372036854775807, This is generally enough although at times some equations have to be reworked to avoid really large or small intermediate numbers that work fine in math equations, but not so well on computers.
Now, math also includes other sets of numbers such as D, R, and C. Computers do not offer any way to represent numbers in R or C but they can offer D to some extend. These numbers are called floating point numbers because we do math using an exponent. The exponent makes the decimal point "float" in any location as the number used for the exponent offers. Using a 64 bit floating point, you can have positive and negative numbers with precision varing betwee 10-308 and 10+308. This includes a positive zero (+0) and a negative zero (-0), which is import in a few cases (although +0 = -0 is true, you can get the sign of a number and distinguish both zeroes). Note that at first decimal numbers were going to also have a positive and negative zero, but it was instead decided to have one more negative number (remember, with 8 bits we have signed numbers from -128 to +127, this is because in the positive numbers we have a 0 which we don't have in the negative numbers.)