How Linux starts up.

How Linux starts up.

Sorry this page isn't ready yet, I have a fair amount of research to do before I can finalize it. That said it is rather close to the mark, and probably usable for basic concepts.

When sufficient current is sent through the coil of a magnetic recording head the magnetism at the pole pieces will reorient magnetizable crystals in the magnetic media. The pole pieces amount to a gap in a band of Ferrous metal, at the point where the head meets the media. This gap concentrates the strength of the magnetic flux, onto the smallest stripe of media the head design engineers can come up with, and still meet other vital criteria, things like frequency response, AKA bit rate and the like. If there is to be any possibility of "playback" at a later point in time, direct current is not useful, aside from the fact that you can't impress information on pure DC, a coil bathed in, even a rich, static magnetic field, never generates any current. No it is the Alternation of magnetism, that makes possible generated electrical current in a coil of wire. If the alternation just happens to carry information, then when played back, a tiny fraction of that energy is regenerated. If you're curious where conservation of energy is, in all of this, it's the mechanical energy that rotates the disc past the head that is providing the power that permits regeneration of the signal during playback. In the drawing below I show what that looks like.

Above is an enlarged view
of generally, what those
recorded stripes look like
with north / south pole
markings.

In the photo below,  in the upper
left  hand  corner  is  a magnified
closeup of some ferrous particles
clinging  to  magnetized   regions
of a recorded data  disk.    Notice
the stripes of  particles  revealing
head  gap  shape and orientation

The radial spoked pattern is an
artifact of the sectored recording
technique of a formatted data disk

In the preceeding two illustrations very old technology was used, today nano sized bar magnets whose ends are painted two different colors, red for north pole, and green for south pole for example. Millions of these low strength magnets are then suspended in a transparent solution, usually some low viscosity, very dense, ph neutral oil that approximates the density of the magnets themselves. If a hard drive platter, was carefully mechanically removed from the hard drive and placed on the spindle of a special machine designed to optically read the color of the ends of these thousands nano magnets, at extreme magnification and with a computer controlled camera, to see them with, and appropriate software to interpret what is being seen, it is possible to recover the "lost" data optically. Even more interesting, especially to various three letter government agencies throughout the world, is the fact that as drives mechanically age, the alignment to track position shifts a tiny amount. This berely perceptible shift, often permits recovery of data that was overwritten many, many times, long ago, by a special type of optical examination that allows the microscope to be tilted, while the angle of the gentle wash of the suspension oil is changed to optimize the viewing angle. With several such views, combined with a CAT, (Computer Aided Tomographic) algorithm an old set of long ago erased / overwritten data can be retreived; and given that a CRC, (Cyclic Redundancy Check) is routinely written on the disc platter itself, the program that is attempting to read this old data, can verify that it is correct! So what does all of this mean to you? Just this, if you have a secret, that you absolutely MUST keep, don't ever allow it to be written to the hard drive in the clear. You simply cannot ever insure that it will be erased completely. When in doubt encrypt, these agencies are devilishly good at recovering this sort of thing. Where as they also have a proven track record of not being very good at breaking modern codes. Couple that with steganography they even have difficulty determining if you have given them a "useful" key, eg. some keys may decrypt pointless irrelevant information, buried in an encrypted file, where as other keys will decrypt sensitive information, from the same file. Since it is a well known technique to add meaningless noise to these kinds of files, faced with such a file, an interrogator can't know for certain if the data bares more than one message. Ultimately it all comes down to how well you can take the interrogator's cattle prod, and how far the laws in your country are willing to go to get you to reveal the key, and how desprate they are that they may be willing to side step those laws. The safest bet is to never trust the innocent looking machine on your desktop, with anything sensitive. Remember too, that from time to time memory that is virtualized, may be sent temporarily to the swap partition, if you do sensitive work, you may want to run with swap turned off. Second to that is looping swap through cryptographic tool known for security, in that case you don't even have the key, it is pulled out of thin air when the machine is powered up, and vanishes upon power down.

Assume as you look at the top of the drawing below, that the front edge of that series of four platters is where the read head stack is, and the platter stack is rotating counter clockwise. The top platter, is associated with "Head Zero" is just about to put the Master Boot Record under head zero. Directly beneath it is the platter that is associated with head one and the Secondary Boot Record

Because the startup has a series of rules that change with each stage of the booting process I have chosen to start this description at a point before the re-boot. A point when something has been done that is necessitating a change in boot sequence. Usually such a circumstance is as a result of adding a new operating system, changing a kernel, or altering Lilo in some way. So the change made, /etc/lilo.conf edited and now you, after a test run, are ready to do it for real, you run lilo and re-boot. This part of my site is about what goes on here. At this point, the moment you run lilo everything about where things are stored on the hard drive are known to the system. This can be insanely complicated, modern drives have multiple modes to operate in, with several differing Cylindar Head Sector geometries available to them, but all you have to do is run lilo. Lucky for you, Linux knows the rest.

At cold boot time the whole process is hanging by a thread the size of the MBR. Think of it, Linux, and in multiboot systems other operating systems, are started running by that same 512 bytes of machine executable binary, and a good portion of that is reserved for the Partition Table. I submit to you it's a wonder it works at all! Part of the reason it does work, is that they cheat, when Lilo built the MBR, between the running Linux, and Lilo to grill Linux into revealing its innermost secrets of where in terms of C:H:S numbers various essential files are on the disc drive that a BIOS could use to get the Linux that was already running, started up again. Another reason is that Lilo knows the machine comes up very stupid, but in reality the only thing that the MBR really needs to do, is load up about five Kbytes, that contains a much fancier bootstrap loader. The location of the boot loaders data /boot/boot.b in C:H:S Cylinder, Head, and Sector, numbers or if the drive is running in LBA mode, Linux gives Lilo the Linear Block Address of that files data, so it can build an MBR capable of fetching the boot.b data without knowing anything about filesystems and the like. I've just given you all you need to know to write your own boot strap loader, except for one little detail, that you may or may not have caught onto on your own. The README that follows keeps referring to BIOS int10 calls. int10 calls date back to the earliest PCs that had hard drive extension ROMs soldered into the hard drive controller card. Before that you had to boot your hard drive up with a floppy diskette, or an audio cassette recorder, jacked into the data port of an IBM PC! The CPU was an INTeL i8088. Now I realize that Linux won't run on anything less than an i386, but the BIOS ROM is a different matter. It, for matters of backard compatibility will boot up and run MS-DOS, which is an i8088 based operating system. At the moment you switch on "Protected Mode" all of that i8088 based code in BIOS ROM is useless, it's written for an i8088, and the machine is in i386, i486, or Pentium mode now, so say good bye to BIOS calls. This in a large way held back the adoption of "Protected mode" programming but what it really meant was you had to duplicate in the protected mode CPU instruction set all those BIOS calls, Before you hit the switch, ecpecially with Linux, because the security ethos of Linux with respect to protected mode is to have no way back to native mode, once protected mode is turned on. Well the unwritten text here, the silent story, is that the MBR is by definition composed of i8088 / i8086 code. Ditto for the Lilo boot loader, and surprise, ditto for the first part of a modern Linux Kernel. Such a Kernel once loaded into memory, executes enough i8088 instructions to set up the protected mode memory model, and get Protected mode turned on, once that's done, an in-memory decompression routine, uncompresses the rest of the kernel memory image, into the now waiting vast 32 bit addressed memory pool. That hunk of Kernel code holds among other things, a full protected mode BIOS and the old BIOS is nolonger accessible, nor is it possible to switch back to Native mode from a running Kernel. You're probably wondering how DosEMU a platform capable of running a full i8088 based MS-DOS implementation. The answer is it is a faithful Emulation of an i8088 running in, you geussed it, Protected mode. Once the Kernel is started, returning to native mode is impossible. OK here's that excerpt of the README file I've been telling you about, found in the Lilo source code tarball. Pay close attention to the explaination leading up to the phrase...

* LILO does not know how to read a file system. Instead, the map installer asks the kernel for the physical location of files (e.g. the kernel image(s)) and records that information. This allows LILO to work with most file systems that are supported by Linux.
...about a third of the way through the excerpt shown below, and hilighted in bold print to make it easier to locate.

It is here that I want to point out how Micro$oft envisioned the bootstrap while allowing for third party developers to start their operating systems, and to point out one key difference between their system, and Linux. They envisioned four physical partitions, whose whereabouts are known to the MBR, and written to a "Partition Table" within the MBR, actually from the 447th byte to the end of the MBR with the exception of the two ID bytes at the end, (0xaa, 0x55). If you look at those two bytes in binary, arranged end to end, a curious pattern emerges, 0xaa=1010 and, 0x55=0101, arranged inline it looks like this 10100101 this pattern is believed to be the least likely to randomly appear when the uninitialized RAM,(Random Access Memory) is first switched on, so it became a standard to signify that something meaningful has very deliberately been placed in that memory, as opposed to merely containing the random ones, and zeros, memory might have when it was first switched on. So why on earth would you use this pattern on a disk drive... Hmm? I suspect the general idea simply got grandfathered in over time, and its use was extended to anything that needs an ID of this sort. In the "Partition Table" itself, there are pointers,(address numbers) that point to the locations of four partitions, out on the disk drive. and as far as Micro$oft is concerned, that's all anyone ever needs to get their system booted up, because in Micro$ofts view you only need to read the first sector of any bootable partition, "The Boot Sector" place it in memory location 0x7C00 and jump to the first byte, and that operating system or specialized application program will boot up, simple as that. Linux breaks these rules, unless you specifically tell Lilo-Linux to place a boot loader at the beginning of a Linux partition, it doesn't bother to do so, normally it doesn't use it anyway. A Lilo MBR has all that is necessary, by pointing the C:H:S numbers, or LBA addy to the place where boot.b's data is. In contrast boot.b is far more intelligent and it knows how to load the map file, from there it can provide C:H:S/LBA to get all the componets it needs, including the kernel, but NOT the modules.

Moving right along... In the next illustration, the top half of the figure below, I detail the first half of the sequence of operations of building a bootstrap from a running Linux operating system, through shutdown, and subsequent restart, Operating System selection load, and execute.

Before Shutdown:

Either Linux has just been installed, or a Kernel change:

Usually the /etc/lilo.conf file has been edited, or in the case of a fresh install lilo.conf was freshly created by the installer itself, anyway the next task is to run lilo. On the right hand side below you see a heading These tracks written by Lilo with arrows pointing from files found in the /boot directory, to places on the beginning sectors of the drive. To put a finer point on it, the MBR is written with some special code imbedded in it to enable it to locate boot.b, and place a copy of those bytes into a series of sequential memory locations suitable for execution. The wherabouts in C:H:S/LBA disc address notation, of the Kernel, (vmlinuz, and initrd) and several other files in the /boot directory, depending on what directives are in lilo.conf are placed in the map file, and the wherabouts in C:H:S/LBA notation, of the map file itself are "Hard Coded" into boot.b (the boot loader) program.

Shutdown/reboot:

Now all the careful preparation pays off:

I now direct your attention to the left hand side below you see a heading depicting the code in the BIOS ROM that when it reaches the stage where the hard drive is called upon to boot the system, the ROM only has to do one simple thing, read in the MBR, and copy that data to sequential memory locations beginning at 0x7C00 for 512 bytes, (the size of one sector) and then get the CPU to perform a Jump to location 0x7C00 to execute code found there.

Now boot.b wakes up:

Those actions now load in the boot.b file, although none of the code executing knows anything about files, or subdirectories, or anything like that, these CPU instructions are simply, and mindlessly, going through the motions of picking up sectors of data and plugging that data into known standard places in memory, and then jumping to that place, simple as that. Although boot.b is a little different, it has embedded within its executable code, the C:H:S/LBA address of the map file, and the map file has the C:H:S/LBA addresses of the Kernel, and all the other pieces necessary to start up the Kernel, and the Kernel does understand filesystems, so now things take on a more inteligent, and streamlined approach.

In the bottom half of the drawing above, because of the fact that the monolithic componet of the Kernel has gotten so big, it has now become neccessary to load even the monolithic componet of the Kernel in two pieces, the first piece has an i8088 based decompressor front end, that after protected mode has been switched on, and an orthogonal memory model is in place, the vmlinuz part of the Kernel is decompressed into memory above the one megabyte limit that an i8088 CPU can access. Next the initrd part of the monolithic componet of the Kernel is loaded, and by now I suspect rudimentary filesystems are understood by the Kernel thus far loaded, as it widens its task to include starting up Linux. At some point it begins to setup the Kernel modules, these are modular componets of the Linux Kernel, as differentiated from the monolithic componet of the Kernel we've already loaded. The modular portion can be litterally anywhere that a carefully designed monolithic componet of a Kernel can reach. Example if you placed your Kernel modules on a USB thumb drive, but one of the modules the Kernel needed was usb_storage.o was stored on that thumb drive, your Kernel has no hope what ever of reaching any of its, modules This is the kind of thing you learn in Kernel config design, you can very easily paint yourself into the proverbial corner. So there are limits as to where the modules can be placed, as well as practical considerations. It is possible to place your Kernel's modules out on the Internet, I will leave it up to you to ponder the wisdom, or lack thereof, in entertaining such a design feature :-)

Looking at the bottom right of the above pictorial you'll see an illustration of a Kernel represented as a system of many operations that programs call via a function dispatch table. In my illustration I show calls like "This" "That" and "Other" linking through a series of function dispatch hooks that are vectored off to an "Answering Machine". There are many ways to treat Modules, but here is a simple one, what if the "Answering Machine" did a little more than just answer a phone, what if as in the third hook from the left, the "Answering Machine" ran a small program to load the indicated module, from the module "cloud" to the left, and if memory resources were tight, disconnect a lesser used module in the process. From the users point of view, other than speed, a barely noticable pause while the module gets loaded, and the Kernel appears to have nearly limitless functionality!

Back to Me and Linux Next

The large print Giveth, and the small print Taketh away
CopyLeft License Copyright � 2000 Jim Phillips
Know then: You have certain rights to the source data, and distribution there of, under a CopyLeft License