Introduction

This document attempts to give a user-oriented panoramic of the architecture and functioning of an Ubuntu system. It does not provide instructions to accomplish any specific tasks, although it will hopefully give you the knowledge to be able to find your own way around the basic components of the system.

The Boot Process

When your computer is powered on, it starts executing a program that is found in a persistent memory storage (for instance a ROM chip). This booting firmware (which, on x86 machines, is called the BIOS) can initialize some subsystems that are needed for the proper functioning of your computer; in particular, it has the ability to read data from a mass storage device, such as a hard drive.

Thanks to this fact, the boot firmware can load off program code from the hard drive into the main storage (RAM), and instruct the system processor (CPU) to pass control to it and execute it.

There are conventions as to where the firmware is expected to find this piece of code: on x86 machines, for example, it must reside in the drive's master boot record (MBR), which is the first 512-bytes sector of the drive. The program that is found there is usually a first-stage bootloader.

This bootloader needs to know where to find, and how to start, a proper operating system, or at least a second-stage bootloader. On Ubuntu, the standard first-stage bootloader is a program called GRUB, which can start a few operating systems (among which Linux) directly, without calling other intermediate bootloaders.

GRUB (Grand Unified Boot-Loader) is in reality quite a complex program, and it cannot fit into the 512 bytes provided by the boot sector. Instead, it has its own second-stage loader, which resides on an ordinary partition; in fact, the first-stage part of GRUB includes mini-drivers for a few standard filesystems, allowing it to read its second-stage code and configuration data as ordinary files stored in the operating system's standard directory structure. The GRUB configuration file is, on Ubuntu, located in /boot/grub/menu.lst.

When GRUB's second stage has been loaded, it proceeds to load an operating system (or present a choice of systems to the user) according to the contents of its configuration file. After this, control passes to the Linux kernel, which resides in the file pointed to by the /vmlinuz symbolic link.

The Linux System

Linux is what is normally called a "kernel" in computing; its purpose is to supervise the functioning of the system, by giving and taking control of the CPU to and from user programs, and regulating access to the subsystems (main storage, mass storage, various peripherals) that programs can access.

Any modern CPU has a distinction between unprivileged and privileged modes of operation. Code that runs in a privileged mode is a supervisor that has the ability make unprivileged code execute, while still restricting its possible behaviors: for instance, it can decide which parts of the main storage are accessible to the unprivileged program, instructing a part of the CPU (the MMU) to ignore any attempt to reach "forbidden" memory locations and to return control to the supervisor as soon as such an attempt is made.

The Linux kernel is the part of the Ubuntu system that runs in privileged mode on the CPU.

Linux is considered a monolithic kernel: this means that, rather than just supplying the very basic functions that allow independent programs (processes or tasks) to operate and coordinate with one another, it also directly provides many higher-level abstraction, such as filing systems, networking layers or device drivers.

The many parts that compose the Linux kernel have been subdivided into modules, which can reside either inside the Linux core kernel image (the file pointed to by /vmlinuz, or as separate files in the /lib/modules/ directory. In this latter case, the kernel is able to load and unload modules dynamically, so that parts of the kernel are only present in the system's main storage when they are actually required.

The only Linux modules that must reside inside the core image are those needed during the boot process; specifically, the filesystem handlers are required to be available, in order for the kernel to be able to load further modules from the disk.

In Ubuntu, since the user is able to select among a number of different filesystems for the boot partition, the core image only contains what is needed to parse a small filesystem image, located in the file linked to by /initrd.img, which is read into RAM and contains the filesystem handlers and other core modules.

Linux publishes system calls to applications to allow them to make use of the services it provides. Device drivers, in particular, are typically accessed via filesystem calls, and the peripherals they control are represented by the special files that are found in the /dev/ directory: they can be read from and written to like other files, although in addition to that they normally also offer means to set and get device parameters.

Other means to interact with Linux using only filesystem-type calls are the /sys/ and /proc/ directories, which mount "virtual" filesystems whose files contain information about the current system state.

The Init Process

After Linux has completed its initial setup (which can take some seconds, as subsystems are probed and initialized), it loads the first non-kernel program into memory (which is, by default, expected to reside in the /sbin/init file); this becomes the first userland process, and is given Process ID number 1.

In Linux, every process must have a loaded parent process, which is the process which originally spawned (or, in Unix terms, forked) it. The only exception to this rule is process 1, i.e. Init. When Init stops executing, Linux halts the system.

The version of Init that Ubuntu 6.10 uses is called upstart, and replaces the older sysvinit that is found in previous versions. What upstart is instructed to do upon its invocation is described by the contents of the /etc/event.d directory; sysvinit, on the other hand, used a single file called /etc/inittab.

No matter what the flavor of Init, it will execute some of the scripts that are in the /etc/init.d directory: each of this scripts initializes a specific service, and new scripts will be added when you install some daemon packages (i.e. programs that run in the background to provide services).

Ultimately, Init will run the script that starts a display manager (e.g. /etc/init.d/gdm for the Gnome Display Manager, or /etc/init.d/kdm for the KDE Display Manager), which is responsible for all further graphical interaction with the user.

The X Window Server

For the display manager to be able to present any kind of graphical interface to the user, it needs some means to access the graphics subsystem.

The Linux kernel only provides limited means to draw to the screen; what is used for the graphical interface is, instead, a userland daemon called the X window server. X can access the graphics card's hardware either directly, using its own loadable modules for driving the various different cards that are available on the market, or it can use the kernel-provided graphics facilities, known as the framebuffer. The former approach is the one normally used.

The X server accepts requests from client applications to create windows, which are (normally rectangular) "virtual screens" that the client program can draw into. Windows are then composed on the actual screen by the X server (or by a separate composite manager) as directed by the window manager, which usually communicates with the user via graphical controls such as buttons and draggable titlebars and borders. A separate window decorator is sometimes used to render these objects to the screen, and to react to user actions and report them to the window manager.

The protocol that client applications use to communicate with the X server can be estabilished over various different channels, such as Unix domain sockets or TCP connections. This allows clients and the server to reside on different physical machines.

The Display Manager

When the display manager is run by Init, it starts an X server, and registers as a client application to it. It can then proceed to identify and authenticate the local user and, if authentication succeeds, start a user session.

From this point onwards, programs are started under the identity of the user who has logged in; in contrast, the Init process, the display manager and the X server run under a special user identity (called superuser or root) that is granted special privileges by the kernel, since it must be able to control parts of the system that normal users should not be able to access.

This user/superuser distinction should, however, not be confused with the similar distinction that has been made between privileged ("kernel mode") and unprivileged ("userland") CPU operating modes: processes that run as the superuser (i.e. processes running as root) still run in userland, but are allowed access to kernel system calls that other processes are denied.

Among the programs that the display manager starts there is usually a window manager and a shell, that is a graphical environment that presents itself to the user and allow the user to manually start the applications of their choice.

The Desktop Environment

The window manager, the window decorator, the shell and application programs should coordinate together to form a coherent desktop environment.

The two best known frameworks of programs, interfaces and libraries that are today employed to achieve this are Gnome and KDE, shipped respectively with the Ubuntu and Kubuntu distributions.

These frameworks provide application programming interfaces (APIs) that let applications make use of libraries to access the functions provided by the various subsystems: graphics, sound, printers, scanners... and the desktop environment's own facilities (window managing, clipboards, notification areas, etc.).

Efforts to create standard specifications for desktop management interfaces have been underway since a long time, culminating with today's freedesktop.org project.

The Graphical Interface Libraries

Perhaps the most vital facility provided by the desktop environment to application is a means to compose interactive graphical interfaces inside windows: on one hand, this relieves the application programmer from "inventing the wheel" every time an application is written; on the other hand, it helps the user by making interface elements consistent between applications, with a uniformed "look and feel".

Historically, applications accessed such GUI toolkits directly, while nowadays most applications only see the desktop environment's own APIs, and ???? them the task of talking to the GUI toolkit. This allows the desktop environment to take charge of inter-applications interface needs, such as clipboards, drag-and-drop, etc.

The two most common GUI toolkits today are GTK+ and Qt, which are employed by Gnome and KDE respectively.

The Sound System

ALSA is the standard sound interface offered by the Linux kernel. ALSA can drive most soundcards on the market, and offers interfaces that applications can access directly.

However, just as the X server approaches the problem of multiple applications accessing the screen concurrently, while allowing the user to control how to multiplexing of the various applications' graphics output should occur (by means of the window manager), there is a need to coordinate concurrent access to the sound system, as well as, in some cases, to support operation over a network, like the X server allows for video.

The Gnome environment currently uses ESD (the Englightened Sound Daemon) to this end, and KDE uses aRtsd (the analog Real time synthesizer daemon). Both products, however, have proven problematic, and are in the process of being replaced. Many applications that access the soundcard offer the user a choice of using ESD, aRtsd, ALSA or even OSS (the driver framework that preceded ALSA), to solve the many compatibility problems that unfortunately have arisen.

Other Services

Gnome and KDE both offer APIs to access CUPS, the Common Unix Printing System; for scanning, SANE is the commonly used backend, which KDE offers an interface to, although many applications access it directly.

GNOME and KDE also offer their own filing system abstraction on top of the ones provided by the Linux kernel; they are called, respectively, GNOME VFS and KIO, and one of their main purposes is to provide userland filesystem-style access to services that the kernel does not handle, or handles in a more statical manner, such as network transfer protocols.

Due to the pervasiveness of multimedia in modern applications there has become a need to create a standard API to handle as much of this as possible. Since the term "multimedia" applies to many different types of file, such as sound, video and graphics, and these each have different codecs, compression schemes, etc. and may be opened in many ways (for example, through ALSA, ESD, aRtsd, etc.) then for a developer to make, for example, a music player would be a very large task. The GStreamer framework has been created to handle all of these different operations, which allows application developers to only talk to one system, GStreamer, and have instant access to anything that GStreamer can handle. This is good for users as well, since installing a plugin for GStreamer to handle a new format, for example MP3, means that every GStreamer-aware application then has access to this new format.

Other services that the desktop environment may provide are session management, passwords management, file format decoding and conversion, etc.

KnowThyUbuntu (last edited 2009-08-07 15:46:55 by 76-10-161-77)