|
http://www.dsslive.org/mediawiki/index.php/About
About
Contents [hide]1 ABSTRACT
2 Introduction
3 Knoppix overview
3.1 Boot loader initialized RAM disk (initrd)
3.2 Process control initialization (init)
3.3 Knoppix and Debian
4 Dsslive
4.1 Early User Space
4.2 Initramfs-tools (mkinitramfs)
4.2.1 Mountroot: yuch
5 Unionfs
5.1 Unionfs: The Design
5.2 Unionfs and The Upstream Salmon Struct (USS)
6 Deliver
7 Debconf: Autoconfigure, Reconfigure (Debian Way)
8 Features and Future of Dsslive
editABSTRACT
It is becoming more and more common for OS software to be distributed on live CDs. Live CDs are bootable CD-ROMs (or DVDs) that contain an entire operating system in compressed format. The major drawbacks of live CDs are that the file system is read-only, and configuration changes are lost after a reboot. Unionfs can help solve both of these problems. At the moment the framework that Dsslive offers is compatible with both debian and ubuntu, and all the debian based systems, for this reason it's possible to call Dsslive a debian based livecd. The concept behind the iso is that everyone can create his own livecd in few steps, that's why it is to be considered as a "framework for livecd creation". Mainly the power of Dsslive resides on its design.
editIntroduction
There is no great magic behind Dsslive other than that it is a pure debian system booting from a cdrom. Pure means that it makes use of default configuration files and a stock kernel. This has been achieved by designing the USS (Upstream Salmon Struct), where an unionfs is used as root file system.
Nowadays many livecd's are Knoppix derivatives. Even if Knoppix is based on Debian it differs for some important core packages as sysvinit and some main configuration files as inittab. Knoppix uses its own kenrel, an initrd to mount the root filesystem and hwsetup, a tool based on Red Hat Kudzu library, to load all the needed modules and only S00knoppix-autoconfig runlevel. Differently Dsslive uses a stock kernel, an initramfs to mount the root filesystem that can be used even on the installed system, hotplug as the autodiscovery tool instead of hwsetup and the default runlevels.
editKnoppix overview
When Knoppix was first released it was heralded as revolutionary in the Linux world. Its autodetection and configuration capabilities were unsurpassed. Many people remarked that if 'Koppix can't do it, Linux can't do it'. Many steps have been done since first versions of Knoppix, hotplug can detect most of the existing hardware, Xorg can generate an usable xorg.conf, udev can be integrated within the initramfs and it substitutes even devfs. Does a livecd still needs the Knoppix way to reconfigure the X server? hwsetup to load the modules? The hacked linuxrc within an initrd? Can a livecd use the default runlevels?
editBoot loader initialized RAM disk (initrd)
The special file /dev/initrd is a read-only block device. Device /dev/initrd is a RAM disk that is initialized (e.g. loaded) by the boot loader before the kernel is started. The kernel then can use the the block device /dev/initrd contents for a two phased system boot-up [0].
In the first boot-up phase, the kernel starts up and mounts an initial root file-system from the contents of /dev/initrd (e.g. RAM disk initialized by the boot loader). In the second phase, additional drivers or other modules are loaded from the initial root devices contents. After loading the additional modules, a new root file system (i.e. the normal root file system) is mounted from a different device.
When booting up with initrd, the system boots as follows:
The boot loader loads the kernel program and /dev/initrd's contents into memory.
On kernel startup, the kernel uncompresses and copies the contents of the device /dev/initrd onto device /dev/ram0 and then frees the memory used by /dev/initrd.
The kernel then read-write mounts device /dev/ram0 as the initial root file system.
If the indicated normal root file system is also the initial root file-system (e.g. /dev/ram0 ) then the kernel skips to the last step for the usual boot sequence.
If the executable file /linuxrc is present in the initial root file-system, /linuxrc is executed with uid 0. (The file /linuxrc must have executable permission. The file /linuxrc can be any valid executable, including a shell script.)
The file linuxrc mounts the "real" root file system and places the root file system at the root directory using the pivot_root system call
The usual boot sequence (e.g. invocation of /sbin/init) is performed on the normal root file system.
The main motivation for implementing initrd was to allow for modular kernel configuration at system installation, and Knoppix uses it to boot up the system from a cdrom on a large number of different machines with different hardwares.
The boot ramdisk tries to autoprobe for the most common SCSI adapters and identifies the CD-Rom drive where the Knoppix CD is located. The minirootdisk features a statically linked shell with commands like mount built in.
The boot script (linuxrc) tries to find the Knoppix CD by mounting all CD-Rom drives and checking for a directory Knoppix that may contain a directory tree for the root filesystem or a file with the same name containing a compressed iso9660 image of the file system which is then mounted via the cloop device. If no CD is found, an attempt is made to find the KNOPPIX directory on an existing harddisk partition, containing a complete installation tree. When the root file system has been found it gives control to INIT [1].
editProcess control initialization (init)
Init is the parent of all processes. Its primary role is to create processes from a script stored in the file /etc/inittab. This file usually has entries which cause init to spawn gettys on each line that users can log in. It also controls autonomous processes required by any particular system.
The selected group of processes allowed to exist are configured using runlevels and the processes spawned by init for each of these runlevels are defined in the /etc/inittab file.
Init can be in one of eight runlevels: 0-6 and S. Runlevels 0, 1, and 6 are reserved. Runlevel 0 is used to halt the system, runlevel 6 is used to reboot the system, and runlevel 1 is used to get the system down into single user mode. Runlevel S is not really meant to be used directly, but more for the scripts that are executed when entering runlevel 1-5.
After init is invoked as the last step of the kernel boot sequence, it looks for the file /etc/inittab to see if there is an entry of the type initdefault. The initdefault entry determines the initial runlevel of the system.
All scripts executed by the init system are located in /etc/init.d. The directories /etc/rc?.d (? = S,0-6) contain relative links to those scripts. These links are named
S<2-digit-number><original-name>
K<2-digit-number><original-name>.
The following runlevels are defined:
N System bootup (NONE).
S Single user mode (not to be switched to directly)
0 halt
1 single user mode
2 .. 5 multi user mode
6 reboot
Knoppix uses a forked sysvinit package where /etc/inittab has been modified to boot non-interactively into runlevel 5 (debian default is 2), runlevel 0 and 6 run respectively the scripts knoppix-halt and knoppix-reboot.
Within the directory /etc/rcS.d there is a single script, knoppix-autoconfig, in this script, the automatic hardware setup is done, and the enviroment is configured. hwsetup, a tool that uses the kudzu-library [2], detects devices, loads all necessary driver modules for known hardware, sets up symbolic links in /dev and writes configuration parameters and options to the corresponding files in /etc/sysconfig/ on the ramdisk.
The directory /etc/rc5.d is empty and the X Window session, in runlevel 5, is called from /etc/inittab using the following command:
x5:5:wait:/etc/init.d/xsession start
editKnoppix and Debian
Debian comes in three versions:
stable
testing
unstable
Knoppix is normally built using a mix of testing and unstable mirrors, that's because it is intended to be a preview of linux without having to go through a long and maybe complicated installation and configuration process for a person that's having a first approach to GNU/Linux systems, for advanced user it can even be used to recover systems for all kinds of emergency issues with all necessary filesystems in the kernel, and repair tools available.
Anyway Knoppix is not intended for commercial installs as Debian stable is. Knoppix can be installed but the safer way to perform an update of the whole system is by using unstable apt mirrors.
Probably the best way to figure out in what Knoppix differs from a Debian is to build it from scratch, this will be really hard for a non-advanced user that doesn't know debian in depth, but it could be hard even for a user with an already good experience working with debian systems but that doesn't know the Knoppix's design deeply .
editDsslive
Dsslive is a "framework for livecd creation", it makes the customization or a built from scratch as easy as possible. It's a pure debian system booting from a cdrom, all the packages use the default configuration and within the source.list it's possible to use just one debian mirror. Everyone that is familiar with apt can easly build it from scratch.
During the boot up process the isolinux bootloader load an initramfs and unionfs is used to merge the content of different directories while the klibc run_init utility is used to change the directory to the unified root. The hardware autodetection is carried out by udev and hotplug and the environment is configured using debconf. The last Dsslive version (0.3-1) is based on breezy and there are no differences between the livecd and the installed Ubuntu.
editEarly User Space
Starting with kernel 2.5.x, the old "initial ramdisk" protocol is getting replaced with the new "initial ramfs" (initramfs) protocol. The initramfs contents is passed using the same memory buffer protocol used by the initrd protocol, but the contents is different. The initramfs buffer contains an archive which is expanded into a ramfs filesystem.
"Early userspace" is a set of libraries and programs that provide various pieces of functionality that are important enough to be available while a Linux kernel is coming up, but that don't need to be run inside the kernel itself.
It consists of several major infrastructure components:
gen_init_cpio, a program that builds a cpio-format archive containing a root filesystem image. This archive is compressed, and the compressed image is linked into the kernel image.
initramfs, a chunk of code that unpacks the compressed cpio image midway through the kernel boot process.
klibc, a userspace C library, currently packaged separately, that is optimized for correctness and small size.
The initramfs buffer format is based around the "newc" or "crc" CPIO formats, and can be created with the cpio utility. The cpio archive can be compressed using gzip. One valid version of an initramfs buffer is thus a single .cpio.gz file.
The klibc distribution contains some of the necessary software to make early userspace useful.
The standalone klibc distribution currently provides three components, in addition to the klibc library [4]:
ipconfig, a program that configures network interfaces. It can configure them statically, or use DHCP to obtain information dynamically.
nfsmount, a program that can mount an NFS filesystem.
kinit, that uses ipconfig and nfsmount to replace the old support for IP autoconfig, mount a filesystem over NFS, and continue system boot using that filesystem as root.
kinit is built as a single statically linked binary to save space.
Eventually, several more chunks of kernel functionality will hopefully move to early userspace as almost all of init/do_mounts* and ACPI table parsing.
The kernel has currently 3 ways to mount the root filesystem:
all required device and filesystem drivers compiled into the kernel, no initrd. init/main.c:init() will call prepare_namespace() to mount the final root filesystem, based on the root= option and optional init= to run some other init binary than listed at the end of init/main.c:init().
some device and filesystem drivers built as modules and stored in an initrd. The initrd must contain a binary '/linuxrc' which is supposed to load these driver modules. It is also possible to mount the final root filesystem via linuxrc and use the pivot_root syscall. The initrd is mounted and executed via prepare_namespace(). As has been previously introduced.
using initramfs. The call to prepare_namespace() must be skipped. This means that a binary must do all the work. Said binary can be stored into initramfs either via modifying usr/gen_init_cpio.c or via the new initrd format, an cpio archive. It must be called "/init". This binary is responsible to do all the things prepare_namespace() would do.
To remain backwards compatibility, the /init binary will only run if it comes via an initramfs cpio archive. If this is not the case, init/main.c:init() will run prepare_namespace() to mount the final root and exec one of the predefined init binaries.
editInitramfs-tools (mkinitramfs)
The move to early userspace is necessary because finding and mounting the real root device is complex. Root partitions can span multiple devices (raid or separate journal). They can be out on the network (requiring dhcp, setting a specific mac address, logging into a server, etc). They can live on removable media, with dynamically allocated major/minor numbers and persistent naming issues requiring a full udev implementation to sort out. They can be compressed, encrypted, copy-on-write, loopback mounted.
Dsslive uses mkinitramfs script, that cames with initramfs-tools debian package, to generate the initramfs. There are no differences between the initramfs used to boot up a system that resides on an hard disk and the one on cdrom, if not that Dsslive one includes the file "yuch" and the three directories "yuch-top", "yuch-premount" and "yuch-bottom".
To generate the initramfs the steps are really simple:
export from svn the package yuch
run on a console mkinitramfs -d `pwd`/yuch -o initramfs.gz
Initramfs root tree
Initramfs scripts tree (yuch,local,bfs)The init script run "local" if anything different has been specified, to use the "yuch" set of scripts the isolinux.cfg should be as follow:
label linux
kernel vmlinuz
append ramdisk_size=100000 root=/dev/ram0 initrd=initramfs.gz boot=yuch
The executable init boot up the system sourcing the file "function" that contains run_script, this is used to run the scripts within the specified path, while load_modules uses udev sys directory to find what module is associated with a particular "modalias"(coldplug), the function "mountroot" is the boot script itself:
run_script /scripts/init-top :
. /scripts/$BOOT: where $BOOT is the variable equal to the parameter passed to boot from cmdline, for Dsslive it is "yuch" and it's where the function "mountroot" is defined.
load_modules
parse_numeric $ROOT:
run_scripts /scripts/init-premount
mountroot
run_scripts /scripts/init-bottom
mount -n -o move /dev $rootmnt/dev, umount /sys, umount /proc
exec run-init $rootmnt $init
editMountroot: yuch
run_scripts /scripts/yuch-top
run_scripts /scripts/yuch-premount
tmpfs
findcd
preuss
mount -t unionfs -o dirs=$merge unionfs $UNIONFS
run_scripts /scripts/yuch-bottom
init
fstab
locales
clock
host
log
editUnionfs
Unionfs is a stackable file system that operates on multiple underlying file systems. it merges the updated contents of multiple directories but keeps their original physical content separated.
The Dsslive implementation of UnionFS merges the Dsslive RAMdisk with the read-only file systems on the boot CD so it's possible to modify any read-only file as if it was writeable.
UnionFS is part of a project called the File System Translator, or FiST . The goal is to address the problem of file system development, a critical area of operating-system engineering. The FiST lab notes that even small changes to existing file systems require deep understanding of kernel internals, making the barrier to entry for new developers high. Moreover, porting file system code from one operating system to another is almost as difficult as the first port.
FiST, developed by Erez Zadok and Jason Nieh in the computer science department at Columbia University, combines two methods to solve the above problems in a novel way: a set of stackable file system templates for each operating system, and a high-level language that can describe stackable file systems in a cross-platform portable fashion.
The idea is that with FiST, a stackable file system would need to be described only once. Then FiST's code-generation tool would compile one system description into loadable kernel modules for different operating systems (currently Solaris, Linux and FreeBSD are supported).
editUnionfs: The Design
Dsslive within the "preuss" script mount different compressed file system in different mount points and uses an rw directory as last layer, however the system needs to have everything in one place (the root directory). Unionfs proposed the solution to virtually merge-or unify-the views of different directories (recursively) such that they appear to be one tree; this is done without physically merging the disparate directories. Such namespace unification has the benefit of allowing the files to remain physically separate, but appear as if they reside in one location. The collection of merged directories is called a union, and each physical directory is called a branch. When creating the union, each branch is assigned a precedence and access permissions (i.e., read-only or read-write).
Unionfs is a namespace-unification file system that addresses all of the known complexities of maintaining Unix semantics without compromising versatility and the features offered. It supports two file deletion modes that address even partial failures. It allows efficient insertion and deletion of arbitrary read-only or read-write directories into the union. Unionfs includes efficient in-kernel handling of files with identical names; a careful design that minimizes data movement across branches; several modes for permission inheritance; and support for snapshots and sandboxing. Unionfs has an n-way fan-out architecture [5,6]. The benefit of this approach is that Unionfs has direct access to all underlying directories or branches, in any order.
Even if the concept of virtual namespace unification appear simple, there are three key problems that arise when using it as root file system of Dsslive.
The first is that two or more unified directories can contain files with the same name. If such directories are unified, duplicate names must not be returned to user-space for obvious reasons. Unionfs solves this definig a priority ordering of the individual directories being unified. When several files have the same name, files from the directory with higer priority take precedence.
The second problem relates to file deletion. Files with same name could appear in the directories been merged or files to be deleted reside on a ro branch. Unionfs handle this sitruation inserting a whitout, a special high-priority entry that marks the file as deleted. File system code that sees a whiteout for a file behaves as it doesn't exists.
The third probles is reletated to the previous one and it involves mixing ro and rw directories in the union. When users want to modify a file that resides in a ro branch, Unionfs perform "copyup", the file is copied to higher priority directory and modified there.
editUnionfs and The Upstream Salmon Struct (USS)
The power of Dsslive resides on its design, offering high modularity and allowing the customization as easy as possible. This has been achieved by designing the USS and using Unionfs as background.
The unified root file system is made of the content of different modules, each module is a squashfs compressed file system:
base: console mode module, it contains a basic bootstraped debian system
kernel: it contains the /lib/modules/ directory plus kernel related utilities
xserver: graphical mode modules, the priority in the unified directory is defined by sorting the modules name
deliver: it contains the runlevel scripts needed to reconfigure the debconf database and the environment reading the user configuration from /proc/cmdline (ex: locales)
overall: the rw branch, it can reside in ram or even be an external hd
To explain what is base, kernel and xserver is not necessary. In fact, the packages on those modules are installed using a "noninteractive" debconf frontend and maintain the packages default configurations. That's why Dsslive can be considered a pure debian system booting from a cdrom. Anyway, to allow the user to use his own locales and video card some packages need to be reconfigured, and this is made using the runlevel scripts in deliver
editDeliver
The scripts in "yuch-bottom", the directory within the initramfs, write the environment variables in the file /etc/deliver.conf, parsing cmdline parameters as lang, username, hostname ecc. Deliver uses those variables to reconfigure some packages upgrading at the same time the debconf database.
The scripts in deliver are plain text bash scripts (architecture independents), this allows to use it not just for i386 livecd but even for ppc or sparc, and all the other 11 architectures that debian supports.
Deliver's runlevelsMainly what is needed to bring an installed system on a cdrom and boot it up are just the initramfs and the deliver module, as Dsslive uses a stock kernel.
The runlevel scripts are merged with all the other default one, in single user mode:
DSSLV-disks: scan for devices and swap partition
DSSLV-status: rebuild the apt database file (/var/lib/dpkg/status)
DSSLV-locales: set the $LC_ALL environment variable to the proper language
DSSLV-env: set the keyboard keymap to the proper language
When entering in multi-user mode (runlevel 2):
DSSLV-adduser: add the user to the needed groups
DSSLV-mkxorg: generate the xorg.conf file reconfiguring xserver-xorg, since Xorg can be used the video card modules can be detected by the X server, this was not possible with Xfree86.
DSSLV-init: configure the files that will permit to use the X server to the added user if a login manager is not present
editDebconf: Autoconfigure, Reconfigure (Debian Way)
Dsslive, differently from knoppix, uses debconf to configure the system, it provides a consistent interface for configuring packages, allowing to choose from several user interface frontends. It supports preconfiguring packages before they are installed, which allows large installs and upgrades to ask for all the necessary information up front, without the need of user interactions (frontend "noninteractive"). It lets to skip over less important questions and information while installing a package (and revisit it later).
During the livecd creation it's used the noninteractive frontend, it never interacts with the user at all, and makes the default answers be used for all questions. It will occasionally mail root with messages the package wanted to display, otherwise it is completely silent and unobtrusive, a perfect frontend for automatic installs. When using this front-end and non-default answers to questions are required , it will be needed a reconfiguration of the involved packages. Dsslive makes use of a modified /etc/debconf.conf, that allow to statically set the answers to some packages during the reconfiguration, while mainataing the real debconf configuration files unchanged.
Debconf uses a rather flexible and potentially complicated backend database for storing data such as the answers to questions. The file /etc/debconf.conf is used to configure this database. Generally, the backend database is located in /var/cache/debconf/
New drivers can be created with a minimum of effort, and sets of drivers can be combined in various ways.
# This is a sample config file that is
# sufficient to use debconf.
Config: configdb
Templates: templatedb
Name: configdb
Driver: File
Filename: /var/cache/debconf/config.dat
Name: templatedb
Driver: File
Mode: 644
Filename: /var/cache/debconf/templates.dat
The format of this file is a series of stanzas, each separated by at least one wholly blank line. Comment lines beginning with a "#" character are ignored.
The first stanza of the file is special, is used to configure debconf as a whole. Two fields are required to be in this first stanza:
Config: Specifies the name of the database from which to load config data.
Templates: Specifies the name of the database to use for for the template cache.
A database stanza begins by naming the database, then it indicates what database driver should be used for this database. If a database has been marked as readonly, debconf will not write anything to it.
A number of drivers are available, and more can be written with little difficulty. Drivers come in two general types. First there are real drivers that actually access and store data in some kind of database, which might be on the local filesystem, or on a remote system. Then there are meta-drivers that combine other drivers together to form more interesting systems.
File: This database driver allows debconf to store a whole database in a single flat text file. This makes it easy to archive, transfer between machines, and edit. It is one of the more compact database formats in terms of disk space used. It is also one of the slowest. On the downside, the entire file has to be read in each time debconf starts up, and saving it is also slow.
DirTree: This database driver allows debconf to store data in a hierarchical directory structure. The names of the various debconf templates and questions are used as-is to form directories with files in them. This format for the database is the easiest to browse and fiddle with by hand. It has very good load and save speeds. It also typically occupies the most space, since a lot of small files and subdirectories do take up some additional room.
Pipe: This special-purpose database driver reads and writes the database from standard input/output. It may be useful to fit special needs.
Stack: This driver stacks up a number of other databases (of any type), and allows them to be accessed as one. When debconf asks for a value, the first database on the stack that contains the value returns it. If debconf writes something to the database, the write normally goes to the first driver on the stack that has the item debconf is modifying, and if none do, the new item is added to the first writable database on the stack. Things become more interesting if one of the databases on the stack is readonly. Consider a stack of the databases foo, bar, and baz, where foo and baz are both readonly. Debconf wants to change an item, and this item is only present in baz, which is readonly. The stack driver is smart enough to realize that wont work, and it will copy the item from baz to bar, and the write will take place in bar. Now the item in baz is shadowed by the item in bar, and it will not longer be visible to debconf.
The Dsslive debconf.conf is as follow:
Config: configdb
Templates: templatedb
# World-readable, and accepts everything but passwords.
Name: config
Driver: DirTree
Reject-Type: password
Directory: /DSSLV/debconf/config
Extension: .txt
Readonly: false
# DSS cofig to override the default values
Name: DSS-config
Driver: DirTree
Reject-Type: password
Directory: /DSSLV/debconf/DSS-config
Extension: .txt
Readonly: true #(should be ro)
# Not world readable (the default), and accepts only passwords.
Name: passwords
Driver: File
Mode: 600
Backup: false
Required: false
Accept-Type: password
Filename: /DSSLV/debconf/passwords.dat
# Set up the configdb database. By default, it consists of a stack of two
# databases, one to hold passwords and one for everything else.
Name: configdb
Driver: Stack
Stack: DSS-config, config, passwords
# Set up the templatedb database, which is a single flat text file
# by default.
Name: templatedb
Driver: DirTree
Directory: /DSSLV/debconf/templates
Extension: .txt
Readonly: false
With this configuration it's possible to define the answers by inserting txt files in the readonly folder.
# cat /DSSLV/debconf/DSS-config/xserver-xorg/autodetect_keyboard.txt
Name: xserver-xorg/autodetect_keyboard
Template: xserver-xorg/autodetect_keyboard
Value: true
Owners: xserver-xorg
editFeatures and Future of Dsslive
Dsslive aims to have an high usability, at the moment the Desktop Linux lacks of organization among separate projects and common standards. Anyway there are some projects that try to fix these problems. Dsslive will follow freedesktop.org and linux base standards where possible. There is a kernel, there are hal and dbus, there is an X server and differents Desktop Managers. The user should be allowed to switch between gnome, kde or xfce4 having the same menu layout for example.
Linux is constructed in layers. Each layer builds upon the next from the kernel up to the graphical programs. But what holds them all together? How does the very bottom communicate with the very top and vice versa? Why do I need to tell Linux what disks I have in my system? Windows and Mac OS both know when I stick a disk in my drive.
On the bottom layer, a daemon deals with hardware in an intelligent manner. It must be able to discover new hardware, find a proper kernel module for it, let the program at the top know when it can't initialize a device. Disks should automatically mount upon insertion, and if unreadable, a disk utility should be launched to initialize it.
Hotplug (udev) and hal can help very much on this, but the user needs to know what is happening on his system, that's the way he will really learn GNU/Linux, there are already volume managers that assist the users on automounting devices and performing actions (gnome-volume-manager, ivman) but they should be integrated with notifications.
At the moment when the user is in Dsslive he can plug a device, it'll be automounted and the notification daemon will pop up a window explainig what has just happened and eventually errors.
Dsslive doesn't use fstab but a list of the devices within /etc/pmount.allow, normally the user will use pmount instead of mount.
For the future release Dsslive will focus on usability and standards and the graphical installer will help the user on easly partitioning the drives and installing the system.
When the main points on the previous goals will be achieved it will be created a tool to generate Dsslive from scracth using a gtk UI similar to synaptic.
Retrieved from "http://www.dsslive.org/mediawiki/index.php/About"
Last modified January 6, 2006 6:25 pm |
|