Kernel
In this section, we examine the basics of kernels and processes. These
are the fundamental building blocks of operating systems, and of sophisticated
services.
Processes
Consider a process is a program in execution. A pure process
is said to run on a virtual processor and thus has the following properties:
-
Finite progress: over each period of time of length t, the probability
that some process executes is bounded by p>0.
-
Execution of a pure process is independent of any other process that is
executing on the system (and is, in particular, independent of the order
or "interleaving" of execution.
The definition of pure process does not allow any interaction between processes,
and that is not desirable. However, the amount of interaction needs to
be strictly controlled, to prevent errors in one process from being propagated
to unrelated processes. This unconstrained error propagation occurs in
MS/DOS and MS/Windows (But not Windows NT), enabling faulty programs to
corrupt working programs.
Hence, we begin from pure processes and enable carefully controlled
visibility of other processes. The most central inter-process visibility
is the file system, which is always available for sharing without any special
arrangements. For example, it is possible use an editor to modify a program's
source code, save it, and then without exiting the editor to invoke the
compiler on the source code file.
Other types of interaction can be controlled through both protection
mechanisms and explict calls. Examples of these impurities are:
-
Signals: notification of asynchronous events (including control of some
processes by other processes).
-
Message passing: sending data directly to another process (not indirectly
through another process.
-
Shared memory: this is physical memory which is mapped into multiple processes
address space.
-
Semaphores
-
Sockets codes
Kernel
The kernel is not a process. Rather it extends the base hardware to a virtual
machine, which hides various details of the low level implementation.
The purpose of the kernel is to:
-
Implement the process abstraction
-
To provide protection for different users from each other and from the
outside world.
-
To deal with error conditions and architectural ideosyncracies.
The Unix kernel is written almost entirely in C, with about 4,000-8,000
lines of code of assembler. Since the kernel is the only part of the system
which can execute privileged instructions, these instructions are coded
in assembly language.
Failures
The kernel is extremely important; if the kernel detects a failure its
typical response is to
panic, which causes a reboot. There are
two alternatives.
-
Ignore the error: this technique is almost never used. Although the computer
would reboot less often, the error may contaminate the system allowing
unauthorized accessed, giving incorrect results, etc.
-
Correct the state of a kernel: thereby removing the effect of the error.
This is much more difficult, and may allow errors to propagate.
The complexity of failed systems is much higher than of working systems;
moreover failures are relatively rare. Hence, trying to determine and fixed
a problem is fraught with perils, and is impossible to exhaustively test.
Instead, it is a better policy to fail fast (ie. fail at the earliest
detection of erroneous state), and restart, rather than correct problems.
System calls
Processes communicate with the kernel via system calls which are
entry points into the kernel. The set of system calls in Unix can be found
in section 2 of the manual pages (see man -s 2 Intro in Solaris).
In addition, there are many library calls listed in section 3 of the manual;
all library codes invoke system calls for kernel services. These code for
these library calls are linked with the user executable by ld;
by default the standard C library (libc) is linked in together with C code.
System calls themselves are implemented with trap instructions which
simultaneously enter the kernel at an indexed entry point and enable the
execution of privileged instructions.
The limited number of system calls is prefered to branching to arbitrary
location is the kernel since it restricts the number of entry points. Since
the kernel is charged with providing security, every one of these entry
points must be well guarded, checking that the process parameters are legal
and that the process has permission to perform the required operations.
Superuser
In Unix, the operating system functions are split between system processes
and the kernel. Unix has two levels of users in the system: ordinary users
and the superuser. The difference between these two is that the super user
is allowed much less restriction on the system calls that are made--for
example, the superuser can modify any user's files whereas ordinary users
can modify only their files and the files for which they have been given
explict access.
To be sure that these permissions are not abused, it is the processes
responsibility to check appropriate conditions since the kernel is extremely
permissive with the superusers. Most violations of security in Unix systems
are the result of superuser processes failing in this responsibility--not
kernel problems.
Concurrency
The Unix kernel is designed so that it is always in the context of a single
process (after the first process, init is created). The traditional
Unix kernel allows only one process to be executing in the kernel at a
time, and control is given up voluntarily while in the kernel. Hence, there
is no need for semaphores when constructing a Unix systems. More modern
Unix systems allow multiple processes to be executing in the kernel (in
a multiprocessor system), but the kernel has be modularized into subsystems
which are largely independent. Any interactions between subsystems must
then be guarded with semaphores.
Process failures
There are many conditions which can cause a process to terminate, such
as divide by zero, external termination via kill -9, etc. The
kernel cleans up after these processes as best it can, so for each process
the kernel keeps some process related info in the kernel.
Kernel-based process structures
The kernel tracks two types of information on a per process basis:
-
Infomation which is needed only when the process is running is kept in
the u-area. The u-area is part of the process and can be swapped
out when the process is not running. But it exists in kernel space, not
user space, and hence can only be accessed by the kernel.
-
Information which is needed at all times is kept in the process table.