Operating Systems 2009/10 Prof. Dr. Frank Bellosa
Philipp Kupferschmied
Tutorial-Assignment 1
Question 1.1: Linux Basics
a. What is a shell?
Solution:
A shell is a (text-based) program that allows a user to interact with the operating system.
The basic functionality a shell offers includes file system navigation (changing the current
directory, listing the contents of a directory, . . . ) and the ability to launch other programs.
b. How can you easily get help for a specific Linux tool/program? (No, “Use Google” is not
the answer to that question).
Solution:
View the appropriate man(ual) page, using man X, where X is the topic you want help to,
for example the name of a program.
c. Assume you want to edit a file named syscalls.c, but you have forgotten in which sub-
directory of our project it resides. What can you do?
Solution:
An easy way to locate that file is to use find. Therefor, you have to navigate to the base
directory of your project and run the following command: find . -name ’syscalls.c’
d. Assume you have a directory that contains multiple C source files in multiple subdirec-
tories. How can you search for all occurences of the macro FOOBAR in these C files?
Solution:
find . -name ’*.c’ | xargs grep ’FOOBAR’ The find command recursively searches the
current directory (named .) and all subdirectories for all files that match the pattern .c,
i.e., C source files. The standard output of find is then redirected to the standard input of
xargs, using a pipe (represented by the |). xargs is a program that reads items from the
standard input which are delimited by blanks or by newlines, and builds and executes
a command line for each of these items. In our example, find will output lines such as
./src/foo/bar.c, and xargs will consequently execute grep ’FOOBAR’ ./src/foo/bar.c.
grep searches for the string FOOBAR in the specified file (if you want to perform a case-
insensitive search, use the -i flag). As xargs repeats that step for each line printed by
find, all files will be searched for the pattern FOOBAR.
Note that our solution might lead to problems if filenames contain blanks or newlines. We
omitted parameters to deal with such situations for the sake of simplicity.
e. What steps are necessary to create an executable program from multiple C source files?
Solution:
Basically, two steps are performed:
(a) The sources files are compiled individually, resulting in multiple object files (one per
source file). An object file already contains machine code, but also not yet resolved
symbols to functions in other object files.
1
(b) The object files are linked to an executable binary. During linking, symbols are resol-
ved to the real addresses of the functions/variables.
When invoking gcc with multiple source files, e.g., “gcc main.c foo.c”, gcc not only com-
piles the individual source files, but also calls the linker ld appropriately.
You can prevent gcc from doing this by using the -c flag. In that case, gcc only produces
object files (*.o), which you can link manually using ld. Note that manual linking is not
a trivial task, as, in practice, you will have to link against multiple libraries and other
system-specific object files in order to get a correct executable.
f. What is make?
Solution:
make is a program that eases building a program from source files. make takes the informa-
tion how to build a specific program from a file called Makefile. A Makefile contains one
or more rules, each rule specifies how to build a specific target. A rule basically looks like
this:
TARGET: dependencies
command 1
command 2
command n
make is also able to reduce the build overhead. For example, if a program consists of two
source files, but only one has changed since the last build, make will only recompile that
source file.
Question 1.2: Bits & Bytes
a. Why might it be necessary to set or clear a single bit of an integer value?
Solution:
One reason is the interaction with hardware devices, the hardware might define control
words where each bit has a specific meaning. For example, setting (or clearing) write per-
missions to a page of memory requires to change a single bit (this topic will be covered in
more detail in a later tutorial). Another reason is to save storage. For example, if you want
to implement a simple block-based memory management, a single bit suffices to inidicate
whether a specific block is free or not.
b. How can you set the ith bit of a given integer value?
Solution:
result = value | (1 << i);
where | is the bit-wise OR
c. How can you clear the ith bit of a given integer value?
Solution:
result = value & ∼(1 << i);
where & is the bit-wise AND and ∼ is the bit-wise negation
d. How can you retrieve the contents of bit 2 to bit 5 of a given integer value?
Solution:
result = value & (0xf << 2);
2
Question 1.3: OS Basics
a. Explain the difference between policy and mechanism.
Solution:
A policy describes how a specific goal shall be reached. Mechanisms are used to realize a
given policy. For example, assume you are the leader of a supermarket. On the one hand,
your goal is to minimize personnel costs, on the other, you don’t want your customers ha-
ving to wait too long at the cash desk. Therefor, you specify the following policy: As long as
the length of the queue at the cash desk is below a specified limit, only one cash desk will
be open. If the queue becomes too long, another staff member has to open a second cash
desk. In order to call another staff member, you will probably install a microphone that al-
lows for making announcements. Consequently, the microphone is a mechanism that can
be used to implement the policy described above. In this example, policy and mechanism
are clearly separated: You don’t need to install a new microphone if you decide to open
new cash desks not based on queue lengths, but on average waiting times per customer.
You might however have to install additional mechanisms to support a new policy.
b. Enumerate the major tasks of an operating system.
Solution:
• Abstraction/Standardization: An operating system should hide hardware details from
applications/application programmers/users.
• Resource Management: Resources must be multiplexed in a “fair” way between app-
lications/users.
• Security/Protection: This point is closely related to resource multiplexing. Different
applications should not be able to disturb/manipulate each other. No private data of
one user should be exposed to other users, unless explicitly desired.
• Providing an execution environment for applications: This point can be seen as the
main goal of an operating system, the 3 points mentioned above are requirements to
fulfill this goal.
Question 1.4: Hardware Basics
a. What are some of the differences between a processor running in privileged mode (also
called kernel mode) and user mode? Why are the two modes needed?
Solution:
In user mode, only the non-privileged processor instructions can be executed. Executing a
privileged processor instruction in user mode will lead to an exception of type privileged
instruction violation. It is up to the system programmer how to react to this exceptional
event. In most cases, the corresponding exception handler will abort the activity that cau-
sed this exception.
In kernel mode, most processors allow the execution of all instructions. On many existing
systems that support only user and kernel mode you cannot prevent kernel code from
accessing every system or application entity (i.e., object in memory).
Some activities that control and manage the system as a whole may only be performed
in kernel mode. Thus kernel mode is necessary in order to avoid that applications can
tune the system to their own needs while negatively affecting other applications (e.g., by
occupying system resources as long as they need them without paying for their usage).
3
b. Describe the principle and the benefits of a memory hierarchy. How can memory hier-
archies provide both fast access and large capacity? What typical program behavior
coincides with the benefits of a memory hierarchy?
Solution:
A memory hierarchy feigns a larger high speed memory (e.g., L1 cache) than the system
can offer in reality. In case of L1 cache shortage, no longer needed and modified L1 cache
lines are written back to L2; no longer needed and unmodified L1 cache lines are simp-
ly discarded. In both cases, new data can then replace the freed cache lines. The same
principle is used between L2 and L3 cache (if present), cache and RAM, RAM and disk,
etc.
Two principles of locality contribute to efficient cache usage:
temporal locality Data words that have been accessed recently are likely to be accessed
again (e.g., loop counters). Consequently it makes sense to spend some time getting
them into a cache on first access in order to speed up further accesses.
spatial locality Most memory cells that are accessed within a short period of time are
close together (clustered) rather than being spread all over the address space. Conse-
quently, caching lines rather than mere data words makes sense.
The stack again is a wonderful example for these principles of locality: For most procedu-
res, a number of local variables residing in the procedures’ respective frames on the stack
(clustered, spatial locality) are likely to be accessed, whereas most global variables remain
untouched. While executing, the local variables are also likely to be accessed more than
once (at least one write access and one read access, otherwise you could have omitted the
variable altogether).
c. Cache memory is divided into (and loaded in) blocks (also called cache lines). Why is a
cache divided into these cache lines? What might limit the size of a cache line?
Solution:
First of all, each cache is divided into sets of cache lines. Each set is used to cache specific
parts of the main memory, which are identified by a part of the virtual or physical address
currently being accessed. Depending on the internal cache architecture (fully associative,
n-way or direct mapped), each memory location can be mapped to any cache line (i.e.,
there is only one set), exactly one subset (comprising 2 or more lines), or exactly one ca-
che line. As multiple memory regions map to the same (set of) cache lines, the yet unused
upper part of the address is stored along with the data to identify its location (this extra
label is called the tag).
Caches are organized in lines as memory accesses are most efficient if used in bursts: At
the DRAM interface, one read operation requesting eight words is much faster than eight
read operations for one word each (about seven times slower).
The size of the cache lines depends on the appropriate portion that can be easily swapped
between the cache and its lower memory level.
4