Document Sample
registry Powered By Docstoc
					The Linux Registry
A proposed solution for Linux/Unix Configuration Nightmare
Avi Alkalay <>
Proposal Draft. Feb 2004

Why Linux Needs a Registry?
Look into /etc/fstab file. Now look into /etc/modules.conf, /etc/passwd, /etc/ssh/sshd_config, /
etc/httpd/conf/httpd.conf. I see here two terrible problems:
1. They don't have a common file format.
2. Their localization in the filesystem may be different from Linux distribution to distribution.
These 2 issues leverages other problems:
1. A system administrator must know all these formats.
2. A third-party Apache plugin, for example, to provide a clean and integrated installation, must
   know where Apache is installed, parse and edit its configuration file, to include itself. So
   software integration is a pain for the administrator and for ISVs (Independent Software
3. Sysadmin must be aware of all files that were changed, to make a complete backup.
4. A software developer must waste a lot of time to write the plumbing code for configuration file
   parsing etc.
5. It is impossible to make high-level system administration GUIs that will work on any Unix or
   Linux distribution. And for those that are available (webmin, redhat-config-*, SuSE's YAST, etc),
   the task to keep them updated and bug-free is so complex, that good system administration
   practices can't rely on them, or can't use them for an heterogeneous environment (think of
   Red Hat and SuSE integration). And believe me, heterogeneous environments are very
   common in real world businesses out there.
Other OSes solved this problem centralizing their configurations, and creating a framework to
access them. Configurations then are no more represented by 'configuration files', but by key-
value pairs organized in a structured tree commited to some naming convetions.
To achieve multi-vendor/provider consistent software integration, and ease of administration
across heterogeneous Linux distributions, Linux needs a Registry.
What is the Registry
The Linux Registry is an alternative back-end for text configuration files.
Instead of each program to have its own text configuration files, the Registry tries to provide a
universal, fast, consistent, robust, thread-safe and transactional infrastructure to store
configuration parameters through a key-value pair mechanism.
This way any software can read/save his configuration using a consistent API. Also, applications
can be aware of other applications configurations, leveraging easy application integration.
With the Linux Registry, configuration file's syntax and handling, will not be a rework for each
Lets put it in some items:
•   It is a simple and consistent API to help software developers store and retrieve global and user-
    specific configuration parameters.
•   It is designed to be secure, lightweight and fast, to let even early boot-stage programs like /
    sbin/init to use it, instead of /etc/inittab file.
•   It tries to set distribution-independent naming standards to store things like hardware
    configuration, Samba properties, Apache configurations, user's session configuration, system's
    mime-types, parameters for kernel modules, and everything that is generally stored under /
•   It requires existing software to be changed to use its API. This will substitute hundreds of
    configuration-text-file parsing code, into clear Registry's API key-value access methods.
•   It is inspired on the Microsoft Windows Registry.

What the Registry is Not
The Registry:
•   Is NOT a way to access SQL/relational databases.
•   Is NOT an alternative to network information systems like LDAP or NIS.
•   Is NOT a Webmin-like or other GUI tool to be used by end users.
•   Is NOT an additional software layer to edit/generate existing configuration files.
•   It doesn't know a thing about the semantics of each data it stores.
•   Is NOT a network daemon in its essence.
Registry Convention for Key Names and
The keys are organized in a hierarchical tree as the picture bellow. It has two sub-trees:
•   system: contains all subsistems and global application configuration. Equivalent to /etc files.

•   user: The current user's configuration. Equivalent to dotfiles in a user's $HOME.

To access the value of a key, the developer must use the full key name. The notation for the
Registry hierarchy uses a dot (.) character as delimitator. Here are some examples of a key
notation and possible values for them:
•   system.userdb.aviram.home: “/home/aviram”

•   system.userdb.aviram.fullName: “Avi Alkalay”
•   system.userdb.500: “system.userdb.aviram”

•   system.init.default: 3

•   system.init.entry04.execute: “/sbin/mingetty tty1”

•   system.fstab.entry02.device: “/dev/sda1”

•   system.fstab.entry02.mountPoint: “/usr”

•   system.httpd.documentRoot: “/var/www”

•   system.hardware.networkInterface0.type: “Ethernet”
•   system.hardware.networkInterface0.onBoot: 1 (True)

•   system.hardware.networkInterface0.IP: “”

•   system.kernelModules.aliases.eth0: “3com”

•   system.XFree86.screen0.driver: “/usr/X11R6/lib/modules/drivers/cirrus_drv.o”

•   user.environment.PATH: “/usr/bin:/bin:/usr/local/bin”, which is the PATH of the current user

•   user.shells.preferredBrowser: “/usr/bin/mozilla”

Entries essential to the system, like system.init.*, user.environment.*, system.fstab.* etc, will
have a well defined key hierarchy specification that is standarized across any Linux distribution. It
is the Registry Project's responsibility to articulate the Linux community for this standarization.

Accessing other user's registry data
If a user program wants to access other user's key-value pairs, the syntax is:
•   user:michele.environment.PATH
•   user:michele.shells.preferredBrowser

Of course, in this example, michele must set permissions in these keys to let other groups or
users access it for reading and/or writing. See key meta information bellow.

Key Types
The Registry offers several types of keys:
•   String. Any-length UTF-8 (at the server) encoded text. The client API is responsible to convert
    from UTF-8 to the program's charset.
•   Integer. An 8 bytes integer.
•   Double. An 8 bytes floating point number.
•   Binary. Any-length stream of bytes.
Record Meta Information
Each key-value pair has a set of attributes that will ensure security and accountability:
•   Owner's user and group. This is a system user or group.
•   Filesystem-like access permissions for user, group and other.
•   Modification time. Last time a key was modified.
•   Record group. A record can be a member of a group of keys, so is easy to access keys that
    are interrelated.
•   Owner package. Since configuration files are owned by the packages that installed them,
    each key can be owned by a package. Then package managers (rpm, deb) will be able to
    remove all keys installed by a package.
•   Description. Pretty much as a configuration file comment. Not intended to be used in GUI
    applications, because it isn't internationalizable.
So a complete key-value pair can be expressed in the following human-readable way:

Description=The driver module for current system's ATI board
time=Jan 25 2004 10:32:23 UTC

This representation is actually the format dumped and restored by registry's command line tools.
It will be also possible to dump it to an XML format.
Registry Software Architecture
The Linux Registry has 2 parts:
•   Client API. Is a shared library that programs use to access the registry data. It has a set of
    functions like get(), set(), list(), etc, to manipulate the keys.
•   Server. Is a unique daemon that access the registry data files, check user permission for
    read/write to the requested key, and delivers the answer
Since the server must be available to early boot-stage programs like /sbin/init, it cannot be a
init.d daemon. The registry server executable is a set-user-id program that is executed by the
client API. It stays alive for some seconds to avoid a restart by the next requesting client. The
server is not a network service, it doesn't bind to any TCP port, etc. Since the registry is designed
to substitute local configuration files, the client-server communication method is the fastest
possible, like shared memory or message queues.
The files that store the configuration are under /etc/registry/ with very low access permissions.
A regular program can only access configurations through the set-user-id registry server.
For user specific configurations (like the $HOME/.dotfiles), registry files are stored under
$HOME/.registry/ directory.
The following graph shows the client-server interoperation.

              Program A                                          Program B
                get()    set()                                     get()    set()
                list()                                             list()

            Registry Client API                                Registry Client API

                         Programs run the Registry Server
                         Or connects to existing one if already running

                                   Registry Server
                                         setuid executable                           Registry Internal Files

Installed Files
The registry installs several files:
rwsr-xr-x   /sbin/registry           #   registry server
rw-r—-r--   /lib/      #   client shared library
rwxr-xr-x   /bin/reget               #   tool to get keys in scripts
rwxr-xr-x   /bin/reset               #   tool to set keys in scripts
rwx------   /etc/registry/           #   Directory to save the system.* registry data
rwx------   $HOME/.registry/         #   The user.* part of the registry

Note that the server and client library are installed in /sbin and /lib. This is because they are too
essential to be installed under /usr, that may be unavailable/unmounted in early boot stages.
Technology Building Blocks for the Registry
To implement the Registry, a set of specific technologies are used to meet the following
1. Must be fast and reliable.
2. Must be as secure as files on the filesystem. A user can't access unaccessible keys. And a user
   program must not interfere in the communication of another client and the server.
3. Must be available globally to all programs, at any time, even to the first program that the
   system runs: /sbin/init.
So for each of these requirements, there is a technical response:
1. Berkeley DB seems to be a response for speed and reliability. Similar software use XML files
   as the database back-end (gconf2). It would be great to store the info in a very clean and
   human way, but any XML parser is a memory eater. Berkeley DB is also used as the RPM's
   back-end database. The GPL status of BDB does not matter for commercial applications,
   because they link only to the registry's client API, which doesn't know nothing about BDB. BDB
   is used only be the registry server process. Actually, this backend can be substituted, keeping
   compatibility with existing applications.
2. Using TCP or UDP connections, there is no way for the server to identify who he is talking to.
   And we don't want a protected information (like root's password) to be incorrectly delivered.
   Also, we don't want somebody sniffing the client-server communication. The solution here are
   Message Queues, because their security model is similar to filesystem's, and they are 100%
   managed directly by the kernel. Message Queues provide mechanisms for programs to know
   who they are talking to.
3. At boot time, some tools may not be available due to still-not-mounted filesystems. So the
   libraries the Registry uses must be all on the / filesystem. This issue eliminates the use of
   tools like C++ ( and XML. In current distributions, these libs are usually under /
   usr/lib. So the Registry must use 100% fast ANSI C, using almost only system calls. I would
   prefer C++ though, but not for this project. Berkeley DB is an essential library, and in current
   Linux distributions, it is installed under /lib.

About Similar Software
There is some other softwares that tries to solve this same configuration problem. A popular one
in GConf, the Gnome configuration infrastructure. GConf uses XML documents as backends,
stored in user's home directory. XML based software are memory eaters, as noted above. Also,
GConf seems not to be preocupied with access permissions, making it a good solution only for
personal use in desktop (high level) systems.

Registry Adoption by Old System Software
The benefits of the Registry will be felt only when configurations like those under /etc will move
into it. It is expected that this will take time, but awareness should be done by the community to
speed the process.
Example of Registry API Calls
A very simple example in C of how a client can access the Registry:
#include <registry.h>
#include <stdio.h>

int main(int argc, char **argv) {
     KeySet myKeys;

     registryGetKeysByGroup(“My App”,&myKeys);
     /* Although we'll access the registry later, lets free some resources now */

     /* Now use our keys... */
     Key *current;
     for (current=myKeys.start; current; current=current->next) {
          printf(“Key %s has value %s\n”,
          keySetString(current,”New value”);

     /* Update some entries */

     /* Never forget to close the Registry */
Command Line Tools
The Registry will provide a set of basic command line tools to be used in scripts and essential
•   reget: To get registry values

•   reset: To set registry values

•   redump: To export the registry (or part of it) into a text file. Can be used for backup purposes

•   reload: To import a set of keys that may be generated by redump

Internal Database Layout
The Registry backend database is Berkeley DB. It has 3 files:
•   registry.db: The actual registry data
•   groups.idx: An index to find records by groups
•   folders.idx: An index to find records by parent folder

A registry record has the following layout, according to Berkeley DB key-value structure:
•   Null terminated key name (example: “system.fstab.entry02.mountPoint”)
•   Key data
The data part has the following layout:
•   Record type: u_int8_t (1 byte)
•   Record owner system uid: uid_t (4 bytes)
•   Record owner system gid: gid_t (4 bytes)
•   Record permissions: mode_t (4 bytes)
•   Record last modification time: time_t (4 bytes)
•   Size of the group part: u_int32_t (4 bytes)
•   Size of the description part: u_int32_t (4 bytes)
•   Size of the actual data part: u_int32_t (4 bytes)
•   Null-terminated group name: size from above
•   Null-terminated description: size from above
•   Actual data part: size from above

A graphical representation:
   Key name

                ... 0

                        = 1 byte


                                   One registry record: at least 32 bytes

Last changed

  string size

  string size

   Data size
                ... 0

Group name

                ... 0


Shared By: