Embed
Email

Common Log File System

Document Sample

Shared by: xiang
Categories
Tags
Stats
views:
1
posted:
11/3/2011
language:
English
pages:
7
Common Log File System

Common Log File System (CLFS) is a general-purpose logging subsystem that is accessible to

both kernel-mode as well as user-mode applications for building high-performance transaction

logs. It was introduced with Windows Server 2003 R2 and included in later Windows OSs.

CLFS can be used for both data logging as well as for event logging. CLFS is used by TxF and

TxR to store transactional state changes before they commit a transaction.



Overview

The job of CLFS, like any other transactional logging system, is to record a series of steps

required for some action so that they can be either played back accurately in the future to commit

the transaction to secondary storage or undone if required. CLFS first marshals logs records to

in-memory buffers and then writes them to log-files on secondary storage (stable media in CLFS

terminology) for permanent persistence. When the data will be flushed to stable media is

controlled by built-in policies, but a CLFS client application can override that and force a flush.

CLFS allows for customizable log formats, expansion and truncation of logs according to

defined policies, as well as simultaneous use by multiple client applications. CLFS is able to

store log files anywhere on the file system.[1]



CLFS defines a device driver interface (DDI), via which physical storage system specific drivers

plug in to the CLFS API. The CLFS driver implements the ARIES recovery algorithm; other

algorithms can be supported by using custom drivers.[1]



CLFS supports both dedicated logs, as well as multiplexed logs. A dedicated log contains a

single stream of log records whereas multiplexed log contain multiple streams, each stream for a

different application. Even though a multiplexed log has multiple streams, logs are flushed to the

streams sequentially, in a single batch. CLFS can allocate space for a set of log records ahead-of-

time (before the logs are actually generated) to make sure the operation does not fail due to lack

of storage space.[1]



A log record in a CLFS stream is first placed to Log I/O Block in a buffer in system memory.

Periodically blocks are flushed to stable storage devices. On the storage device, a log contains a

set of Containers, which are allocated contiguously, each containing multiple Log I/O Blocks.

New log records are appended to the present set. Each record is identified by a Log Sequence

Number (LSN), an increasing 32-bit sequence number. The LSN and other metadata are stored in

the record header. The LSN encodes the identifier of the container, the offset to the record and

the identifier of the record - this information is used to access the log record subsequently.

However, the container identifiers are logical identifiers, they must be mapped to physical

containers. The mapping is done by CLFS itself.[2]



Introduction to the Common Log File System

The Common Log File System (CLFS) is a general-purpose logging service that can be used by

software clients running in user-mode or kernel-mode. This documentation discusses the CLFS

interface for kernel-mode clients. For information about the user-mode interface, see Common

Log File System in the Microsoft Windows SDK.







1

CLFS encapsulates all the functionality of the Algorithm for Recovery and Isolation Exploiting

Semantics (ARIES). However, the CLFS device driver interface (DDI) is not limited to

supporting ARIES; it is well suited to a variety of logging scenarios.



The primary job of any high-performance transactional log is to allow log clients to accurately

repeat history. CLFS does this by marshalling client log records into memory buffers, forcing

them to stable storage, and reading records back on request. It is important to note that after a

record makes it to stable storage and the storage media is intact, CLFS will be able to read the

record across system failures.



CLFS supports dedicated logs and multiplexed logs. A dedicated log has a single stream of log

records that is used by all of the log's clients. A multiplexed log (also called a common log) has

several streams. Each stream has its own clients and its own memory buffers for marshalling log

records, but the records from all those buffers are multiplexed into a single queue and flushed to

a single log on stable storage. Multiplexing allows the I/O operations of several streams to be

consolidated.



When a client writes a record to a stream, it gets back a log sequence number (LSN) that

identifies the log record for future reference. The LSNs assigned to the records that are written to

a particular stream form an increasing sequence. That is, the LSN assigned to a record that is

written to a stream is always greater than the LSN assigned to the previous record written to that

same stream.



CLFS provides several services in addition to marshalling, flushing, and retrieving log records.

The following list describes some of those additional services.



 Space for a set of related log records can be reserved ahead of time. This means that a

client can proceed with a transaction knowing that CLFS will be able to append to the log

all of the records that describe the transaction.

 CLFS maintains a header for each log record. Clients can set certain fields in the header

to create chains of linked records that you can later traverse in reverse order.

 CLFS flushes log records to stable storage according to its policy, but also allows clients

to force a set of log records to stable storage.

 CLFS maintains metadata for a log and also for each stream of a multiplexed log. Clients

can view metadata and set certain portions of the metadata.

 For each stream, CLFS maintains a base LSN and a last LSN that a client can use to

delineate the active portion of the stream.

 For dedicated logs, CLFS maintains (at the client's request) an archive tail that the client

can use to keep track of the portion of the log that has been archived.



Certain features of CLFS (for example, the previous LSN and undo-next LSN fields of a record

header) can be best understood by studying ARIES. For more information about ARIES, see the

following papers.



 C. Mohan, Don Haderle, Bruce Lindsay, Hamid Pirahesh, Peter Schwarz, ARIES: A

Transaction Recovery Method Supporting Fine-Granularity Locking and Partial

Rollbacks Using Write-Ahead Logging.

 C. Mohan, Repeating History Beyond ARIES.



CLFS Stable Storage



2

When you write a record to a Common Log File System (CLFS) stream, the record is placed in a

log I/O block (in a marshalling area) in volatile memory. Periodically, CLFS flushes log I/O

blocks from the marshalling area to stable storage such as a disk. On the stable storage device,

the log consists of a set of containers, each of which is a contiguous extent on the physical

medium. A collection of containers that forms the stable storage for a stream is called a log, or a

physical log.



The following figure illustrates a container.









Containers, blocks, and records



The preceding figure illustrates a container that holds three log I/O blocks. The first log I/O

block contains three records, the second contains five records, and the third contains two records.

As the figure suggests, the beginning of each log I/O block is always aligned with the beginning

of a sector on the stable storage medium. Note that log I/O blocks on stable storage vary in size.



CLFS uses a set of three numbers to locate a record in a log.



 The container identifier identifies the container that holds the record.

 The block offset gives the byte offset, within the container, of the beginning of the log I/O

block that holds the record.

 The record sequence number identifies the record within the log I/O block.



The log sequence number (LSN) of a CLFS log record actually holds those three pieces of

information: container identifier, block offset, and record sequence number. However, the LSNs

given to log clients contain logical container identifiers that CLFS must map to physical

container identifiers before it accesses the records on stable storage.



CLFS uses logical container identifiers to give clients the view that log records are being written

to an ongoing sequence of containers, when in fact, the physical containers are being recycled.



Suppose a log has three containers, and a single client is writing CLFS records to the log. The

following scenario shows how a container could be recycled.



1. The client writes enough log records to fill all three containers.



The client sets the log base (by calling ClfsAdvanceLogBase or

ClfsWriteRestartArea.) to one of the records in container 2. By doing that, the client is

saying that it no longer needs the records in container 1.



The client writes another record to the log and gets back the LSN of the newly written

record. The logical container identifier in that LSN is 4. When records are flushed to



3

stable storage, records that the client sees in logical container 4 will go to physical

container 1.



The following figure illustrates the scenario; it shows how the client sequence of logical

containers is mapped to physical containers on stable storage.









Logical and physical containers



The logical container identifier, block offset, and record sequence number are stored in an LSN

in such a way that the LSNs for a particular stream always form a strictly increasing sequence.

That is, the LSN (with logical container identifier) of a log record written to a stream is always

greater than the LSNs of the log records previously written to that same stream. LSNs, then,

serve a dual purpose: 1) they give the clients of a stream an ordered sequence of record

identifiers, and 2) they provide CLFS with the location of records on stable storage.



Given the LSN of a record, you can extract the logical container identifier, the block offset, and

the record sequence number by calling the following functions.



 ClfsLsnContainer



ClfsLsnBlockOffset



ClfsLsnRecordSequence



The logical container identifier is a 32-bit number, so there are 2^32 possible logical container

identifiers, and they are in the range 0x0 through 0xFFFFFFFF. A stream can have at most 2^32

logical containers.



The block offset is stored in 23 bits of the LSN, but ClfsLsnBlockOffset returns a 32-bit number

that is aligned with the sector size of the stable storage medium. The block offset is always a

multiple of 512. Also, the block offset is aligned with the sector size of the stable storage

medium. For example, if the sector size is 1024 bytes, the block offset will be a multiple of 1024.



The record sequence number is a 9-bit number, so there are 2^9 (512) possible record sequence

numbers, and they are in the range 0x0 through 0x1FF. A log I/O block can have at most 512

records.









4

Backup Rotation Schedules

The best way to ensure that backups are done in a consistent and timely manner is to establish a

backup schedule. When creating a backup schedule, the ultimate goal is to preempt the loss of

data by becoming able to restore your entire system, or systems, quickly and efficiently.

However, disaster recovery is not the only consideration. Daily convenience also needs to be

taken into account. A good backup scheme should incorporate an easy way to restore individual

files that may inadvertently get deleted. Other considerations include the amount of time

allocated for backups relative to the time available and the time needed. On the one hand the

more often you run a backup you successfully create the most updated copy of your files, which

is tantamount to a successful restore. On the other hand each backup session takes time and

consumes data medium.



Usually backup periodicity depends on intensity of work with data. Most users backup daily. If

you are a frequent computer user and create/change numerous files daily, you might want to

backup daily. If you do not use your computer frequently, backing up weekly may work best.

Also you should synchronize your data backup tasks with operational milestones (i.e., backing

up data upon the acquisition of over 100 orders; backing up data at the close of trading hours,

etc.)



One of the key elements of any backup schedule is to develop a media rotation scheme that

protects your data at least once a day. The best rotation schedule is one that provides you with a

long, deep and varied history of file versions (as opposed to a Tape-A-Day scheme, for instance,

that merely overwrites data from the day previous.).



Here, we outline the most popular rotation schemes that are offered as configurable backup

patterns by most backup application software. Backup rotation schedules:



1. Round Robin

2. GFS - grandfather father son

3. Tower of Hanoi



All offers gives a great depth of file versions. Choose the one that works for you, or customize to

your own needs. Then, make sure to put it into place at all locations and across all types of data

on every platform. Once you finish selecting backup rotation scheme, you can schedule your

backup to run when the most convenient. If you use backup software, after each backup you will

receive a confirmation message stating your backup was complete.You can relax.Your files and

data are safe and secure.



Data Backup Methods

Four fundamental backup methods are: full backup, incremental backup, differential backup,

mirror backup.



To choose a data backup method, you must first weigh three factors: the capacity of your backup

media, the period of time available for your backup, and the level of urgency expressed by you

or other users when a file restoration is necessary.



For example, conducting full backups on a daily basis will require both high number of media

and a long period of time. But doing so will facilitate rapid and easy restoration because you will

5

need only one media to pull data from. On the other hand, weekly full backups combined with

daily incremental backups will conserve media and shorten the daily backup period, though data

recovery would require the last full backup and each subsequent incremental backup up to the

most current - a process that can seem to take forever when there are needy users waiting for a

file. After you select a backup method, you need to schedule backups and pick the most

appropriate rotation scheme for your organization and network needs. Ideally, this reduces media

costs and extends the longevity of your media, while ensuring that every file is protected.



Full Backup Method

A full backup usually includes your entire system and all its files. It's a basic backup method and

all other metods are based on full backup. In each full backup session all data is copied. For

example: all data base, file system, catalog on HDD. It would be ideal to make full backups all

the time, because they are the most comprehensive and are self-contained. However, the amount

of time it takes to run full backups often prevents us from using this backup type. Full backups

are often restricted to a weekly or monthly schedule, although the increasing speed and capacity

of backup media is making overnight full backups a more realistic proposition. Full backups, if

you have the time to perform them, offer the best solution in data protection. In effect, a single

backup can provide the ability to completely restore all backed-up files.However, you should be

aware of a significant security issue. Each full backup contains an entire copy of the data. If the

backup media were to be illegally accessed or stolen, the hacker or thief would then have access

to an entire copy of your data.



Advantages: restore is the fastest. Disadvantages: backing up is the slowest, the storage space

requirements are the highest (compared to incremental backups or differential backups).



Incremental Backup Method

Incrementals are usually done more often than full backups. This method is based on sequential

partial backup copy refreshing. Incremental backup provides a much faster method of backing up

data than repeatedly running full backups. During an incremental backup only the files that have

changed since the most last full, differential or incremential backup are included. That is where it

gets its name: each backup is an increment since the most recent backup. Backup levels are used

to distinguish between different types of backups. A level 0 is a full backup. A level 1

incremental means backing up everything that has modified since the last level 0. A level 2

incremental copies all the files that have modified since the last level 1 and so on. The advantage

of lower backup times comes with a price: increased restore time.When restoring from

incremental backup, you need the most recent full backup as well as every incremental backup

you've made since the last full backup. For example, if you did a full backup on Friday and

incrementals on Monday,Tuesday and Wednesday, and the PC crashes Thursday morning, you

would need all four backup container files: Friday's full backup plus the incremental backup for

Monday, Tuesday and Wednesday. As a comparison, if you had done differential backup on

Monday, Tuesday and Wednesday, then to restore on Thursday morning you'd only need Friday's

full backup plus Wednesday's differential.



Advantages: backing up is the fastest, the storage space requirements are the lowest.

Disadvantages: restore is the slowest.









6

Differential Backup Method

There is a significant, but sometimes confusing, distinction between differential backup and

incremental backup. Whereas incremental backs up all the files modified since the last full

backup or incremental backup, differential backup offers a middle ground by backing up all the

files that have changed (file is changed if content, attribute or access permition rights are

changed) since the last full backup. That is where it gets its name: it backs up everything that's

different since the last full backup. Restoring a differential backup is a faster process than

restoring an incremental backup because all you need is the last full and last differential backup.

Use differential backup if you have a reasonable amount of time to perform backups.The upside

is that only two backup container files are needed to perform a complete restore. The downside is

if you run multiple differential backups after your full backup, you're probably including some

files in each differential backup that were already included in earlier differential backups, but

haven't been recently modified. Differential backup is gaining in popularity because it traps files

at points in time, for example, prior to virus corruption.



Advantages: restore is faster than restoring from incremental backup, backing up is faster than a

full backup, the storage space requirements are lower than for full backup. Disadvantages:

restore is slower than restoring from full backup, backing up is slower than incremental backup,

the storage space requirements are higher than for incremental backup.



Mirror Backup Method

A mirror backup is identical to a full backup, with some exceptions. A mirror backup is a straight

copy of the selected folders and files at a given instant in time. Mirror backup is the fastest

backup method because it copies files and folders to the destination without any

compression.While the other backup types collect all the files and folders being backed up each

time into a single compressed "container file", a mirror backup keeps all the individual files

separate in the destination. That is, the destination becomes a "mirror" of the source. However,

the increased speed has its drawbacks: it needs larger storage space and it cannot be password

protected.It has the benefit that the backup files can also be readily accessed using tools like

Windows Explorer.



Advantages:the fastest backup method. Disadvantages:it needs more storage space than any

other backup type, password protection is not possible, cannot track different versions of files.









7



Related docs
Other docs by xiang
The Parable of the Rich Fool
Views: 23  |  Downloads: 0
14838-Nat.Equest Summer 08-2
Views: 7  |  Downloads: 0
kompendium_februar_01
Views: 1  |  Downloads: 0
Antimikrobielle Wirkung ausgewhl
Views: 2  |  Downloads: 0
Vietnamese BULLETIN vietnamien
Views: 1  |  Downloads: 0
Information Retrieval Models and
Views: 19  |  Downloads: 0
Download our Menu - Aveda Institutes
Views: 2  |  Downloads: 0
Journ茅e mondiale de l'hydrograph
Views: 2  |  Downloads: 0
SJSAS
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!