OS File Systems
Document Sample


HARD HAT AREA - WHITE PAPER
OS File Systems
Longhorn Brings Changes For Windows
T
he upcoming Longhorn operating partitions up to 4TB in size. In ext2, each GFS. The Global File System works
system will bring several new file is represented by an inode, which especially well with Linux cluster file
technologies to the next genera- includes a detailed description of the file, systems. It provides better security than
tion of Windows. Although many of the including the file type, access rights, size, ext2, as well.
changes will be immediately visible and will and pointers to the appropriate data blocks. Microsoft file systems support. Some
affect the way you interact with Windows, Of course, Linux wouldn’t be Linux developers have created support for FAT32
many of the other changes will be in the unless many developers were contributing and NTFS within Linux.
background and will affect the way the to the project. Dozens of different file sys- Minix. This file system format origi-
operating system performs. tems are under development or are available nated in Minix, which is a variation of
The work of the file system in Long- for use with Linux. Some of them include: Unix that sparked many features in
horn will be one of those important,
behind-the-scenes changes. Every type of
writeable storage disk, whether it is a hard Implicit Query
drive platter, a CD, or a diskette, must use
a file system. The file system performs sev- t also appears that as part of WinFS and Longhorn, Microsoft is seeking a Google-like
eral tasks related to data storage, including
naming, storing, and retrieving data.
I user interface and search function that will make it easier to find a particular piece of
data. Microsoft has created an application called IQ (Implicit Query) that would service
Before delving into the features of this function. (A version of IQ was part of a Microsoft presentation at Comdex in
Longhorn’s file system, let’s look at the November 2003.) The key force driving IQ would be its ability to perform searches in the
types of file systems Linux and Mac OS X background without your prompts. The file structure and file organization in WinFS would
use. Most of you are probably familiar make technology such as IQ possible. For example, if you’re working on a particular doc-
with the current main Windows file sys- ument (as shown below), IQ would scan your document for key text strings as you enter
tems: FAT/FAT16, FAT32, and NTFS. them and perform searches on those strings, making data available to you in the back-
For a refresher, see the “Windows File ground if you need it—and maybe even before you realize you need it. Analysts say that
Systems” sidebar. in the long run, an idea such as IQ could make search engines obsolete, automatically
building into WinFS and the Longhorn OS the ability to search the Web automatically
Linux File Systems alongside the ability to search hard drives.
Linux can use several file systems, but IQ searches your email
what many people refer to as the “native” program for any messages
Linux file system is ext2. Ext2, which first matching your To: line,
appeared in 1993, is short for The Second just in case you need to
Extended File System. The first version look back at previous
(called ext) initially appeared in 1992 but is messages to John.
no longer part of the Linux kernel. A third
version (called ext3) is backward-compati- IQ would look through your
ble with ext2 and essentially adds a journal- hard drive for any email
ing file system to ext2. Ext3 is included in messages, calendar entries,
some kernels of 2.4.x. However, ext2 is the Word documents, and
foundation for the Linux file system. other documents and items
Ext2 has its roots as a basic Unix file sys- linked to the conference.
tem. Ext2 uses many of the same principles
as Windows OS file systems. It stores files in IQ would look for any
data blocks (called clusters in Windows file documents, email mes-
systems) and uses a hierarchical tree for its sages, calendar items, and
structure of directories, subdirectories, and Web sites related to Bill
files. Ext2 can handle long file names and Gates and Omaha.
WHITE PAPER - HARD HAT AREA
WinFS Data Model
he WinFS data model relationship, a source type is documents, for instance. (More
T describes the concepts for
data structure and organiza-
required. A target type can be
part of the relationship, but it
than one folder can have a
holding relationship with a
between two or more types
with a holding relationship in
place aren’t allowed.
tion within the file system. isn’t required. (A relationship document at a time.) If a Reference relationships. In
Within the WinFS data model with a source type and no source-type folder is deleted, a reference relationship, source
are types, which describe the target type is called a dangling the target-type document types have no control over tar-
pieces of data. Each type has relationship.) Two major types remains available as long as get types. No restrictions on
certain properties and fields of relationships exist in WinFS, one or more other source-type relationships are allowed, and
that relate to it and describe called holding relationships folders are pointing to it. dangling relationships are
it. For example, a type called and reference relationships. However, if all source-type allowed. A reference relation-
“person” might have proper- Holding relationships. In a folders are deleted in the hold- ship can involve cycles, too, as
ties and fields such as “name” holding relationship, the source ing relationship, and the target- shown in Figure 3. A reference
and “address” that relate to type controls the target type, type document has no source- relationship might be used over
it. The types and their proper- and the relationship doesn’t type relationships, it’s deleted. a network, for example. If the
ties aid in data organization end until the source type ends One other important point network experiences a problem
and data searches. it. Dangling relationships are about holding relationships: and the link between a source
WinFS types experience not possible in a holding rela- Cycles are not allowed, as and target type is temporarily
relationships, too, which are tionship. While organizing data, shown in Figures 1 and 2. Any broken, a reference relationship
rules related to organizing WinFS might use a holding holding relationships that could still exist for the other
and using the data. Within the relationship with folders and would create a cycle or a loop types that remain linked.
Linux. However, Minix probably won’t and HFS+) is the primary file system for Longhorn & WinFS
appear in future versions of the Linux the Mac OS X. WinFS (Windows File System) should
kernel, as it has been replaced by ext2. HFS Plus is a replacement for HFS, debut in the upcoming Longhorn OS,
Read-only file systems. A few read- which was a long-time file system for which is the next major desktop version of
only file systems designed for boot disks Macintosh computers developed in the Windows, expected to replace WinXP,
are available for Linux, too. Cramfs is a late 1980s. Apple decided to replace HFS probably in 2005 or 2006. (Although some
read-only file system that can use com- because of problems it was having with say WinFS is short for Windows Future
pression. Romfs is a basic read-only file larger hard drives. (In that regard, HFS is Storage, in a recent speech, Bill Gates called
system that cannot use compression. comparable to FAT/FAT16 from Win- it Windows File System.) WinFS is a data
Squashfs is a read-only file system still in dows computers, which was replaced by storage system that will make information
development that will squeeze as much FAT32 and NTFS.) easier to find. The goal of WinFS is to push
data as possible in the boot area. Both HFS and HFS Plus use B-trees for efficient file-storage capabilities to heights
Reiserfs. The Linux 2.4.x kernel makes cataloging the file system, just as NTFS that most of us can scarcely imagine today.
Reiserfs available. The strength of Reiserfs does. In 2002, Apple added optional jour- Plans for WinFS involve using several
lies in its ability to efficiently handle large naling features to HFS Plus for additional types of technologies, including NTFS.
numbers of small files. security. Even though Apple uses different Microsoft officials have said that WinFS
terminology to describe its file system— won’t replace NTFS but will build on
Mac OS X File Systems Apple calls its data blocks allocation top of it, using NTFS and allowing the
Newer Macintosh computers use the blocks, while Windows calls them clus- strengths of both technologies to work
HFS (Hierarchical File System) Plus file ters—many of the HFS Plus features are together. Industry experts say the ability
system, which debuted in Mac OS 8.1. comparable to FAT32 and NTFS in clus- of WinFS to work on top of NTFS
HFS Plus (also called HFS Extended ter size and disk space allocation. should speed the acceptance of WinFS
HARD HAT AREA - WHITE PAPER
Data Storage In Windows File Systems
epending on the Windows operating system you’re using, you may have a choice between FAT/FAT16, FAT32, and NTFS as your file
D system. As the “Cluster Size” chart and this graphic show, each file system stores data a little differently, which can cause wasted
space within the hard drive (also called slack space).
FAT/FAT16 vs. FAT32 FAT32 vs. NTFS
On a 2GB hard drive or partition, FAT/FAT16 uses a On a 20GB hard drive or partition, FAT32 uses a default
default cluster size of 32KB, while FAT32 uses 4KB clusters. cluster size of 16KB, while NTFS uses 4KB clusters. (Each
(Each small square represents 1KB; the bold-lined squares small square represents 1KB; the bold-lined squares and
and rectangles represent a cluster.) rectangles represent a cluster.)
FAT/FAT16 FAT32 FAT32 NTFS
In the above example of a 31KB file (blue), both types of file sys- In the above example of a 29KB file (blue), both types of file systems
tems have the same amount of slack space (peach). However, in the have the same amount of slack space (peach). However, in the bottom
bottom example of a 40KB file, the FAT32 file system still has six 4KB example of a 50KB file, the NTFS file system still has three 4KB clusters
clusters available as free space (white) and no slack space, while the available as free space (white) and 2KB of slack space, while the FAT32
FAT/FAT16 file system has 24KB of slack space in the second cluster. file system has 14KB of slack space in the fourth cluster.
and Longhorn. If WinFS used a vastly APIs to access data. Each of these tech- Searching. By organizing data in a
different file system than is available in nologies will let the WinFS data model DAG, WinFS opens a new world of
today’s Windows PCs, Longhorn might strongly handle what Microsoft calls possibilities for searching. Users will
experience compatibility problems with the three key components of a data stor- be able to search using multiple criteria,
today’s software packages. In addition, age platform: organization, searching, which is impossible in a tree structure.
customers would probably be leery of and sharing. Microsoft says the WinFS search capa-
making the switch until WinFS had Organization. When organizing and bilities will be superior to the filtering
proven its stability. presenting data, WinFS will follow a capabilities of today’s search engines.
To make data easier to find, WinFS different path from NTFS, which uses a Sharing. Sharing data between users
will use a few different APIs than B-tree structure. WinFS will present and between applications will be an easier
NTFS, however. Look for XML tech- data in what Microsoft calls a DAG process under WinFS. The technology in
nologies to appear in WinFS and pro- (directed acyclic graphic). Data organi- WinFS for sharing data will allow for a
vide the file system with high-end zation will be far more flexible under common security model in Longhorn and
data-labeling capabilities. WinFS will WinFS, with the ability to organize by will work well with other types of tech-
also use relational and object-oriented several methods, including relationships. nologies, such as peer-to-peer networking.
WHITE PAPER - HARD HAT AREA
Fragmentation In Windows File Systems Cluster Size
n a Windows file system, clusters are
W hen the hard drive
is empty, it’s easy
the hard drive (Figure 2).
Although splitting files
system can retrieve an
entire file from adjacent I the smallest possible storage units
on a hard drive. (You can think of a
for the file system to fill across different areas of clusters, it can work
the clusters in order the hard drive doesn’t faster. Running a defrag- cluster as a drawer in a filing cabinet.)
(Figure 1). However, as affect the file, it does menting program will A file can extend over several clusters,
files are deleted and as affect system perfor- rearrange the clusters to but each cluster can only hold one file.
new files are added, the mance because the file try to place them closer Because of this rule, if a file or a por-
file system might not be system must take addi- together (Figure 3). (In tion of a file only occupies a small per-
able to squeeze files into tional time to collect each the graphic, different files centage of a cluster, the remainder of
the available space, forc- portion of the file from are represented by differ- the cluster is wasted, empty space.
ing it to split the file stor- the different areas of the ent colors; each square Clusters play a key role in hard drive
age into different areas of hard drive. When the file represents a cluster.) and system performance. Most experts
agree that a 4KB cluster size is best for
balancing system performance with
minimal wasted space. Clusters that are
too small result in less wasted, empty
space, but they aren’t as efficient in
performance. Clusters that are too
large have the opposite problems.
The Hope For WinFS For example, instead of searching for a Cluster size is dependant on the size of
keyword in the file name of a Word partitions, on the size of hard drives,
Microsoft hopes WinFS can become a
document, users will have the option to and—most importantly—on the file sys-
jack-of-all-trades file system, giving
search for a topic in the actual text of tem in place. The chart below shows
Microsoft a single storage system that can
the Word document as part of a general the default cluster sizes for varying
find stored information quickly while
file search. Developers will also be able sizes of partitions under each type of
working well with a variety of applications.
to use the metadata system in WinFS to Windows file system.
Microsoft says WinFS will be far more
than a file system; it will also deal with search all types of applications at the FAT/FAT16
nonfile data, such as personal contacts and same time, which is a difficult task at Partition/ Default
email messages. best now. Hard Drive Size Cluster Size
Microsoft officials say they expect Many industry experts say the im- 16MB to 127MB 2KB
WinFS will be able to simplify and proved search capabilities of WinFS can’t 128MB to 255MB 4KB
streamline data organizing, searching, come quickly enough. As hard drive sizes 256MB to 511MB 8KB
and sharing. This will be no small task; soar beyond hundreds of gigabytes, 512MB to 1,023MB 16KB
after all, application types currently play improved search capabilities are vitally 1,024MB to 2,047MB 32KB
the key role in determining the storage important. Without an improved search 2,048MB to 4,096MB 64KB
of data. Because each type of applica- capability, efficiently organizing and
tion—whether it’s a database, an email using the huge amounts of data stored on FAT32
server, a file system, or another applica- a hard drive will be next to impossible. Partition/ Default
tion—follows a slightly different Microsoft has many hopes for the future Hard Drive Size Cluster Size
method for storing data, it can be a of WinFS. However, the reality—accord- 0.5GB to 8GB 4KB
nightmare to retrieve and find data ing to industry experts and analysts—is that 8GB to 16GB 8KB
stored by different applications. the daunting task of making WinFS work 16GB to 32GB 16KB
To improve the ability of Windows as expected is going to be extremely diffi- 32GB and larger 32KB
computers to search for data throughout cult for Microsoft. In fact, developing
all applications, WinFS will use metada- WinFS may already have caused a delay in NTFS
ta stored within each file. The metadata the eventual release of Longhorn, which Partition/ Default
system, which will use XML technology, initially was rumored to be early in 2005. Hard Drive Size Cluster Size
will let developers link specific pieces of The release is now expected to occur some- 0.5GB or less 512 bytes
data to each file that will aid in the time later in 2005 or 2006. 0.5GB to 1GB 1KB
search capabilities and write applications 1GB to 2GB 2KB
that allow for more focused searches. by Kyle Schurman 2GB and larger 4KB
HARD HAT AREA - WHITE PAPER
Windows File Systems
icrosoft currently uses Other companies now will use smaller clusters on larger Log File. Whenever changes
M two file systems with its
newest Windows operating
have the option of licensing
Microsoft’s various versions of
hard drives and partitions. (See
the “Cluster Size” chart.) This fea-
occur in the file system, NTFS
records the change in the Log
systems: FAT32 and NTFS. its FAT file system, thanks to a ture greatly improved the perfor- File. It doesn’t record the actual
However, Microsoft’s file system recent decision by the company. mance of FAT32 over FAT, in changes to the data; it just
technology started with FAT. Microsoft says it wants to join a most instances. makes note of when the change
broad industry effort to make occurred, which is useful to par-
FAT/FAT16 core technologies more avail- NTFS ticular types of software, such as
In older versions of Windows able to all kinds of companies Windows NT 3.1, which ap- antivirus software.
(Win95 and older), most hard dri- through licensing. peared in 1993, was the first MFT Mirror. The MFT Mirror
ves used the FAT file system. The Because FAT is a popular for- to use NTFS. WinNT 4.0 users is a copy of the first 16 MFT files,
latest, 16-bit version of FAT goes mat for exchanging media be- had the option of using FAT or which are responsible for aspects
by the name FAT16. Win95 tween computers and digital de- NTFS. Microsoft’s release of the of system operation. NTFS stores
OSR2/98/XP users have the option vices—and because it works well WinXP Home Edition operating the MFT Mirror in a different area
of using FAT or FAT32. (WinXP with many operating systems— system marked the first time of the hard drive than the MFT,
users can also select NTFS.) Microsoft decided to make FAT Windows home users had the allowing the MFT Mirror to serve
FAT can trace its roots to one of the first technologies it’s option of using NTFS instead of as a backup if the MFT suffers
1976, when it appeared in BASIC, offering under the new policy. FAT or FAT32. damage. NTFS reserves the first
and 1981, when it appeared in By having FAT available for Microsoft designed NTFS to 12% of the hard drive for the
Mi-crosoft’s first version of DOS. licensing, Microsoft hopes other be a more reliable and secure MFT, and the 16 system opera-
The initial FAT only worked with companies will have an easier option for network users in a tion files are at the beginning of
hard drives up to 32MB in size. time making compatible prod- corporate environment. NTFS that area. The MFT Mirror is in
As hard drives grew throughout ucts. You can find more informa- technology gave system admin- the middle of the other 88% of
the 1980s and early 1990s, FAT tion on Microsoft’s licensing istrators the option of assigning the hard drive.
morphed in subsequent versions policy at www.microsoft.com permission to individual files Quota Table. Microsoft
of DOS to allow for larger hard /mscorp/ip/. within folders, which allows for added this metadata file to the
drives. By 1991, Microsoft re- flexible control of the network’s MFT in Win2000/XP. The Quota
leased DOS 5.0, and it included a FAT32 most secure files. Table gives you control over the
16-bit FAT, which supported hard FAT32 is a 32-bit file system, One area where NTFS shines amount of hard drive space any
drives up to 2GB in size in which allows it to address more is in its file organization. NTFS directory can occupy. For exam-
Win95/98 and up to 4GB in size clusters than FAT/FAT16. FAT32 stores file attributes in its MFT ple, this feature might be handy
in WinNT. FAT has continued to initially appeared in 1996 with (master file table), which is well at home if you’re sharing a
use 16-bit technology. One of the OSR2 release of Win95, organized and can hold a large computer with the rest of your
FAT’s biggest advantages is its which allowed for hard drives of amount of file attribute informa- family; each person’s folder for
ability to work with computers larger than 2GB. (Win95 OSR2 tion. The MFT is similar to FAT’s personal files can have a limit
running a variety of Windows was only available to computer file allocation table, but the MFT on its size.
operating systems. manufacturers; Win98 was the can store a more complex set of NTFS also uses B-trees when
FAT works best with smaller first retail OS featuring FAT32.) data about the files. dealing with large folder struc-
hard drives because its simplis- Microsoft attempted to make MFT consists of metadata files, tures. (A B-tree is a method of
tic design allows it to work FAT and FAT32 as compatible which NTFS uses to manage the organizing and finding files in a
quickly and provide good data- as possible, and they’re fairly file system. Some of the impor- database or on a hard drive.) The
access times on a small hard similar. However, a few signifi- tant MFT metadata files include: B-trees let NTFS precisely orga-
drive. FAT’s simplistic design cant differences can cause Bad Cluster File. NTFS uses nize the attributes for the files
hampers its ability to work problems with running certain this metadata file to mark any and folders within the large fold-
through a lot of data on larger types of older software on malfunctioning clusters on the er structure, making for efficient
hard drives, though, resulting in FAT32 systems. For example, hard drive and avoids storing searches. This folder structure
poor data-access times. FAT also some hard drive compression data there. wasn’t available with FAT. v
makes inefficient use of hard software made for FAT will not Cluster Allocation Bitmap.
drive space, which is a problem run on FAT32. This map of all clusters on the See www.cpumag.com/cpufeb
that’s compounded on larger FAT32’s biggest advantage partition helps the file system 04/filesystems for information
hard drives. over FAT/FAT16 is its ability to find any available clusters. on NTFS search methods.
NTFS Search Methods
NTFS uses B-trees to aid in organizing and finding files on the
hard drive. (A B-tree diagram is shown below.) At right, you can
see that, in most instances, a tree-type search is faster than a
brute search, which goes from top to bottom in a list. Within the
NTFS tree search, each record specifies whether the file in ques-
tion is above or below the current position. As the search pro-
gresses, the number of files remaining to be searched continues
to be cut in half.
SOURCE: DIGIT-LIFE.COM, WHATIS.COM
Related docs
Get documents about "