General and detailed project description.docx - Assembla
Document Sample


KASARDIA – General Presentation and description
Table of Contents
1. Introduction and motivation ................................................................................................. 5
1.2 Project overview, client side ............................................................................................ 6
1.2.1 XMPP Overview........................................................................................................... 7
1.2.1.1 Jabber-Net library ..................................................................................................... 8
1.2.2 Windows File System Drivers ...................................................................................... 8
1.2.2.1 File System API ........................................................................................................ 8
1.2.2.2 Kernel Level API ...................................................................................................... 9
1.2.2.3 Driver Based API ...................................................................................................... 9
1.2.2.4 File System Filter driver ......................................................................................... 10
1.2.3 C# and .NET Framework............................................................................................ 11
1.2.3.1 C# ............................................................................................................................ 11
1.2.3.2 .NET Framework .................................................................................................... 11
1.3 Project overview, server side ......................................................................................... 12
1.3.1 Ignite – OpenFire ........................................................................................................ 12
1.3.2 PostgreSQL ................................................................................................................. 13
1.3.3 XIFF API .................................................................................................................... 13
1.3.4 ASP .NET ................................................................................................................... 14
1.4 Definition of terms ......................................................................................................... 15
1.4.1 System ........................................................................................................................ 15
1.4.2 Actor ........................................................................................................................... 15
1.4.3 Client entity ................................................................................................................ 15
1.4.4 Server entity................................................................................................................ 15
1.4.5 Sync manager entity ................................................................................................... 16
1.4.6 Database manager entity............................................................................................. 16
1.4.7 Storage manager entity ............................................................................................... 17
1.4.8 Backup and restore manager....................................................................................... 17
1.4.9 File version ................................................................................................................. 17
1.4.10 Web Interface ............................................................................................................. 18
1.4.11 Sharing manager ......................................................................................................... 19
1.4.12 VHD............................................................................................................................ 19
1.4.13 Windows Client GUI .................................................................................................. 20
1.4.14 Windows Client driver................................................................................................ 21
1.4.15 Windows client libraries ............................................................................................. 22
1.4.16 Windows mobile client GUI ....................................................................................... 23
1.4.17 Windows mobile FileSystem Watcher ....................................................................... 25
2. Architecture and implementation ....................................................................................... 26
2.1 General Architectural design .............................................................................................. 26
2.2.1 XMPP Server Architecture .............................................................................................. 26
2.2.2 Server Side Layout ........................................................................................................... 30
2.2.3 Server components layout ................................................................................................ 31
2.2.4 Packet Layout ................................................................................................................... 33
2.3 Windows Mobile client architecture ................................................................................... 36
2.3.1 Window mobile architecture overview ............................................................................ 37
2.3.2 The DFS file system ......................................................................................................... 39
DFS Architecture ...................................................................................................................... 40
2.3.3 File system watcher .......................................................................................................... 43
2.4.1 Windows desktop architecture overview ......................................................................... 44
2.4.2 Windows desktop client file system driver ...................................................................... 47
2.4.2.1 NTFS File System Overview ........................................................................................ 47
Overview ................................................................................................................................... 47
Internals ..................................................................................................................................... 48
Limitations ................................................................................................................................ 48
Advantages of NTFS ................................................................................................................. 49
Disadvantages of NTFS ............................................................................................................ 50
2.4.2.2 Windows file system filter driver .................................................................................. 51
2.4.2.3 File sytem filter driver internal implementation ........................................................... 53
2.4.2.3.1 Filespy.c ..................................................................................................................... 54
2.4.2.3.2 FsMonitorCreateEngine.c .......................................................................................... 56
2.4.2.3.3 FsIoImplementation.c ................................................................................................ 57
2.5.1 VHD Overview ................................................................................................................ 59
2.5.1.1 VHD Background ................................................................................................... 59
2.5.1.2 Multiple types of VHDs .......................................................................................... 60
2.5.1.3 VHD infrastructure – primary components ............................................................ 60
2.5.1.4 Virtual Disk IO Data Flow ...................................................................................... 61
2.5.1.5 VHDs are disks ....................................................................................................... 62
2.5.1.6 VHD Native Features .............................................................................................. 62
2.5.1.7 Remote mounting .................................................................................................... 63
2.5.1.8 VHD dismount and removal ................................................................................... 63
2.5.1.9 VHD resctrictions ................................................................................................... 64
2.5.1.10 Issues for filter writers ............................................................................................ 64
2.5.1.11 Resolving issues for filter writers ........................................................................... 64
2.5.1.12 Routines for filter writers ........................................................................................ 65
1. Introduction and motivation
This project implements a file synchronization and backup in the cloud mechanism between
multiple devices, independent of their hardware platform or software (operating system)
platform. Basically what the application does it monitors one or more file system paths, and
when a modification is detected on a file it will sync it to the other devices in the mesh, and also
create a version of that modified file in the cloud. The protocol through which all of this is done
is XMPP. The user will be able to control any of his meshed devices from any computer (ex:
from an internet coffee) that has an internet connection and a web browser, so he could control
them even if he is not standing in front of any of his devices.
The accomplishment of this project will help make collaboration and proactive file
synchronization easier, and having all you important data, centralizes in the cloud. Chances are
big that you have a lot of important stuff on your computer like financial documents, email,
digital photos, music and more. Unfortunately, computers are vulnerable to hard drive crashes,
virus attacks, theft and natural disasters, which can erase everything in an instant. Current
statistics show that one in every ten hard drives fail each year. The cost of recovering a failed
hard drive can exceed $7,500, and success is never guaranteed.
Living in a more and more dynamic IT world the technology users expect proactive behavior
from their devices, expect all their personal devices to be connected, synched and reachable,
even though they are not standing in front of them.
Figure 1 - Sync Picture
In the above picture we could see a typical mesh of 3 devices, one laptop, one desktop, and
one mobile phone. The picture showing a globe is the symbol, showing the devices could be
accessed from the web. As you can see all the devices are connected and in sync. If one device
makes a modification the modification is sent to the other two devices. This way they will
always be in sync with one another.
With the help of virtualization the Graviton project will allow you to manipulate virtual hard
disks that do not even exist on your local storage, but rather in the cloud. So you could create,
mount, assign drive letters or demount virtual hard disks from the cloud. This results in a better
and way cost effective backup system. Graviton will make scheduled backups, of you important
data in your personal virtual hard disks on the cloud. So rather than saving you data on you
device, and eventually lose, it, you can just make a click to mount you virtual hard disk, and
copy all your data there, and in the cloud. Virtual hard disks will also help computers boot from
the cloud.
Having said that graviton is also collaboration software, it will also make possible, data
sharing between registered users. Users will be able to share any of their data on their personal
computers with other registered users, and set different permissions like read-only, write, read
write, delete, execute.
1.2 Project overview, client side
As discussed in previous chapter the Graviton service, will be able to serve clients respective a
certain in house protocol, independent of their hardware platform (laptop, mobile phone,
desktops) and software platform (Microsoft Windows, Linux, MAC OS, etc..).
For this project specifically, the implemented client is for Microsoft Windows operating
system, and the core of the syncing engine is a file system filter driver, above the NTFS or FAT
file system driver layer.
The GUI application is implemented using C# and the libraries that link the driver with the
GUI are written in ANSI C / C++.
To make anything happen, first of all the clients on each device need to understand the XMPP
protocol RFC 3920 and RFC 3921 . To communicate with an XMPP server, I chose the Jabber-
Net library for easier access, control and greater manipulation of the protocol.
1.2.1 XMPP Overview
Extensible Messaging and Presence Protocol (XMPP) is an open, XML-based protocol
originally aimed at near-real-time, extensible instant messaging (IM) and presence information
(e.g., buddy lists), but now expanded into the broader realm of message oriented middleware.
It remains the core protocol of the Jabber Instant Messaging and Presence technology. Built to be
extensible, the protocol has been extended with features such as Voice over Internet Protocol and
file transfer signaling.
Unlike most instant messaging protocols, XMPP is an open standard. Like e-mail, it is an open
system where anyone who has a domain name and a suitable Internet connection can run his own
Jabber server and talk to users on other servers. The standard server implementations and many
clients are also free and open source software.
The Internet Engineering Task Force (IETF) formed an XMPP Working Group in 2002 to
formalize the core protocols as an IETF instant messaging and presence technology. The XMPP
WG produced four specifications which were approved by the IESG as Proposed Standards in
2004. RFC 3920 and RFC 3921 are now undergoing revisions in preparation for advancing them
to Draft Standard within the Internet Standards Process. The XMPP Standards Foundation
(formerly the Jabber Software Foundation) is active in developing open XMPP extensions.
However, no technology correctly implements the RFCs in full.
XMPP-based software is deployed on thousands of servers across the Internet and by 2003
was used by over ten million people worldwide, according to the XMPP Standards Foundation.[2]
Popular commercial servers include the Gizmo Project, Nimbuzz and Google Talk. Popular
client applications include the freeware clients offered by Google, Nimbuzz and the Gizmo
Project, multi-protocol instant messengers such as iChat and Pidgin (formerly Gaim), and free
dedicated clients such as Psi and Gajim. Google Talk provides XMPP gateways to its service.
Google Wave's federation protocol is an open extension to the XMPP protocol.
1.2.1.1 Jabber-Net library
Jabber-Net is a set of libraries for accessing Jabber functionality from .NET. It is written in
C#, but should be accessible from other .NET languages such as VB.NET. Components exist for
connecting to a Jabber server either as a client or as a component. As you explore, you'll find
there are some other goodies buried inside, like Trees, CommandLine processing, etc.
The library consists of .NET controls for sending and receiving Extensible Messaging and
Presence Protocol (XMPP), also known as the Jabber. The library can handle client connections,
server component connections, presence, service discovery, and the like.
1.2.2 Windows File System Drivers
In computing, a file system (often also written as filesystem) is a method for storing and
organizing computer files and the data they contain to make it easy to find and access them. File
systems may use a data storage device such as a hard disk or CD-ROM and involve maintaining
the physical location of the files, they might provide access to data on a file server by acting as
clients for a network protocol (e.g., NFS, SMB, or 9P clients), or they may be virtual and exist
only as an access method for virtual data (e.g., procfs). It is distinguished from a directory
service and registry.
More formally, a file system is a special-purpose database for the storage, organization,
manipulation, and retrieval of data.
1.2.2.1 File System API
A file system API is an application programming interface through which an operating system
interfaces with file system code. The operating system usually provides abstractions for
accessing different file systems transparently to userland programs, and in this sense it is
analogous to device driver APIs that provide abstracted access to hardware.
Some file system APIs may also include interfaces for maintenance operations, such as
creating or initializing a file system, verifying the file system for integrity, and defragmentation –
although these are more often implemented independently from the file system code.
Microsoft Windows NT, 2000, and XP have a default file system API known as a file system
driver for the NTFS, FAT and FAT32 file systems.
1.2.2.2 Kernel Level API
The API is "kernel-level" when the kernel not only provides the interfaces for the filesystems
developers but is also the space in which the filesystem code reside.
It differs with the old schema in that the kernel itself uses its own facilities to talk with the
filesystem driver and vice-versa, as contrary to the kernel being the one that handles the
filesystem layout and the filesystem the one that directly access the hardware.
It isn't the cleanest scheme but resolves the difficulties of major rewrite that has the old
scheme.
With modular kernels it allows adding filesystems as any kernel module, even third party
ones. With non-modular kernels however it requires the kernel to be recompiled with the new
filesystem code (and in closed-source kernels, this makes third party filesystem impossible).
Unixes and Unix-like systems such as Linux have used this scheme.
There is a variation of this scheme used in MS-DOS (DOS 4.0 onward) and compatibles to
support CD-ROM and network filesystems. Instead of adding code to the kernel, as in the old
scheme, or using kernel facilities as in the kernel-based scheme, it traps all calls to a file and
identifies if it should be redirected to the kernel's equivalent function or if it has to be handled by
the specific filesystem driver, and the filesystem driver "directly" access the disk contents using
low-level BIOS functions.
1.2.2.3 Driver Based API
The API is "driver-based" when the kernel provides facilities but the filesystem code resides
totally external to the kernel (not even as a module of a modular kernel).
It is a cleaner scheme as the filesystem code is totally independent, it allows filesystems to be
created for closed-source kernels and online filesystem additions or removals from the system.
Examples of this scheme are the Windows NT and OS/2 respective IFSs.
1.2.2.4 File System Filter driver
A file system filter driver intercepts requests targeted at a file system or another file system
filter driver. By intercepting the request before it reaches its intended target, the filter driver can
extend or replace functionality provided by the original target of the request. Examples of file
system filter drivers include anti-virus filters, backup agents, and encryption products. To
develop file systems and file system filter drivers, use the IFS (Installable File System) Kit,
which is provided with the Windows Driver Kit (WDK).
Filter Manager and Minifilters Basics. The Filter Manager is a file system filter driver
provided by Microsoft that simplifies the development of third-party filter drivers and solves
many of the problems with the existing legacy filter driver model, such as the ability to control
load order through an assigned altitude. A filter driver developed to the Filter Manager model is
called a minifilter. Every minifilter driver has an assigned altitude, which is a unique identifier
that determines where the minifilter is loaded relative to other minifilters in the I/O stack.
Altitudes are allocated and managed by Microsoft.
1.2.3 C# and .NET Framework
1.2.3.1 C#
C# (pronounced "C Sharp") is a multi-paradigm programming language encompassing
functional, imperative, generic, object-oriented (class-based), and component-oriented
programming disciplines. It was developed by Microsoft within the .NET initiative and later
approved as a standard by Ecma (ECMA-334) and ISO (ISO/IEC 23270). C# is one of the
programming languages designed for the Common Language Infrastructure.
C# is intended to be a simple, modern, general-purpose, object-oriented programming
language. Its development team is led by Anders Hejlsberg, the designer of Borland's Turbo
Pascal. It has an object-oriented syntax based on C++. It was initially named Cool, which stood
for "C-like Object Oriented Language". However, in July 2000, when Microsoft made the project
public, the name of the programming language was given as C#. The most recent version of the
language is 3.0 which was released in conjunction with the .NET Framework 3.5 in 2007. The
next proposed version, 4.0, is in development.
1.2.3.2 .NET Framework
The Microsoft .NET Framework is a software framework that can be installed on computers
running Microsoft Windows operating systems. It includes a large library of coded solutions to
common programming problems and a virtual machine that manages the execution of programs
written specifically for the framework. The .NET Framework is a key Microsoft offering and is
intended to be used by most new applications created for the Windows platform.
The framework's Base Class Library provides a large range of features including user
interface, data and data access, database connectivity, cryptography, web application
development, numeric algorithms, and network communications. The class library is used by
programmers, who combine it with their own code to produce applications.
Programs written for the .NET Framework execute in a software environment that manages
the program's runtime requirements. Also part of the .NET Framework, this runtime environment
is known as the Common Language Runtime (CLR). The CLR provides the appearance of an
application virtual machine so that programmers need not consider the capabilities of the specific
CPU that will execute the program. The CLR also provides other important services such as
security, memory management, and exception handling. The class library and the CLR together
constitute the .NET Framework.
Version 3.0 of the .NET Framework is included with Windows Server 2008 and Windows
Vista. The current version of the framework can also be installed on Windows XP and the
Windows Server 2003 family of operating systems. A reduced version of the .NET Framework is
also available on Windows Mobile platforms, including smartphones as the .NET Compact
Framework. Version 4.0 of the framework was released as a public Beta on 20 May 2009.
1.3 Project overview, server side
Having talked a little about the client side, and saw what components it needs, and how they
work individually, now it is time to talk a little about the server side.
The server is actually the most important component of all as it needs to handle request from
thousands of clients, connected from all sorts of different devices.
As well as the client, the server will also have to support and implement the XMPP protocol
that we have discussed earlier, in chapter 1.2.1. For this to happen the server machine will have
installed an XMPP server, and I chose, the server from Ignite, called OpenFire (will discuss
about that soon). The OpenFire server will work with multiple database servers from Microsoft
SQL to mySQL, but out server will make use of PostgreSQL
As I mentioned earlier, the service would also be provided through the web browser, so the
server will also have to provide web interface interaction with the user. This is done using ASP
.NET and an open fire library for flex, called XIFF API.
1.3.1 Ignite – OpenFire
OpenFire is a real time collaboration (RTC) server licensed under the Open Source GPL. It
uses the only widely adopted open protocol for instant messaging, XMPP (also called Jabber).
Openfire is incredibly easy to setup and administer, but offers rock-solid security and
performance. Most administration of the server is done through a web interface, which runs on
the ports 9090 (HTTP) and 9091 (HTTPS) by default. Administrators can connect from
anywhere and edit the server's settings, add and delete users, conference rooms, and so forth.
Openfire supports the following features:
- Web-based administration panel
- Plug-in interface
- Customizable
- SSL/TLS support
- User-friendly web interface and guided installation
- Database connectivity (i.e. embedded Apache Derby or other DBMS with JDBC 3 driver)
for storing messages and user details
- LDAP connectivity
- Platform independent, pure Java
- Full integration with Spark Jabber client
The proprietary extension to Openfire allows multiple server instances to work together in one
clustered environment.
1.3.2 PostgreSQL
PostgreSQL is an object-relational database management system (ORDBMS). It is released
under a BSD-style license and is thus free software. As with many other open-source programs,
PostgreSQL is not controlled by any single company, but has a global community of developers
and companies to develop it.
1.3.3 XIFF API
XIFF is an Open Source Flash library for instant messaging and presence clients using the
XMPP (Jabber) protocol. XIFF includes an extension architecture that makes it easy to add
functionality for additional protocol extensions, or even your own special-needs extensions.
There are quite a few extensions already included in the library, giving it support for XML-RPC
over XMPP (XEP-0009), Multi-user conferencing (XEP-0045), Service browsing (XEP-0030),
and XHTML message support (XEP-0071).
1.3.4 ASP .NET
ASP.NET is a web application framework developed and marketed by Microsoft to allow
programmers to build dynamic web sites, web applications and web services. It was first released
in January 2002 with version 1.0 of the .NET Framework, and is the successor to Microsoft's
Active Server Pages (ASP) technology. ASP.NET is built on the Common Language Runtime
(CLR), allowing programmers to write ASP.NET code using any supported .NET language.
.NET pages, known officially as "web forms", are the main building block for application
development. Web forms are contained in files with an ".aspx" extension; in programming
jargon, these files typically contain static (X)HTML markup, as well as markup defining server-
side Web Controls and User Controls where the developers place all the required static and
dynamic content for the web page. Additionally, dynamic code which runs on the server can be
placed in a page within a block <% -- dynamic code -- %> which is similar to other web
development technologies such as PHP, JSP, and ASP, but this practice is generally discouraged
except for the purposes of data binding since it requires more calls when rendering the page.
1.4 Definition of terms
1.4.1 System
We will call the ―system‖, the actual product under this specification.
1.4.2 Actor
An actor is an entity that has a specific role and which plays in its relationship with the system,
interacts with the system, makes use of the system‘s resources.
1.4.3 Client entity
The ―client entity‖ can be defined as any device or PC that has a file system and can follow the
server protocol. A device connected to the server will be recognized as a client if it cans login to
the server with a valid username and password. After that, it should be able to create XMPP IQ
requests that the server can process.
An entity that will successfully login to the server, but will not be able to send any valid
XMPP IQ, will also be recognized as a client, but the server will not be able to provide any
functionality.
1.4.4 Server entity
The ―server entity‖ is a component which handles Client requests, and syncs data throughout a
set of rules. The server entity is actually a set of more components put together to offer the client
the functionality it requests. The server is composed of the XMPP server, sync manager
(embedded in-house in the XMPP server), database manager, storage manager.
The server will recognize a client only after, the entity connected to it, will successfully login.
Only after this the server will provide functionality for the client requests. If the server does not
recognize the requests, than he will send back to the client an appropriate error message. This
will happen if a request type is recognized but cannot be fulfilled. A success message will be sent
to the requestor client if the request was processed. Eventually the request data will be given
back to the client.
1.4.5 Sync manager entity
The ―sync manager‖ is a server component, which based on a set of user rules, syncs data files
among devices. The sync manager is embedded in the XMPP server as a plug-in. The sync-
manager will work through a set of in-house XMPP stanza requests (ex: IQ UpdateFileRequest,
with parameters: update size, update offset, and buffer).
The sync manager is the most complicated part of the project to implement and it requires
being the most efficient, requiring a small cache manager on the server, for some kind of updates
to work really fast.
The sync manager will do such tasks as assuring that one ―synced file‖ is on all the devices. A
file is considered synced if the newest version of that file is on all the devices in the same time.
Further modification to that file, will remove the ―synced file‖ bit from that file.
The user can choose to manually sync one file across his devices, if any problems or
malfunctions appear in the sync process.
1.4.6 Database manager entity
The database manager is the actor that deals with database/table management. The database
manager, basically knows the whole organization of the project in its background, things like
how tables are related to one another, what query should you make to see the protected folders of
one user, or the users space quota.
The database manager lies on top of the PostgreSQL SGBD. The layer on top will be an API,
dll library like with all sorts of in-library queries. (Ex: LONG GetUserFreeSpaceQuota (IN char
*Username, OUT PLARGE_INTEGER *UserQuota) will put in UserQuota the free space user
quota for the specified user, and will return the appropriate error code if any error);
1.4.7 Storage manager entity
The storage manger will all storage related issues on the server, like total space, free space,
user folders, VHD option and settings. The user will have the option to choose if they will want
their storage on a VHD or on the server‘s direct storage; this will also be handled by the storage
manager.
Of course the storage manager is in really close relationship with the DB manager, all queries
and SETs he will make will reflect in the databases. The storage manager will also allow the user
to extend his current space quota as well as go to a lower space quota.
1.4.8 Backup and restore manager
As its name is very suggestive, the backup and restore manager will, be server component that
will maintain user‘s backup meta-data, like backup schedules, backed-up folders, or file versions.
The backup and storage manager will know what server folder paths a certain user has that he
uses to make the backup happen. The restore manager will help the user choose which files he
wants to restore from the server. The user will have to choose files to restore according to date,
or version (see FileVersion).
The restore manager will allow the user to choose his interface of restoring the files. The user
can choose to either download the files via HTTP or FTP, or the files will be automatically
downloaded by the desktop application, and restored where needed.
The restore manager will work to restore the files even if the user is not in front of any of his
devices. The user can choose to make a web login from a web browser, and download any file
need at the computer he is in front of.
The restore manager basically offers the user interfaces of access to its files. It is tightly
integrated with the sharing manager (see chapter 1.4.11)
1.4.9 File version
A ―file version‖ of a file is any previous content of a file that lies on the server. The file
versioning that the server will support will vary on the file size or version history. The user can
impose the server not to support more than a number of versions of file. If then number of
versions exceeds the maximum for a file, the server will automatically delete the first versions of
the file, and shift the versions, so the last version is always backed up and present on the server.
1.4.10 Web Interface
Actor interface to easily, remotely and securely access backed up and storage data with the
help of your web browser. The web interface will help the user interact with his files if he is not
in front of one of his computers.
The user will be able to login online, and control any of its devices, perform actions on them
like (copy, delete, rename, modify files).
Figure 2 showing the "Welcome" web interface. The connected user can see his devices,
which are online, and which not. If logged from a new device, he can choose to add the new
device to its sync mesh. The user can see its Storage and remote control his devices
The user will be able to perform file sharing operations, via web, but also access shared files
by others.
The user will be also able to register a new user, change his password or change settings from
the browser.
1.4.11 Sharing manager
A very important component of the service is the sharing manager. The user can choose to
share files on his devices with other registered users or unregistered users, making them public.
The sharing manager works closely with the restore manager, being able to offer interfaces to
reach the desired files.
The user will be able to select the users which he will want to share the files with, and set
permissions, like read-only, read-write, write-only, etc.
The public files that one user will expose will be available either through a web-link or an
anonymous login to an ftp server. The files will be available for download for a determined
period of time (default 30 days).
1.4.12 VHD
A Virtual Hard Disk (VHD) is a file format containing the complete contents and structure
representing a Hard Disk Drive, and is used to store virtual operating systems and their
associated programs in a single file by various virtualization programs or a virtual machine. The
format was created by Connectix which was later acquired by Microsoft for Virtual PC. Since
June 2005 Microsoft has made the VHD Image Format Specification available to third parties
under the Microsoft Open Specification Promise.
Virtual Hard Disks allow multiple operating systems to reside on a single host machine. This
enables developers to test software on different operating systems without the cost or hassle of
actual hardware.
The ability to directly modify a virtual machine‘s hard disk from a host server supports many
applications, including:
- Moving files between a VHD and the host file system
- Backup and recovery
- Antivirus and security
- Image management and patching
- Disk conversion (physical to virtual, and so on)
- Life-cycle management and provisioning
1.4.13 Windows Client GUI
The ―Windows Client GUI” is the actually the main interface for the user to interact with.
The user is able to indirectly control the ―Sharing manager‖ (see 1.4.11), the ―Backup and
restore manager‖ (see 1.4.8), the ―Sync Manager‖ (see 1.4.5) or the settings from the ―Windows
driver client‖ (see 1.4.14).
The Windows Client GUI (WCG) will provide also the user the ability to administrate his
contacts. He can add, delete or customize contact details for any of its contacts. He can choose to
share files, see what others had to share with you and also collaborate.
Figure 3 showing a typical contacts tree with three groups, 2 offline contacts and one
online contact, and a search bar, with only the matched users
As shown in Figure 1 one logged in user can see its contact roster, and search through his
contacts. He could also choose to, make contact groups to manage them with more ease.
As Figure 1 indicates, the logged in user can choose to manage any of its devices from the one
he‘s connected at.
The WCG is more, a layer between the managers on the servers and the services offered and
the client, trying to gain access to them.
1.4.14 Windows Client driver
The ―Windows Client driver” component is one the most hard to implement, and the one that
makes a difference between a normal file sync and collaboration application, and Graviton.
The, driver type, is a FileSystemFilter driver, which can be configured to watch certain file
system paths, and look for any modifications to files. The driver will have a local backup folder
where it will save the files as they are modified. The driver will make backup copies, detect file
deletion, detect file modification (offset and length), or file creation.
The driver also provides callbacks to the user mode application to register to be called when a
certain update happens in the driver.
The user application will send each update from the driver to the server and then the server
will route it to all the devices in the mesh. The driver will ignore any modifications made by the
user mode application. This way the user mode application can patch the files without having to
worry about the driver interpreting them as file modifications.
The trick is that the driver will send only the modified part of the file to the user mode
application and the offset and length. So if a huge file is partly modified, the driver will not send
the user a request to sync the whole file, but the modified part. This way the client saves band
width, and uploading time. Each update like this can mean one version of the file, if the user
chooses to have the files version for each update. The user can configure how the versions will
be made, according to number of versions of storage occupied.
One last thing that needs to be mentioned for the driver is that the driver cannot work if
attached to a VHD, because it does not meet the concurrency requirements. A version for this
driver will be released and, will offer functionality only through a callback interface, to the user
and an API will be exposed as a dynamic library to make direct calls and setting to the driver on
runtime.
1.4.15 Windows client libraries
The Windows client libraries consist of .NET dynamic link libraries or normal WIN32/64
libraries that the whole application uses to offer the client the functionality it expects. Such
libraries expose functions to manipulate entities like the sync manager (see 1.4.5), backup and
restore manager (see 1.4.8), or sharing manager (see 1.4.11).
A typical library like this will help the programmer easily integrate in the main application a
login box, a register box or even a contacts tree view with all the contacts shown as they are
online or offline.
Figure 4 the login box is generated by a function call in the Login library. The library
also exposes functions that support the login without a login form.
As shown in Figure 3 the, programmer can choose to make a login call from the login library
and the login box will automatically be shown from the API. For example the login library class
has four constructors:
- public GaMiTechXMPPLoginUser()
- public GaMiTechXMPPLoginUser(string User, string Pass)
- public GaMiTechXMPPLoginUser(string User, string Pass, string ServerName, string
NetworkHostName, int port)
- public GaMiTechXMPPLoginUser(JabberClient jClient)
, and exposes the following public methods:
- public JabberClient Login()
- public JabberClient Login(ref GaMiTechLoginResult LogResult)
- public JabberClient LoginNoForm()
- public JabberClient LoginNoForm(bool RembemberCredentials)
Each of these functions will use the private variable of type JabberClient and try to log it in to
the server.
The class also exposes some public variables showing the last error code and the login status.
This way all the programmer has to do is just make a new instance of the class
GaMiTechXMPPLoginUser and just call Login(). As you can see there is also an option, if no
form is required, for a login with no form. This is typically done when the user wants to
remember its credentials on the machine he is on, and the application will automatically log
him/her in without showing the form.
The model of the GaMiTechXMPPLoginUser is followed for the other libraries as well.
Basically the Windows client libraries will enhance programming speed and offer new and
ways of implementing the main application
1.4.16 Windows mobile client GUI
As mentioned earlier, the application will be able to keep in sync even the mobile smartphones
or pocket PC‘s. All that is needed is that your mobile devices have and operating system and can
connect to the internet. Of course not all the functionalities are implemented in the mobile
version of the application due to the mobile devices limitations, but the necessary and most
important functionalities will be there, like file sync, sharing, backup and restore.
Figure 5 show a devices emulator, emulating a Pocket PC using Windows Mobile 5.0.
The current running application is made to make a proof of concept that the data is sent
from the mobile phone to a Linux server
As suggested by Figure 4, the first client which I will implement will only support windows
mobile clients. The server protocol of course is platform independent, and the clients can be
multi platform. Because of the ease with which the Windows mobile client can be implemented I
choose to start with it.
1.4.17 Windows mobile FileSystem Watcher
The Windows mobile FileSystem Watcher class is probably the biggest and most important
piece of needed code that this project will need for mobile devices.
Normally, Microsoft does not have a FileSystemWatcher class for the .NET Compact
framework. Normally if the functionality should be minimum for the mobile devices using
Windows Mobile, than the FileSystemWatcher class is a must. Luckily there is an open source
community, which handles projects for the .NET Compact framework to extend its functionality.
OpenNETCF is committed to open source projects to help the mobile and embedded
development community in their projects whether it be enterprise development or commercial
development.
The only difference here between the windows client and the windows mobile client is that,
the FileSystemWatcher (on the mobile client), does not come close to the functionality offered
by the driver (in the windows client). The compromise is little though, due to the fact that mobile
devices do not have large files, and, also do not have such high I/O activity. So this being said
the conclusion is that the windows mobile client will only sync file, with their entire size, not
parts of the file.
2. Architecture and implementation
2.1 General Architectural design
As discussed in the Introduction and motivation chapter, the main modules that the
application must implement to have minimum functionality are the following: a server using the
XMPP protocol as the main mean of communication, a desktop client (on a Microsoft Windows
platform) and a mobile client (on a Microsoft Windows Mobile platform). The server should also
have some modules implemented, such as the sync manager, backup and restore manager,
database manager, device manager, sharing manager, storage manager, and the web
interface from the server. A server having these modules implemented should be able to handle
any sync requests, upload or download requests, or restore requests. The desktop client should
also have some modules implemented to offer the minimum functionality such as: the windows
client GUI, windows client driver, windows client libraries. After having these functionalities
implemented a client should be able to correctly make requests to the server and offer the user
the functionality expected. The mobile client should also have a minimum set of modules before
it can communicate with the server, such as: windows mobile client GUI, windows mobile
client libraries (including libraries for file system watchers or XMPP protocol libraries).
2.2.1 XMPP Server Architecture
As discussed in the first chapter, we will use a server called OpenFire, that already has the
XMPP protocol implemented, but why I chose XMPP, what architectural benefits and drawbacks
has, what is the compromise and how does it help me better implement the application to its final
stage is something that I will talk in this second chapter.
First of all, what is Jabber, or a XMPP server?
Jabber enables you to provide built-in or client-based services based on an open,
asynchronous, extensible, decentralized, and secure XML protocol riding directly on TCP/IP to
provide real-time exchange of messages and presence information between two endpoints on the
open Internet or between the open Internet and a corporate intranet. Quite a mouthful, wasn‘t it?
Let us pick that statement to bits and explain a bit more fully.
…Built-in Services…
A Jabber server is the combination of a message switch and service backplane, which hosts
slots for plug-in components. There are three ways to fill a plug-in slot in the Jabber backplane
and a fourth way to offer a service that looks like an ordinary client, explored in the next section
and shown in Figure 1.1. A server is responsible for providing the following minimal set of
services:
- Handling client connections and communicating directly with Jabber clients
- Communicating with other Jabber servers to give clients homed on other servers seamless
access to clients homed on this server
- Acting as a ―plug-in manager‖ to manage and pass messages between components loaded
with the server
Library Modules
First, you can create a component by creating a library module and then linking it directly into
a Jabber server. When the server is started/restarted, the service becomes available to clients and
there is a protocol for presenting loaded services to clients. These are shown in Figure 1 as ―lib
modules.‖They are intimately entwined with the server and incur the least run-time overhead of
the service connection methods—at the expense of requiring development in a C-language
environment.
TCP/IP Sockets
For those of us who have sworn off programming in C except in dire emergent cries, it is also
possible to install a service across a TCP/IP socket connection and speak back and forth to the
Jabber server that way. If you want to write in Java or Python or Perl, you‘re free to do it that
way. As long as you write the correct XML messages back and forth, the server considers the
component ―plugged in.‖ A nice advantage here is that you can load level your components cross
a set of servers to create a naturally scalable set of services. These socket-based services can be
either the initiator of the connection (that is, the server waits for them to connect) or the target of
the connection (that is, the server attempts to connect to them when it starts).This provides
additional flexibility in that the server and its component services don‘t necessarily have to be
started all together.
High-Level Architecture
The Jabber server is made up of several components that interact through exchanging
messages on an internal message bus. OpenFire includes the components that are needed to get
an instant messaging server up and running, but not only:
- Jabber Session Manager (JSM).The JSM manages the registration of new user
accounts, authenticates users, and manages presence information. It is by far the largest, most
complicated component shipped with OpenFire.
- c2s (Client-to-Server). C2s handles connections between the server and its clients. Its
main job is formatting and routing messages between clients and other components (almost
always JSM in the case of simple instant messaging).
- s2s (Server-to-Server). S2s handles connections between the server and other servers.
The protocol for s2s connections is slightly different than c2s connections and this component
speaks that protocol.
- Xdb (XML Database). Xdb responds to messages to store and retrieve data. It‘s the
shared persistent storage mechanism for the Jabber server.
- Logger. Logger services receive messages from other components intended to track the
server‘s actions. Logged messages include things such as user logins and logouts, errors, and so
on. The standard OpenFire configuration includes two loggers: elogger for errors and rlogger for
all other system events.
- Dnsrv (DNS Service). Dnsrv resolves server names to IP addresses. Server names are
almost always DNS host names, so this is a straightforward function.
A couple common additional components are
- Jabber User Directory (JUD). The JUD component provides services that enable clients
to publish their contact information and to query for other clients‘ information.
Each component has a Jabber ID (JID) that distinguishes it from other components. These
components exchange messages among themselves over a data bus that routes messages to the
appropriate component based on the type of message and the destination JID.
In addition to the <message>, <presence>, and <IQ> messages that are exchanged between
clients and servers, server components also exchange three other types of messages: n <log>.
Finally, probably one last thing that needs to be mentioned apart from this general server
description is that, that protocol supports multiple user login, with the usage of a login resource.
So having this, our client can login from multiple devices, using different login resources. A
login resource is actually, a way of telling the server, you are logged in from a certain location,
or a certain client, or a certain computer. If a resource with that name already is logged-in the
server will disconnect it, and bind you with that resource. A problem can appear if auto login is
set, because this way none of the clients will stay connected, due to resource name conflict.
Conclusions:
- The XMPP jabber server is library and module based, it is easy to configure and manage.
- Its layer on top of the TCP/IP stack makes the jabber server work really efficient and fast.
- Having the TCP/IP layer, makes the protocol is real-time.
- The protocol simulates peer-to-peer protocols, the clients can easily communicate
through XML stanzas, intermediated by the server.
- The server supports multiple logins from the same user with the usage of JID resources,
so in our case, each device is a resource, the web will also have its resource.
- The databases are very configurable.
- The server supports secure communication, and secure logins.
As discussed in previous chapters, the server that supports the XMPP protocol, called
OpenFire, is already implemented and functional. All I have to do in my application is making
some plug-ins that will filter certain IQs and route them to the specified devices.
2.2.2 Server Side Layout
Before I start any implementation details, I should start by discussing how the User (ID) has
its special folder location implemented on the server, and how the server handles a user in
general, but first let‘s have a look at the server general architecture:
Web
Netw
Interface
ork
Datab
Dummy wait
ase
instructions
Reschedu
le
Server Resto
Serv Sync
network re
Instruction 3 Instruction 2 Instruction 1 Service
component er Service
Disk
Request caching
Figure 6 - General server view
The general server view‘s intention is to point out the server‘s components and how they
interact with each other. The server receives instruction from the network, or from the internet,
on his internet connected interface. The server‘s network interface, receives the packets, and
makes them intro instructions the server‘s components would understand. The server knows how
to route each of these instructions to the different components that make him up. The server can
either choose to make a database query, make a sync operation, or restore some files to the user.
The managers offer the server interface with which he can provide the requested information
to the user. This figure does not show how the server interacts with other client, but rather how
it works internally.
2.2.3 Server components layout
When a user sends a RegisterIQ, to the server, our extension of the server will intercept it, and
process it, and if the server successfully registers the user, than our extension will create the
users so called personal registered folder and entries in the databases related to that new
user.
The database layout on the server side will be described below:
Database structure:
Table T_User
UserID – primary key
UserName
Table T_Device
UserID
DeviceID
Device description
Table T_File
FileID – Primary key
UserID – Foreign key from T_User
ParentFolderID – foreign key from T_Folder
Path – path to actual files (for versioned files - /versioning/folder/id_version.ext)
FileName – file name, as seen by the user
IsDeleted – files with IsDeleted = 1 don‘t appear in the current file list
IsVersioned – file with isVersioned = 0 have only one associated record in T_Versions
(with version = 1)
maxVersion – maximum available version. Used to generate a new version ID
Table T_Folder:
FolderID
UserID
ParentFolderID
Path – path to actual folder on server‘s disk
Name – file name, displayed to the user
CheckoutStatus – if the file is opened for writing by the server
IsVersioned – if IsDeleted = 1 and IsVersioned = 0, the row is permanently deleted. Used
to keep a list of deleted versioned (monitored) folders
IsDeleted
Table T_Version:
FileID – foreign key from T_Files
FileVersion –part of primary key, together with FileID
CheckoutStatus – if the file is opened for writing by the server
Date – date at which the version was added
Status : queued, pending or final
The database layout described, is shown, how to reflect the way the server, handle, one
user‘s files, folders, versions, and devices.
This is how the server stores the user in the databases, but the users should also be stored on
disk. The server should also make a folder structure layout, of the user‘s register (install) folder,
so it can read settings, devices information or store files.
The layout of the folder structure is described below:
Folder structure (in root install folder):
User<ID> (ex: gabi.bercea@gmail.com):
o /storage – root folder for stored files
List of files & folders, no restrictions
o /versioning – root folder for backup files. List of monitored folders.
Folder<folderID> – name & ID of folder
(List of sub files/subfolders)
The name for each file will be <UniqueID>_<version>.<extension>
o /conf – configuration files for user
/conf – server configuration files
/temp – used to buffer & process client requests
The folder structure presented, is the general way of how the server, stores a user on disk.
The server also makes lot entries in different databases as seen previously in the database layout
description.
2.2.4 Packet Layout
The packet layout for the server is actually second hardest thing to implement besides the
extra protocol for file synchronization itself.
I will define a packet as being a customized IQ, sent by the client or server to another client
or server to query or set a request. This IQs are very important, and most difficult to correctly
implement and handle in the server side
Here are some XML IQ packet requests representing some example queries:
Query packet request example:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- Edited by Gabi® -->
<Query User Info Request>
<Security Pass>
<Value>er34543tRETE5435ert43%^%$^&%$#$52342*&(*&y</Value>
</Security Pass>
<User Name>
<User Name Value>xxx@yahoo.com</User Name Value>
<!-- Extra other info needed -->
<Info1></Info1>
<Info2></Info2>
</User Name>
<Device Unique ID>
<ID Value>sdgdf$%$^%5467rdytr%^&%&%^&65</ID Value>
</Device Unique ID>
<User Request Type>
<Query Space Statistics Request>
</ Query Space Statistics Request >
</ User Request Type >
</Query User Info Request>
Before I discovered the XMPP protocol, I used to implement my own XML packet
management, and each packet had a fixed and a dynamic part in its layout. The dynamic part was
the part in which the query packet was inserted. The fixed part, used to contain information about
the sender/receiver of the packet, things like, the username, or packet type.
Here is the layout of a packet making a <Query User Info Request>:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- Edited by Gabi® -->
<Packet Request>
<PacketSecurity>
<SecurityPass>
<Value>2136274896378946329872638759345349785634953789</Value>
</SecurityPass>
<!-- This will always be considered as the from id>
<UserName>
<User Name Value>xxx@yahoo.com</User Name Value>
<!-- Extra other info needed -->
<Info1></Info1>
<Info2></Info2>
</UserName>
<DeviceUnique ID>
<ID Value>4893457394857394085703945734iuegf498</ID Value>
</DeviceUnique ID>
</PacketSecurity>
<To>
<ID Value> destination_ID@email.com </ID Value>
</To>
<Packet Type>
<PacketQueryUserInfoType>
<!--This info here will be according to each type of request needed -->
<!--Different request type have different layouts-->
<UserInfoType>
<QuerySpaceStatisticsRequest>
</ QuerySpace StatisticsRequest >
</ UserInfoType >
</Query User Info Request>
</PacketType>
<ErrorCode>
ErrorSuccess
</ErrorCode>
<TranzactionID>
<ID Value>3049208349025849545j90482359475-23957234895023</ID Value>
</TranzactionID>
<TranzactionStatus>
InProgress (or Started, Finished, etc. ..)
</TranzactionStatus>
<Buffer>
Just an extra buffer
</Buffer>
</Packet Request>
As you can easily tell each, tranzaction has all the information it needs in a packet to
complete in an asynchronous way and with the TranzationStatus element, the user will know
what is the status of the current tranzaction, no matter its TranzactionStatus.
The <TranzactionStatus> element was used to send complex packets, like the ones the user
had to transfer a file. Here the <Buffer> element would come intro play, because in the <Buffer>
value the contents of the current file read would have been put.
This is the way any client/server interaction would have taken place. The packets would be
backwards compatible, because of the XML structure. If new fields in the XML queries would
appear, the old ones would not dissapear so if the user chooses to use an older version of the
client, he could easily do so without having to worry about incompatibility of packets.
2.3 Windows Mobile client architecture
Probably one of the most important and requested feature these days is that all applications run
on mobile devices.
Because of its platform independent behaviour, the protocol can be supported on any device
that can parse XML streams. Mobile devices, make no difference, and with the .NET Compact
Framework it is even easier to handle.
The Jabber-Net classes, licensed open source by the open source community work also on
windows mobile devices. Most of the windows library classes that are implemented on the
windows desktop client work also on the windows mobile client, but some of them need to be
retuned to be compatible with windows mobile restrictions.
The things we will discuss in this chpater is what the windows mobile client architecture
looks like, how the file system monitorig work, how the requests are made and how are they
handled by the windows mobile client, what restrictions appear in the windows mobile client,
and what future improovements can be made.
Before we start discussing any of the above we should remind that the, Graviton server sync
protocol is a platform independent protocol due to its XML stream implementation. I chose to
implement a windows mobile client, using Microsoft .NET framework only because it is easier
and faster to implement. Similar clients could be implemented on different mobile devices,
mobile devices like the ones using: Symbian, Blackberry, Linux, Android, and other.
And to finish this introduction with, is the reason why a file system filter driver was no
implemented for the mobile devices. Due to the variety of brands, and their pretty omogen
distribution on the market, writing a file system filter driver for each device would imply a lot of
developing overhead. Because of the mobile devices` behaviour and pretty low file system
activity, using the user mode filters to help us detect changes in the file system was enough to
offer a very good functionality and the expected result to the client.
2.3.1 Window mobile architecture overview
Due to its light weight behaviour, and having a critical need, and that is to consume as less
battery as possible, and thus ensure the mobile lifetime per charge is as long as possible, the
window mobile architecture should be really simple, and the implementation for the Graviton
functionalities should be done in a manner that the extra stuff that is reflected in the desktop
version should not be reflected here.
The figure below ilustrates the Windows mobile architecture:
Figure 7 - Windows mobile architecture overview
At a first glance the windows mobile architecture, as it is implemented on the windows
mobile client shows from the above figure, how the managers interact with one another, what
libraries they need to be implemented, and how the windows mobile GUI is on top of all these
small frameworks to offer the user the desired result.
First of all let us look at the windows mobile base libraries. I call them base libraries, because
this are practically the core libraries from which all managers and all other libraries evolve from.
One of the most important base library is the Jabber-Net library, library that we have
discussedin previous chapter. The Jabber-Net library, is also portable from the desktop platform
to the mobile platform. On top of this library though there are a few libraries implemented so the
development can go faster. These libraries are the login and register libraries, the contact
manager and the presence and multiple login manager.
The login and register manager, offers methods to the programmer that help him, create new
users on the Graviton network, or login already existing one. The login can, be made using a
login form, or if the user is using one of his devices and chooses to remember his username or
password, he will automatically login without any username or passwords requests.
The file system library we will discuss in more detail, because this is actually the core library
on the windows mobile devices. As you can see it is made up of a FileSystemWatcher class, a
FilePathParser and a ShaddowCopy library. All together these classes help the sync or the
backup manager to make XML IQ packets to be sent to the server to either sync or backup any
files on you mobile device. The way these libraries are implemented we will discuss later,
because these should be the most battery cost effective ones, because they involve the overhead
of permanently listening for changes in the filesystem and addressing the corect manager to
handle those changes. But first before talking about their implementation we should take a look
of what file system the windows mobile devices use, and that is DFS.
2.3.2 The DFS file system
Distributed File System (DFS) is a set of client and server services that allow an organization
utilizing Microsoft Windows servers to organize many distributed SMB file shares into a
distributed file system. DFS provides location transparency and redundancy to improve data
availability in the face of failure or heavy load by allowing shares in multiple different locations
to be logically grouped under one folder, or DFS root.
Microsoft's DFS is referred to interchangeably as 'DFS' and 'Dfs' by Microsoft and is
incompatible with the DCE Distributed File System, which held the 'DFS' trademark but was
discontinued in 2005.
The server component of Distributed File System was first introduced as an add-on to
Windows NT 4.0 Server, called "DFS 4.1", and was later included as a standard component of all
editions of Windows 2000 Server. Windows NT 4.0 Workstation includes client-side support for
DFS.
When a user accesses a share that exists off the DFS root, the user is really looking at a DFS
link and the DFS server transparently redirects them to the correct file server and share.
A DFS root can only exist on a server version of Windows, from Windows NT 4.0 and up, or
on a computer running Samba, the Enterprise and Datacenter Editions of Windows can host
multiple DFS roots on the same server.
There are two ways of implementing DFS on a server:
Standalone DFS roots allow for a DFS root that exists only on the local computer, and
thus does not use Active Directory. A Standalone DFS can only be accessed on the computer
which it is created. It doesn't offer any fault tolerance and cannot be linked to any other DFS.
This is the only option available on Windows NT 4.0 Server systems.
Domain-based DFS roots exist within Active Directory and can have their information
distributed to other domain controllers within the domain — this provides fault tolerance to DFS.
DFS roots that exist on a domain must be hosted on a domain controller. This is to ensure that
links with the same target get all their information replicated over the network. The file and root
information is replicated via the Microsoft File Replication Service (FRS).
DFS Architecture
The DFS service (Dfssvc.exe) is the core component of the DFS architecture and runs on root
servers and domain controllers. The primary functions of the DFS service include handling
referrals, managing namespaces, and communicating with the DFS driver (Dfs.sys).
The components of the DFS architecture on DFS clients and root servers are illustrated in the
following figure. In this figure, the DFS architecture of the domain controller is simplified to
show only the DFS object
Figure 8 - DFS
The following figure illustrates the DFS architecture of a domain controller and a simplified
view of the DFS client and root server architecture. Note that domain controllers use DFS
architecture similar to root servers; this is because domain controllers play a role in referring
client computers to domain-based roots. It is also possible for domain controllers to host
namespaces and play the role of root server. In this case, the domain controller also hosts the
DFS metadata cache (regardless of namespace type) and the stand-alone DFS metadata in its
registry (for stand-alone namespaces).
Figure 9 - DFS
The DFS is very complex and needs a lot of study to understand its architecture and
implementation, and thus not being the target of this thesis, to read more about the DFS
architecture follow this link.
2.3.3 File system watcher
As discussed previously, the file system watcher class on the windows mobile platform
should be the most efficient, one of all because of the fact that it has to watch, that is
permanently monitor the file system for changes. This process can result in very poor batery
performance for the mobile device, thus making the application useless.
The FileSystemWatcher class, is implemented on the base layer of the DFS file system. The
wrapper on top of it, that I had to implement was basically going to treat some callbacks from the
FileSystemWatcher class.
The FileSystemWatcher public events, the ones I had to monitor are the following:
Changed Occurs when a file or directory in the specified Path is changed.
Created Occurs when a file or directory in the specified Path is created.
Deleted Occurs when a file or directory in the specified Path is deleted.
Error Occurs when the internal buffer overflows.
Renamed Occurs when a file or directory in the specified Path is renamed.
On each of these events, the windows mobile device main application, connected to the file
system utilities class should inform the server that one of the monitored paths were in some
mannger modifed: either renamed, deleted, new entries created, or simply exiting ones got
modified.
The sync manager, has callbacks registered to be inform when any of these events happen.
On each event, the sync manager will create an IQ for the server informing him of the
modification that appeared. The server then will route that IQ to all the connected resources of
the current user, thus the modification will reflect in all the devices.
The difference between the mobile devices‘ sync manager, and the desktop sync manager is
that the mobile device sync manager will work like an UDP protocol sync. The manager will not
wait for any confirmation from the server that the sync was made ok or not. The server will only
send upload requests to the client if the sync request could not be handled correctly, otherwise
the windows mobile sync manager will consider it correctly done by the server.
A sync request on the windows mobile device, will be considered over, after the upload
process if finished unlike on the desktop computers, where a sync operation is more complex,
and it is considered done, only when all the devices are in sync.
These are just few of the tricks that are made on the windows mobile devices to make an
easier and faster communication with the server.
2.4.1 Windows desktop architecture overview
The client, that was the most important to me, and the one that I chose to personally
implement was the windows desktop client, including the file system filter driver.
The windows desktop client, is probably the most complex piece of code of them all, maybe
more complex than the server itself because it has to corectly sync more entities together: the
driver, the user-mode application, the server, the other devices.
Unlike the mobile devices‘ simple behaviour, where the sync only meant an upload or a
download request, the windows desktop client, will make complicated syncs like, incremental
sync, will support VHD mounting from the cloud, and will handle a lot more interfaces for
restoring the files, than just the application, interfaces like: http, ftp or others.
But probably the most important and most comlex piece of code of the windows desktop
client is the file system driver itself. What make it delicate is the fact that the windows OS has a
very high file system overhead, and thus the file system driver should not amplify it a lot, so the
user should even realize that the driver is there.
Is chose to implement the driver, without additional file caching than the one offered by the
operating system and I also split it into little driver managers, which I will talk about later.
In this chapter I also talk about a very popular technology these days, so called VHD
mounting from the cloud. How this is necessary or not, the goods, the bads and a litthe about the
VHD architectural design.
Before starting I should also mention the fact that, that the implementation that I chose for
the windows mobile client is not unique, and by this I mean, any application that can follow the
XMPP protocol and the in-house IQs that the protcol understands can make the file sync
application work with the server, this makes it platform independent actually. The reason I chose
this implementation is the fact that it is very compatible with windows and very intergrated, and
many classes I use here can be used in the windows mobile client.
In this application I will talk and describe three big entities that make up the client, those are:
the windows client main application and GUI, the windows client driver and the VHD manager.
First of all let‘s have a look at the windows desktop client architecture in the picture below:
Figure 10 - Desktop client architecture
As you can see from the picture above, the main difference between the windows desktop
client, and the windows mobile client is that the windows desktop client does not have a file
system watcher library but rather it has a file system driver instead, from which it take updates
and processes them in each manager.
The file system driver is actually the engine of the whole client application, because if the
driver works fast the application works fast. I will talk later about the components of the driver
application (update manager, message manager, filter manager, backup manager), now I would
rather explain a little how the user-mode application iteracts with it. Most of the configurations
that are made to the driver are made in a dinamic and in realtime with the help of I/O Control
codes. The driver processes each control code and make modifications on its internal lists, and
managers. The I/O control codes are sent from the user-mode application using the function
DeviceIoControl . The function send a control code along with an input buffer and an output
buffer. The buffers are otional, they are not used for operations that do not require any
parameters, control codes like: GMTECH_IOCTRL_START_FILTERING. This is a control
code that tell the driver to start filtering on the file system. As you can see the driver is started
and stoped dinamically. For operations like: GMTECH_IOCTRL_ADD_FILTER_PATH an
input buffer is required to tell the driver a file system path to monitor. For example the input path
could be a UNICODE_STRING identifying the path (C:\Users\Gabi\Documents). Now the
driver will know that the documents path should be monitored for changes.
The last but definetly not least of the managers is the VHD manager. The architecture of the
VHD manager and what it can do and how I will discuss later in this chapter. What I want to
mention here are some few details, so no misconceptions will appear. If the user will mount a
VHD the file system filter driver will not monitor path, from the drive letters/partitions of the
VHD. I talked about this issue earlier in the first chapter. The VHD will only be used by the user
as a secondary storage. He imagine it as personal, mobile external HDD that he can carry
anywhere and plug. No sync and no backup will be done there. The user can choose to backup
data there or he can choose to copy any files, but they will not be synced anywhere, or versioned
or backedup. The VHD is a to be considered a different component entirely, apart from all the
devices a user has. Just a HDD, but really portable.
2.4.2 Windows desktop client file system driver
Mentioned before, probably the most complex and elaborate piece of code of the entire
application, is the file system filter driver, not because of the way the file system filters are
written on windows platforms, but because of the way NTFS works, and the overhead it
produces. I could not go further and talk about the implementation of the file system filter, until I
cand describe a little bit the NTFS file system.
2.4.2.1 NTFS File System Overview
NTFS is the standard file system of Windows NT, including its later versions Windows 2000,
Windows XP, Windows Server 2003, Windows Server 2008, Windows Vista, and Windows 7.
NTFS supersedes the FAT file system as the preferred file system for Microsoft‘s Windows
operating systems. NTFS has several improvements over FAT and HPFS (High Performance
File System) such as improved support for metadata and the use of advanced data structures to
improve performance, reliability, and disk space utilization, plus additional extensions such as
security access control lists (ACL) and file system journaling. The file system specification is a
trade secret, although it can be licensed commercially from Microsoft through their Intellectual
Property licensing program.
Overview
From a user's point of view, NTFS continues to organize files into directories, which, like
HPFS, are sorted. However, unlike FAT or HPFS, there are no "special" objects on the disk and
there is no dependence on the underlying hardware, such as 512 byte sectors. In addition, there
are no special locations on the disk, such as FAT tables or HPFS Super Blocks.
The goals of NTFS are to provide:
Reliability, which is especially desirable for high end systems and file servers
A platform for added functionality
Support POSIX requirements
Removal of the limitations of the FAT and HPFS file systems
Internals
In NTFS, all file data—file name, creation date, access permissions, and contents—are stored
as metadata in the Master File Table. This abstract approach allowed easy addition of file system
features during Windows NT's development — an interesting example is the addition of fields
for indexing used by the Active Directory software.
NTFS allows any sequence of 16-bit values for name encoding (file names, stream names,
index names, etc.). This means UTF-16 codepoints are supported, but the file system does not
check whether a sequence is valid UTF-16 (it allows any sequence of short values, not restricted
to those in the Unicode standard).
Internally, NTFS uses B+ trees to index file system data. Although complex to implement, this
allows faster file look up times in most cases. A file system journal is used to guarantee the
integrity of the file system metadata but not individual files' content. Systems using NTFS are
known to have improved reliability compared to FAT file systems.
The Master File Table (MFT) contains metadata about every file, directory, and metafile on an
NTFS volume. It includes filenames, locations, size, and permissions. Its structure supports
algorithms which minimize disk fragmentation. A directory entry consists of a filename and a
"file ID" which is the record number representing the file in the Master File Table. The file ID
also contains a reuse count to detect stale references. While this strongly resembles the W_FID
of Files-11, other NTFS structures radically differ.
Limitations
The following are a few limitations of NTFS:
Reserved File Names: Though the file system supports paths up to about 32767 Unicode
characters with each path component (directory or filename) up to 255 characters long, certain
names are unusable, since NTFS stores its metadata in regular (albeit hidden and for the most
part inaccessible) files; accordingly, user files cannot use these names. These files are all in the
root directory of a volume (and are reserved only for that directory). The names are: $MFT,
$MFTMirr, $LogFile, $Volume, $AttrDef, (dot), $Bitmap, $Boot, $BadClus, $Secure, $Upcase,
and $Extend; (Dot) and $Extend are both directories; the others are files.
Maximum Volume Size: In theory, the maximum NTFS volume size is 264-1 clusters.
However, the maximum NTFS volume size as implemented in Windows XP Professional is 232-1
clusters. For example, using 64 KiB clusters, the maximum NTFS volume size is 256 TiB minus
64 KiB. Using the default cluster size of 4 KiB, the maximum NTFS volume size is 16 TiB
minus 4 KiB. (Both of these are vastly higher than the 137 GiB limit lifted in Windows XP SP1.)
Because partition tables on master boot record (MBR) disks only support partition sizes up to 2
TiB, dynamic or GPT volumes must be used to create bootable NTFS volumes over 2 TiB.
Maximum File Size: Theoretical: 16 EiB minus 1 KiB (264 − 210 or 8,446,744,073,709,550,592
bytes). Implementation: 16 TiB minus 64 KiB (244 − 216 or 17,592,185,978,880 bytes)
Alternate Data Streams: Windows system calls may—or may not—handle alternate data
streams. Depending on the operating system, utility and remote file system, a file transfer might
silently strip data streams. A safe way of copying or moving files is to use the BackupRead and
BackupWrite system calls, which allow programs to enumerate streams, to verify whether each
stream should be written to the destination volume and to knowingly skip offending streams.
Maximum path length: An absolute path may be up to 32767 characters long; a relative path is
limited to 255 characters.
Date range: NTFS uses the same time reckoning as Windows NT: 64-bit timestamps with a
range from January 1, 1601 to May 28, 60056 at a resolution of ten million ticks per second.
Advantages of NTFS
NTFS is best for use on volumes of about 400 MB or more. This is because performance
does not degrade under NTFS, as it does under FAT, with larger volume sizes.
The recoverability designed into NTFS is such that a user should never have to run any sort of
disk repair utility on an NTFS partition. For additional advantages of NTFS, see the following:
Microsoft Windows NT Server "Concepts and Planning Guide," Chapter 5, section titled
"Choosing a File System"
Microsoft Windows NT Workstation 4.0 Resource Kit, Chapter 18, "Choosing a File
System"
Microsoft Windows NT Server 4.0 Resource Kit "Resource Guide," Chapter 3, section
titled "Which File System to Use on Which Volumes".
Disadvantages of NTFS
It is not recommended to use NTFS on a volume that is smaller than approximately 400 MB,
because of the amount of space overhead involved in NTFS. This space overhead is in the form
of NTFS system files that typically use at least 4 MB of drive space on a 100 MB partition.
Currently, there is no file encryption built into NTFS. Therefore, someone can boot under MS-
DOS, or another operating system, and use a low-level disk editing utility to view data stored on
an NTFS volume.
It is not possible to format a floppy disk with the NTFS file system; Windows NT formats all
floppy disks with the FAT file system because the overhead involved in NTFS will not fit onto a
floppy disk.
For further discussion of NTFS disadvantages, see the following:
Microsoft Windows NT Server "Concepts and Planning Guide," Chapter 5, section titled
"Choosing a File System"
Microsoft Windows NT Workstation 4.0 Resource Kit, Chapter 18, "Choosing a File
System"
Microsoft Windows NT Server 4.0 Resource Kit "Resource Guide," Chapter 3, section
titled "Which File System to Use on Which Volumes"
Taking into consideration all the goods, the bads, constraints, and enhances the NTFS file
system has, I started writing the file system filter driver.
Another thing to keep in mind before describing the driver, in detail is that the filter should
be aware of the fact that it is a filter driver and that there are drivers just like him above or below
him, which he should do operation in respect to.
To read more details about the NTFS file system, because it is so complex and it is not open
source, you could follow this link and this.
2.4.2.2 Windows file system filter driver
The windows client file system filter driver is also divided into managers as the main
application is. The driver manager are different though because, in kernel-mode, you are in
another world. You do not depend about other application as you normally do in user mode, but
you depend about the operating system and idirectly from other applications, at a very abstract
and low level, and thus makes the file system driver harder to implement.
Before talking about anything let‘s look at the general architecture:
Figure 11 Driver architecture
First of all let me define the managers, engines and other entities seen in the picture:
- User mode communication component is the manager in the driver that connects the
user mode to the driver. The driver will receive seetings requests through IO control codes, in the
driver Dispatch Control routine and they will be parsed by this manager.
- The Callbacks, are the most important part of the driver actually. They are actually the
call routines of the driver, when a certain request appears in the file system. The filter can set
callbacks for Create operations, SetInformation, QueryInformation, Read operations or Write
operations. Each of these are called in different process and thread contexts and can start in a
thread context and finish in another asynchronously. The filter must be very careful about
implementing these callbacks bacause from here most memory leaks appear, resulting in filling
all the user‘s memory.
- The request parser manager is made up of small components, a path parser, an
extension parser, a size parser. The request parser manager, should be the first to be
interogated in the Create request callback, to see if a current request is of interest. If a certain
Create request does not address a file or folder in the protected path than it should pe passed to
the intermediate filters and ultimately to the NTFS file system to commit it to the disk
- The message manager, is a component that interacts with the user, and which allows the
driver to send certain messages to the user, asking him different questions, like if a file is
synchronised, or if the internet connexion is alive.
- The backup manager, is used by the driver to create shaddow copies of the modified
files, deleted files, or other files that might need backup. For example, if a protected file is
opened with the SUPERSEDE flag, the backup manager will pause the create operation, and first
make a backup copy of the file before the file is superseded. If the supersede operation fails, than
the backup manager deletes the backup file, as the file did not get modified, the supersede
operation is made in a create operation, so this results in the create operation to fail so no
FILE_OBJECTs or file HANDLEs were created, this results in an unmodified file.
- The internal list management, takes care of the different list the the driver uses to
arrange his internal data. Some of these list are: ActiveFilesList, FileObjectsList. For example
the ActiveFilesList repezents the file streams reprezenting an in user file from the disk. The
ActiveFilesList also has a FileObjectsList attached reprezenting all the opened instances to that
file. A certain file from the disk can be opened from different applications, more than once, so
this is why this is neccesarry. The active files list has field telling the managers if the current
stream is dirty, if it is a directory, it‘s create file size, if it should ignore writes, or if the file has
the delete on close flag set. The FileObjectList contain information about each handle actually
created like: OriginalProcessContext, OriginalThreadContext, the FILE_OBJECT itself, the
granted ACCESS_MASK, the CreateOptions and CreateDisposition.
- The worker threads and DPCs handle the all the asynchronous operations that need to
be implemented in runtime. Operations like a certain Write request to an update file or a
shaddow copy request.
- The updates manager manager, is also really important. This is an interface that will
give the user the posibility to read the updates from the driver. The user will interogate the driver
periodically and for updates and, the driver will give him one update per interogration. The user
cand reserve certain update, so only the requesting thread that reserved the update will be able to
request updates related to that chain of updates.
- The IO request creator, is actually a small library that helps the programmers to make,
read requests, write request, query information requests to the file system. It is the base because
all the managers from the above need it to make any moves on the file system.
2.4.2.3 File sytem filter driver internal implementation
The internal driver implementation I chose is modular as seen from the above picture. The
modules the driver is made up of are:
- Filespy.c: main module.
- FsInitDriver.c: driver initialization routines.
- FsIoImplementation.c: module coresponding to the IO request creator.
- FsJobImplementation.c: module that the driver uses to implement certain driver internal
jobs.
- FsListUtils.c: module that handles list management.
- FSMonitorCreateEngine.c: module handling certain aspects of the create request, the
most important request of all.
- FSMonitorDriverUtils.c: module that has some few driver utilities.
- Namelookup.c: module library that has the functions to query file names in the NT
manner from the file system.
2.4.2.3.1 Filespy.c
Driver‘s main modules. This module has the DriverEntry routine and the most important
dispatch routines, like the Create routine, SetInformation routine or Write routine.
In Filespy.c all the dynamics of the driver happen actually. After the DriverEntry returns a
STATUS_SUCCESS code, the filtering begins, but only if the driver is initialized. The driver
might not have all his initializing parameters in from the DriverEntry. If not the driver will not
start filtering.
In the DriverEntry routine, the filter will initialize all of his internal lists, and concurency
structures like spinlocks or resources, but also sets the callbacks. The driver will be considered
initialized if there are some valid filtering paths and filtering constraints. Also the driver will
need a backup path on the local storage to make temporary backup.
In the DriverEntry the driver also initializez any memory alocators that it might need later,
like NameBufferLookasideList:
ExInitializeNPagedLookasideList
(&gFileSpyNameBufferLookasideList,
NULL,
NULL,
0,
FILESPY_LOOKASIDE_SIZE,
FILESPY_NAME_BUFFER_TAG,
0);
After all the initialization is done, the filter driver will create it‘s device objects. There are
two types of device objects that he will need. A control device object, the user will use to control
the driver through IOCTL‘s and the devices that will attach to the file system. See sample code
below, how a control device object is made:
RtlInitUnicodeString (&nameString, FILESPY_FULLDEVICE_NAME2);
Status = IoCreateDevice (DriverObject,
0,
&nameString,
FILE_DEVICE_DISK_FILE_SYSTEM,
FILE_DEVICE_SECURE_OPEN,
FALSE,
&gControlDeviceObject);
The &gControlDeviceObject is the output devices that the IoCreateDevice function will
give. This is the device that will receive the IOCTLs from the user.
One other important step that is to be followed in the DriverEntry to initialize the dispatch
functions. See sample code below:
DriverObject->MajorFunction [IRP_MJ_CREATE] = SpyCreate;
DriverObject->MajorFunction [IRP_MJ_WRITE] = SpyWrite;
DriverObject->MajorFunction [IRP_MJ_SET_INFORMATION] =
SpySetInformation;
DriverObject->MajorFunction [IRP_MJ_CLEANUP] = SpyClose;
DriverObject->MajorFunction [IRP_MJ_CLOSE] = SpyClose;
DriverObject->MajorFunction [IRP_MJ_FILE_SYSTEM_CONTROL] =
SpyFsControl;
The DriverObject will set all these dispatch functions to valid routines in existing in the
driver, and for example, each time a Create request is made, the SpyCreate routine is called.
This all the initialization the DriverEntry can handle. In the driver entry dispatch are also
created the worker threads and the DPC objects initialized.
2.4.2.3.2 FsMonitorCreateEngine.c
One of the most important modules of the drive is the one where the create request are
handled. Because the file system driver will give access to a file, only when a successful
CREATE request is made, then this operation is one of the most delicate of all.
Before going any further let‘s look at the ActiveFilesList structure:
typedef struct _ACTIVE_FILES_VALUE
{
ULONG_PTR FsContext;
PFILES_LIST Files; //file object streams that refer to
the same file
KSPIN_LOCK FileListEntryLock;
EXTENSION_IGNORE_TYPE ExtensionIgnoreType;
ERESOURCE StructSharing;
BOOLEAN DeleteOnClose;
BOOLEAN IgnoreWrites;
BOOLEAN PeriodicalFlushingEnabled;
PNAME_CONTROL FileName;
CHAR DosFileName [1024 * sizeof(char)]; //will be
directly sent to user
BOOLEAN Directory;
BOOLEAN Dirty;
LARGE_INTEGER OnCreateFileSize;
BOOLEAN InLastCleanup;
BOOLEAN BackupOnAfterCleanup; // this is a special
"flag”.
//it is set to
the files that are created/superseded/overwritten will be
} ACTIVE_FILES_VALUE, *PACTIVE_FILES_VALUE;
On each Create request, the filter driver, will analize the create parameters, will decide if the
create reprezents a valid protected file. If yes, than that file will be marked as an active file. The
active file will be reprezented by the structure above. The ActiveFile structure will be chained to
a list of structures like that reprezenting all the opened files in memory.
Due to different configurations, the driver might choose to set the IgnoreWrite flag or also
DeleteOnClose flag. If this is true, then, the file will pass through in the Write routine path, and
in the Cleanup routine, the backup manager will create a shaddow copy of the file.
2.4.2.3.3 FsIoImplementation.c
The last module that I will describe is the FsIoImplementation.c. In this module, the driver
has implemented a small framework to make different FS requests. For example:
- NTSTATUS FsIoMakeCleanupRequest( IN OPTIONAL PIRP Irp,
PNL_DEVICE_EXTENSION_HEADER NlExtensionHeader);
- NTSTATUS FsIoMakeCloseRequest(IN OPTIONAL PIRP Irp, IN
PNL_DEVICE_EXTENSION_HEADER NlExtensionHeader);
- NTSTATUS FsIoMakeReadRequest(PDEVICE_OBJECT
DeviceObjectHint, PFILE_OBJECT FileObject,
PVOID Buffer, //kernel buffer
PLARGE_INTEGER BufferLength,
PLARGE_INTEGER Offset);
- NTSTATUS FsIoMakeReadRequestEx(PIRP ReadIrp,
PDEVICE_OBJECT DeviceObjectHint,
PFILE_OBJECT FileObject,
PVOID Buffer, //kernel buffer
PLARGE_INTEGER BufferLength,
PLARGE_INTEGER Offset,
OUT PIO_STATUS_BLOCK UserIosb);
- NTSTATUS FsIoMakeWriteRequestEx( PIRP WriteIrp,
PDEVICE_OBJECT DeviceObjectHint,
PFILE_OBJECT FileObject,
PVOID Buffer, //kernel buffer
PLARGE_INTEGER BufferLength,
PLARGE_INTEGER Offset,
OUT PIO_STATUS_BLOCK UserIosb);
- NTSTATUS WriteRequestIoCompletion(IN PDEVICE_OBJECT
DeviceObject,
IN PIRP Irp,
IN PKEVENT Finish Event);
- NTSTATUS FsIoMakeWriteRequest(PDEVICE_OBJECT
DeviceObjectHint, PFILE_OBJECT FileObject,
PVOID Buffer, //kernel buffer
PLARGE_INTEGER BufferLength,
PLARGE_INTEGER Offset);
- NTSTATUS
FsIoMakeQueryEndOfFileInformationRequest(PFILE_OBJECT
FileObject, PDEVICE_OBJECT DeviceObjectHint, PLARGE_INTEGER
EndOfFile);
- NTSTATUS FsIoMakeQueryStandardFileInformationRequest(
PFILE_OBJECT FileObject, PDEVICE_OBJECT DeviceObjectHint,
PFILE_STANDARD_INFORMATION FileStdInfo);
- NTSTATUS FsIoMakeSetFileAllocationInformationRequest(
PFILE_OBJECT FileObject,
PDEVICE_OBJECT DeviceObjectHint,
PLARGE_INTEGER FileAllocation);
As seen from the function definitions above the FsIoImplementation library framework,
implements many functions that will provide I/O utilities to the file system.
These functions are used by the BackupManager, SyncManager and the UpdatesManager to
create files, copy files from one location to another in kernel mode, or query basic information
about some files.
2.5.1 VHD Overview
- Virtual Disk - A file or set of files that represents a disk as a block device.
- VHD - Acronym for ―Virtual Hard Disk‖.
- Surface - To expose a VHD context as a disk to the local system. Similar to ―mount‖ but
at the disk layer, not file system layer.
-
2.5.1.1 VHD Background
VHD format defined by Microsoft
- Document titled ―Virtual Hard Disk Image Format Specification‖.
- Current file specification available on Microsoft web site.
- URL on Links slide at end of presentation.
- Conceived as a disk-in-a-file format for VM environments.
- Sector size is currently hardcoded to 512 bytes.
- The Microsoft VHD file format specifies a virtual machine hard disk that can reside on a
native host file system encapsulated within a single file. The format is used by format is used by
Virtual PC 2007, Virtual Server 2005 R2 and Hyper-V and the format will be used by future
versions of Microsoft Windows Server that includes hypervisor-based virtualization technology.
Beyond that, the VHD format is broadly applicable, because it is agnostic to the virtualization
technology, host operating system, or guest operating system with which it is used.
- Customers and partners who invest the VHD file format will have a clear path forward to
future Windows virtualization technologies. In addition, Microsoft plans to design its systems
management tools around the VHD file format for improved patching and manageability.
The ability to directly modify a virtual machine‘s hard disk from a host server supports many
interesting applications that may be of interest to customers. These include:
- Moving files between a VHD and the host file system
- Backup and recovery
- Antivirus and security
- Image management and patching
- Disk conversion (physical to virtual, and so on)
2.5.1.2 Multiple types of VHDs
- Fixed VHD
o File is the size of the virtual disk plus small footer
o Max size restricted by the host file system
o Recommended for production use
- Dynamic VHD
o File is as large as the data written plus VHD metadata
o File grows on demand
o Not to be confused with VDS Dynamic Disk
o Currently limited to 2040 GB in size
- Differencing VHD
o File is a set of modified blocks relative to a parent image
o Parent of a differencing disk can be a fixed, dynamic, or differencing disk
(differencing chain)
2.5.1.3 VHD infrastructure – primary components
- VDrvRoot.sys: Root enumerated bus driver surfaces Virtual HBA PDOs
- VhdMP.sys: Virtual HBA VHD driver loads on HBA PDO and is responsible for
performing VHD parsing and exposing disks to the system
- FsDepends.sys: File system dependency driver, manages PNP and volume relationships
between virtual and physical storage stacks
- VirtDisk.dll: User-mode DLL provides Win32 API for managing virtual disks
Figure 12 - VHD Infrastructure
2.5.1.4 Virtual Disk IO Data Flow
Figure 13 - VHD Data Flow
2.5.1.5 VHDs are disks
- VHDs appear to PnP as disks
o Driver installation
o Device interfaces
o Full visibility to applications
o Presence in diskmgmt.msc
• This is good for boot VHDs and deployment tools
• Must have SE_MANAGE_VOLUME privilege to surface a virtual disk
2.5.1.6 VHD Native Features
VHDs can contain multiple partitions:
C:
Figure 14 - VHD Native features
2.5.1.7 Remote mounting
Remote VHDs are supported
\\server\share contains file Foo.VHD
Local Disk:
Figure 15 - Remote mounting
Not reliable where network infrastructure doesn‘t have strong guarantees! When network
connection is lost, guest volume is torn down.
2.5.1.8 VHD dismount and removal
Parent Volume Dismount:
- FsDepends will dismount all surfaced VHDs on the volume and unsurface the VHDs in
proper order
- Guest file systems will flush and cleanup volume when dismount request is received
Parent Volume Surprise Removed
- FsDepends will notify VhdMP for all surfaced VHDs on the host volume
- VhdMP simulates surprise removal for all surfaced disks which had parent volume
removed
2.5.1.9 VHD restrictions
- Pagefile/CrashDump/Hibernation files are not supported on VHDs
- Hibernate is disabled on VHD boot
- Pagefile by default goes on parent partition
- Crashdump uses this as well
- Attempts to create a pagefile on the VHD are failed by FsDepends
- UDFS is not supported as a host or guest file system on a virtual disk
- The VHD file cannot be encrypted, compressed, or sparse
2.5.1.10 Issues for filter writers
Holding Driver level locks while doing IO
- Example: Driver wide lock to single thread volume arrival / removal
- Deadlock when grab lock on virtual disk and IO causes filter to attempt to grab lock on
host volume
Posting to global worker queues
- When global worker queues are busy handling requests from virtual disks, work queue
items posted for host volumes may have to wait for the virtual ones to complete… but IO on
virtual disk may be tied to host volume work item completion… Deadlock.
2.5.1.11 Resolving issues for filter writers
Holding Global Locks while doing IO:
- Determine max number of levels
- Separate out lock and data structures they protect to be per virtual disk nesting level
- Acquire lock that is specific to that level before you do IO
Posting to global worker queues:
- Allocate worker queues per virtual disk level.
- Post work items to queue for that level.
Not trivial solutions but very workable.
2.5.1.12 Routines for filter writers
FsRtlGetMaximumVirtualDiskNestingLevel
- Returns maximum number of nested virtual disk levels
- In Win 7, it will return one of the following :
0 : No virtual disks may be mounted
1: A single level of virtual disks can be surfaced.
2: A virtual disk can be nested in another virtual disk but cannot have a
third level nested within it.
- Future versions of Windows may support a larger maximum nesting level.
- The limit can be modified by registry parameter and group policy but will not change
dynamically
FsRtlGetVirtualDiskNestingLevel
- Takes a Disk DeviceObject
- Returns current nested virtual disk nesting level for that disk
- In Win 7, the level returned means the following :
0 : This disk is not a virtual disk
1: This disk is a virtual disk that resides on a physical disk
2: This disk is a virtual disk nested within a virtual disk that resides on
a physical disk.
- Returned value will not change dynamically
- Level returned is the level of the writable VHD. There may be a parent at a different
level.
3. Conclusions
In the end I would like to add a few conclusions about all the important entities that took part
in this project like: the server, the multiple types of clients. My last conclusions will be my
overall conclusion about this project and future forecasts.
The project was, and still is in a development state due to its very complex architecture, and
many proposed functionalities.
I learned, by working at this project that team work is very important and communication as
well. I also learned a lot of new things and technologies like: OpenFire server development,
Jabber-Net development, better understanding of file system drivers, mobile device
development, and how to connect all these together.
The biggest challenge of this project was, to determine which type of server I should choose,
what paradigm is better from the big variety of choices. After I decided that XMPP protocol was
the thing I needed, than I had to decide what implemented server I should choose. Most of the
servers were and are open source, but I had to lookup which community is more active, which
server is better updated and better patched.
Then on the client side, I had to figure out how to make the application run very smoothly,
so that the user would not even realize they are running. I had to first find a way to link all the
different technologies together and link them as efficient as possible. This is why I chose .NET
Framework, as the platform for developing the GUI and the WIN32 and device driver
development for developing the core components. These would then communicate through
shared memory, direct dll function invocation or other IPC mechanisms.
Finally, let‘s not forget that the Graviton project has also a web interface, through which the
user can control his or her devices. I also had to research on how to integrate the web application
to run along with all the different clients. The web application is seen, as well as another client, a
special type of client, like an abstract client, where the user cannot install any client programs,
but rather an interface through which he will control the other clients.
3.1 Server conclusions
As I said in the Conclusions chapter, above, the server was the most difficult part on this
project, because there were so many choices and I had to find the one that best fits on this type of
application.
As XMPP protocol gained more and more popularity, now that many big companies use it
for their in house and public applications (ex: GoogleWave), I thought of giving it a shot for my
application.
The Graviton, project needed a request based server, from which the clients would make
different requests, and receive replies. The XMPP protocol is very close to what this project
needs, because the main three things the protocol is based on are: Messages, Presence, and IQs.
By Messages, I mean, one client can send one message to another client. This enhances
communication between clients, and a better collaboration. By Presence I mean, that each client
can notify all other clients in their contact list that they are online or offline. This also, works for
more than one instance of one client connected. So if a client is connected from more than one
device, each device will know of each other‘s presence. By IQs I mean, Information Query
requests, from the client to the server and vice versa. Each of these IQ can have different
formats, depending on what requests are needed on each side. The IQs can be getters and setters.
If a device wants to make a query to another device it will have to set a ‗to’ member in the IQ
structure and then fill it as usually. The server will automatically route it to the destination. In the
client IQ handler the client will interpret it and answer with a reply to the sender or with an error
if the request cannot be completed.
As you can see this protocol best fits on this application due to the fact that Graviton‘s most
requests will be to upload or download files, or make certain file operations remotely like: copy,
delete, rename or create.
Being very extensible, it was really easy to create new IQ requests than the ones presented in
the standard RFC. This has helped me to easily make a lot of functionalities for the application
like remote device control, or backup and restore interfaces.
3.2 Client conclusions
The conclusions that I have to talk about in the client side are many, due to the fact that there
are a lot of client types running on different platforms and suffering different implementations.
The good part about being in a development stage of a client was that, it was really easy to
deploy it, after you have decided what the server was going to be like.
The easiest client to deploy would be the windows mobile client due to its light weight
architecture and lack of features comparing to the windows desktop client. On the other hand the
hardest to implement and deploy would be the windows desktop client or the desktop clients in
general, because they require a lot of overhead to be bypassed and complicated architecture and
features planned.
The hardest thing to implement and integrate on the windows desktop client would be the
windows file system filter driver, which monitors the NTFS or FAT32 file systems. The driver
supports both x64 and x86 platforms and all the windows versions from Windows 2000 to
Windows 7 and Windows Server 2008 R2. The driver is the real piece of code that actually gives
grace and unicity to the application. The total number of line wrote in the driver exceed the one
written in the desktop client or server patches and plug-ins. This is what makes it so complex,
hard to maintain and update.
3.3 Overall conclusions and future forecasts
Overall this project has helped me understand a lot, of how certain software works, and how
you should make a good decision in writing software, how to choose the best framework
libraries, how to find out which servers fit you the best or what clients you may be able to
support.
In the future Graviton, will be rebranded and eventually be sold, and hopefully be a good
competitor on the market, with all the concurrency exiting. Graviton‘s strong points are the good
and stable server implementation using XMPP and the best synchronization possible with the
help of the file system filter and IQ requests.
References
- Windows NT File System Internals by OSR Classic reprints
- Windows NT Device Driver Development by OSR Classic reprints
- OSR Community forums
http://www.osronline.com/page.cfm?name=ListServer
- OpenFire server source and documentation
http://www.igniterealtime.org/
- XMPP namespace declaration and documentation
http://xmpp.org/registrar/namespaces.html
- Jabber-net library source and documentation
http://code.google.com/p/jabber-net/
- Documentation on .NET Framework and Compact Framework
http://microsoft.com
- Extra documentation on file systems, XMPP, kernel drivers on Wikipedia
http://en.wikipedia.org/wiki/Ntfs#Internals
http://en.wikipedia.org/wiki/File_system
http://en.wikipedia.org/wiki/Xmpp
http://en.wikipedia.org/wiki/Kernel_(computing)
Get documents about "