# DIS-Lecture-Material-Full

Document Sample

					         เอกสารประกอบการสอน

วิชา 07017206 ระบบสารสนเทศแบบกระจาย
(Distributed Information Systems)

อัครินทร คุณกิตติ

คณะเทคโนโลยีสารสนเทศ
สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)                                  อัครินทร คุณกิตติ

สารบัญ
1.    Course Overview and Introduction to Distributed Information Systems ...............................3
2.    Characterization, Concepts and Design Goals........................................................................8
3.    Network and Internetworking ...............................................................................................19
4.    Inter-Process Communication...............................................................................................28
5.    Remote Procedure Calling ....................................................................................................34
6.    Distributed Operating Systems .............................................................................................39
7.    File service: A model and Case Studies................................................................................46
8.    Name Services ......................................................................................................................50
9.    Time and Coordination .........................................................................................................58
10.      Replication ........................................................................................................................65
11.      Concurrency Control.........................................................................................................72
12.      Recovery and Fault Tolerance ..........................................................................................78
13.      Security .............................................................................................................................84
14.      Distributed Systems Update..............................................................................................91

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                                              หนาที่ 2 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

1. Course Overview and Introduction to Distributed Information Systems
คําอธิบายรายวิชา (Course Description)
บทนํา       ระบบสารสนเทศแบบกระจายและปจจัยทางเทคนิคและเศรษฐศาสตรที่ทําใหมีการใชอยาง
ั     ่
แพรหลาย สถาปตยกรรมของระบบแบบกระจายที่ใชกนโดยทัวไป เครือขายคอมพิวเตอร การสือสารภายใน         ่
กระบวนการ การเรียกใชโปรซีเจอรทางไกล ระบบปฏิบัติการแบบกระจาย การบริการแฟมขอมูล การตั้งชื่อใน
ระบบแบบกระจาย การออกแบบโปรแกรมของเครือขายใหบริการ การควบคุมแบบพรอมกัน การกูระบบและ
การทนตอความผิดพลาด การทําสําเนาและการทํางานพรอมกัน ระบบความปลอดภัย กรณีศกษา             ึ
An Introduction to distribute information systems and technical and economic factors that have been
led to their widespread use. Architecture used in general purpose distributed systems, Network, Inter-process
communication, remote procedure calling. Distributed operating system, file service, Naming in distributed
systems, Design of server program, concurrency control, recovery and fault tolerance, replication and
synchrony, security and case studies are included.

วัตถุประสงค (Course Objectives)
เพื่อใหเขาใจหลักการพื้นฐานของระบบสารสนเทศแบบกระจาย และสามารถนําไปประยุกตใชงานได

การวัดผลและหลักเกณฑการวัดผล (Course Evaluation and Evaluation Criteria)
ระหวางภาค (In-Class and Mid-Term Examination) คิดเปน 50 % โดยแบงเปน
o Assignment                30 %
o Mid-Term Examination 20 %
ปลายภาค (Final Examination) คิดเปน 50 %

หนังสือ / ตําราที่ใชประกอบการสอน (Reference Materials)
[1] George Coulouris, Jean Dollimore and Tim Kindberg, Distributed Systems: Concepts and Design, Second
Edition, Addison-Wesley Publishing Company, England, 1994.
[2] Papers
[2.1] Ted Burghart, “Distributed Computing Overview”, Quoin Inc., June 1998. < available at URL
http://www.quoininc.com/quoininc/dist_comp.pdf >
[2.2] Alfonso Fuggetta, Gian Pietro Picco and Giovanni Vigna, “Understanding Code Mobility”, IEEE
Transaction on Software Engineering, Vol. 24, No. 5, May 1998, pp. 342-361.

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 3 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

[2.3] Mario Baldi, Silvano Gai, and Gian Pietro Picco, “Exploiting Code Mobility in Decentralized and
Flexible Network Management”, Mobile Agents, Proceedings of the 1st International Workshop on
Mobile Agents 97 (MA'97), Berlin (Germany), 1997.
[3] Errol Simon, Distributed Information Systems: From Client/Server to Distributed Multimedia, The
McGraw-Hill Companies, London, 1996.

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 4 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)            อัครินทร คุณกิตติ

Assignment

วัตถุประสงค
ึ                                                           ่
1. เพื่อใหศกษา คนควา หาความรู ใหกวางขวางมากยิ่งขึ้น ในเรื่องที่เกียวของกับ Distributed Information
Systems เพิ่มเติมจากการเรียนการสอนในหองเรียน
2. เพื่อฝกการคนควาและนําเสนอ ตลอดจนการโตตอบ ซักถาม แลกเปลี่ยนความคิดเห็นทางวิชาการ
Distributed Information Systems

รายละเอียด
ใหทําการคนควาบทความวิชาการ/วิจัย เชน International Journals, International Conferences, IEEE
Magazines (ที่เปนภาษาอังกฤษ) หรืออื่นๆ ที่เกี่ยวของกับ Distributed Information Systems โดยจะตองเปน
้
บทความที่อธิบายถึงหลักการ หรือวิธีการทํางานของเรื่องที่บทความนันนําเสนอ แลวสรุปเปนรายงานภาษาไทย
สั้นๆ (1-3 หนา) ดวยตนเอง พรอมทั้งแสดงความคิดเห็นของตนเองในเชิงวิชาการเกียวกับบทความนั้นๆดวย ถา
่

บทความนั้นมีความยาวมาก อาจจะตัดหรือคัดเลือก หรือสรุปบางประเด็นใหอยูในขอกําหนดได พรอมทั้ง
้             ี
นําเสนอในชันเรียน และใหมการซักถาม แลกเปลี่ยนความคิดเห็นระหวางนักศึกษากันเองดวย โดยมีรายละเอียด
การวัดผลและหลักเกณฑการวัดผล ดังนี้

รายการ                                          การวัดผล
1. รายงาน                                                                                   5%
2. การนําเสนอ/ความเขาใจเนื้อหา/ความคิดเห็นตนเอง                                           15%
3. การโตตอบ ซักถาม และแลกเปลี่ยนความคิดเห็นทางวิชาการ                                     10%
รวม          30%

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                         หนาที่ 5 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

รายงาน
หลังจากไดคนควาและศึกษาบทความแลว ใหจดทํารายงานสรุปเปนภาษาไทยสั้นๆ ความยาวประมาณ
ั
1-3 หนา (A4) กําหนดสงในวันที่นําเสนอ กอนการนําเสนอ โดยใหแนบสงบทความตนฉบับมาดวย ในรายงาน
ใหมีสวนที่แสดงความคิดเห็นของตนเองในเชิงวิชาการเกี่ยวกับเรื่องที่คนความาดวย

การนําเสนอ
ึ                                        ่
นักศึกษาจะตองนําเสนอบทความที่ไดศกษาคนความา พรอมความคิดเห็นของตนเองเกียวกับบทความ
นั้นในเชิงวิชาการ                      ้
โดยความคิดเห็นนันจะตองมีการศึกษาและแสดงใหเห็นถึงจุดประสงค       วิธีการและ
รายละเอียดโดยสังเขป ของบทความนั้น และจะจัดใหมีการนําเสนอในเวลา หลังการบรรยายในชั้นเรียน
นักศึกษาแตละคนจะมีเวลาในการนําเสนอประมาณ 10 นาที และเวลาสําหรับการโตตอบ ซักถาม และ
แลกเปลี่ยนความคิดเห็นทางวิชาการอีกประมาณ 5-10 นาที รวมเวลาคนละประมาณ 15-20 นาที ในการนําเสนอ

นั้นนักศึกษาควรจะมีการเตรียมการและมีการแจกเอกสารประกอบการนําเสนอแกผูฟงดวย (ไมบังคับ)
การนําเสนอจะใหนําเสนอในชวงระยะเวลาการสอนตั้งแตครั้ง (สัปดาห) ที่ 8 รวม 8 ครั้ง ครั้งละ
ั
ประมาณ 3-7 คน (ขึ้นอยูกับจํานวนนักศึกษาในหอง) ตารางลําดับการนําเสนอ ใหนกศึกษาตกลงลําดับการ
นําเสนอกันเองใหเรียบรอย แลวสงตารางลําดับการนําเสนอ ภายในสัปดาหที่ 2 ของการเรียนการสอน ถาตกลง
กันไมไดจะกําหนดให โดยวิธีสุม และถาหลังจากสงตารางลําดับการนําเสนอแลว นักศึกษาคนใดไมสามารถ
นําเสนอตามตารางได ใหติดตอตกลงกับนักศึกษาคนอื่น เพื่อขอแลกลําดับการนําเสนอกันเอง แลวแจงใหทราบ
ในวันที่นําเสนอ

การโตตอบ ซักถาม และแลกเปลี่ยนความคิดเห็นทางวิชาการ
ั          ่
เพื่อใหนักศึกษามีการแลกเปลี่ยนความคิดเห็นทางวิชาการ และเขาใจในเรื่องที่นกศึกษาตนอืนนําเสนอ
จึงกําหนดใหนักศึกษาเขารวมฟงการนําเสนอของนักศึกษาคนอื่นๆทุกคน และหลังจากการนําเสนอเสร็จ ให
นักศึกษาที่เขารวมฟงตั้งคําถามทางวิชาการ เพื่อซักถาม โตตอบ และแลกเปลี่ยนความคิดเห็นทางวิชาการกับผูที่
นําเสนอนั้นๆ โดยจะมีการบันทึกจํานวนคําถาม/คําแนะนําทางวิชาการของนักศึกษาแตละคนไว แลวนํามา
เปรียบเทียบเปนคะแนนในสวนนี้ตอไป

นอกจากนี้ในการสอบปลายภาคอาจจะคัดเลือกเนื้อหาของ Assignment ไปใชในการสอบดวย

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 6 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

หัวขอบรรยายและกําหนดการบรรยาย 14 สัปดาห (Topics and Details of 14 Weeks)

สัปดาห (Week)                            หัวขอบรรยาย (Topic and Details)
1               Course Overview and Introduction to Distributed Information Systems
2               Characterization, Concepts and Design Goals (1 & 2)
3               Network and Internetworking (3)
4               Inter-Process Communication (4)
5               Remote Procedure Calling (5)
6               Distributed Operating Systems (6 & 18)
-               Mid-Term Exam. Period (No Lecture)
7               File service: A model and Case Studies (7 & 8)
8               Name Services (9)
9               Time and Coordination (10)
10               Replication (11)
11               Concurrency Control (13)
12               Recovery and Fault Tolerance (15)
13               Security (16)
14               Distributed Systems Update (Papers)

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 7 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

2. Characterization, Concepts and Design Goals
หัวขอ (Topics)
Definition of distributed systems
Distributed and centralized systems
Key characteristics
Design concepts and Goals
Key Design Goals
Basic Design Issues
User Requirements
Summary
ความหมายของระบบสารสนเทศแบบกระจาย (Definition of Distributed System)
A collection of autonomous computers linked by a network, with software designed to produce an
integrated computing facility.

การเปรียบเทียบระบบ (System Comparison)
เพื่อความเขาใจงายขึ้น สามารถเปรียบเทียบระบบคอมพิวเตอรไดกับการทํางานของมนุษย โดย
พิจารณาอยางงายไดวา เครื่องคอมพิวเตอรแตละเครื่องเปรียบเทียบไดกับมนุษยแตละคน ดังนั้นการทํางานของ
ระบบสารสนเทศแบบกระจายสามารถเทียบไดกับการทํางานเปนกลุมหรือทีมของมนุษย                     เพื่อใหบรรลุ
วัตถุประสงคของการทํางานนั้นๆ         ในการทํางานเปนหมูคณะของมนุษยนั้นจําเปนที่จะตองมีองคประกอบ
หลายๆอยาง ในทํานองเดียวกันระบบสารสนเทศแบบกระจายก็จะตองประกอบดวยสวนตางๆ เพื่อใหสามารถ
ทํางานไดเชนกัน

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 8 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Distributed and Centralized Systems

แตเดิมระบบคอมพิวเตอรจะทํางานในลักษณะแบบรวมศูนย (Centralized System) หรือเปรียบเทียบได
กับการทํางานของมนุษยในลักษณะทําทุกอยางดวยคนเพียงคนเดียว ทํางานเกิดขอจํากัดในการทํางาน เชนเมื่อมี
ภาระงานมากขึ้น จะไมสามารถตอบสนองหรือทํางานไดทันเวลาที่ตองการ หรือการทํางานนั้นจะตองใชระบบ
่
หรือคนที่มีความสามารถมาก ทําใหหายากหรือเสียคาใชจายสูง จึงมีการนําเอาเครืองคอมพิวเตอรหลายๆ เครื่อง
มาชวยทํางาน เพื่อใหสามารถตอบสนองตอภาระงานได แตการทํางานแบบรวมศูนยหรือทํางานเพียงคนเดียวก็
มีขอดีในแงของการจัดการที่งาย และสะดวกรวดเร็ว มีความปลอดภัยของขอมูลสูง กวาการทํางานแบบกระจาย
Examples of Distributed Systems
Distributed UNIX
Commercial Applications
Wide Area Network Applications - Internet
Multimedia Information Access and Conferencing Applications
Voice and Video are continuous media, Time-based data
Interactive applications are sensitive to delay, less than 100 ms.
Require network support Quality of Service (QoS)

่
คุณลักษณะทีสําคัญของระบบสารสนเทศแบบกระจาย (Key Characteristics)
Resource Sharing
Openness
Concurrency
Scalability
Fault Tolerance
Transparency
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 9 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Resource Sharing
ทรัพยากรของระบบคอมพิวเตอรประกอบดวย Hardware, Software and Data
ในการควบคุมการแบงปนทรัพยากรจะมีผูจัดการทรัพยากร (Resource Manager) ทําหนาที่ควบคุมและ
่
จัดการทรัพยากร โดยที่ Resource Manager จะเปน software module ที่ทําหนาทีในการจัดการทรัพยากร
นั้นๆ ( manages resources)
Models ของการจัดการทรัพยากรในระบบ
The Client-Server Model
The Object-based Model

The Client-Server Model
Server processes act as resource managers
Clients issue requests to servers

The Object-based Model
All shared resources to be viewed in a uniform way by resource users
Object manager is the collection of procedures and data values that together characterize a class of
objects

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 10 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Openness
Open systems are characterized by the fact that their key interfaces are published
Open distributed systems are based on the provision of a uniform inter-process communication
Open distributed systems can be constructed from heterogeneous hardware and software, possibly from
different vendors. But the conformance of each component to the published standard must be carefully
tested and certified if users are to be protected from responsibility for resolving system integration
problems.

Concurrency
Several processes exist in a single computer
Interleave or Parallel
Interleave: เปนการสลับการทํางาน
Parallel: เปนการทํางานพรอมกัน (N processes executed simultaneously)
Concurrency opportunities
Users view => พิจารณาจากจํานวนผูใช
Processes view : Client-Server => พิจารณาจากจํานวน Process

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 11 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Scalability
Can operate at many different scale
The system and application software should not need to be changed when the scale of the system
increase
สามารถทํางานไดกับระบบที่มีขนาดตางๆ โดยไมตองแกไขระบบและโปรแกรมใชงาน และสามารถ
ปรับขนาดการใชงานไดทั้งเพิ่มขึ้นและลดลง

Fault Tolerance
Distributed systems should provide a high degree of availability
Fault-tolerance bases on
Hardware redundancy: the use of redundant components
Software recovery: the design of programs to recover from faults
เปนการออกแบบใหระบบทํางานได โดยอาศัยทั้ง Hardware และ Software

Transparency
่
เปนการปกปด/ซอนการทํางานหรือการเปลียนแปลงที่เกิดขึ้น โดยแบงเปนจากมุมมองของโปรแกรมและ
ผูใช
The concealment from the user and the application programmer of the separation of components in
distributed systems
The International Standards Organization’s Reference Model for Open Distributed processing (RM-
ODP) identify 8 forms of transparency
Transparency forms for RM-ODP
Access transparency
Location transparency
Concurrency transparency
Replication transparency
Failure transparency
Migration transparency
Performance transparency
Scaling transparency
Access + Location transparency = Network transparency

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 12 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

หลักการและเปาหมายการออกแบบระบบสารสนเทศแบบกระจาย (Design Concepts and Goals)
Suggest ideas of conceptual and design knowledge of Distributed System
Assume the key characteristics are generic design requirements for Distributed Systems
resource sharing, openness, concurrency, scalability, fault tolerance and transparency
Development of DS results for adding new services
The Goals may be expressed in the form of guarantees
Key Design Goals
ประเภทของเปาหมายของการออกแบบระบบ (Categories of design goal for distributed services)
Performance
Reliability
Scalability
Consistency
Security
Basic Design Issues
Discussion in some technical views
Design issues
Naming
Communication
Software structure
Consistency maintenance
พิจารณาเริ่มตนจากการกําหนดชื่อใหกับระบบและองคประกอบตางๆ ของระบบ เพื่อใหสามารถอาง
ถึงไดในสิ่งเดียวกัน หลังจากนั้นจะเปนการทําใหเกิดการติดตอสื่อสารระหวางกันได เมื่อสามารถเรียกใชงาน
ระหวางกันไดแลว จะตองมีการวางโครงสรางการทํางานของระบบ และกําหนดภาระหนาที่ แบงงานให
สวนประกอบตางๆ ทํางานรวมกันได และเมื่อระบบสามารถทํางานรวมกันไดแลว จะตองมีการออกแบบการ
ดูแล บํารุงรักษาระบบใหสามารถทํางานไดอยางสม่ําเสมอ

รายละเอียดประเด็นการออกแบบ (Basic Design Issues) - Naming
Terms:
name - interpreted by users or programs
identifier - interpreted by program only
Name resolved together with communication identifier:
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 13 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

port number
การพิจารณาในการออกแบบชื่อ (Design consideration)
่
ลักษณะของชือที่ใช (Choices of name space for each type of resource)
มีจํานวนไดจํากัดหรือไมจํากัด (finite or infinite)
มีโครงสรางหรือไม อยางไร (flat or structure)
่                      ้
ชื่อของทรัพยากรตางๆ จะถูกแปลงใหเปนการอางอิงเพือการสื่อสารถึงทรัพยากรนันๆ (Resource
names must be resolvable to communication identifier)
translation in a name service
Names are always resolved relative to some context, inherit contextual information
Forms of names and identifiers
re-call
re-use
examples: /etc/passwd, kmitl.ac.th
The most important advantage of hierarchic name spaces is relative resolving to separate context
Naming schemes can be designed for unauthorized access

รายละเอียดประเด็นการออกแบบ (Basic Design Issues) - Communication
Operations of sending and receiving processes involve
transfer of data, through communication channel
synchronization, implicit programming primitives
Basic action is message-passing, send and receive
synchronous or blocking: send and wait for reply
asynchronous or non-blocking: send in queue, receive normally blocks receiving processes
Communication model
- client-server,            - group multicast
transmit a request from client process to server process
execution of the request by the server
transmit a reply to the client

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 14 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Group multicast
Sending a message to a process group, one-to-many
Can be motivated by
Locating an object
Fault tolerance
Multiple update

รายละเอียดประเด็นการออกแบบ (Basic Design Issues) - Software Structure
อธิบายถึงลักษณะโครงสรางการทํางานของระบบ
Compare with centralized system Operating System, “monolithic”
Essential services from centralized OS
Basic resource management
memory allocation and protection
process creation and scheduling
peripheral device handling
User and application services
user authentication and access control, e.g.. Login
file management and file access facilities
clock facilities

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 15 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Basic resource management for Distributed OS
memory allocation and protection
process creation and processor scheduling
inter-process communication *
peripheral device handling
Components of distributed operating system
Operating system kernel services
Open services
Support for distributed programming

รายละเอียดประเด็นการออกแบบ (Basic Design Issues) - Workload Allocation
Describe by simple workstation-server model
Model
Processor pool model
Use of idle workstations
Shared-memory multiprocessor model

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 16 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

รายละเอียดประเด็นการออกแบบ (Basic Design Issues) -Consistency Maintenance
Update consistency
Replication consistency
Cache consistency
Failure consistency
Clock consistency
User interface consistency
User Requirements
Functionality
Enhancements from single computer
Sharing by network have variety of resources than single system
Possible for utilization
Migration options to distributed computing
Move to an entirely new systems designed specially for distributed systems
Emulation: existing system over new system
Reconfigurability
Short-term changes
Medium-to-long-term evolution
Quality of Service
Performance
Reliability and availability
Security

Summary
Distributed system is a collection of autonomous computers linked by a network, with software designed
to produce an integrated computing facility
Distributed and centralized systems comparison
Key characteristics of Distributed systems;
Resource Sharing, Openness, Concurrency, Scalability, Fault Tolerance, Transparency
Distributed systems are designed to be services for
Quality of Services
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 17 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Distributed system aims to achieve
high performance, reliability, scalability, consistency, security
In technical view, issues
Naming, communication, software structure, workload allocation, consistency maintenance
In user view, distributed system should provide
functionality, reconfigurability, quality of service

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 18 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

3. Network and Internetworking
Networks and Internetworking
Introduction
Network Technologies
Protocols
Technology case studies:
Ethernet
Token Ring
ATM
Protocol case study: Internet Protocol (IP)
Summary
Introduction
เปนการทบทวนความรูทางดานระบบเครือขายคอมพิวเตอร เพื่อใชงานในระบบสารสนเทศแบบกระจาย
Overview of computer networking for distributed system
Variety of hardware and software
Communication subsystem:
collection of hardware and software components that provide communication facilities for DS
Network performance parameter
Latency: delay in
software, network access and network itself
Data transfer rate: speed at which data can be transferred in bit per second (bps)
Message transfer time = latency + length / data transfer rate
Total system bandwidth: measure of throughput
LAN: system bandwidth same as data transfer rate
WAN: several channels transfer
Network QoS key parameters ที่สําคัญ
Bandwidth
Loss Rate
Delay
Delay Variation (Jitter)

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 19 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Network performance requirements
Performance or response for distributed system should no longer than centralized system
E.g. file access: in centralized systems take 10 - 20 ms., in DS should be less than 10 ms., including
Reliability requirements
Usually high in communication, error are often due to timing failure in software between sender
ชนิดของเครือขาย (Types of network)
Local Area Network (LAN)
high speed, low latency, short distance
Ethernet, Token Ring, FDDI
Wide Area Network (WAN)
lower speed, higher latency, long distance
Tele-communication network (64Kbps-based), ISDN, B-ISDN, ATM
Metropolitan Area Network (MAN)
high speed, low latency, medium distance
Criteria
speed, latency, distance
Owner, User, Maintainer/Operator
Messages: logical units of information, sequences of data items of arbitrary length
Packets: data items of arbitrary length, header and data, used to
allocate sufficient buffer to hold largest possible packet
avoid the undue delays, waiting for communication channel to be free
Internetworks:
several networks linked together to provide common communication facilities
openness for distributed system
Gateways
Routers
Bridges
result as ‘virtual network’
Example: Internet (capital “I”)
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 20 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Network Technologies
Wide Area Networks
collection of communication channels linking by packet-switching exchanges (PSE)
operate by store-and-forward
delay in PSE
route affect delay
ATM-based offer
high transfer rate
low latency

Local Area Networks
designed to enable users working on computer and sharing resources
Based on broadcast communication, shared channel
buses or rings topologies

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 21 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Protocols
A well-known set of rules and formats to be used for communication between processes
โปรโตคอลเปนขอกําหนด/ขอตกลงสําหรับใชติดตอสื่อสารระหวางโปรเซส ประกอบดวย
ขอกําหนดการทํางานของการแลกเปลี่ยนขอความ/ขอมูล (A specification of sequence of messages
that must be exchanged)
่
ขอกําหนดรูปแบบของขอความ/ขอมูลที่จะแลกเปลียนกัน (A specification of format of the data in
the messages)
แบงสวนการทํางานออกเปนชั้นๆ เพื่อสะดวกในการอธิบายการทํางาน และพัฒนาสวนตางๆ เมื่อมีการ
เปลี่ยนแปลงใหเกิดผลกระทบนอย
Protocol layers
Protocol suites or protocol stack
a complete set of protocol layers

OSI Protocol Model

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 22 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

ลักษณะของการแลกเปลี่ยนขอมูลในแบบ Packet
มีการแบงขอความที่จะติดตอสื่อสารออกเปนหลายๆ packet แลวทําการสงออกไป เมื่อไปถึง
ปลายทางก็จะมีการรวบรวมหลายๆ packet กลับเปนขอความ (Packet assembly - task of dividing
messages into packets before transmission and reassembling them at the receiver)
มีการกําหนดขนาดสูงสุดของ Packet ที่จะใช เรียกวา MTU - Maximum Transfer Unit
Addressing: network address of computer and port number - ในแตละ Packet จะมีการกําหนดที่อยู
เพื่อใหทราบถึงผูสงและผูรับ โดยจะมีการระบุถึงที่อยูของเครื่องและหมายเลขพอรต ที่ใชในการติดตอ
กับโปรเซสที่อยูภายในแตละเครื่อง

Types of Transport Service
Connection-oriented
a ‘virtual connection’ is set up between a sending and a receiving process and is used for
transmission of a stream of data
example: TCP
Connectionless
individual messages, datagrams, are transmitted to specified destinations
example: UDP
Comparison

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 23 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Technology Case Studies
Ethernet
CSMA/CD
Ethernet packet layout
Packet collision
Ethernet efficiency
Interconnected Ethernets: repeaters and bridges

Token Ring
Ring topology
based on token packet

Asynchronous Transfer Mode (ATM)
cell relay, 53 bytes
virtual connection
virtual channel (VC)
virtual path (VP)

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 24 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Internet Protocol (IPv4)
TCP/IP - Transmission Control Protocol / Internet Protocol
Ports: TCP and UDP

Identify device
IPv4 has 32 bits
represented by Dot Notation in binary, decimal or hexadecimal
Unique => assigned by Network Information Center (NIC)
2 parts: Network and Host
Class A: Large networks (8-bit NET) [1-126]
Class B: Moderate networks (16-bit NET) [128-191]
Class C: Small networks (24-bit NET) [192-223]
Class D: Multicast (e.g. MBONE) [224-239]
Class E: Experimental uses [240-247]

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 25 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Divide IP network class into sub-networks
หมายถึงสวนของ Network
หมายถึงสวนของ Host

example: 32-bit IP to 48-bit Ethernet, ARP
IP routing
delivery IP packet to destination
decide where to send packet next
routing table
send to nearest routers or gateways
ใชหลักการรับและสงตอเปนทอดๆ ไปจนถึงจุดหมายปลายทาง

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 26 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Domain Name System (DNS)
hierarchy, unique
client-server

Summary
Network performance parameters and requirements
Network technologies: LAN and WAN
Protocol: OSI layer model
Technology case studies
Ethernet, Token Ring, ATM
Protocol case study: Internet Protocol (IP)

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 27 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

4. Inter-Process Communication
InterProcess Communication
Introduction
Building blocks
Client-server communication
Group communication
Case study:
Interprocess communication in UNIX
Summary
Introduction
Distributed systems and applications compose of collections of processes.
Their roles determine the patterns of communication.
Software element and high-level protocols should be supported the main patterns of communication.
Base on message-passing.
Building Blocks
รูปแบบขอมูลกลางสําหรับการสื่อสารแลกเปลี่ยนระหวางกัน                จะเปนในลักษณะไมมีโครงสรางขอมูล
ขอมูลจะเรียงตามลําดับกันไป
Mapping data structures and data items to messages, flat structures.
Convert to an agreed external data.
If communicate the same type, the conversion to external data form may be omitted. Computers
may negotiate.
Transmit data values in their native form together with an architecture identifier.
External data representation: example: SUN XDR

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 28 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Marshalling: is the process of taking a collection of data items and assembling them into a form suitable
for transmission in a message.
การ Marshalling เปนการแปลงขอมูลรูปแบบภายในของการทํางานของโปรเซส (หรือโปรแกรม) ไปเปน
ขอมูลรูปแบบกลางสําหรับการติดตอสื่อสารในลักษณะขอความ
Unmarshalling: is the process of disassembling them on arrival to produce an equivalent collection of data
items at the destination.
การ Unmarshalling เปนการแปลงขอมูลรูปแบบกลางสําหรับการติดตอสื่อสารในลักษณะขอความไปเปน
ขอมูลรูปแบบภายในของการทํางานของโปรเซส (หรือโปรแกรม)
Send and Receive operations: using ports, PortID – การรับสงจะทํางานโดยใชหลักการของ Port ที่มี
หมายเลข Port เพื่อกําหนดโปรเซสที่ติตตอสื่อสารกัน

Synchronous and Asynchronous Communication
Blocking and non-blocking
Timeout specifies an interval of time to wait or reply
Communication             Blocking Blocking Languages and
Synchronous                   Yes     Yes occam
Asynchronous                  No       yes   Mach, Chorus,
BSD 4.x UNIX
Asynchronous                  No       No    Charlotte
Message Destination: specifies an ID denoting a message destination such as a port

Location-independent identifiers for destinations

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 29 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Types of message destination

Reliability
unreliable message: a single message transmitted from sender to recipient, without
acknowledgement or retries, e.g. UDP.
Reliable delivery service may be constructed from unreliable one by the use of acknowledgements.
(i) the need to store state information at source and destination
(ii) to transmit extra messages
(iii) possible latency for sender and recipient
Message Identifier: unique message identifier, two parts
requestID, an increasing sequence by sending process
an identifier for the sender process

Client-Server Communication
client side: DoOperation
server side: GetRequest, execute and SendReply

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 30 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

RPC exchange protocols

Operation concerns – การทํางานของโปรโตคอลระหวาง Client-Server จะเปนการพยายามแกไข
ขอผิดพลาดของการทํางานและการติดตอสื่อสารที่อาจจะเกิดขึ้น ไดแก
Delivery failures – ขอผิดพลาดที่เกิดจากการรับสงขอความ Request, Reply
Timeouts – เพื่อปองกันปญหาที่เกิดขึน จึงมีการกําหนดชวงเวลาของการรอขอความ แลวอาจจะทํา
้
การพยายามสงขอความนั้นอีกครั้ง
Discarding duplicate request messages – การสงขอความซ้ําอาจเกิดการรับและประมวลผลซ้ําขึ้น
ได จึงตองมีการปองกันการไดรับขอความซ้ํา
History – เปนการเก็บบันทึกประวัติการทํางาน/ประมวลผลที่เกิดขึ้น โดยอาจจะใชเปนที่เก็บ
ผลลัพธจากการทํางานและสงเปน Reply ไป ในกรณีที่ไดรับ Request ซ้ําได
Multi-packet messages – ในขอความหนึงๆ อาจจะมีหรือใชหลายๆ Packet ในการติดตอสื่อสารได
่

Group Communication
Use multicast messages
Multicast messages are very useful for DS with
Fault tolerance based on replicated services
Locating objects in distributed services
Better performance through replicated data
Multiple update
Atomic multicast: transmitted messages either received by all of the processes that are members of the
receiving group or else none of them.
Reliable multicast: is a message transmission method that makes a best effort to deliver to all members of
a group but does not guarantee to do so.
Totally-ordered multicast: when several messages are transmitted to a group by totally-ordered multicast
the message reach all of the members of the group in the same order.

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 31 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Implementation of group communication
Loop to send all members in the group
Efficiency: may be improved by network level
Reliability: messages may be dropped or process may fail after transmitting some.
Monitoring: failed are removed from groups
A reliable and atomic multicast; two techniques
Hold-back; atomicity and ordering for messages be done by communication handler
Negative acknowledgement; indicate missing messages
Totally-ordered atomic multicast: by using timestamps, sequencer or protocol
Case Study: Interprocess Communication in UNIX
Pipe line
One way communication – เปนการติดตอสื่อสารในลักษณะทิศทางเดียว
Redirect output stream from a process to another input process
Datagram communication

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 32 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Stream communication

Summary
The degree of reliability varies from applications, build protocols to suit for each application
There is variety of different types of message destinations, e.g. ports, processes or objects
Acknowledgement may be used for reliability, e.g. request-reply protocol
Multicast messages are used for group communications
socket concept may be used for process communication

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 33 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

5. Remote Procedure Calling
Remote Procedure Calling
Introduction
Design issues
Implementation
Asynchronous RPC
Summary
Introduction
Local Procedure Calling เปนการเรียกใช Procedure ภายใน
การเรียกใช Procedure จะแบงเปน ผูเรียกใช และผูถูกเรียกใช การทํางานจะเกิดขึ้นในสวนของฝายที่ถูก
เรียกใช จึงสามารถนําหลักการ Client-Server มาใชได โดยผูเรียกใช Procedure จะเปน Client สวนผูถูก
เรียกใชจะเปน Server
Remote Procedure Calling mechanisms integrate communication’s arrangement, e.g. client-server,
Server perspective: At RPC level, a service may be viewed as a module with an interface that exports a
set of procedures for operating on data abstraction or resource
Client perspective: a service provides the same facilities as a software module - enabling clients to import
its procedures
Features of remote procedure calls; main aspects of the semantics of RPC
The definition of a remote procedure specifies input and output parameters
Parameters provide a direct equivalent to parameters passed by value in conventional procedure
calls, indicate for input, output or both
A remote procedure is executed in a different execution environment from its caller, cannot access
variables in the calling environment
It is meaningless for a process to pass address of memory locations or their equivalent to other
processes
User package: a library of conventional procedures that can be used in application programs, written in a
special notation. E.g. location of a server.

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 34 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Design Issues
Classes of RPC system
The RPC mechanism is integrated with a particular programming language that includes a
notation for defining interfaces
A special purpose interface definition language is used for describing the interfaces between
clients and servers
Interface definition language – เปนภาษาที่ชวยในการกําหนดการเรียกใชโปรแกรม (Procedure)
Specify characteristic of the procedures, include names of the procedures and types of their
parameters, input, output or both
An interface contains a list of procedure signatures - that is, their names types of input/output
arguments
Exception handling – การจัดการ Exception
Any remote procedure call may fail because it may not be able to contact a server, or server has
failed or is too busy
Exception handling mechanism consists of two parts
raising of exceptions
handling procedures
Delivery guarantees
Main choices
Retry request message
Duplicate filtering
Retransmission of replies
The combinations of choices leads to a variety of possible semantics for the reliability of RPC as
seen by the caller
ผลลัพธ/ความหมายของการเรียกใช RPC เมื่อนําเอาการสื่อสารแบบ Client-Server มาใช

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 35 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Transparency
To make remote procedure calls as much like local procedure calls as possible and no distinction in
syntax between a local and a remote procedure calls.
To make the semantics of remote procedure calls like local procedure calls
เปนการพยายามทําใหการเรียกใช RPC เหมือนหรือใกลเคียงกับการเรียกใชภายใน โดยไมใหมีความ
แตกตางกันในไวยากรณทั้งสองแบบ รวมไปถึงผลลัพธที่ไดดวย     

Implementation
The software that supports remote procedure calling has three main tasks
Interface processing
Communication handling
Binding
Interface processing

้
กอนจะมีการเรียกใช Procedure ในฝง Remote จะตองทําการติดตั้ง Procedure นันๆ ในเครื่อง Server
หรือ Remote กอนใหพรอมเรียกใชงานได
Interface compiler - process interface definitions written in an interface definition language.
Generate a client stub procedure to correspond to each procedure signature in the interface
Generate a server stub procedure to correspond to each procedure signature in the interface
Use the signatures of the procedures in the interface to generate appropriate marshalling and
unmarshalling operations in each stub procedure
Generate procedure headings for each procedure in the service from the interface definition. The
programmer supplies the bodies of these procedures
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 36 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)               อัครินทร คุณกิตติ

Communication handling
deal with communication between client and server programs, generally by using request-reply
communication
Binding – การระบุตําแหนงของการใหบริการ
specify a mapping from a name to a particular object, usually identified by a communication Id.
Binding is essential to avoid the need to re-compile server port identifiers into client programs
Binder: a service that maintains a table containing mappings from service names to server ports,
an example of a Name Service
Request

Client                                                                  Server

Lookup                    Register

Binder

Binder interfaces
servers use Register and Withdraw
clients use LookUp

Locating the binder – ลักษณะของ Binder
Run the binder on a computer with well-known host address and compile this address into client
programs. All client programs must be recompiled if the binder is relocated
Make the client and server operating systems responsible for supplying the current host address of
the binder at run time, e.g. via an environment variable. Users need to be informed whenever
binder is relocated.
When a program starts executing, it uses a broadcast message to locate the binder, with a
specified port number.

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                            หนาที่ 37 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Asynchronous RPC
Synchronous RPC เรียกใช Procedure แลวรอผลลัพธกลับมา แลวจึงทํางานตอ
ํ
Asynchronous RPC เรียกใช Procedure หลายๆ Procedure ได โดยไมจาเปนตองรอผลลัพธกลับมากอน
Remote procedure calls that do not receive replies are termed asynchronous.
Analysis of throughput for asynchronous RPC
Parallel requests to several servers

Summary
The advantages of using RPC are mainly related to similarity to conventional procedure calling. The
benefits of data abstraction and software modularity can be achieved by the appropriate design and use
of RPC interfaces.
Interface definitions are required in RPC systems with some languages. An interface compiler
produces client and server stub procedures and a server dispatcher.
The client that makes an RPC uses a binder to locate a server, that registered the service with that
binder.
RPC ‘s usefulness is limited to distributed applications, modeled as client-server. Other models,
process groups, also should be considered.

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 38 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

6. Distributed Operating Systems
Distributed Operating Systems
Introduction
The kernel
Naming and protection
Communication and invocation
Virtual memory
Summary
Introduction
Broad characterization of a distributed operating system – ลักษณะทั่วไปของระบบปฏิบัติการ (ทั้งแบบ
กระจายและแบบทั่วๆไป)
enable a distributed system to be conveniently programmed, can be used to implement wide range
of applications.
Presenting applications with general, problem-oriented abstractions of resources in a distributed
system.
No clear dividing line between the distributed operating system and applications that run on top
of it.
Focus on the part of a distributed operating system that acts as an infrastructure for general, network-
transparent resource management – เปนการจัดการทรัพยากรที่มีในระบบ โดยการกําหนดลักษณะของ
ทรัพยากรขึ้นมา เพื่อใหผูใชหรือโปรแกรมสามารถเรียกใชไดตามรูปแบบของระบบปฏิบัติการนั้นๆ
low-level resources: เชน processors, memory, network interfaces and peripheral devices
higher-level resources: เชน windows, e-mail messages, spreadsheet,...
A distributed operating system provides – ระบบปฏิบัติการจะดําเนินการใหเกิดลักษณะตางๆ ดาน
Encapsulation: provide useful services that hide the details of management and implementation
from clients – จัดเตรียมการใหบริการตางๆ ที่ผูใชหรือโปรแกรมที่ใชงานไมจําเปนตองทราบ
รายละเอียดการดําเนินการภายใน
Concurrent processing: Resource managers are responsible for achieving concurrency
transparency – ทําใหผูใชหรือโปรแกรม ทํางาน/เรียกใชทรัพยากรตางๆ ไดพรอมๆกัน
Protection: Resources require protection from illegitimate accesses – มีกลไกในการปองกันการใช
งานทรัพยากร โดยไมมีสิทธิ์

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 39 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

A combination of client libraries, kernels and servers may be called upon to perform the following
Name resolution: the server or kernel that manages a resource has to be located, from the
resource’s identifier
Communication: operation parameters and results have to be passed to and from resource
managers, over a network or within a computer
Scheduling: when an operation is invoked, its processing must be scheduled within the kernel or
server
The Kernel
Kernel is a program, which its code is executed with complete access privileges for the resources
Kernel can setup address space, a collection of ranges of virtual memory, to protect processes from one
another
A kernel process executes with the processor in supervisor (privileged) mode; the kernel arranges that other
processes execute in user (unprivileged) mode
โหมดการทํางานของ Processor จะเปนการกําหนดสิทธิการใชงานสวนตางๆ ภายในหนวยประมวลผล ณ
เวลานั้นๆ
ั
A kernel may depend on hardware, processor – ขึ้นอยูกบ/เปลี่ยนตาม Processor
Monolithic kernels and microkernels
Monolithic kernels – รวมการทํางานทั้งหมดไวเปนสวนเดียวกับ แยกสวนไมได มักมีขนาดใหญ
Microkernels – แบงการทํางานออกเปนสวนๆ และสวน kernel เปนเฉพาะสวนที่เปนพืนฐานการ      ้
ทํางานของ Kernel เทานั้น ทําใหมีขนาดเล็กและเปนแบบ Modular

Microkernel appears as a layer between hardware and subsystems
Comparison:
microkernel; openness, modularity (structure), small (bug free)
monolithic; efficiency

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 40 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Architecture of a microkernel – สถาปตยกรรมของ Microkernel
Process manager: handle the creation and low-level operations upon processes
Memory manager: manage memory and cache
Supervisor: dispatching of interrupts, system call traps and exceptions

A thread is the operating system abstraction of an activity
An execution environment is the unit of resource management: a collection of local kernel-managed
resources to which its threads have access
An execution environment primarily consists of
thread synchronization and communication resources such as semaphores and communication
interfaces. (e.g. ports)
A region is an area of contiguous virtual memory, accessible by threads of the owning process
Each region is specified by
its extent (lowest virtual address and size)
whether it can be grown upwards or downwards

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 41 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Shared memory region
The uses of shared regions
Libraries
Kernel
Data sharing and communication

Creation of a new process – การสราง Process
Choice of target host – เลือกเครื่องที่จะให Process ไปทํางาน
Creation of an execution environment – สรางสภาพแวดลอมของการทํางาน
้
may use copy-on-write - เทคนิคหนึ่งในการสราง โดยการกําหนดตัวชี้ไปพืนที่เดิมกอน
้
(ยังไมสรางขึนมาจริงๆ) ถามีการเขียนหรือเปลี่ยนแปลงขอมูล จึงทําการสรางขึ้นมาจริง
Creation of a thread within it – สรางหนวยทํางาน

Multiple threads in server; improve throughput

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 42 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Creating a new thread within an existing process is cheaper than creating a process
More important, switching to a different thread within the same process is cheaper than switching
Threads within a process may share data and other resources conveniently and efficiently
compared to separate processes
But, by the same token, threads within a process are not protected from one another

Naming and Protection
Naming uses to identify resources
The identification of a resource in the systems requires a knowledge of
a port or port group to reach the server that manages the resource
a service-specific identifier for the resource (unless a unique port is associated with it)

Reconfigurability – การเปลี่ยนแปลงของทรัพยากรและการพิจารณา
Server relocation
Resource mobility
Location transparency
Migration transparency
Resource protection – การปองกันการใชงานทรัพยากร
Protection domains; using
Capabilities – กําหนดไดวาสามารถใชทรัพยากรนั้นไดหรือไม
Access control list – กําหนดรายละเอียดของการเขาใชได เชน อานได แตเขียนไมได
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 43 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Communication and Invocation
การสื่อสารและการเรียกใช
Questions concerning the communication provision in a distributed operating system
What basic communication primitives are supplied?
What quality of service guarantees are made?
Which protocols are supported?
How open is the communication implementation?
What steps are taken to make communication as efficient as possible?
Communication primitives – มีการสื่อสารแบบใดใหใชงานได
Memory sharing
used for rapid communication
Quality of service – คุณภาพของการบริการเปนอยางไร
reliable or unreliable
latency and bandwidth
Protocols and openness – ขอกําหนดหรือมาตรฐานที่ใช
TCP, UDP, IP

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 44 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Virtual Memory
Virtual memory is the abstraction of single-level storage that is implemented, transparently, by a
combination of primary memory
In virtual memory systems, part of main memory is used as a cache of the contents of backing storage
run programs whose associated code and data exceed the capacity of main memory
increase the level of multiprogramming by increasing the number of processes whose working
code and data can be stored in main memory simultaneously
remove the concerns of physical memory limitation from programmers
External pagers
to receive and deal appropriately with data that have been purged by a kernel from its cache of
pages, as part of kernel’s page replacement policy
to supply page data as required by a kernel to complete its page fault handling
to impose consistency constraints determined by the underlying memory object abstraction

Summary
Kernel architecture: monolithic kernels and microkernels
Naming of resources and their protection
Communication and invocation
Virtual memory, external pager

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 45 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

7. File service: A model and Case Studies
File Service: A Model
Introduction
File service components
Design issues
Interfaces
Implementation techniques
Summary
Introduction
The file is an abstraction of permanent storage.
ไฟลเปนที่เก็บขอมูลในลักษณะถาวรและเปนองคประกอบพื้นฐานของการทํางานของระบบสารสนเทศ
File system is included in operating system for organization, storage, retrieval, naming, sharing and
protection of files.
A file is defined as a sequence of similar-sized data items with functions.
File system modules:

Distributed file service requirements
A distributed file service is an essential and usually the most heavily-used service in a general-
purpose distributed system
The design of the file service should
support many transparency requirements
balance against
flexibility and scalability from transparency
software complexity and performance
Forms of transparency partially or wholly addressed by most current file services
Access transparency
Location transparency
Concurrency transparency
Failure transparency
Performance transparency

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 46 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Important requirements that affect the usefulness of a distributed file service
Hardware and operating system heterogeneity (openness)
Scalability
Forms of transparency for scalability is extended to very large numbers of nodes, currently developed
Replication transparency
Migration transparency
Important features that not found in current file services, but for in the future
Support for fine-grained distribution of data
Tolerance to network partitioning and detached operation

File Service Components
An example of file service, structured as
A flat file service, uses Unique File Identifiers (UFID)
A directory service, map text name and UFID

A client module (user package), programming interface
Design Issues
Flat file service – เก็บขอมูล
contain both data and attributes
Fault tolerance
servers can be stateless, can be restarted and restored
่
Directory service - กําหนดชือไฟล
one-dimension (non-hierarchic) or hierarchic directories
use capabilities and access control list for security
Client module
use RPC – ใชหลักการของ RPC
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 47 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Interfaces
An example of service interface – สําหรับเรียกใชงานดวย RPC
list procedures
brief explanation of each procedure
Flat file service – ตัวอยาง Procedures สําหรับ Flat File Service

Directory service – ตัวอยาง Procedures สําหรับ Directory Service

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 48 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Implementation Techniques
File identifications
File groups
UFID
Space leaks
security
capabilities and access control
access mode and permission field
File representation
Caching
server cache
client cache

Summary
Requirements for a file service
File service model, an example, three components
a flat file service, a directory service and a client module
Design issues for each component
Service interfacing for each component, procedures
Some implementation techniques

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 49 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

8. Name Services
Name Services
Introduction
The SNS - a Simple Name Service Model
Discussion of the SNS and further design issues
Case studies: DNS, GNS and X.500
Summary
Introduction
ั
การใหบริการชื่อในปจจุบนเรียกอีกอยางหนึ่งวา Directory Service
Names are used to refer to a wide variety of resources, e.g. computers, services, ports, users...
Names and attributes

A name services stores a database of bindings between a set of textual names and attributes for objects
The major operation is to resolve a name, look up attributes in the database from a given name
General name service requirements – ลักษณะที่สําคัญของการบริการชื่อ
To handle an essentially arbitrary number of names and to serve an arbitrary number of
High availability – ใชงานมากจึงตองมีความพรอมในการใชงานสูง
Fault isolation
Tolerance of mistrust

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 50 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

The SNS - a name service model (ตัวอยาง)
Basic requirements – เริ่มดวยการกําหนดความตองการเบื้องตน
the objects named are users, services, computers and groups
other types of named object may be integrated
the names are used only within the organization
since name lookups are frequent, they should be carried out efficiently
only authorized users may alter data held by SNS, but all users may read all data store by it,
access control
SNS names - กําหนดรายละเอียดของตัวชือ - .gene, .cs.gene, .cs.distrib.gene, .phys.gene
่
name space
name components
prefix
naming domain
่
SNS data and operations – ขอมูลที่ใชระบุลักษณะของชือนั้นๆ
data in form <Type, Value>
service operations - Bind, Lookup, Unbind

Name resolution - การหาชื่อ
Name servers and navigation – แบงสวนการเก็บขอมูลไวใน Name Server หลายเครื่อง เพื่อแบงการจัดการ
components
user agent (UA)
name server (NS)
method : iterative
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 51 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Caching
performance
availability

Discussion of the SNS and further design issues
The name space – limitations – ขอจํากัดดานโครงสรางของชื่อ ขาดความยืดหยุนตอการเปลี่ยนแปลง
Relative names
Merging
Heterogeneity
Customization
Reconstructing
่
Attribute-based naming - uses attributes as keys – การใหบริการดานชือ
Yellow pages services: Attribute-based name services
White pages services: Conventional name services
Replication - การทําสําเนาขอมูลและการใหบริการ

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 52 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Case study: DNS
The Internet Domain Name System (DNS) designed to replace the original Internet naming scheme,
/etc/hosts
does not scale to large numbers of computers
local organizations wish to administer their own naming systems
a general name service is needed, not only for looking up computer addresses
Domain names - consists of one or more strings, called labels, separated by ‘.’
domain names may be computer names or others
root, ‘.’, or top-level domains may have sub-domain,
US organizations, e.g. com, edu, gov, mil, net, org, ...
Countries, e.g. us, uk, fr, th, sg, jp, …
DNS queries
Host name resolution
Mail host location
Reverse resolution
Host information
Well known services
DNS name servers – Primary and Secondary (or Back-up)
DNS naming data are divided into zones, contains
Attribute data for names in domain
The names and addresses of authorized name servers
The names and addresses of authorized sub-zones name servers, glue data
Zone management parameters
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 53 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

In DNS, the user agent is called a resolver
The DNS architecture allows for both recursive and iterative navigation
Resource records are zone data
The Berkeley Internet Name Domain (BIND) is an implementation of the DNS in BSD UNIX

Case study: GNS (อธิบายถึงขอดีดานความยืดหยุนของโครงสรางชื่อ)
Global Name Service (GNS) was designed and implemented by DEC Systems Research Center
Naming
use unique directory identifier (DI)
values stores in value tree
<directory name, value name>
Directory tree is flexible for
Merging
Reconstructing
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 54 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Case study: X.500 (อธิบายถึงขอดีดานการใหบริการสืบคนขอมูลไดสะดวก)
The X.500 Directory service
by the CCITT and ISO
support white and yellow pages services
tree -> Directory Information Tree (DIT)
data -> Directory Information Base (DIB)
server -> Directory Service Agents (DSA)
client -> Directory User Agents (DUA)

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 55 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Summary
Naming services manage textual name and attributes
Main design issues
structure of name space; e.g. based on name or attributes, syntax, …
resolution model
management of bound names, e.g. divide into domains

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 56 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

how to manage reconfigurations of naming domains
merging and changing
interfaces supported by the service, operations, e.g. bind
Implementation
use of replication and caching, but beware of inconsistencies
Examples of name services, SNS, DNS, GNS and X.500

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 57 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

9. Time and Coordination
Time and Coordination
Introduction
Synchronizing physical clocks
Logical time and logical clocks
Distributed coordination
Summary
Introduction
Introduces some concepts and algorithms related to the timing and coordination of events occurring in
distributed systems – เวลาเปนเครื่องมือในการกําหนดลําดับเหตุการณและการทํางานรวมกัน
Synchronize clocks in different computers
external synchronization
internal synchronization
Depend upon developments
maintaining consistency of distributed data
checking authenticity of a request sent to a server
eliminating the processing of duplicate updates
Synchronizing physical clocks
Each computer contains their own physical clock
Clock output can be read by software, related to event
Clock resolution - the period between updated of the clock register - is smaller than the rate at which
events can occur
Clocks may be different, clock drift (rate) and offset
Clock drift - clocks count time at different rates, due to physical variations, e.g. f error 10-6, a second in
every 1,000,000 seconds, or 11.6 days
Offset - different in reading time, may due to drift rate,

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 58 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)             อัครินทร คุณกิตติ

The most accurate physical clocks use atomic oscillators, accurate in one part in 1013, International
Atomic Time
The standard second has been defined as 9,192,631,770 periods of transition between the two levels of the
ground state of Cesium-133, instead of astronomical time
Coordinated universal time (UTC) is an international standard that based on atomic time
UTC signals are synchronized and broadcast regularly from land-based radio stations and satellites
Accuracy of a received UTC signal is a function of both accuracy of the source, and its propagation delay,
depend on distance and speed
Compensating for clock drift – ปญหาของการตั้งคาเวลา
If time at computer C is greater than UTC, it may be possible simply to set its clock back to UTC
This simple setting may confuse applications, e.g. make utility in compilers
The solution is not to set its clock back, but to cause it to run slow for a period, until it’s in accord
with the time service
การเปลี่ยนคาเวลาโดยตรง (Step or Offset Change)จะเปลี่ยนคาเวลาไดเร็ว แตอาจทําใหเกิดผล
่
กระทบได สวนการเปลียนคาอัตราการเดินของนาฬิกา (Clock Rate Change) จะไมทําใหเกิดผล
กระทบ แตจะตองใชชวงเวลาในการเปลี่ยนคาเวลานานกวา
Cristian’s method for synchronizing clocks

P requests the time to time server, mr, it could set its clock to t + Ttrans, where Ttrans is the transmit time
Ttrans is subject to variation, Ttrans = min + x ; x >=0
Cristian suggested to record total round-trip time, Tround
Estimate the time by setting its clock to t + Tround /2
Accuracy of the result is +/- (Tround /2 - min)
The most accuracy can be dealt with taking the minimum of Tround of several requests
The Berkeley algorithm
Developed for collections of computers running Berkeley UNIX
master periodically polls the slaves for synchronization
The master estimates their local times by observing the round-trip times
It averages the values obtained, including its own clock. This average cancels out fast and slow clocks
Master sends out the amount of each individual slave’s clock adjustment, positive or negative value
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                          หนาที่ 59 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)            อัครินทร คุณกิตติ

The Network Time Protocol (NTP)
Defines an architecture for a time service and a protocol to distribute time information over Internet
Primary servers are directly connected to a time source such as radio clock receiving UTC
Secondary servers are synchronized to primary servers
The servers are connected in a logical hierarchy called a synchronization subnet, whose levels called strata

Primary servers occupy stratum 1, at the root
Stratum 2 servers, secondary servers, are synchronized to primary servers
Stratum 3 servers are synchronized from stratum 2 servers, and so on
NTP servers synchronize with one another in one of 3 modes
Multicast mode; use on high-speed LAN, servers multicasts the time in other servers connected by
LAN
Procedure-call mode; similar to Cristian’s algorithm, high accuracy
Symmetric mode; use by master servers that supply time in LANs and by higher levels (lower
strata), highest accuracy, a pair of servers exchange time information
In all modes, messages are delivered unreliably, using UDP
Each pair of servers calculate an offset, oi, and total transmission delay, di.

Apply a data filtering algorithm to successive pairs < oi , di.> to estimate a statistical quantity, called filter
dispersion.
Logical time and logical clocks
Happened-before relation
HB1: If x, y in process p, x -> y
HB2: For any message m, send(m) -> rcv(m)
HB3: If x, y and z are events such that x -> y and y -> z, then x -> z
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                         หนาที่ 60 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

A logical clock is a monotonically increasing software counter, whose value need bear no particular
relationship to any physical clock
Each process p keeps its own logical clock, Cp, which it use to timestamp events

Distributed coordination
Distributed processes often need to coordinate their activities
Locks may not enough for synchronization
Distributed mutual exclusion and election may be used
Distributed mutual exclusion
ME1:(safety) At most one process may execute in the critical section (CS) at a time
ME2:(liveness) A process requesting entry to the CS is eventually granted it
ME3:(ordering)          Entry to the CS should be granted in happened-before order
The central server algorithm
Employ a server to grant permission to enter a critical section

Processes send requests to server and wait for reply
The reply to grant a token for permission
After the process finish, it releases the token
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 61 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

A distributed algorithm using logical clocks
use logical token, state in each process indicating its token state, RELEASED, WANTED and HELD
send request messages to all, WANTED, using multicast, and wait for all replies to grant token,
permission, HELD
use timestamp, logical clock, to control the order of requests

A ring-based algorithm
use logical ring, unrelated to the physical interconnections
a token message passed from process to process in a single direction, round the ring

the token message passes around the ring when no process wants it
if a process want to enter critical section, wait for token and then keep it until exit the critical section
Example: Token Ring Network => คําถาม: อะไรคือ Critical Section ของการทํางานของเครือขายแบบ
Token Ring?
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 62 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Election Algorithms
เปน Algorithm สําหรับเลือกตัวแทนขึ้นมา ในกลุมสมาชิกที่ทํางานอยู โดยการตรวจสอบและคนหาวา
้
โปรเซสใดที่มีหมายเลขประจําโปรเซสสูงที่สุด ณ เวลานัน จะไดรับการเลือกขึ้นมาทําหนาที่เปนตัวแทน
การกําหนดหมายเลขประจําโปรเซสเปรียบเสมือนการกําหนดลําดับการเปนตัวแทนของกลุมไวแลว ซึ่ง
จะกําหนดเอาไวกอนการเลือกตัวแทน

The bully election algorithm
members know the identities and addresses of other members
three types of messages
election - announce an election
coordinator - announce new coordinator

o a process sends election to processes with higher identifier
o wait answer for a time, consider itself the coordinator, send a coordinator to lower identifier processes
o otherwise, wait for coordinator to arrive from the new coordinator, if none arrive, begin another election

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 63 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

A ring-based election algorithm
suitable for processes in logical ring
elect a coordinator with largest identifier

o     Initially, every process is marked as non-participant
o     begin election, mark itself participant, send an election message
o     process compare id in message with its own, if arrived id is greater, forwards the message
o     if arrived id is smaller and receiver is not a participant, then put its own id in message and forwards it,
and mark as participant
o     if it is a participant, it does not forward
o     if received id is itself, then it’s the greatest, and becomes the coordinator, then mark itself as non-
participant and send elected message
o     when received elected, mark itself as non-participant and forward the message

Summary
Described the importance of accurate time-keeping for distributed systems
Described algorithms for synchronizing clocks despite the drift and the variability of message
delays
The happened-before relationship, logical clocks and time
Described the need for processes to access shared resources under mutual exclusion
Three algorithms for distributed mutual exclusion; central server, distributed using logical clock
and ring-based
Election algorithms; bully and ring-based, to elect a new master time, or lock servers, when the
previous one fails

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 64 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

10. Replication
Replication
Introduction
Basic architectural model
Consistency and request ordering
The gossip architecture
Process groups
Summary
Introduction
Replication is the maintenance of on-line copies of data and other resources
The motivations for replication
Performance enhancement
Enhanced availability
n servers with probability p of failing; the availibility equals 1-pn
Fault tolelance
The chief requirement is replication transparency
The considerable issues are replication transparency, consistency and performance

Basic Architectural Model

Replica managers (RM) - processes that contain the replicas and perform operations upon them directly
Front Ends (FE) - may be separate processes or libraries linked into clients; provide replication
transparency, communicate with replica manager on behalf of clients
Clients (C) - make a series of requests, read-only or update

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 65 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Approaches for replication
The gossip approach

The process group approach

The gossip architecture
the front end communicates with an individual replica manager for each operation
RM exchange gossip messages periodically in order to convey the updates

The primary copy model
alternative gossip architecture, all front ends communicate with the same ‘primary’ server when
updating a data item
the primary server propagates the updates to the other servers, slaves
front ends may read the item from a slave
if the primary server fails, one of the slaves can be promoted to act as primary

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 66 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Multi-user editor architecture (Peers) or Multi-Master
multi-user collaboration, called groupware
each executing instance of the editor program holds a replica of the overall document state

Consistency and Request Ordering
Consistency requirements lead to ordering constraints for processing requests at different replica managers
การควบคุม Consistency สามารถทําไดโดยการควบคุมลําดับของการประมวลผล
Types of request ordering
Totally ordered request processing
all replica managers process the requests in the same order
Causal ordered request processing
use the notion of causal ordering, happened-before ordering
Sync-ordered request processing
a sync-ordered request forces the order of requests processed at replica managers to be ‘in
sync’, flushed any outstanding requests

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 67 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Implementing request ordering
hold-back : a received request is not processed by a replica manager until ordering constraints
can be met
a request message is said to be stable at a replica manager if all prior requests have been
processed
implementing by using hold-back queue and processing queue

Implementing total ordering
assign totally ordered identifiers to requests
two main methods for assigning identifiers to requests
using a sequencer
distributed agreement

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 68 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Implementing causal ordering with vector timestamps
use ordered identifiers, vector timestamps, a list of counts of update events, one for each of the
replica managers
compare the versions or timestamps of it held by different replica managers

The Gossip Architecture
Clients request operations which are processed initially by a front end
Front ends normally communicate with only a single replica manager at a time, although they are free to
communicate with others
Replica managers update one another by exchanging gossip messages which contain the most recent
They update in lazy fashion, exchange only occasionally

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 69 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Process Groups
A process group is a collection of processes that co-operate towards a common goal

■ Group structures - are defined according to pattern of communication
– peer group: all communication is directed from processes within the group to the group
– server group: all servers receive request, reply only one, e.g. client-server group
– subscription group: a group of processes that are sent the same information from a source,
members do not reply to the messages and process information in their own independent
– hierarchical group: a large group can be divide into sub-groups, and constructed in forms
root group, sub-groups and sub-groups of sub-groups
Group services
group membership management
multicast communication

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 70 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

The Process group and Gossip
Process group is represented by ISIS, and compared to the gossip model
Trade-off between amount of communication and timeliness of update delivery
For large number of RM, number of messages is considered, transmitted and processed

Summary
Replication is an important, means of achieving good performance, high availability and fault
tolerance in distributed systems
A general replicated system consists of clients, front ends and replica managers
Consistency ensures all replicas process all updates from clients in consistent orders
Three types of ordering; total, causal and sync-ordered
Explain the concepts of gossip and process group architectures with their comparison

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 71 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

11. Concurrency Control
Concurrency Control
Introduction
Locks
Optimistic concurrency control
Timestamp ordering
Comparison of methods for concurrency control
Summary
Introduction
Transactions must be scheduled so that their effect on shared data is serially equivalent
การควบคุมการใชงานพรอมกัน (Concurrency Control) เปนการพยายามใหมีการใชงานทรัพยากรให
                           ้            ่
พรอมกันใหมากที่สุด เพื่อใหเกิดความคุมคาของการใชทรัพยากรนันๆ โดยทียังคงความถูกตองของการ
ทํางานไดดวย
Concurrency control for servers may be modeled in terms of Read and Write operations on the data
items

All of the concurrency control protocols based on the criterion of serial equivalence
They are derived from rules for conflicts between operations

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 72 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Locks
Locks are used to order transactions that access the same data items according to the order of arrival of
their operations at the data items
Two-phase locking: transaction is not allowed any new locks after it has released a lock
the first phase, growing phase, new locks are acquired
the second phase, shrinking phase, locks are released
กําหนดขอตกลงรวมกันวา ในขณะทีเ่ กิดการ Lock ของทรัพยากร สวนอื่นจะเขาใชงานไมได
A simple exclusive lock, read & write, reduces concurrency
Improve concurrency by ‘many reader/single writer’ scheme – แบงการทํางานออกเปน Read และ Write
There are read locks and write locks
Write operations wait for all lock, but not for read operations
read locks sometimes called shared locks – กําหนด Conflict Rule ของการเขาใช

Strict two-phase locking: any locks applied during the progress of a transaction are held until the
transaction commits or aborts

Lock implementation
the granting of locks will be implemented by a separate module of the server program, called lock
manager
a lock manager maintains a table of locks for data items, each entry in the table of locks includes
transaction identifiers of the transactions that hold the lock
an identifier for a data item
a lock type
a condition variable
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 73 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

deadlock is a state in which each member of a group of transactions is waiting for some other
member to release a lock

Deadlock can be checked by using wait-for graph
A wait-for graph can be used to represent the waiting relationships between current transactions at a
server

A deadlock can be released by aborting a transaction in the cycle, cycle is broken

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 74 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Deadlock prevention: simple but not good – งายแตไมดนักี
lock all of the data items used by a transaction when it starts
Deadlock detection: by finding cycles in the wait-for graph
edges are added and removed by lock manager, lock and unlock operations
a graph can be represented by a table of
Transaction and waits for transaction
detect cycles in the graph table
Timeouts
Lock timeouts are a method for resolution of deadlocks that is commonly used
There are many problems with the use of timeouts
the worst is that transactions are sometimes aborted due to their locks
in an overloaded system, the number of transactions timing out will increase and
transactions taking a long time can be penalized
it is hard to decide on an appropriate length for a timeout

Increasing concurrency in locking schemes – การเพิ่มการใชงานพรอมกัน
Two-version locking – วิเคราะหการเขาใชทรัพยากรใหละเอียดขึ้น
write tentative versions of data items while other read committed version
commit a transaction by using commit locks

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 75 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Increasing concurrency in locking schemes
Hierarchic locks – แบงสวนทรัพยากรใหละเอียดขึ้น
improve granularity
children

Optimistic Concurrency Control
Optimistic concurrency control allows transactions to proceed until they are ready to commit,
whereupon a check is made to see whether they have performed conflicting operations in data items
Each transaction has the following phases
Read phase: each has tentative version
Validation phase: verify conflict with others, commit or abort
Write phase: made tentative version permanent
Validation of transactions

Backward validation
checks the transaction undergoing validation with other preceding overlapping transactions,
that entered validation phase before it
Forward validation
checks the transaction undergoing validation with other later transactions, which are still
active
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 76 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Timestamp Ordering
Timestamp ordering uses timestamps to order transactions that access the same data items according to
their starting times
The conflict rule is based on timestamp ordering rule

Comparison of Methods for Concurrency Control
The timestamp ordering method is similar to two-phase locking
Timestamp ordering decides the serialization order statically when two-phase locking decides
serialization order dynamically
Optimistic concurrency control is relatively efficient operation when there are few conflicts –
ี่
Optimistic concurrency control จะใชงานไดดีในกรณีทการทํางานมีความขัดแยงกันนอย

Summary
Operation conflicts form a basis for the derivation of concurrency control protocols
Protocols not only must ensure serialize-ability but also for recovery by using strict executions
When a server receives a request for an operation, it may choose
(i) execute it immediately, (ii) delay it, or (iii) abort it
Strict two-phase locking uses first two strategies, abort only in case of deadlock, drawback in deadlocks
Timestamp ordering uses all three, advantage for read only, transactions must be aborted when they
arrive too late
Optimistic concurrency control allow transactions to proceed without any form of checking until they
are completed, transactions are validated before being allowed to commit
การควบคุมการใชงานพรอมกันจะกระทําตาม Conflict Rules
การเพิ่มการใชงานพรอมกันทําไดโดย
วิเคราะหและแบงการเขาใชใหละเอียดขึ้น
วิเคราะหและแบงสวนทรัพยากรใหละเอียดขึ้น

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 77 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

12. Recovery and Fault Tolerance
Recovery and Fault Tolerance
Introduction
Transaction recovery
Fault tolerance
Hierarchical and group masking of faults
Summary
Introduction
Most fault-tolerant applications can be
transaction base; suitable for long-lived shared data
process control base; for real-time applications and applications that need to run correctly
The factor that distinguishes is recovery time
The recovery of transactions is concerned with ensuring failure atomicity in the presence of occasional
server failures
Transaction Recovery
A server records its committed data in a recovery file or files
A recovery manager ensures the effects of transactions on a server’s data items can be recovered when a
server is restarted after a failure
The task of a recovery manager
save data items in permanent storage for committed transactions
restore the server’s data items after a crash
reorganize the recovery file to improve the performance of recovery
reclaim storage space (in the recovery file)
A intention list is all records of a server’s current active transactions - when a transaction is committed,
the server uses that transaction’s intention list to identify the data items in affected

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 78 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Logging
Logging - the recovery file represents a log containing the history of all the transactions performed by a
server
A recovery manager will restore a server’s data items by ‘reading the recovery file backward’
A recovery manager reorganizes its recovery file for faster processing and reducing its use of space
A checkpoint is a reference of writing current committed values of a server’s data items
prepared for a new recovery file
reduce the number of transactions to be dealt with

Shadow versions technique uses a map to locate versions of data items in a file called version store
Any of data items changed by transaction are appended to the version store
These new, as tentative, versions are called shadow versions

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 79 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Two-Phase Commit Protocol Transaction Recovery
Recovery of the two-phase commit protocol in servers
use status values, including done, uncertain, in recovery file
a coordinator operates with workers
phase 1, coordinator is prepared to commit and votes all worker for yes or no, abort
phase 2, wait for commit confirmation

Fault Tolerance
Software and hardware of computer systems may fail from time to time
Two important points in the presence of faulty components
Services sometimes fail, but it is hard for a service to detect whether another computer has really
services may fail in a variety of independent ways

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 80 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

A set of servers running in different computers can be combined for joint execution, which is less
likely to fail than any individual components
a service can take advantage of the availability of multiple computers to mask failures
A designer needs to be aware of a range of possible failure
Failure semantics - a description of the ways in which a service may fail
A fault-tolerance service operates according to its specification in the presence of faults in other services
on which it depends
A server masks a failure in a service on which it depends, either by hiding it altogether or by converting it
to one of the faults t is allowed to exhibit
Class of failures
Omission failure - omit to response
Response failure - response incorrectly

Timing failure - response too late or too early
Performance failure - response too late
Crash failure - repeated omission failure

Type of server failure, by functions
Fail-stop; when it is about to fail, changes to a state that permits other servers to detect that a failure has
occurred and the stops
Byzantine failure; arbitrary failure, faulty computers work maliciously, e.g. reply incorrect values
based on Byzantine Generals Problem
some generals may be good or bad
vote for a plan of action

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 81 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Message Originators and Authentication

message originators may
2N+1 servers can tolerate N bad servers

Hierarchical and Group Masking of Faults
a situation in which a server depends on lower-level services
the server at higher level attempts to mask the faults at the lower level, may be entirely hidden
when a lower-level failure cannot be masked, it is converted to a higher level exception
in general, at each level, failure are either hidden entirely or passed in as exceptions to the level
implementing as a group of servers, running on different computers
if some of the servers fail, the remaining can continue
group masking hides the failure of individual members by a group management mechanism
a group is t-fault tolerant if it performs correctly so long as no more than t of its members fail at the
same time as one another
Fail-stop; t+1 servers can mask up to t member failure
Byzantine failure with authentication of sender ; 2t+1 servers can mask up to t member
failure
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 82 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

closely synchronized group: all members execute all of the requests immediately after receiving
them
loosely synchronized group: one server, primary, is used so long as it performs correctly and other
servers, backup or stand-by, are available to take over when it fails
the primary/stand-by servers arrangement can mask crash failures, but cannot be used for
Byzantine failure
Summary
Transaction-based applications have strong requirements for long-life and integrity of information stored by
transactional services, but do not usually for immediate response
Transaction recoveries are based on recovery files, may be logs or shadow versions
Fault tolerance explains in form of fault semantics for classes of failure
Fail-stop servers can be masked N, in a group of N+1 servers
In Byzantine General Model, N faulty servers can be masked from 2N+1 sender-authenticated servers

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 83 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

13. Security
Security
Introduction
Cryptography
Authentication and key distribution
Case study: Kerberos
Digital signatures
Summary
Introduction
Security mechanisms must be employed to implement the security policies
Principal refers to the agents accessing the information or resources
Security model, principal has a name, e.g. username, and a secret key, e.g. password
Threats: the purpose of a security system is to restrict access to information and resources to principals
which are authorized to have access, security threats to computer:
Leakage: the acquisition of information by unauthorized recipients
Tampering: the unauthorized alteration of information, and program
Resource stealing: the use of facilities without authorization
Vandalism: interference with the proper operation of a system without gain to the perpetrator
Methods of attack
Eavesdropping; obtaining copies of messages without authority
Masquerading; sending or receiving messages using the identity of another principal without their
authority
Message tampering; intercepting messages and altering their contents before passing them on to
the intended recipient
Replaying; storing messages and sending them at a later time
Infiltration - most attacks are launched by one of the legitimate users of a system
Virus: attached to a legitimate host program and install itself in the target environment and
replicated itself
Worm: a program that exploits facilities for running processes remotely, e.g. Internet Worm
Trojan horse: a program that is offered to the users as performing a useful function, but has a
second function hidden in it, e.g. spoof login

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 84 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Client-Server Scenarios

Security requirements for client-server systems
Secure the channels of communication used, to avoid eavesdropping
Design clients and servers to view each other with mutual suspicion, and to perform appropriate
message exchanges
servers must be satisfied that clients act on behalf of the principals that they claim to
client must be satisfied that the servers providing particular services are the authentic
servers for those services
Ensure that communication is fresh in order to avoid security violations through the replay of
messages
Security mechanisms for distributed systems are based on
Cryptography
Authentication
Access control
Cryptography - encryption of messages plays 3 majors roles
conceal private information
support of mechanisms for authenticating communication between pairs of principals
implement digital signature

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 85 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Authentication mechanisms
in centralized systems, user’s identity can be authenticated by a password check
in distributed systems, authentication means the identities of servers and clients are reliably
established, formed authentication service
Access control mechanisms - ensuring that access to information resources and hardware resources is
available only to subset of users that are currently authorized to do
Cryptography
A message is encrypted by the sender applying some rules to transform it from plain text to cipher text
To avoid the need for generating new rules, the encryption and decryption transformations are defined with
two parts, a function and a key
The function defines an encryption algorithm
A text M can be encrypted by key K in form {M}K
Keys can be either secret-key or public-key
Keys may be distributed by a secure service, a key distribution service
Secret-key encryption
may called private-key encryption
use one key for encryption and decryption
Data Encryption Standard (DES) is the most widely-used secret-key encryption method
use a 56-bit key for encrypt and decrypt 64-bit blocks

Public-key encryption
use two keys, one for encryption, another for decryption
The Rivest, Shamir and Adelman (RSA) is widely-used public-key encryption method

Key distribution is an important issue, keys should be transmitted in secured channels, e.g different physical
communication channel
คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 86 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Comparison of secret- and public-key cryptography
Security: with suitable keys and algorithms both methods are secure enough for all normal
purposes
Convenience: public-key method can be more convenient to implement because a secret channel is
not required to distribute the keys, but authenticated communication is required
Performance: secret-key methods are much faster

Authentication and Key Distribution - Secret keys
Key distribution based on an authentication server

Authentication and Key Distribution - Public keys

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 87 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Case Study: Kerberos
Based on secret-key protocol
Developed at MIT to provide a range of authentication and security facilities for use in the Athena campus
Security objects in Kerberos
Ticket: a token issued to a client by Kerberos ticket-granting service for presentation to a
particular server
Authenticator: a token constructed by a client and sent to a server to prove the identity of the user,
can be used only once, contains client’s name and timestamp, encrypted in session key
Session key: a secret key randomly generated by Kerberos and issued to a client for use when
communicating with a particular server

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 88 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Application of Kerberos
servers running for Kerberos have to require a ticket from each client at the start of every client-
server interaction
login program sends the user’s name to Kerberos AS, reply with a session key and ticket, login
Accessing typical servers with Kerberos
requests a ticket for the service from the ticket granting service
Digital Signatures
Handwritten signatures are widely used as an authentication technique for conventional documents, but not
applicable to computer-based documents
An electronic document or message M can be ‘signed’ by a principal A by encrypting a copy of M in a key
KA and attaching it to a plain-text copy of M and A’s identifier, <M,A,{M}KA>
To reduce the size of digital signatures for potentially large documents a digest function D is used to
produce a characteristic value that uniquely identifies the message to be signed, similar to checksums, D(M)
different from D(M’) for all likely M and M’
Digital signatures with public keys

Digital signatures with secret keys

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 89 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Summary
Security threats: leakage, tampering, stealing and vandalism
Security mechanisms for distributed systems are based on Cryptography, Authentication and Access
control
Two classes of cryptography: secret and public keys
Key distribution should be transmitted in secured methods
Authentication of principals depends upon the existence of a trusted third party
An example of authentication, Kerberos, uses tickets to access to services in servers
Digital signatures can be used to authenticate the origins of electronic documents

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 90 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

14. Distributed Systems Update
Distributed Systems Update
Introduction
Distributed Architectures
Host-Terminal, Client-Server, Middleware, Agent-based
–

Mobile Computations (Mobile Code)
–Mobile Code Technologies

System Environments
Mobility Mechanisms

Software System Constitution
–Mobile Code Applications

Conclusion
Introduction: Objectives
Distributed systems have been widely used and more significant in applications.
Discuss in concepts, architectures and evolutions of distributed systems.
To have clearer views in Distributed Systems
Discuss a new current concept
–Mobile Code or Code Mobility

Introduction: Foundations
Network communications
Software system developments and concepts
–Interfaces and Programming

APIs – Application Programming Interfaces
Terminal/User Interfaces
RPC – Remote Procedure Calling
–Services

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 91 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Distributed Architecture
Host-Terminal (not truly distributed)
Client-Server
Middleware
Agent-based
Host-Terminal Architecture
Host – Process/Execute jobs
Terminal – Input/Output (I/O), User Interface
–Dumb terminal

–Terminal emulator running on a workstation

Applications run on a host, no communication API
Allow programs, which were written, to be used remotely without modification.
Client-Server Architecture
Servers – open channel and wait for others
Clients – initiate contact to servers
Peers – act as both a client and a server
Applications run on both client and server

–degree of distribution between client and server

Clients identify and communicate directly with servers.
Middleware Architecture
The concept of “middleware” assumes a functional layer between clients and servers;
–provide services, e.g. location and alias resolution, authentication and transaction semantics.

–allows clients to interact with a generic abstraction of a server rather than with a specific

host and/or process.
–allow applications to be developed to a standardized API without knowledge of the location

or implementation of external functionality.
–A strength is implementation hiding.

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 92 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Middleware Architecture: Examples
DCE – Distributed Computing Environment, by OSF
Distributed Objects
Java – by Sun
DCOM – Distributed Component Object Model, by Microsoft
CORBA – Common Object Request Broker Architecture, by Object Management Group
(OMG)

Agent-based Architecture
Agent is a software component that is able to
–achieve a goal by performing actions, and

–reaching to events in a dynamic environment.

Agents are well supported by Object-Oriented technologies.

Mobile Computations
Terms
–Mobile code, or Code Mobility, is defined as the capability to dynamically change the

bindings between code fragments and the location where the are executed.
–Mobile Code Systems (MCS) are systems that use migration techniques.

Mobile Code Technologies
–Mobile Code Environments

–Mobility Mechanisms

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 93 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)                                                 อัครินทร คุณกิตติ

–Software System Constitution

–Comparison

Mobile Code Applications

Mobile Code Environments

Component      Component                       Component                  Component

True Distributed System

Network                        Network                        Network
Operating                      Operating                      Operating
System                         System                         System

Core Operating System           Core Operating System         Core Operating System

Hardware
Host                            Host                            Host

Mobile Code Systems (MCS)

Component          Component         Component             Component          Component

Computational                  Computational              Computational
Environment                    Environment                Environment

Network                     Network                       Network
Operating                   Operating                     Operating
System                      System                        System

Core Operating System         Core Operating System       Core Operating System
Hardware

Host                        Host                           Host

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                                              หนาที่ 94 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)                   อัครินทร คุณกิตติ

Mobile Code Environments
Computational Environments (CE)
–Executing Units (EU)

–Resources

code segment
states
–execution state

–data space

Executing Unit
Resource

Computational Environment

Execution state                   Data space
Code segment            (stack and instruction pointer)

Mobility Mechanisms: Classification by Code and State
Code and Execution State Mobility
–Strong mobility (strong MCS) – migration of both code and execution state of an EU to a

different CE.
Migration – suspend EU, transmit to destination, and resume.
Remote cloning – create a copy of EU at remote CE.
–Weak mobility (weak MCS) – code transfers across different CE, no migration of execution

state.
Code fetching – fetch code and execute.
Code shipping – ship code to another CE.

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                                หนาที่ 95 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)           อัครินทร คุณกิตติ

Mobility Mechanisms: Classification by Data Space
Resource binding forms
–By identifier: bound to an uniquely identified resource

–By value: cannot change due to migration, local access

–By type: weakest form, no matter in value or identity

not transferable
transferable: has two forms, fixed or free
Classes of data space upon migration mechanisms
–Resource relocation

–Binding reconfiguration

Mobility Mechanisms: Data Space Relationships
Bindings, Resources and Data space management mechanisms
–Binding removal: discard binding when resource moved

–By move: bound by identifier, resource is transferred with binding

–By copy: bound by value, binding is modified to a copy of resource, and are transferred to
destination
–Re-binding: bound by type, void binding and re-established to existed same type resource

after migration to destination

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 96 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)                   อัครินทร คุณกิตติ

Summary of bindings, resources and data space management mechanisms relationships

Software System Constitution: Basic Architectural Concepts
Components are constituents of a software architecture.
–code components: know-how to perform computation.

–resource components: data or devices used during computation.

–computational components: active executors capable to carry out a computation, as

specified by know-how.
Interactions are events that involve two or more components, e.g. messages exchanged among
computational components.
Sites host components and support the execution of computational components.

Code Component (know-how)
Resource
Interactions

Computational Component                                 Computational Component
Site SA                                                 Site SB

Client-Server
Remote Evaluation
Code on Demand
Mobile Agent

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                                หนาที่ 97 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)                             อัครินทร คุณกิตติ

Client-Server (CS)
Client A wants services from servers.
Server B has codes and resources.
(1) Request

(2) Execute                 Resource

(3) Result
Computational Component                                     Computational Component

Client             Site SA                                                     Site SB           Server

Remote Evaluation (REV)
A wants services.
A has codes, and B has resources.

(1) Send
Code
(2) Execute                 Resource

(3) Result
Computational Component                                     Computational Component
Site SA                                                     Site SB

Code on Demand (COD)
A wants services.
A has resources, and B has codes.
(1) Code Request

Resource
(3) Execute

(2) Send
Code
Computational Component                                     Computational Component
Site SA                                                     Site SB

Mobile Agent (MA)
A wants services.
A has codes, and B has resources.

(1) Migrate
(2) Execute                   Resource

Computational Component                                      Computational Component
Site SA                                                      Site SB

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                                     หนาที่ 98 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)                  อัครินทร คุณกิตติ

Comparison
Before                        After
SA             SB            SA              SB
Client-Server                A         know-how           A           know-how
resource                       resource
B                              B
Remote Evaluation        know-how       resource          A           know-how
A              B                           resource
B
Code on Demand            resource     know-how        resource          B
A            B           know-how
A
Mobile Agent             know-how       resource          -           know-how
A                                          resource
A
Bold: execute the code, Italic: has been moved.

Mobile Code Applications
Distributed Information Retrieval
Active Documents
Workflow Management and Cooperation
Active Network

Conclusion
Distributed Architectures
–host/terminal, client/server, middleware, agent-based

Code Mobility
–Computational Environment, EU, execute state, data

–Basic Concepts:

components – code, resource, computational
–mobility of codes, resources and bindings

interactions
sites

–Applications

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                               หนาที่ 99 จากทั้งหมด 100 หนา
เอกสารประกอบการสอน วิชา 07017206 ระบบสารสนเทศแบบกระจาย (Distributed Information Systems)          อัครินทร คุณกิตติ

References
[1] Ted Burghart, “Distributed Computing Overview”, Quoin Inc., June 1998. < available at URL
http://www.quoininc.com/quoininc/dist_comp.pdf >
[2] Alfonso Fuggetta, Gian Pietro Picco and Giovanni Vigna, “Understanding Code Mobility”, IEEE
Transaction on Software Engineering, Vol. 24, No. 5, May 1998, pp. 342-361.
[3] Mario Baldi, Silvano Gai, and Gian Pietro Picco, “Exploiting Code Mobility in Decentralized and
Flexible Network Management”, Mobile Agents, Proceedings of the 1st International Workshop on Mobile
Agents 97 (MA'97), Berlin (Germany), 1997.

*****************************************************

คณะเทคโนโลยีสารสนเทศ สถาบันเทคโนโลยีพระจอมเกลาเจาคุณทหารลาดกระบัง                        หนาที่ 100 จากทั้งหมด 100
หนา
เอกสารแนบ

บทความ (Papers)
Distributed Computing Overview

QUOIN
1208 Massachusetts Avenue, Suite 3
Cambridge, Massachusetts 02138

tel:     617.492.6461
fax:     617.492.6461
email:   info@quoininc.com
web:     www.quoininc.com
Distributed Computing Overview                                                          June, 1998

Introduction
The rise of networked workstations and fall of the centralized mainframe has been the most
dramatic change in the last two decades of information technology. This shift has put more
processing power in the hands of the end-user and distributed hardware resources throughout the
enterprise. No longer the domain of raised floors and data centers, processing power now resides
on desktops, workgroup servers, and minis. This shift first involved hardware; the current
challenge is to develop the software infrastructure to make use of these now distributed
resources.
As networks of computing resources have become prevalent, the concept of distributing related
processing among multiple resources has become increasingly viable and desirable. Over the
years, several methods have evolved to enable this distribution, ranging from simplistic data
sharing to advanced systems supporting a multitude of services. This paper presents an
overview of the means used to enable distribution of computing work, covering core concepts and
popular implementations of those concepts. The objective is to educate the audience as to the
technologies available and their strengths and weaknesses.
This white paper was developed by senior technical staff at QUOIN, a software development and
consulting firm that specializes in distributed objects and business components.

The Common Foundation
Network Communication
Underlying all distributed computing architectures is the notion of communication between
computers. Although basic, this shows how close, conceptually, some common distribution
architectures are to their underlying communication facilities. The combination of hardware and
system-level software that enables computers to communicate is frequently referred to as the
transport layer. When several computers are connected to one another through a common
transport layer, they can be considered a network of computers.
Much as a piece of information can be wrapped, addressed and sent through the postal service,
networks generally operate on packets, which are analogous to the package you might send
through the mail. Like the mail package, the network packet has ÔfromÕ and ÔtoÕ addresses and
contains some information, such as a message. Also like the mail message, the receiver may or
may not elect or be compelled to acknowledge receipt of the packet.
If either the mail or network message exceeds certain size limits, it may need to be broken up into
separate parts and reassembled upon arrival at its destination. However, these physically
separate packets can be treated as single logical packets. The transport layer, addressing
semantics, packet sequencing, data formatting and a host of other defined components make up a
communications protocol. These pre-defined protocols are what allow computer systems to
properly interpret the packets received from other systems.

© 1998 QUOIN Inc.                          Ted Burghart                                     Page 2
Distributed Computing Overview                                                             June, 1998

Synchronous and Asynchronous Transmission
Just like the mail package, the senderÕs interest in the subsequent receipt of the packet and
actions taken in response to it varies. There are cases where the sender isnÕt concerned about
when, or perhaps if, the packet arrived at its destination. There are other cases where the sender
wants confirmation that the packet arrived, but doesnÕt need such confirmation to continue its
task. There are also cases where the sender cannot continue until it receives a response from the
Synchronous modes of operation are those where the sender needs a response from the receiver
before it can continue. Modes of operation where the sender doesnÕt require a response from the
receiver, at least not before it can continue, are considered asynchronous. This distinction is
generally one of the primary factors in determining a given communication protocolÕs suitability
Clients, Servers and Peers
The terms Client/Server and Peer-to-Peer have come to be associated with specific attributes of
distributed computing. In fact, clients, servers and peers are just roles played by the participants
in a communication protocol. These roles can change constantly within a communications
session. Note that these participants are actually threads of execution, which may exist on the
same system or even within the same process as the thread of execution with which they are
communicating.
When a thread of execution opens a communication channel and waits for another thread to
contact it, it can generally be considered a server. The thread that initiates the communication by
contacting the server is generally considered a client. Peer is a general term used to refer to a
thread that is able to act as both a client and a server.
APIs - Application Programming Interfaces
Core communications facilities are generally provided by Operating System (OS) and network
requester APIs. These are groups of functions called by a program to accomplish the actual
transmission and receipt of bytes of data between systems. In general, these low-level
components provide limited abstraction of the underlying communication session, leaving
communicants to provide all logical services such as addressing and data conversion.

Client Logic                                                     Server Logic

OS/Network API                                       OS/Network API
Network Connection

Host ÔAÕ                                                  Host ÔBÕ
Figure 1: Direct Network Communication

© 1998 QUOIN Inc.                          Ted Burghart                                        Page 3
Distributed Computing Overview                                                           June, 1998

Terminal Interfaces
The oldest form of distributed computing isnÕt generally recognized as such - logging in to a host
system from a dumb terminal or terminal emulator running on a workstation. While not terribly
elegant, this method has proven itself very effective.
A number of protocols exist for this type of communication, among them Telnet, rsh and rexec.
The concept and implementation are simple; the client acts much like a directly connected
terminal, but with some additional facilities allowing it to communicate through a remote
connection. Each time a key is pressed, the client sends a packet containing a code identifying
the key to the server. The server, in turn, sends back packets containing data to be displayed by
the client. While generally limited to textual interfaces, server-based applications are able to use
color and extended keys to enhance client interface functionality.
Among the benefits of terminal interfaces is the fact that in many cases they donÕt require the
application to be written using a communication API at all, allowing programs which were
written without consideration for distribution to be used remotely without modification.
Messages
Next in the evolution of distributed computing comes the concept of the message, a packet of
data that is labeled with the information it contains. This allows an intermediate processing layer
at the server to route the message, or the data it contains, to the receiver appropriate to it.
Messaging systems can operate quite naturally in an asynchronous architecture. Because
message-based communication is well suited to intermediate routing, these features can be
combined to provide a level of abstraction to the communication framework itself. Messages
may be deposited into a queue by the server/router, from which theyÕre retrieved and acted upon
by one or more logical processors. These processors may not respond to the messages at all or
may respond directly to the client. However, to maintain the abstraction, they can send a
message back to the server, through another queue, to be routed back to the client.
Message-based architectures are also able to operate synchronously. Generally in this mode, the
server/router passes the message to the processor, which passes a response back to the server to
be returned to the client. Another hybrid mode is available, however, in which the server behaves
asynchronously as described above and only the client behaves synchronously. This
combination of behaviors allows the server to gain the efficiency of asynchronous operation
while the client benefits from the procedural simplicity and safety of synchronous processing.
This basic architecture of client communication with a server which dispatches messages based
on their content will be seen to underlie many of the following distribution models.
RPC - Remote Procedure Call
The concept behind RPCs is simple - to make what appears to be a normal procedure call from
within a process and have its execution actually carried out within some other process, possibly
on a remote system. Various implementations of RPC protocols have been developed with the
common goal of reducing the complexity of communicating between processes through
implementation hiding.

© 1998 QUOIN Inc.                          Ted Burghart                                       Page 4
Distributed Computing Overview                                                         June, 1998

The core concept of RPC mechanisms is that of serializing function call data into a sequential
stream and reconstructing it on the receiving end of the connection. This behavior takes place
synchronously, mirroring the semantics of traditional procedural programming. The RPC client
process makes a call to what appears to be a standard function, known as a stub. However,
rather than executing locally, the parameters passed to the function are packaged and transmitted
to a remote execution environment, where they are passed to the real implementation of the
function. Upon completion of the functionÕs execution, its return value is serialized and passed
back to the client stub, which returns it to the caller.
This behavior can be built upon the synchronous messages described previously.

Client/Server
As mentioned above, the terms client and server really refer to generic roles played by the
participants in a communication session. However, the term Client/Server has become common
in its usage describing a higher level, though conceptually similar, architecture. The common
interpretation of the term denotes a system where significant processing is done on the client,
which also submits operations to the server for execution. In this type of architecture,
synchronous operation is generally assumed, wherein the client waits for confirmation that the
operation has been carried out before proceeding.
Database Protocols
The X/Open Call Level Interface (CLI) [CLI 96] specification provides an interface to
Relational Database Management Systems (RDBMS) using Structured Query Language (SQL)
[SQL 92]. Microsoft's Open Database Connectivity (ODBC) API [ODBC 96] is the best
known implementation of the CLI standard. Sun MicrosystemsÕ Java Database Connectivity
(JDBC) API [JDBC 98] is a new implementation of the CLI standard specifically for Java
applications.
CLI and the architecture it supports are perhaps the most commonly envisioned usage of
client/server computing, allowing applications written using the standard to operate in most cases
without regard for the database to which they're connected. The drawback to this lowest-
common-denominator approach, of course, is that it does not provide access to some of the more
advanced features that differentiate RDBMS products.
The API presented by the specification ranges in appearance from a thinly-veiled messaging
interface to an RPC interface. The message-like components of the interface expose a hybrid
synchronous/asynchronous mode of operation wherein initial results are returned synchronously
while processing may continue asynchronously at the server. This allows the client to continue
processing as soon as the server is able to provide an initial set of results; further results are
queued by the server and returned to the client as they are requested. The RPC components are
used for control purposes and operate in a strictly synchronous manner.

© 1998 QUOIN Inc.                         Ted Burghart                                      Page 5
Distributed Computing Overview                                                              June, 1998

Middleware
Unlike client/server architectures where the client identifies and communicates directly with the
server, the concept of ÔmiddlewareÕ assumes a functional layer between the client and server.
This layer may provide services to the communicants such as location and alias resolution,
authentication and transaction semantics. Other behaviors associated with middleware include
time synchronization and translation between data formats.

Logic                              Logic
Client Process                   Implementation                     Implementation

Host                                Host                               Host

Middleware

Service                       Service                       Service

Figure 2: Middleware Architecture

This additional layer allows clients to interact with a generic abstraction of a server rather than
with a specific host and/or process. Various services are provided through abstracted layers as
well, blurring the distinction between services provided by the middleware and functionality
added by servers. These abstractions allow applications to be developed to a standardized API
without knowledge of the location or implementation of external functionality. This
implementation hiding is one of the middleware modelÕs strengths, although it makes it difficult
for the client to determine what performance it can expect from any given logic implementation.
DCE Ð Distributed Computing Environment
OSF1 DCE formalizes many of the concepts described here with a group of related specifications
[DCE 96]. The DCE RPC specification is among the most widely implemented in the industry,
providing consistent behavior across heterogeneous execution environments. The DCE
architecture also defines thread, time, authentication and security, directory and naming services.
Because DCE is supported by an industry consortium comprised of many major operating
system vendors, its standards enjoy widespread support across major computing platforms.
DCE core functionality is included in almost every available variant of UNIX, and as PC
operating systems have become more advanced, DCE core service support has become more
common in them as well. These standards are based on procedural programming methods in the
ÔCÕ programming language, however, limiting their applicability to multilingual and object-
oriented deployments.

1
The Open Software Foundation (OSF) has been renamed to The Open Group (OG), however DCE is trademarked
as ÒOSF DCEÓ.

© 1998 QUOIN Inc.                              Ted Burghart                                      Page 6
Distributed Computing Overview                                                            June, 1998

Reliable Messaging
On their surface, reliable messaging architectures such as IBMÕs MQSeries and MicrosoftÕs
MSMQ appear much like the queued message frameworks described earlier. Beneath this
surface, however, they differ greatly in their implementation.
In order to provide reliable delivery of asynchronous messages, a Ôstore and forwardÕ model is
used wherein a message to be sent by a process is passed synchronously to a middleware layer
that stores the message, and any addressing information it may contain, to a persistent storage
mechanism before returning control to the sending process. Once the message has been stored in
this manner, the middleware can use a variety of methods to try to get the message to its intended
recipient, while the sender continues its processing.
The reliability of the architecture comes from the concept that the current holder of the message
does not destroy its persistent copy of the message until it has received confirmation from a
subsequent receiver that the message has been safely stored by it. Because each link in the
communication chain stores the message until it knows it has been forwarded successfully, the
original sender can proceed with its processing assured that its message will get through to its
destination. Because of the asynchronous nature of this architecture, the sender must request
confirmation of receipt of the message (or confirmation must be a specified action upon receipt of
the particular message) if it needs to know when the message arrived or other details of its
handling.

Distributed Objects
Object distribution architectures build upon the middleware concept by encapsulating data within
functional interfaces to objects. Like well-designed procedural APIs, implementation details are
hidden from the user of the object. Unlike traditional APIs, however, object architectures limit
access to the invocation of methods defined for the object. Furthermore, methods are invoked on
the objects indirectly, via references to the objects, eliminating the need for local instances of the
objects.

Object Client
Object                             Object
Implementations                    Implementations

Object Interface Calls              Local                             Remote

Distribution/Services

Figure 3: Distributed Object Architecture

This near-complete implementation hiding allows distributed object architectures to support
location, platform, and programming language transparency. Such transparency is not without its

© 1998 QUOIN Inc.                           Ted Burghart                                       Page 7
Distributed Computing Overview                                                            June, 1998

costs, however, which has prompted the designers of some distributed object architectures to
forego some neutrality in exchange for perceived improvements in performance, applicability to
specific tasks and/or ease of use.
Java RMI - Remote Method Invocation
SunÕs Java, while relatively new to the computing industry, has gained a great deal of acceptance
due to its platform neutrality, safety and object-oriented design. Java was designed from the
ground up as a complete execution environment, rather than just another programming language,
therefore it is able to provide a consistent and abstract interface regardless of the underlying
platform.
This platform independence is accomplished through the use of a Java Virtual Machine (JVM)
that emulates a computing platform itself. The JVM is provided for each actual combination of
hardware and operating system upon which Java is to run. Because Java programs all appear, at
the application level, to be running on the same computing platform, communication between
Java applications is made significantly easier.
Java RMI [RMI 97] provides a language-specific architecture allowing Java-to-Java distributed
applications to be built easily. The main advantage to using Java RMI when designing a pure
Java distributed system is that the Java object model can be taken advantage of whenever
possible. Of course, this precludes using Java RMI in multilingual environments. JavaÕs inherent
platform independence, however, still allows deployment in heterogeneous environments.
DCOM Ð Distributed Component Object Model
MicrosoftÕs core object distribution protocol is DCOM [DCOM 98], an extension of
MicrosoftÕs COM [COM 95] integration architecture, permitting interaction between objects
executing on separate hosts in a network.
COM began as a way to let client programs link to object implementations dynamically; i.e. at
run-time, incorporating them into a single address space. The implementations were packaged in
Dynamic Link Libraries (DLLs). COM is essentially an integration scheme, adopting the
structure of C++ virtual function tables for binary compatibility. These virtual function tables,
commonly known as ÔvtablesÕ, consist of a table of function addresses (in C language terms) or
the equivalent. COM interfaces are presented to clients as pointers to vtables, thereby hiding the
details of the implementation. This makes COM binaries independently replaceable, as long as
they implement the same interfaces as their predecessors.
By taking the path of least resistance and adopting the C++ vtable model of binary integration,
COM achieved both replaceable binaries and the efficiency of in-process method invocation,
equivalent to a C++ virtual function call. These benefits naturally come at some cost. Because
an interface is presented as a pointer to a single vtable, interfaces cannot be defined using multiple
inheritance. This would imply multiple vtables and therefore multiple pointers per interface. In
addition, because there is no intermediary service to dispatch function calls, any programming
language besides C++ must go through some contortions to work with COM.

© 1998 QUOIN Inc.                           Ted Burghart                                       Page 8
Distributed Computing Overview                                                                     June, 1998

In order to address the rising need for distribution of objects across multiple hosts (i.e. multiple
physical address spaces), Microsoft developed DCOM as an extension to COM. As an
extension rather than a separate architecture, DCOM inserts a stub interface between the calling
application and the actual implementation of that interface. In this manner the architecture
strongly resembles an RPC-based model, although the implementation is still based on a binary
integration scheme, rather than a more abstract model.
COM+ is the recently announced successor to COM-based models incorporating a new
generation of technology. COM+ extends COM with multiple inheritance, a new runtime, and
language extensions that will make it easier to build COM objects in a variety of programming
languages. As of this writing, a COM+ specification has not been published, making it difficult
to assess its architecture and any tradeoffs that may have been (or will be) made to facilitate its
implementation.
CORBA Ð Common Object Request Broker Architecture
CORBA is a standard maintained by the Object Management Group (OMG) for the distribution
of objects across heterogeneous networks. Designed as a platform-neutral infrastructure for
inter-object communication, it has gained widespread acceptance.
The OMG is a consortium of more than 760 companies formed to create a standard architecture
for distributed object computing. The goal of the OMG2 is to combine object and distributed
systems technologies into an interoperability architecture that supported integration of existing
and future computing platforms. The result of that effort is the Object Management Architecture
(OMA); CORBA specifies the Object Request Broker (ORB) underlying the OMA. The ORB
provides the base architecture [CORBA 97] as well as a number of services [COSS 97] such as
security, transactions and messaging.
CORBA allows applications to use a common interface, defined in an Interface Definition
Language (IDL), across multiple platforms and development tools. OMG IDL is designed to be
platform and language-neutral; data and call format conversions are handled transparently by the
ORB. All interfaces to CORBA objects, and the data types used in those interfaces, are
specified in IDL. This common definition allows applications to operate on objects without
concern for the manner in which the object is implemented.
As viewed by the client, a CORBA object is entirely opaque, in that the objectÕs implementation
and location are unknown to the application using it. Generally, a CORBA client will know only
how to find or create the objects it needs through interfaces to well known objects such as query
mechanisms and factories. It is likely that the client will not know where or how even these well
known objects are implemented, but will instead be able to locate them by name only through the
CORBA Naming Service [COSS 97].
In the case of any of these CORBA objects the client knows only what the objectÕs public
interface is and can therefore access the objectÕs functionality. CORBA also provides some
capabilities for runtime object interface identification and invocation through its Interface
2
Microsoft participates, but does not submit technology to the OMG. Instead, Microsoft promotes its own
Windows-centric Distributed Component Object Model (DCOM).

© 1998 QUOIN Inc.                               Ted Burghart                                              Page 9
Distributed Computing Overview                                                                         June, 1998

Repository (IR) and Dynamic Invocation Interface (DII). While these have the potential to allow
(almost) complete runtime configuration of access to CORBA objects, in practice there may be
very few cases where such capabilities are actually workable due to semantic issues.

Conclusions
Many of the above concepts are shared across distribution architectures, and many architectures
are built upon each other.
Some technologies, however old, still offer compelling reasons to use them. Certainly, no
mainframe or UNIX system administrator would be willing to give up text-mode terminal
interfaces for remote administration over dial-up connections. Similarly, implementing high-
volume or low-overhead communications is frequently best done using low-level operating
system and network interfaces.
However, where ease and cost of deployment are larger factors, standardized architectures are
generally a good choice. In this case, the architectures can be divided between asynchronous and
synchronous, with these most commonly being further divided between procedural and object-
oriented methods.
For asynchronous communication, message-based architectures will frequently be most
appropriate. In cases where assured delivery3 is required, reliable store-and-forward messaging
delivery may not be necessary, especially when the general reliability of current computing
platforms is considered.
Where synchronous, procedural programming is used, DCE RPC will generally be a good choice.
This proven, widely available framework offers a C language interface that will be reasonably
easy to use in C and C++ applications. For systems written in other languages or where DCE
RPC support is incomplete, synchronous messaging may be an appropriate solution.
Last, but not least, comes the subject of object-oriented distribution. Certainly, of the three
frameworks discussed, CORBA provides the greatest flexibility with its language and platform
neutrality. There are, of course, some costs associated with this neutrality, both in deployment
and runtime overhead. These costs may be reduced by design trade-offs. Of course, a less
flexible alternative may become much more costly if it doesnÕt support some future requirement.
One of the biggest factors in favor of MicrosoftÕs COM/DCOM solutions is their installed base -
virtually every PC running Windows has some level of COM support built in. Most 32-bit
versions of Windows have DCOM support as well, providing a compelling argument for its use
in Windows-only environments. Nearly all Windows development tools provide fairly easy to
use wrappers allowing integration of COM components into applications they generate as well.
The case for using Java RMI is simply that itÕs so easy to do. Because it only supports Java
objects it fits directly into the Java model with minimal impact on development resources.

3
Of course, without suitable precautions against component failures, even the most reliable architectures may be
exposed to data loss or corruption.

© 1998 QUOIN Inc.                                 Ted Burghart                                             Page 10
Distributed Computing Overview                                                           June, 1998

However, since CORBA support is being built into the Java environment, RMI doesnÕt have the
advantage of being assured a larger installed base. Regardless, for pure Java distributed systems,
the ease with which RMI-based systems can be deployed is compelling.
Proponents of all of these architectures can be found easily and in great numbers, and it is
impossible to state that one alone is best for all distributed systems. It should also be noted that
while many of the most widely used implementations of these architectures have been addressed
here, there are many more which have not. In particular, reliable messaging is a rapidly growing
field.
An extensive comparison of COM and CORBA can be found in [Quoin 98]. Comparisons of,
and documentation for, all of the mentioned architectures can be found in a number of locations
on the World Wide Web, as well.
Distributed computing is becoming more prevalent every day. The architectures discussed here,
and many others, are able to solve a range of problems that could not even be considered a few
years ago. We recommend careful identification of present and future needs, as well as current
competencies, before deciding to deploy any of them.

References
[CLI 96]               ÒInformation Technology - Database Languages - SQL - Call-Level
InterfaceÓ ANSI/ISO/IEC 9075-3-1996 (ITI/NCITS October 1996)
[COM 95]               ÒThe Component Object Model SpecificationÓ (Microsoft Corporation,
Digital Equipment Corporation, October 1995)
[CORBA 97]             ÒThe Common Object Request Broker: Architecture and Specification,
Version 2.1Ó (Object Management Group, et al, August 1997)
[COSS 97]              ÒCORBAservices: Common Object Services SpecificationÓ (Object
Management Group, et al, July 1997)
[DCE 96]               R. Salz; ÒDCE 1.2 Contents OverviewÓ, Open Group RFC 63.3 (The
Open Group, October 1996)
[DCOM 98]              N. Brown, C. Kindel; ÒDistributed Component Object Model ProtocolÓ
(Microsoft Corporation, January, 1998)
[JDBC 98]              G. Hamilton, R. Cattell; ÒJDBC™: A Java SQLÓ (Sun Microsystems, Inc.,
February 1998)
[ODBC 96]              ÒODBC 3.0 ProgrammerÕs ReferenceÓ (Microsoft Corporation, October
1996)
[RMI 97]               ÒJava Remote Method InvocationÓ (Sun Microsystems, Inc., December
1997)
[SQL 92]               ÒInformation Technology - Database Languages - SQLÓ, ISO/IEC
9075:1992 (ISO, November 1992), ANSI X3.135-1992 (ANSI, October
1992)

© 1998 QUOIN Inc.                          Ted Burghart                                      Page 11
Distributed Computing Overview                                             June, 1998

[Quoin 98]          O. Tallman, J. Kain; ÒCOM versus CORBA: A Decision FrameworkÓ
(Quoin Inc., January 1998)

© 1998 QUOIN Inc.                    Ted Burghart                             Page 12
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998                                                            1

Understanding Code Mobility
Alfonso Fuggetta, Gian Pietro Picco, and Giovanni Vigna
Abstract | The technologies, architectures, and method-           tional power on both intermediate and end network nodes.
ologies traditionally used to develop distributed applications          The increase in size and performance of computer net-
exhibit a variety of limitations and drawbacks when applied          works is both the cause and the e ect of an important
to large scale distributed settings (e.g., the Internet). In
particular, they fail in providing the desired degree of con g-      phenomenon: networks are becoming pervasive and ubiq-
urability, scalability, and customizability. To address these        uitous. By pervasive, we mean that network connectivity is
issues, researchers are investigating a variety of innovative        no longer an expensive add-on. Rather it is a basic feature
approaches. The most promising and intriguing ones are
those based on the ability of moving code across the nodes           of any computing facility, and, in perspective, also of many
of a network, exploiting the notion of mobile code.                  products in the consumer electronics market (e.g., televi-
As an emerging research eld, code mobility is generating          sions). By ubiquitous, we refer to the ability of exploiting
a growing body of scienti c literature and industrial devel-         network connectivity independently of the physical location
opments. Nevertheless, the eld is still characterized by
the lack of a sound and comprehensive body of concepts               of the user. Developments in wireless technology free net-
and terms. As a consequence, it is rather di cult to un-             work nodes from the constraint of being placed at a xed
derstand, assess, and compare the existing approaches. In            physical location and enable the advent of so-called mobile
turn, this limits our ability to fully exploit them in practice,
and to further promote the research work on mobile code.             computing. In this new scenario, mobile users can move
Indeed, a signi cant symptom of this situation is the lack           together with their hosts across di erent physical locations
of a commonly accepted and sound de nition of the term               and geographical regions, still being connected to the net
\mobile code" itself.
This paper presents a conceptual framework for under-             through wireless links.
standing code mobility. The framework is centered around                Another important phenomenon is the increasing avail-
a classi cation that introduces three dimensions: technolo-          ability of easy-to-use technologies accessible also to naive
gies, design paradigms, and applications. The contribution           users (e.g., the World Wide Web). These technologies have
of the paper is twofold. First, it provides a set of terms
and concepts to understand and compare the approaches                triggered the creation of new application domains and even
based on the notion of mobile code. Second, it introduces            new markets. This is changing the nature and role of net-
criteria and guidelines that support the developer in the            works, and particularly of the Internet. They cannot be
identi cation of the classes of applications that can leverage
o of mobile code, in the design of these applications, and,          considered just plain communication technologies. Nowa-
nally, in the selection of the most appropriate implemen-          days, modern computer networks constitute innovative me-
tation technologies. The presentation of the classi cation is        dia that support new forms of cooperation and communi-
intertwined with a review of the state of the art in the eld.        cation among users. Terms like \electronic commerce", or
Finally, the use of the classi cation is exempli ed in a case
study.                                                               \Internet phone" are symptomatic of this change.
Keywords | Mobile code, mobile agent, distributed appli-             However, this evolution path is not free of obstacles and
ing size of networks raises a problem of scalability. Most
I. Introduction                                results that are signi cant for small networks are often in-

C     OMPUTER networks are evolving at a fast pace, and
this evolution proceeds along several lines. The size
of networks is increasing rapidly, and this phenomenon is
applicable when scaled to a world-wide network like the In-
ternet. For instance, while it might be conceivable to apply
a global snapshot algorithm to a LAN, its performance is
unacceptable in an Internet setting. Wireless connectivity
not con ned just to the Internet, whose tremendous growth            poses even tougher problems 1], 2]. Network nodes may
rate is well-known. Intra- and inter-organization networks           move and be connected discontinuously, hence the topol-
experience an increasing di usion and growth as well, fos-           ogy of the network is no longer de ned statically. As a
tered by the availability of cheap hardware and motivated            consequence, some of the basic tenets of research on dis-
by the need for uniform, open, and e ective information              tributed systems are undermined, and we need to adapt
channels inside and across the organizations. A side ef-             and extend existing theoretical and technological results to
fect of this growth is the signi cant increase of the net-           this new scenario. Another relevant issue is the di usion
work tra c, which in turn triggers research and indus-               of network services and applications to very large segments
trial e orts to enhance the performance of the communica-            of our society. This makes it necessary to increase the cus-
tion infrastructure. Network links are constantly improved,          tomizability of services, so that di erent classes of users are
and technological developments lead to increased computa-            enabled to tailor the functionality and interface of a service
A. Fuggetta and G. Vigna are with Dipartimento di Elettronica      according to their speci c needs and preferences. Finally,
e Informazione, Politecnico di Milano, P.za Leonardo da Vinci, 32,   the dynamic nature of both the underlying communica-
I-20133, Italy. E-mail: ffuggetta,vignag@elet.polimi.it.             tion infrastructure and the market requirements demand
G.P. Picco is with Dipartimento di Automatica e Informatica,       increased exibility and extensibility.
Politecnico di Torino, C.so Duca degli Abruzzi 24, I-10129, Italy.
E-mail: picco@polito.it.                                                There have been many attempts to provide e ective an-
2                                               IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998

swers to this multifaceted problem. Most of the proposed            either by an external signal or by the explicit invocation of
approaches, however, try to adapt well-established mod-             the migrate system call. Object migration makes it possi-
els and technologies within the new setting, and usually            ble to move objects among address spaces, implementing a
take for granted the traditional client-server architecture.          ner grained mobility with respect to process-level migra-
For example, CORBA 3] integrates remote procedure calls             tion. For example, Emerald 9] provides object migration
(RPCs) with the object-oriented paradigm. It attempts to            at any level of granularity ranging from small, atomic data
combine the bene ts of the latter in terms of modularity            to complex objects. Emerald does not provide complete
and reuse, with the well-established communication mech-            transparency since the programmer can determine objects
anism of the former. However, this approach does not en-            locations and may request explicitly the migration of an
sure the degree of exibility, customizability, and recon g-         object to a particular node. An example of system provid-
urability needed to cope with the challenging requirements          ing transparent migration is COOL 10], an object-oriented
discussed so far.                                                   extension of the Chorus operating system 11]. COOL is
A di erent approach originates in the promising research         able to move objects among address spaces without user
area exploiting the notion of mobile code. Code mobility            intervention or knowledge.
can be de ned informally as the capability to dynamically              Process and object migration address the issues that
change the bindings between code fragments and the loca-            arise when code and state are moved among the hosts of
tion where they are executed 4]. The ability to relocate            a loosely coupled, small scale distributed system. How-
code is a powerful concept that originated a very interesting       ever, they are insu cient when applied in larger scale set-
range of developments. However, despite the widespread              tings. Nevertheless, the migration techniques discussed so
interest in mobile code technology and applications, the            far have been taken as a starting point for the development
eld is still quite immature. A sound terminological and           of a new breed of systems providing enhanced forms of code
methodological framework is still missing, and there is not         mobility. These systems, often referred to as Mobile Code
even a commonly agreed term to qualify the subject of this          Systems (MCSs), exhibit several innovations with respect
research1. In addition, the interest demonstrated by mar-           to existing approaches:
kets and media, due to the fact that mobile code research              Code mobility is exploited on an Internet-scale. Dis-
is tightly bound to the Internet, has added an extra level          tributed systems providing process or object migration
of noise, by introducing hypes and sometimes unjusti ed             have been designed having in mind small-scale computer
expectations. In the next section we present the main dif-          networks, thus assuming high bandwidth, small predictable
ferences between mobile code and other related approaches,          latency, trust, and, often, homogeneity. Conversely, MCSs
and the motivations and main contributions of this paper.           are conceived to operate in large scale settings where net-
works are composed of heterogeneous hosts, managed by
II. Motivations and Approach                            di erent authorities with di erent levels of trust, and con-
Code mobility is not a new concept. In the recent past,          nected by links with di erent bandwidths (e.g., wireless
several mechanisms and facilities have been designed and            slow connections and fast optical links).
implemented to move code among the nodes of a network.                 Programming is location aware. Location is a pervasive
Examples are remote batch job submission 5] and the use             abstraction that has a strong in uence on both the design
of PostScript 6] to control printers. The research work on          and the implementation of distributed applications. Mobile
distributed operating systems has followed a more struc-            code systems do not paper over the location of application
tured approach. In that research area, the main problem             components, rather, applications are location-aware and
is to support the migration of active processes and objects         may take actions based on such knowledge.
(along with their state and associated code) at the operat-            Mobility is under programmer's control. The program-
ing system level 7]. In particular, process migration con-          mer is provided with mechanisms and abstractions that en-
cerns the transfer of an operating system process from the          able the shipping and fetching of code fragments (or even
machine where it is running to a di erent one. Migration            entire components) to/from remote nodes. The underlying
mechanisms handle the bindings between the process and              run-time support provides basic functionalities (e.g., data
its execution environment (e.g., open le descriptors and            marshaling, code check-in, and security), but does not have
environment variables) to allow the process to seamlessly           any control over migration policies.
resume its execution in the remote environment. Process                Mobility is not performed just for load balancing. Pro-
migration facilities have been introduced at the operat-            cess and object migration aim at supporting load balancing
ing system level to achieve load balancing across network           and performance optimization. Mobile code systems ad-
nodes. Therefore, most of these facilities provide transpar-        dress a much wider range of needs and requirements, such
ent process migration: the programmer has neither control           as service customization, dynamic extension of application
nor visibility of the migration process. Other systems pro-         functionality, autonomy, fault tolerance, and support for
vide some form of control over the migration process. For           disconnected operations.
example, in Locus 8] process migration can be triggered                To cope with this variety of requirements and needs, in-
1 Hereafter, we use interchangeably the terms code mobility and    dustrial and academic researchers have proposed a number
mobile code, although other authors prefer di erent terms such as   of MCSs. This lively and sometimes chaotic research ac-
mobile computations, mobile object systems, or program mobility.    tivity has generated some confusion about the semantics of
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                             3

mobile code concepts and technologies.                                  versa). This is actually generating confusion since there is
A rst problem is the unclear distinction between                     a mix of concepts and notions that belong to two di erent
implementation technologies, speci c applications, and                  layers, i.e., the layer providing code mobility and the one
paradigms used to design these applications. In an early                exploiting it. Finally, there is no de nition or agreement
and yet valuable assessment of code mobility 12], the au-               about the distinguishing characteristics of languages sup-
thors analyze and compare issues and concepts that belong               porting code mobility. In 23], Knabe lists the essential
to di erent abstraction levels. Similarly, in a recent work             characteristics of a mobile code language. They include
about autonomous objects 13], mechanisms like REV 14]                   support for manipulating, transmitting, receiving, and ex-
and RPC 15] are compared to the Echo distributed al-                    ecuting \code-containing objects". However, there is no
gorithms 16], to applications like \intelligent e-mail" and             discussion about how to manage the state of mobile com-
Web browsers, and to paradigms for structuring distributed              ponents. Other contributions 24], 12] consider only the
applications, like mobile agents. We argue that these dif-              support for mobility of both code and state, without men-
ferent concepts and notions cannot be compared directly.                tioning weaker forms of code mobility involving code mi-
It is as inappropriate and misleading as trying to com-                 gration alone|as we discuss later on in the paper.
pare the emacs editor, the fork UNIX system call, and the                  Certainly, confusion and disagreement are typical of a
client-server design paradigm.                                          new and still immature research eld. Nevertheless, re-
There is also confusion about terminology. For instance,             search developments are fostered not only by novel ideas,
several systems 17], 18] claim to be able to move the state             mechanisms, and systems, but also by a rationalization and
of a component along with its code. This assertion is jus-              conceptualization e ort that re-elaborates on the raw ideas,
ti ed by the availability of mechanisms that allow the pro-             seeking for a common and stable ground on which to base
grammer to pack some portion of the data space of an                    further endeavors. Research on code mobility is not an ex-
executing component before the component's code is sent                 ception. The technical concerns raised by performance and
to a remote destination. Indeed, this is quite di erent from            security of MCSs are not the only factors hampering full
the situation where the run-time image of the component                 acceptance and exploitation of mobile code. A conceptual
is transferred as a whole, including its execution state (i.e.,         framework is needed to foster understanding of the multi-
program counter, call stack, and so on). In the former case,            faceted mobile code scenario. It will enable researchers
it is the programmer's task to rebuild the execution state              and practitioners to assess and compare di erent solutions
of a component after its migration, using the data trans-               with respect to a common set of reference concepts and
ferred with the code. Conversely, in the latter case this               abstractions|and go beyond it. To be e ective, this con-
task is carried out by the run-time support of the MCS.                 ceptual framework should also provide valuable informa-
Another terminological confusion stems from the excessive               tion to application developers, actually guiding the eval-
overload of the term \mobile agent". This term is used with             uation of opportunities for exploitation of code mobility
di erent and somewhat overlapping semantics in both the                 during the di erent phases of application development.
distributed systems and arti cial intelligence research com-               These considerations provide the rationale for the clas-
munities. In the distributed system community the term                  si cation presented in this paper. The classi cation intro-
\mobile agent" is used to denote a software component that              duces abstractions, models, and terms to characterize the
is able to move between di erent execution environments.                di erent approaches to code mobility proposed so far, high-
This de nition has actually di erent interpretations. For               lighting commonalities, di erences, and applicability. The
example, while in Telescript 19] an agent is represented                classi cation is organized along three dimensions that are
by a thread that can migrate among di erent nodes car-                  of paramount importance during the actual development
rying its execution state, in TACOMA 17] agents are just                process: technologies, design paradigms, and application
code fragments associated with initialization data that can             domains. Mobile code technologies are the languages and
be shipped to a remote host. They do not have the abil-                 systems that provide mechanisms enabling and support-
ity to migrate once they have started their execution. On               ing code mobility. Some of these technologies have been
the other hand, in the arti cial intelligence community the             already mentioned and are discussed in greater detail in
term \agent" denotes a software2 component that is able to              the next section. Mobile code technologies are used by the
achieve a goal by performing actions and reacting to events             application developer in the implementation stage. Design
in a dynamic environment 20]. The behavior of this com-                 paradigms are the architectural styles that the application
ponent is determined by the knowledge of the relationships              designer uses in de ning the application architecture. An
among events, actions, and goals. Moreover, knowledge                   architectural style identi es a speci c con guration for the
can be exchanged with other agents, or increased by some                components of the system and their mutual interactions.
inferential activity 21]. Although mobility is not the most             Client-server and peer-to-peer are well-known examples of
characterizing aspect of these entities 22], there is a ten-            design paradigms. Application domains are classes of ap-
dency to blend this notion of intelligent agent with the                plications that share the same general goal, e.g., distributed
one originating from distributed systems and thus assume                information retrieval or electronic commerce. They play
implicitly that a mobile agent is also intelligent (and vice            a role in de ning the application requirements. The ex-
2 In this paper we ignore the implications of broader notions of agent pected bene ts of code mobility in a number of application
which are not restricted to the software domain.                        domains is the motivating force behind this research eld.
4                                             IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998

Our classi cation will break down in a vertical distinc-      by the True Distributed System (TDS) layer. A TDS im-
tion among these three layers, as well as in an horizon-         plements a platform where components, located at di erent
tal distinction among the peculiarities of the various ap-       sites of a network, are perceived as local. Users of TDS ser-
proaches found in literature. Section III presents a general     vices do not need to be aware of the underlying structure
model and a classi cation of the mechanisms provided by          of the network. When a service is invoked, there is no clue
mobile code technologies. The classi cation is then used to      about the node of the network that will actually provide
survey and characterize several MCSs. Section IV presents        the service, and even about the presence of a network at
mobile code design paradigms and discusses their relation-       all. As an example, CORBA 3] services can be regarded
ships with mobile code technologies. Section V discusses         as TDS services since a CORBA programmer is usually un-
the advantages of the mobile code approach and presents          aware of the network topology and always interacts with a
some application domains that are supposed to bene t from        single well-known object broker. At least in principle, the
the use of some form of code mobility. Finally, in Section VI    TDS is built upon the services provided by the underlying
we exemplify the use of the classi cation by applying it to a    NOS.
case study in the network management application domain.            Technologies supporting code mobility take a di erent
perspective. The structure of the underlying computer
III. Mobile Code Technologies                         network is not hidden from the programmer, rather it is
Mobile code technologies include programming lan-             made manifest. In the right-hand side of Figure 1 the TDS
guages and their corresponding run-time supports. At a           is replaced by Computational Environments (CEs) layered
rst glance, these technologies provide quite di erent con-     upon the NOS of each network host. In contrast with the
cepts and primitives. For this reason, the rst part of this      TDS, the CE retains the \identity" of the host where it
section introduces some reference abstractions, and then         is located. The purpose of the CE is to provide appli-
seeks out and classi es the di erent mechanisms that al-         cations with the capability to dynamically relocate their
low an application to move code and state across the nodes       components on di erent hosts. Hence, it leverages o of
of a network. We are concerned here only with the issues         the communication channels managed by the NOS and of
strictly related to mobility. Other aspects of mobile code       the low-level resource access provided by the COS to handle
technology are indeed relevant, such as security or strate-      the relocation of code, and possibly of state, of the hosted
gies for translation and execution. On-going work is de n-       software components.
ing a similar framework for these aspects as well. In the           We distinguish the components hosted by the CE in ex-
second part of the section (Section III-C), the classi cation    ecuting units (EUs) and resources. Executing units repre-
of mobility mechanisms is used to characterize the features      sent sequential ows of computation. Typical examples of
provided by several existing MCSs. The classi cation acco-       EUs are single-threaded processes or individual threads of
modates several technologies found in literature. The set        a multi-threaded process. Resources represent entities that
of technologies considered is not exhaustive, and is con-        can be shared among multiple EUs, such as a le in a le
strained by space and by the focus of the paper. However,        system, an object shared by threads in a multi-threaded
the reader may actually verify the soundness of the classi-      object-oriented language, or an operating system variable.
cation by applying it to other MCSs not considered here,       Figure 2 illustrates our modeling of EUs as the composition
like the ones described in 25], 26], 27]. Also, the reader       of a code segment, which provides the static description for
interested in a more detailed analysis of the linguistic prob-   the behavior of a computation, and a state composed of a
lems posed by the introduction of mobility in programming        data space and an execution state. The data space is the
languages can refer to 28], 29].                                 set of references to resources that can be accessed by the
EU. As explained later on, these resources are not neces-
A. A Virtual Machine for Code Mobility                           sarily co-located with the EU on the same CE. The execu-
Traditional distributed systems can be accommodated in        tion state contains private data that cannot be shared, as
the virtual machine shown on the left-hand side of Figure 1.     well as control information related to the EU state, such
The lowest layer, just upon the hardware, is constituted by      as the call stack and the instruction pointer. For example,
the Core Operating System (COS). The COS can be re-              a Tcl interpreter PX executing a Tcl script X can be re-
garded as the layer providing the basic operating system         garded as an EU where the code segment is X the data
functionalities, such as le system, memory management,           space is composed of variables containing the handles for
and process support. No support for communication or dis-          les and references to system environment variables used
tribution is provided by this layer. Non-transparent com-        by PX the execution state is composed of the program
munication services are provided by the Network Operating        counter and the call stack maintained by the interpreter,
System (NOS) layer. Applications using NOS services ad-          along with the other variables of X .
dress explicitly the host targeted by communication. For         B. Mobility Mechanisms
instance, socket services can be regarded as belonging to
the NOS layer, since a socket must be opened by specifying         In conventional systems, each EU is bound to a single CE
explicitly a destination network node. The NOS, at least         for its entire lifetime. Moreover, the binding between the
conceptually, uses the services provided by the COS, e.g.,       EU and its code segment is generally static. Even in envi-
memory management. Network transparency is provided              ronments that support dynamic linking, the code linked be-
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                                                                                        5

Component Component                     Component           Component                  Component   Component             Component            Component      Component
Computational               Computational                 Computational
True Distributed System                                             Environment                 Environment                   Environment

Network                  Network                    Network                             Network                     Network                       Network
Operating System         Operating System           Operating System                    Operating System            Operating System              Operating System

Core Operating System   Core Operating System       Core Operating System               Core Operating System      Core Operating System          Core Operating System

Hardware

Hardware
Host                     Host                       Host                                Host                         Host                         Host

Fig. 1. Traditional systems vs. MCSs. Traditional systems, on the left hand side, may provide a TDS layer that hides the distribution
from the programmer. Technologies supporting code mobility, on the right hand side, explicitly represent the location concept, thus the
programmer needs to specify where |i.e., in which CE|a computation has to take place.
Code segment

Execution state
(stack and instruction pointer)
Executing Unit             Resource
Data space

Computational Environment

Fig. 2. The internal structure of an executing unit.

longs to the local CE. This is not true for MCSs. In MCSs,                                       EU at a remote CE. Remote cloning di ers from the mi-
the code segment, the execution state, and the data space                                        gration mechanism because the original EU is not detached
of an EU can be relocated to a di erent CE. In principle,                                        from its current CE. As in migration, remote cloning can
each of these EU constituents might move independently.                                          be either proactive or reactive.
However, we will limit our discussion to the alternatives
The portion of an EU that needs to be moved is de-
termined by composing orthogonal mechanisms supporting
mobility of code and execution state with mechanisms for                                            Mechanisms supporting weak mobility provide the capa-
data space management. For this reason, we will analyze                                          bility to transfer code across CEs and either link it dynam-
them separately. Figure 3 presents a classi cation of mo-                                        ically to a running EU or use it as the code segment for
bility mechanisms.                                                                               a new EU. Such mechanisms can be classi ed according
to the direction of code transfer, the nature of the code
B.1 Code and Execution State Mobility                                                            being moved, the synchronization involved, and the time
when code is actually executed at the destination site. As
Existing MCSs o er two forms of mobility, characterized                                       for direction of code transfer, an EU can either fetch the
by the EU constituents that can be migrated. Strong mo-                                          code to be dynamically linked and/or executed, or ship
bility is the ability of an MCS (called strong MCS) to allow                                     such code to another CE. The code can be migrated either
migration of both the code and the execution state of an EU                                      as stand-alone code or as a code fragment. Stand-alone
to a di erent CE. Weak mobility is the ability of an MCS                                         code is self-contained and will be used to instantiate a new
(called weak MCS) to allow code transfer across di erent                                         EU on the destination site. Conversely, a code fragment
CEs code may be accompanied by some initialization data,                                         must be linked in the context of already running code and
but no migration of execution state is involved.                                                 eventually executed. Mechanisms supporting weak mobil-
Strong mobility is supported by two mechanisms: migra-                                        ity can be either synchronous or asynchronous, depending
tion and remote cloning. The migration mechanism sus-                                            on whether the EU requesting the transfer suspends or not
pends an EU, transmits it to the destination CE, and then                                        until the code is executed. In asynchronous mechanisms,
resumes it. Migration can be either proactive or reactive.                                       the actual execution of the code transferred may take place
In proactive migration, the time and destination for migra-                                      either in an immediate or deferred fashion. In the rst
tion are determined autonomously by the migrating EU.                                            case, the code is executed as soon as it is received, while
In reactive migration, movement is triggered by a di er-                                         in a deferred scheme execution is performed only when a
ent EU that has some kind of relationship with the EU to                                         given condition is satis ed|e.g., upon rst invocation of
be migrated, e.g., an EU acting as a manager of roaming                                          a portion of the code fragment or as a consequence of an
EUs. The remote cloning mechanism creates a copy of an                                           application event.
6                                                    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998
Mobility     Code and execution state     Strong mobility      Migration        Proactive
mechanisms   management
Reactive
Remote cloning   Proactive
Reactive
Weak mobility        Code shipping    Stand-alone   Synchronous
code

Asynchronous   Immediate
Deferred
Code
fragment      Synchronous

Asynchronous   Immediate
Deferred
Code fetching    Stand-alone   Synchronous
code

Asynchronous   Immediate
Deferred
Code
fragment      Synchronous

Asynchronous   Immediate
Deferred
Data space      Binding removal
management
Network reference
Re-binding
By copy
By move

Fig. 3. A classi cation of mobility mechanisms.

B.2 Data Space Management                                                          that cannot be substituted by some other equivalent re-
source.
Upon migration of an EU to a new CE, its data space,                               A binding established by value declares that, at any mo-
i.e. the set of bindings to resources accessible by the EU,                        ment, the resource must be compliant with a given type
must be rearranged. This may involve voiding bindings to                           and its value cannot change as a consequence of migration.
resources, re-establishing new bindings, or even migrating                         This kind of binding is usually exploited when an EU is
some resources to the destination CE along with the EU.                            interested in the contents of a resource and wants to be
The choice depends on the nature of the resources involved,                        able to access them locally. In this case, the identity of
the type of binding to such resources, as well as on the                           the resource is not relevant, rather the migrated resource
requirements posed by the application.                                             must have the same type and value of the one present on
We model resources as a triple Resource = hI V T i,                             the source CE.
where I is a unique identi er, V is the value of the resource,                        The weakest form of binding is by type. In this case,
and T is its type, which determines the structure of the                           the EU requires that, at any moment, the bound resource
information contained in the resource as well as its inter-                        is compliant with a given type, no matter what its actual
face. The type of the resource determines also whether the                         value or identity are. This kind of binding is exploited
resource is transferrable or not transferrable, i.e. whether,                      typically to bind resources that are available on every CE,
in principle, it can be migrated over the network or not.                          like system variables, libraries, or network devices. For ex-
For example, a resource of type \stock data" is likely to                          ample, if a roaming EU needs to access the local display
be transferrable, while a resource of type \printer" proba-                        of a machine to interact with the user through a graphi-
bly is not. Transferrable resource instances can be marked                         cal interface, it may exploit a binding with a resource of
as free or xed. The former can be migrated to another                              type \display". The actual value and identi er of the re-
CE, while the latter are associated permanently with a CE.                         source are not relevant, and the resource actually bound is
This characteristic is determined according to application                         determined by the current CE. Note that it is possible to
requirements. For instance, even if it might be conceiv-                           have di erent types of binding to the same resource. In
able to transfer a huge le or an entire database over the                          the example above, suppose that the roaming EU, in ad-
network, this might be undesirable for performance rea-                            dition to interact with the local user through the display,
sons. Similarly, it might be desirable to prevent transfer                         needs to report progress back to the user that \owns" the
of classi ed resources, even independently of performance                          EU. This is accomplished by creating, at startup, a bind-
considerations.                                                                    ing by identi er to the display of the owner and a binding
Resources can be bound to an EU through three forms                             by type to the same resource. As we will explain shortly,
of binding, which constrain the data space management                              after the rst migration the bindings will be recon gured
mechanisms that can be exploited upon migration. The                               so that the binding by identi er will retain its association
strongest form of binding is by identi er. In this case, the                       with the owner's display, while the binding by type will be
EU requires that, at any moment, it must be bound to a                             associated with the display on the destination CE.
given uniquely identi ed resource. Binding by identi er is                            The above discussion highlights two classes of problems
exploited when an EU requires to be bound to a resource                            that must be addressed by data space management mecha-
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                  7

nisms upon migration of an EU: resource relocation and         site, a resource of the same type of R exists. Otherwise,
binding recon guration. The way existing mechanisms            the other mechanisms can be used depending on the type
tackle these problems is constrained both by the nature        and characteristics of the resource involved.
of the resources involved and the forms of binding to such        The existing MCSs exploit di erent strategies as far as
resources. These relationships are analyzed hereafter and      data space management is concerned. The nature of the
summarized in Table I.                                         resource and the type of binding is often determined by the
Let us consider a migrating executing unit U whose data     language de nition or implementation, rather than by the
space contains a binding B to a resource R. A rst general      application programmer, thus constraining the mechanisms
mechanism, which is independent of the type of binding or      exploited. For instance, les are usually considered a xed
resource is binding removal. In this case, when U migrates,    unique resource, and migration is usually managed by void-
B is simply discarded. If access to bound resources must       ing the corresponding bindings, although les in principle
be preserved, di erent mechanisms must be exploited.           could be migrated along with an EU. Replicated resources
If U is bound to R by identi er, two data space manage-     are often provided as built-in to provide access to system
ment mechanisms are suitable to preserve resource iden-        features in a uniform way across all CEs. The next section
tity. The rst is relocation by move. In this case, R is        will provide more insights about mobility mechanisms in
transferred, along with U to the destination CE and the        existing MCSs.
binding is not modi ed (Figure 4a). Clearly, this mech-        C. A Survey of Mobile Code Technologies
anisms can be exploited only if R is a free transferrable
resource. Otherwise, a network reference mechanism must          Currently available technologies di er in the mechanisms
be used. In this case, R is not transferred and once U         they provide to support mobility. In this section we apply
has reached its target CE, B is modi ed to reference R         the classi cation of mobility mechanisms presented so far
in the source CE. Every subsequent attempt of U to ac-         to a number of existing MCSs.
cess R through B will involve some communication with
the source CE (Figure 4b). The creation of inter-CE bind-      C.1 Agent Tcl
ings is often not desirable because it exposes U to network       Developed at the University of Darthmouth, Agent
problems|e.g., partitioning, or delays|and makes it dif-       Tcl 30] provides a Tcl interpreter extended with support
cult to manage state consistency since the data space is     for strong mobility. In Agent Tcl, an EU (called agent ) is
actually distributed over the network. On the other hand,      implemented by a Unix process running the language in-
moving away a resource from its CE may cause problems to       terpreter. Since EUs run in separate address spaces, they
other EUs that own bindings to the moved resource. This        can share only resources provided by the underlying oper-
latter situation may be managed in di erent ways. A rst        ating system, like les. Such resources are considered as
approach is to apply binding removal, i.e., to void bindings   not transferrable. The CE abstraction is implemented by
to the resource moved (see top of Figure 4a). Subsequent       the operating system and the language run-time support.
attempts to access the resource through such bindings will     In Agent Tcl, EUs can jump to another CE, fork a new
rise an exception. A second approach is to retain the bind-    EU at a remote CE, or submit some code to a remote CE.
ings to the resource at its new location by means of network   In the rst case, a proactive migration mechanism enables
references (see bottom of Figure 4a).                          movement of a whole Tcl interpreter along with its code
If B is by value and R is transferrable, the most con-      and execution state. In the second case, a proactive re-
venient mechanism is data space management by copy be-         mote cloning mechanism is implemented. In both cases,
cause the identity of the resource is not relevant. In this    bindings in the data space of a migrating EU are removed.
case, a copy R of R is created, the binding to R is modi ed
0
In the third case, a code shipping mechanism for stand-
to refer to R , and then R is transferred to the destination
0             0
alone code is exploited to perform remote execution of a
CE along with U (see Figure 4c). Management by move            Tcl script in a newly created EU at the destination CE.
satis es the requirements posed by bindings by value but,      This mechanism is asynchronous and immediate. A copy
in some cases, may be less convenient. In fact, in this case   of the variables belonging to the execution state of the EU
R would be removed from the source CE and other EUs            invoking the submit may be passed as parameters of this
owning bindings to R would have to cope with this event.       operation in order to migrate these variables together with
If R cannot be transferred, the use of the network reference   the Tcl script.
mechanisms is the only viable solution, with the drawbacks
described previously.                                          C.2 Ara
If U is bound to R by type, the most convenient mech-          Developed at University of Kaiserslautern, Ara 24]
anism is re-binding. In this case B is voided and re-          is a multi-language MCS that supports strong mobility.
established after migration of U to another resource R on
0
Ara EUs, called agents, are managed by a language-
the target CE having the same type of R (Figure 4d). Re-       independent system core plus interpreters for the languages
binding exploits the fact that the only requirement posed      supported|at the time of writing C, C++, and Tcl. The
by the binding is the type of the resource, and avoids         core and the interpreters constitute the CE, whose services
resource transfers or the creation of inter-CE bindings.       are made accessible to agents through the place abstrac-
Clearly, this mechanism requires that, at the destination      tion. Mobility is supported through proactive migration,
8                                                                 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998

Free Transferrable              Fixed Transferrable                Fixed Not Transferrable
By Identi er                   By move                     Network reference                    Network reference
(Network reference)
By copy                         By copy
By Value                   (By move,                   (Network reference)                  (Network reference)
Network reference)
Re-binding                       Re-binding                           Re-binding
By Type               (Network reference,              (Network reference,                  (Network reference)
By copy, By move)                     By copy)
TABLE I
Bindings, resources and data space management mechanisms.

(a) By move                                                                     (b) Network reference
Before                                 After                                    Before                                     After

R                                              R                                R                                 R

Source CE                Source CE                      Destination CE          Source CE                       Source CE                 Destination CE

(c) By copy
Before                                  After                                    Before                                    After

R                                              R                                R                                 R                   R’

Source CE                Source CE                      Destination CE          Source CE                       Source CE                 Destination CE

(d) Re-binding
Before                                                 After

R                          R’                          R                        R’

Source CE                      Destination CE          Source CE                    Destination CE

Fig. 4. Data space management mechanisms. For each mechanism, the con guration of bindings before and after migration of the grayed
EU is shown.

and data space management is simpli ed by the fact that chronous and supports both immediate and deferred exe-
agents cannot share anything but system resources|whose cution. As for data space management, this takes place
bindings are always removed upon migration.                  always by copy, except for special variables called ubiqui-
tous values. They represent resources replicated in each
C.3 Facile                                                   Facile node and are always accessed with bindings by type,
Developed at ECRC in Munich, Facile 31] is a func- exploiting a re-binding mechanism.
tional language that extends the Standard ML language C.4 Java
with primitives for distribution, concurrency, and commu-
nication. The language has been extended further in 23] to     Developed by Sun Microsystems, Java 32] has trig-
support weak mobility. Executing units are implemented gered most of the attention and expectations on code mo-
as threads that run in Facile CEs, called nodes. The chan- bility. The original goal of the language designers was
nel abstraction is used for communication between threads. to provide a portable, clean, easy-to-learn, and general-
Channels can be used to communicate any legal value of purpose object-oriented language, which has been subse-
the Facile language. In particular, functions may be trans- quently re-targeted by the growth of Internet. The Java
mitted through channels since they are rst-class language compiler translates Java source programs into an inter-
elements. Communication follows the rendez-vous model: mediate, platform-independent language called Java Byte
both the sender and the receiver are blocked until commu- Code. The byte code is interpreted by the Java Virtual
nication takes place. For this reason, mobility mechanisms Machine (JVM)|the CE implementation. Java provides
can be regarded as supporting both code shipping and code a programmable mechanism, the class loader, to retrieve
fetching|depending on whether an EU is a sender or a re- and link dynamically classes in a running JVM. The class
ceiver. In addition, the programmer can specify whether loader is invoked by the JVM run-time when the code cur-
the function transmitted has to be considered as stand- rently in execution contains an unresolved class name. The
alone code or as a code fragment. When the function has class loader actually retrieves the corresponding class, pos-
been transferred, the communication channel is closed, and sibly from a remote host, and then loads the class in the
the receiver EU is free to evaluate the function received or JVM. At this point, the corresponding code is executed. In
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                     9

explicitly by the application, independent of the need to       sources are always considered transferrable and xed, and
execute the class code. Therefore Java supports weak mo-        the submitting messenger may copy them in the message
bility using mechanisms for fetching code fragments. Such       containing the submitted code to make them available at
mechanisms are asynchronous and support both immedi-            the destination CE. Therefore, M0 is a weak MCS provid-
ate and deferred execution. In both cases, the code loaded      ing shipping of stand-alone code (whose execution is asyn-
is always executed from scratch and has neither execution       chronous and immediate), and data space management is
state nor bindings to resources at the remote host|no data      by copy.
space management is needed.
One of the key success factors of Java is its integration    C.7 Mole
with World Wide Web technology. Web browsers have been             Developed at University of Stuttgart, Mole 37], 38] is
extended to include a JVM. Java classes called applets can      a Java API that supports weak mobility. Mole agents are
be downloaded along with HTML pages to allow for ac-            Java objects which run as threads of the JVM, which is
tive presentation of information and interactive access to      abstracted into a place, the Mole CE. A place provides ac-
a server. From the viewpoint we took in our classi ca-          cess to the underlying operating system through service
tion, we regard this as a particular application of mobile      agents which, di erently from user agents, are always sta-
code technology. However, it can be argued also that the        tionary. Shipping of stand-alone code is provided with an
combination of a Web browser and a JVM is so frequent           asynchronous, immediate mechanism. The code and data
that it can be regarded as a technology per se, conceived       to be sent are determined automatically upon migration
explicitly for the development of Web applications. From        using the notion of island 39]. An island is the transitive
this perspective, the presence of a JVM is hidden and its       closure over all the objects referenced by the main agent
mechanisms are used to provide a higher-level layer where       object. Islands, which are generated automatically starting
browsers constitute the CEs and applets are EUs executing       from the main agent object, cannot have object references
concurrently within them. In this context, the download-        to the outside inter-agent references are symbolic and be-
ing of applets can be regarded as a mechanism provided by       come void upon migration. Hence, data space management
the browser to support fetching of stand-alone code.            by move is exploited.
C.5 Java Aglets                                                 C.8 Obliq
The Java Aglets API (J-AAPI) 33], developed by IBM              Developed at DEC, Obliq 40] is an untyped, object-
Tokyo Research Laboratory in Japan, extends Java with           based, lexically scoped, interpreted language. Obliq allows
support for weak mobility. Aglets 34], the EUs, are threads     for remote execution of procedures by means of execution
in a Java interpreter which constitutes the CE. The API         engines which implement the CE concept. A thread, the
provides the notion of context as an abstraction of the CE.     Obliq EU, can request the execution of a procedure on
The context of an aglet provides a set of basic services,       a remote execution engine. The code for such procedure
e.g., retrieval of the list of aglets currently contained in    is sent to the destination engine and executed there by a
that context or creation of new aglets within the context.      newly created EU. The sending EU suspends until the ex-
Java Aglets provides two migration primitives: dispatch         ecution of the procedure terminates. Thus, Obliq supports
is the primitive that performs code shipping of stand-alone     weak mobility using a mechanism for synchronous shipping
code (the code segment of the aglet) to the context spec-       of stand-alone code. Obliq objects are transferrable xed
i ed as parameter. The mechanism is asynchronous and            resources, i.e., they are bound for their whole lifetime to
immediate. The symmetrical primitive retract performs           the CE where they are created even if in principle they
code fetching of stand-alone code, and is used to force an      could be moved across CEs. When an EU requests the ex-
aglet to come back to the context where retract is exe-         ecution of a procedure on a remote CE, the references to
cuted, with a synchronous, immediate mechanism. In both         the local objects used by the procedure are automatically
cases, the aglet is re-executed from scratch after migration,   translated into network references.
although it retains the value of its object attributes which
are used to provide an initial state for its computation. The   C.9 Safe-Tcl
attribute values may contain references to resources, which        Initially developed by the authors of the Internet MIME
are always managed by copy. Finally, being based on Java,       standard, Safe-Tcl 41] is an extension of Tcl 42] conceived
the Aglets API supports Java mechanisms as well.                to support active e-mail. In active e-mail, messages may
include code to be executed when the recipient receives or
C.6 M0                                                          reads the message. Hence, in Safe-Tcl there are no mobility
Implemented at the University of Geneva, M0 35] is           or communication mechanisms at the language level|they
a stack-based interpreted language that implements the          must be achieved using some external support, like e-mail.
concept of messengers. Messengers|representing EUs|             Rather, mechanisms are provided to protect the recipient's
are sequences of instructions that are transmitted among        CE, which is realized following a twin interpreter scheme.
platforms|representing CEs|and executed uncondition-            The twin interpreter consists of a trusted interpreter, which
ally upon receipt. Messengers 36], in turn, can submit          is a full- edged Tcl interpreter, and an untrusted inter-
the code of other messengers to remote platforms. Re-           preter, whose capabilities have been restricted severely, so
10                                          IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998

that one can execute code of uncertain origin without be-      on strong mobility. Telescript employs an intermediate,
ing damaged. The owner of the interpreter may decide to        portable language called Low Telescript, which is the rep-
export procedures which are guaranteed to be safe from the     resentation actually transmitted among engines, the Tele-
trusted interpreter to the untrusted one. Presently, most of   script CEs. Engines are in charge of executing agents and
the fundamental features of Safe-Tcl have been included in     places, that are the Telescript EUs. Agents can move by us-
the latest release of Tcl/Tk, and a plug-in for the Netscape   ing the go operation, which implements a proactive migra-
browser has been developed, allowing Safe-Tcl scripts to       tion mechanism. A send operation is also available which
be included in HTML pages 43], much like Java applets.         implements proactive remote cloning. Places are stationary
EUs that can contain other EUs. Data space management
C.10 Sumatra                                                   is ruled by the ownership concept which associates each
Sumatra 44], developed at University of Maryland, is        resource with an owner EU. Upon migration, this infor-
a Java extension designed expressly to support the im-         mation is used to determine automatically the set of ob-
plementation of resource-aware mobile programs, i.e. pro-      jects that must be carried along with the EU. Data space
grams which are able to adapt to resource changes by ex-       management always exploits management by move for the
ploiting mobility. Sumatra provides support for strong mo-     migrating EU. Bindings to migrated resources owned by
bility of Java threads, which are Sumatra EUs. Threads are     other EUs in the source site are always removed.
executed within execution engines, i.e. dynamically created
interpreters which extend the abstract machine provided                         IV. Design Paradigms
by the JVM with methods that embody proactive migra-              Mobile code technologies are only one of the ingredients
tion mechanisms, proactive remote cloning, and shipping of     needed to build a software system. Software development
stand-alone code with synchronous, immediate execution.        is a complex process where a variety of factors must be
Threads or stand-alone code can be migrated separately         taken into account: technology, organization, and method-
from the objects they need. The object-group abstraction       ology. In particular, a very critical issue is the relationship
is provided to represent dynamically created object aggre-     between technology and methodology. This relationship is
gates that determine the unit of mobility as well as the       often ignored or misinterpreted. Quite often, researchers
unit of persistency. Objects belonging to a group must be      and practitioners tend to believe that a technology inher-
explicitly checked in and out, and thread objects cannot be    ently induces a methodology. Thus \it is su cient to build
checked in an object-group. The rationale for the absence      good development tools and e cient languages". This is
of an automatic mechanism is to give the programmer the        particularly evident in a critical phase of software develop-
ability to modify dynamically the granularity of the unit of   ment: software design. The goal of design is the creation of
mobility. Data space management in an object-group is al-      a software architecture, which can be de ned as the decom-
ways by move bindings to migrated objects owned by EUs         position of a software system in terms of software compo-
in the source CE are transformed into network references.      nents and interactions among them 45]. Software architec-
C.11 TACOMA                                                    tures with similar characteristics can be represented by ar-
chitectural styles 46] or design paradigms, which de ne ar-
In TACOMA 17] (Troms And COrnell Mobile Agents),            chitectural abstractions and reference structures that may
the Tcl language is extended to include primitives that        be instantiated into actual software architectures. A de-
support weak mobility. Executing units, called agents,         sign paradigm is not necessarily induced by the technology
are implemented as Unix processes running the Tcl inter-       used to develop the software system|it is a conceptually
preter. The functionality of the CE is implemented by          separate entity. This distinction is not merely philosoph-
the Unix operating system plus a dedicated run-time sup-       ical: the evolution of programming languages has clearly
porting agent check-in and check-out. Code shipping of         emphasized the issue. It is even possible for a modular sys-
stand-alone code is supported by mechanisms providing          tem to be built using an assembly language, and at the
both synchronous and asynchronous immediate execution.         same time, the adoption of sophisticated languages such
Initialization data for the new EU are encapsulated in a       as Modula-2 does not guarantee per se that the developed
data structure called briefcase, while resources in the CE     system will be really modular. Certainly, speci c features
are contained in stationary data structures called cabinets.   of a language can be particularly well-suited to guarantee
Upon migration, data space management by copy can be           some program property, but a \good" program is not just
exploited to provide the new EU with a resource present        the direct consequence of selecting a \good" language.
within the source CE cabinet. In version 1.2, the system          Traditional approaches to software design are not suf-
has been extended to support a number of interpreted lan-        cient when designing large scale distributed applications
guages, namely Python, Scheme, Perl, and C.                    that exploit code mobility and dynamic recon guration of
software components. In these cases, the concepts of lo-
C.12 Telescript                                                cation, distribution of components among locations, and
Developed by General Magic, Telescript 19] is an object-     migration of components to di erent locations need to be
oriented language conceived for the development of large       taken explicitly into account during the design stage. As
distributed applications. Security has been one of the driv-   stated in 47], interaction among components residing on
ing factors in the language design, together with a focus      the same host is remarkably di erent from the case where
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                                  11

components reside on di erent hosts of a computer net-               of components before and after the execution of the service,
work in terms of latency, access to memory, partial failure,         by the computational component which is responsible for
and concurrency. Trying to paper over di erences between             execution of code, and by the location where the computa-
local and remote interactions can lead to unexpected per-            tion of the service actually takes place (see Table II).
formance and reliability problems after the implementation              The presentation of the paradigms is based on a
phase.                                                               metaphor where two friends|Louise and Christine|
It is therefore important to identify reasonable design           interact and cooperate to make a chocolate cake. In or-
paradigms for distributed systems exploiting code mobil-             der to make the cake (the results of a service), a recipe is
ity3 , and to discuss their relationships with the technology        needed (the know-how about the service), as well as the
that can be used to implement them. It is also important to          ingredients (movable resources), an oven to bake the cake
notice that each of the languages mentioned in the previous          (a resource that can hardly be moved), and a person to
section embodies mechanisms that can be used to imple-               mix the ingredients following the recipe (a computational
ment one or more design paradigms. On the other hand,                component responsible for the execution of the code). To
the paradigms themselves are independent of a particular             prepare the cake (to execute the service) all these elements
technology, and could even be implemented without using              must be co-located in the same home (site). In the follow-
mobile technology at all, as described in the case study             ing, Louise will play the role of component A, i.e., she is
presented in 49].                                                    the initiator of the interaction and the one interested in its
nal e ects.
A. Basic Concepts
Before introducing design paradigms we present some               A.1 Client-Server (CS)
basic concepts that are an abstraction of the entities that          Louise would like to have a chocolate cake, but she doesn't know the
constitute a software system, such as les, variable values,          recipe, and she does not have at home either the required ingredients
executable code, or processes. In particular, we introduce           or an oven. Fortunately, she knows that her friend Christine knows
how to make a chocolate cake, and that she has a well supplied kitchen
three architectural concepts: components, interactions, and          at her place. Since Christine is usually quite happy to prepare cakes
sites.                                                               on request, Louise phones her asking: \Can you make me a chocolate
Components are the constituents of a software architec-           cake, please?". Christine makes the chocolate cake and delivers it
back to Louise.
ture. They can be further divided into code components,              The client-server paradigm is well-known and widely used.
that encapsulate the know-how to perform a particular                In this paradigm, a computational component B (the
computation, resource components, that represent data or             server) o ering a set of services is placed at site SB . Re-
devices used during the computation, and computational               sources and know-how needed for service execution are
components, that are active executors capable to carry out           hosted by site SB as well. The client component A, lo-
a computation, as speci ed by a corresponding know-how.              cated at SA , requests the execution of a service with an
Interactions are events that involve two or more compo-              interaction with the server component B . As a response,
nents, e.g., a message exchanged among two computational             B performs the requested service by executing the corre-
components. Sites host components and support the exe-               sponding know-how and accessing the involved resources
cution of computational components. A site represents the            co-located with B . In general, the service produces some
intuitive notion of location. Interactions among compo-              sort of result that will be delivered back to the client with
nents residing at the same site are considered less expensive        an additional interaction.
than interactions taking place among components located
in di erent sites. In addition, a computation can be ac-             A.2 Remote Evaluation (REV)
tually carried out only when the know-how describing the
computation, the resources used during the computation,              Louise wants to prepare a chocolate cake. She knows the recipe but
and the computational component responsible for execu-               she has at home neither the required ingredients nor an oven. Her
friend Christine has both at her place, yet she doesn't know how to
tion are located at the same site.                                   make a chocolate cake. Louise knows that Christine is happy to try
Design paradigms are described in terms of interaction            new recipes, therefore she phones Christine asking: \Can you make
patterns that de ne the relocation of and coordination               me a chocolate cake? Here is the recipe: take three eggs...". Christine
prepares the chocolate cake following Louise's recipe and delivers it
among the components needed to perform a service. We                 back to her.
will consider a scenario where a computational component             In the REV paradigm4, a component A has the know-how
A, located at site SA needs the results of a service. We             necessary to perform the service but it lacks the resources
assume the existence of another site SB , which will be in-          required, which happen to be located at a remote site SB .
volved in the accomplishment of the service.                         Consequently, A sends the service know-how to a computa-
We identify three main design paradigms exploiting code           tional component B located at the remote site. B , in turn,
mobility: remote evaluation, code on demand, and mobile
agent. These paradigms are characterized by the location                 4 Hereafter, by \remote evaluation" we will refer to the design
paradigm presented in this section. Although it has been inspired
3 The reader interested in the original formulation of the paradigms by work on the REV system 14], they have to be kept de nitely
described here is directed to 4]. A case study centered around a distinct. Our REV is a design paradigm, while the REV system is
formalization of these paradigms using the UNITY notation is also a technology that may be used to actually implement an application
provided in 48].                                                       designed using the REV paradigm.
12                                             IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998

SA               SB            SA                  SB
know-how                            know-how
Client-Server        A           resource           A                resource
B                                  B
Remote       know-how        resource                           know-how
Evaluation         A              B               A                resource
B
Code on       resource      know-how         resource
Demand           A              B           know-how                B
A
Mobile     know-how                                           know-how
Agent         A              resource         |                resource
A
TABLE II
Mobile code paradigms. This table shows the location of the components before and after the service execution. For each
paradigm, the computational component in bold face is the one that executes the code. Components in italics are those
that have been moved.

executes the code using the resources available there. An B. Discussion and Comparison
additional interaction delivers the results back to A.
The mobile code design paradigms introduced in the pre-
vious sections de ne a number of abstractions for represent-
A.3 Code on Demand (COD)                                              ing the bindings among components, locations, and code,
Louise wants to prepare a chocolate cake. She has at home both the and their dynamic recon guration. Our initial experience
required ingredients and an oven, but she lacks the proper recipe. in applying the paradigms 50], 49] suggests that these
However, Louise knows that her friend Christine has the right recipe abstractions are e ective in the design of distributed ap-
and she has already lent it to many friends. So, Louise phones Chris- plications. Furthermore, they are fairly independent of the
tells her the recipe and Louise prepares the chocolate cake at home. particular language or system in which they are eventually
implemented.
In the COD paradigm, component A is already able to ac-                  Mobile code paradigms model explicitly the concept of
cess the resources it needs, which are co-located with it location. The site abstraction is introduced at the architec-
at SA . However, no information about how to manipulate tural level in order to take into account the location of the
such resources is available at SA . Thus, A interacts with di erent components. Following this approach, the types
a component B at SB by requesting the service know-how, of interaction between two components is determined by
which is located at SB as well. A second interaction takes both components' code and location. Introducing the con-
place when B delivers the know-how to A, that can subse- cept of location makes it possible to model the cost of the
quently execute it.                                                   interaction between components at the design level. In
particular, an interaction between components that share
A.4 Mobile agent (MA)                                                 the same location is considered to have a negligible cost
when compared to an interaction involving communication
Louise wants to prepare a chocolate cake. She has the right recipe through the network.
and ingredients, but she does not have an oven at home. However,
she knows that her friend Christine has an oven at her place, and        Most well-known paradigms are static with respect to
that she is very happy to lend it. So, Louise prepares the chocolate code and location. Once created, components cannot
batter and then goes to Christine's home, where she bakes the cake. change either their location or their code during their life-
In the MA paradigm, the service know-how is owned by time. Therefore, the types of interaction and its quality
A, which is initially hosted by SA , but some of the re- (local or remote) cannot change. Mobile code paradigms
quired resources are located on SB . Hence, A migrates to overcome these limits by providing component mobility.
SB carrying the know-how and possibly some intermediate By changing their location, components may change dy-
results. After it has moved to SB , A completes the ser- namically the quality of interaction, reducing interaction
vice using the resources available there. The mobile agent costs. To this end, the REV and MA paradigms allow
paradigm is di erent from other mobile code paradigms the execution of code on a remote site, encompassing local
since the associated interactions involve the mobility of an interactions with components located there. In addition,
existing computational component. In other words, while the COD paradigm enables computational components to
in REV and COD the focus is on the transfer of code be- retrieve code from other remote components, providing a
tween components, in the mobile agent paradigm a whole exible way to extend dynamically their behavior and the
computational component is moved to a remote site, along types of interaction they support.
with its state, the code it needs, and some resources re-                Flexibility and dynamicity are useful, but it is not clear
quired to perform the task.                                           when these paradigms should be used, and how one can
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                     13

choose the right paradigm in designing a distributed appli-     nent's state and can concentrate on the problem to solve.
cation. In our opinion there is no paradigm that is the best    A case study that analyzes these relationships in detail can
in absolute terms. In particular the mobile code paradigms      be found in 49].
we described do not necessarily prove to be better suited
for a particular application with respect to the traditional                V. Mobile Code Applications
ones. The choice of the paradigms to exploit must be per-          At the time of writing, applications exploiting code mo-
formed on a case-by-case basis, according to the speci c        bility can still be considered as relegated to a niche, at
type of application and to the particular functionality being   least if compared to traditional client-server based ap-
designed within the application. For each case, some pa-        plications. This is a consequence of the immaturity of
rameters that describe the application behavior have to be      technology|mostly as far as performance and security 52]
chosen, along with some criteria to evaluate the parameters     are concerned|and of the lack of suitable methodologies
values. For example, one may want to minimize the num-          for application development. Nevertheless, the interest in
ber of interactions, the CPU costs or the generated network     mobile code is not motivated by the technology per se,
tra c. In addition, a model of the underlying distributed       rather by the bene ts that it is supposed to provide by en-
system should be adopted to support reasoning about the         abling new ways of building distributed applications and
criteria. For each paradigm considered, an analysis should      even of creating brand new applications. The advantages
be carried out to determine which paradigm optimizes the        expected from the introduction of mobile code into dis-
chosen criteria. This phase cannot take into account all the    tributed applications are particularly appealing in some
characteristics and constraints, that probably will be fully    speci c application domains. This fact has sometimes led
understood only after the detailed design, but it should        to identifying entire application classes with terms like
provide hints about the most reasonable paradigm to fol-        \mobile agent systems" or \Internet agents" that refer
low in the design. A case study that provides guidelines on     more to how the applications are structured rather than
how such analysis can be carried out is given in Section VI.    to the functionality they implement. Therefore, in order
Once an application has been designed, developers are        to understand mobile code it is important to distinguish
faced with the choice of a suitable technology for its imple-   clearly between an application (e.g., a system to control
mentation. Even if technologies are somewhat orthogonal         a remote telescope) and the paradigm used to design it
with respect to paradigms, some technologies are better         (e.g., the REV paradigm to identify control modules that
suited to implement application designed according to par-      are sent to the remote telescope) or the technology used to
ticular paradigms. For example, one can implement an            implement it (e.g., Java Aglets).
application designed following the REV paradigm with a             Hence, the purpose of this section is to provide the reader
technology that allows EUs to exchange just messages. In        both with a grasp on the key bene ts which mobile code is
this case, the programmer has the burden to translate the       expected to bring, and with a non-exhaustive review of ap-
code to be shipped to the remote site into the data format      plication domains which are being identi ed by researchers
used in message payloads. Moreover, the receiving EU has        in the eld as suitable for the exploitation of mobile code.
to explicitly extract the code and invoke an intepreter in      This completes our conceptual framework and provides the
order to execute it. A mobile code technology providing         reader with a path from the problem to the implementa-
mechanisms for code shipping would be more convenient           tion, spanning application, design, and technology issues.
and would manage marshaling, shipping, and remote in-           Section VI will show an example of how our framework
terpretation tasks at the system level.                         can be leveraged o in the network management applica-
A common case is represented by the use of a weak MCS        tion domain.
that allows for code shipping for implementing applications
designed following the MA paradigm 51]. In this case, the       A. Key Bene ts of Mobile Code
architectural concept of a moving component must be im-            A major asset provided by code mobility is that it enables
plemented using a technology that does not preserve the ex-     service customization. In conventional distributed systems
ecution state upon migration. Therefore the programmer          built following a CS paradigm, servers provide an a-priori
has to build explicitly some appropriate data structures          xed set of services accessible through a statically de ned
that allows for saving and restoring the execution state of     interface. It is often the case that this set of services, or
the component in case of migration. Upon migration, the         their interfaces, are not suitable for unforeseen client needs.
EU has to pack such data structures and send them along         A common solution to this problem is to upgrade the server
with the code to the remote location then the original EU       with new functionality, thus increasing both its complexity
terminates. When the new EU is started on the remote            and its size without increasing its exibility. The ability
CE to execute the code, it must use explicitly the encoded      to request the remote execution of code, by converse, helps
representation of the component's state to reconstruct, at      increase server exibility without a ecting permanently the
the program level, the component's execution state. If a        size or complexity of the server. In this case, in fact, the
strongly mobile technology is used, the component can be        server actually provides very simple and low-level services
directly mapped into a migrating EU and mobility is re-         that seldom need to be changed. These services are then
duced to a single instruction. Therefore the programmer         composed by the client to obtain a customized high-level
is set free from handling the management of the compo-          functionality that meets the speci c client's needs.
14                                            IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998

Mobile code is proving useful in supporting the last          ber of lower level operations, which are performed locally
phases of the software development process, namely, de-          on the server without involving communication over the
ployment and maintenance. Software engineering ad-               physical link. Nevertheless, this solution may be impossi-
dressed the problem of minimizing the work needed to ex-         ble to achieve in certain cases given the speci c application
tend an application and to keep trace of the changes in a        requirements. In any case, it leads to increased complexity
rational way, by emphasizing design for change and the pro-      and size, as well as reduced exibility of the server. Code
vision of better development tools. In a distributed setting,    mobility overcomes these limits because it allows for spec-
however, the action of installing or rebuilding the applica-     ifying complex computations that are able to move across
tion at each site still has to be performed locally and with     a network. This way, the services that need to be executed
human intervention. Some products, notably some Web              by a server residing in a portion of the network reachable
browsers, already use some limited form of program down-         only through an unreliable and slow link could be described
loading to perform automatic upgrade over the Internet.          in a program. This should pass once through the wireless
Mobile code helps in providing more sophisticated automa-        link and be injected into the reliable network. There, it
tion for the installation process. For instance, a scheme        could execute autonomously and independently. In partic-
could be devised where installation actions (that, by their      ular, it would not need any connection with the node that
nature, can usually be automated) are coded in a mobile          sent it, except for the transmission of the nal results of its
program roaming across a set of hosts. There, the program        computation.
could analyze the features of the local platform and oper-          Autonomy of application components brings improved
ate the correct con guration and installation steps. Push-       fault tolerance as a side-e ect. In conventional client-server
ing even further these concepts, let us suppose that a new       systems, the state of the computation is distributed be-
functionality is needed by an application, say, a new dialog     tween the client and the server. A client program is made
box must be shown when a particular button is pushed on          of statements that are executed in the local environment,
the user interface. In a distributed application designed        interleaved with statements that invoke remote services on
with conventional techniques, the new functionality needs        the server. The server contains (copies of) data that belong
to be introduced by reinstalling or patching the application     to the environment of the client program, and will eventu-
at each site. This process could be lenghty and, even worse,     ally return a result that has to be inserted into the same
if the functionality is not fundamental for application op-      environment. This structure leads to well-known problems
erativity there is no guarantee that it will be actually used.   in presence of partial failures, because it is very di cult
In this respect, the ability to request on demand the dy-        to determine where and how to intervene to reconstruct a
namic linking of the code fragment implementing the new          consistent state. The action of migrating code, and pos-
functionality provides several bene ts. First, all changes       sibly sending back the results, is not immune from this
would be centralized in the code server repository, where        problem. In order to determine whether the code has been
the last version is always present and consistent. Moreover,     received and avoid duplicates or lost mobile code, an ap-
changes would not be performed proactively by an operator        propriate protocol must be in place. However, the action
on each site, rather they could be performed reactively by       of executing code that embodies a set of interactions that
the application itself, that would request automatically the     should otherwise take place across the network is actually
new version of the code to the central repository. Hence,        immune from partial failure. An autonomous component
changes could be propagated in a lazy way, concentrating         encapsulates all the state involving a distributed computa-
the upgrade e ort only where it is really needed.                tion, and can be easily traced, checkpointed, and possibly
Mobile code concepts and technology embody also a no-         recovered locally, without any need for knowledge of the
tion of autonomy of application components. Autonomy is          global state.
a useful property for applications that use a heterogeneous         Another advantage that comes from the introduction of
communication infrastructure where the nodes of a network        code mobility in a distributed application is data man-
may be connected by a variety of physical links with di er-      agement exibility and protocol encapsulation. In conven-
ent performances. These di erences must be taken into ac-        tional systems, when data are exchanged among compo-
count since the design stage. For instance, recent develop-      nents belonging to a distributed application, each com-
ments in mobile computing evidenced that low-bandwidth           ponent owns the code describing the protocol necessary
and low-reliable communication channels require new de-          to interpret the data correctly. However, it is often the
sign methodologies for applications in a mobile setting 1],      case for the \know-how" related to the data to change fre-
2]. In networks where some regions are connected through        quently or to be determined case by case according to some
wireless links while others are connected through conven-        external condition|thus making impractical to hard-wire
tional links the design becomes complex. It is important         the corresponding code into the application components.
to cope with frequent disconnections and avoid the gener-        Code mobility enables more e cient and exible solutions.
ation of tra c over the low-bandwidth links as much as           For example, if protocols are only seldom modi ed and are
possible. The CS paradigm has a unique alternative to            loosely coupled with data, an application may download
achieve this objective: to raise the granularity level of the    the code that implements a particular protocol only when
services o ered by the server. This way, a single interaction    the data involved in the computation need a protocol un-
between client and server is su cient to specify a high num-     known to the application. Instead, if protocols are tightly
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                15

coupled with the data they accompany, components could B.3 Advanced Telecommunication Services
exchange messages composed by both the data and the             Support, management, and accounting of advanced
code needed to access and manage such data.                  telecommunication services like videoconference, video on
demand, or telemeeting, require a specialized \middle-
B. Application Domains for Mobile Code                       ware" providing mechanisms for dynamic recon guration
and user customization|bene ts provided by code mobil-
The following review of application domains for mobile ity. For example, the application components managing
code serves two purposes. First, we want to describe some the setup, signalling, and presentation services for a video-
of the domains which are expected to exploit in the near conference could be dispatched to the users by a service
future the bene ts described previously, in order to provide broker. Examples of approaches exploiting code mobility
the reader with an idea of the applicability of the concepts can be found in 54] and 55]. A particular class of advanced
presented so far. Second, we want to point out that some telecommunications services are those supporting mobile
concepts which are often associated tout court with code users. In this case, as discussed earlier, autonomous com-
mobility are not mobile code approaches per se, rather they ponents can provide support for disconnected operations,
are examples of the exploitation of mobile code in a given as discussed in 56].
application domain.
B.4 Remote Device Control and Con guration
B.1 Distributed Information Retrieval                           Remote device control applications are aimed at con-
Distributed information retrieval applications gather in-  guring a network of devices and monitoring their sta-
domain encompasses several other application
formation matching some speci ed criteria from a set of tus. Thise.g., industrial process control and network man-
domains,
information sources dispersed in the network. The infor- agement. In the classical approach, monitoring is achieved
mation sources to be visited can be de ned statically or by polling periodically the resource state. Con guration
determined dynamically during the retrieval process. This is performed using a prede ned set of services. This ap-
is a wide application domain, encompassing very diverse                     on        paradigm, can lead to
applications. For instance, the information to be retrieved proach, based57]. the CSmobility could be used a number
of problems       Code                             to design
might range from the list of all the publications of a given and implement monitoring components that are co-located
author to the software con guration of hosts in a network. with the devices being monitored and report events that
Code mobility could improve e ciency by migrating the represent the evolution of the device state. In addition,
code that performs the search process close to the (possi- the shipment of management components to remote sites
bly huge) information base to be analyzed 53]. This type could improve both performance and exibility 50], 58].
of application has been often considered \the killer appli- A case study focused on the application of our taxonomy to
cation" motivating a design based on the MA paradigm. the network management application domain is presented
However, analysis to determine the network tra c in some in Section VI.
typical cases evidenced that, according to the parameters
of the application, the CS paradigm sometimes can still be B.5 Work ow Management and Cooperation
the best choice 4].
Work ow management applications support the cooper-
B.2 Active Documents                                         ation of persons and tools involved in an engineering or
business process. The work ow de nes which activities
In active documents applications, traditionally passive must be carried out to accomplish a given task as well as
data, like e-mail or Web pages, are enhanced with the ca- how, where, and when these activities involve each party. A
pability of executing programs which are somewhat related way to model this is to represent activities as autonomous
with the document contents, enabling enhanced presenta- entities that, during their evolution, are circulated among
tion and interaction. Code mobility is fundamental for the entities involved in the work ow. Code mobility could
these applications since it enables the embedding of code be used to provide support for mobility of activities that
and state into documents and supports the execution of the encapsulate their de nition and state. For example, a mo-
dynamic contents during document fruition. A paradig- bile component could encapsulate a text document that
matic example is represented by an application that uses undergoes several revisions. The component maintains in-
graphic forms to compose and submit queries to a remote formation about the document state, the legal operations
database. The interaction with the user is modeled by on its contents, and the next scheduled step in the revision
using the COD paradigm, i.e., the user requests the active process. An application of these concepts can be found
document component to the server and then performs some in 59].
computation using the document as an interface. This type B.6 Active Networks
of application can be easily implemented by using a tech-
nology that enables fetching of remote code fragments. A        The idea of active networks has been proposed re-
typical choice is a combination of WWW technology and cently 60], 61] as a means to introduce exibility into
Java applets.                                                networks and provide more powerful mechanisms to \pro-
16                                           IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998

gram" the network according to applications' needs. Al-         evaluation is performed on the basis of the discussion at
though some interpret the idea of active networks without       the beginning of Section V. The second step involves iden-
any relation with code mobility 62], most of the approaches     tifying the suitable paradigms for the design of the applica-
rely on it. They can be classi ed along a spectrum de-          tion at hand. This is done informally and qualitatively, as
limited by two extremes represented by the programmable         in the case described in Section VI-B. Then, the tradeo s
switch and the capsule approaches 60]. The programmable         among the various paradigms must be analyzed for each
switch approach is basically an instantiation of the COD        application functionality whose design could involve code
paradigm, and aims at providing dynamic extensibility of        mobility. To achieve this, in Section VI-C we build a model
network devices through dynamic linking of code. On the         of the application functionality that enables quantitative
other hand, the capsule approach proposes to attach to ev-      analysis of the tradeo s, along the lines of 4]. Finally,
ery packet owing in the network some code describing a          after the suitable paradigms have been chosen, the tech-
computation that must be performed on packet data, at           nology for implementation has to be selected by examining
each node. Clearly, active networks aim at leveraging o         the tradeo s highlighted in Section IV, e.g., trading ease
of the advantages provided by code mobility in terms of de-     of programming for lightweight implementation. This will
ployment and maintenance, customization of services, and        be discussed in Section VI-D.
protocol encapsulation. As an example, in this scenario a          We chose network management as the application do-
multiprotocol router could download on demand the code          main for our case study because, although it is often in-
needed to handle a packet corresponding to an unknown           dicated as the ideal testbed for code mobility, e orts in
protocol, or even receive the protocol together with the        this direction are still in their early stages 58], 60]. The
packet. The work described in 63] is an example of an           results illustrated in the remainder of this section repre-
active network architecture exploiting the COD paradigm.        sent the preliminary achievements of on-going work on the
subject 50], 66].
B.7 Electronic Commerce
Electronic commerce applications enable users to per-        A. The Problem: Decentralizing Network Tra c
form business transactions through the network. The ap-            The world of network management research can be split
plication environment is composed of several independent        roughly in two worlds: management of IP networks, where
and possibly competing business entities. A transaction         the Simple Network Management Protocol 67] proposed by
may involve negotiation with remote entities and may re-        IETF is the dominant protocol, and management of ISO
quire access to information that is continuously evolving,      networks, based on the Common Management Information
e.g., stock exchange quotations. In this context, there is      Protocol 68]. Both protocols are based on a CS paradigm
the need to customize the behavior of the parties involved      where a network management station|the client|polls in-
in order to match a particular negotiation protocol. More-      formation from agents5 |the servers|residing on the net-
over, it is desirable to move application components close to   work devices. Each agent is in charge of managing a man-
the information relevant to the transaction. This problems      agement information base (MIB)6 , a hierarchical base of
make mobile code appealing for this kind of applications.       information that stores the relevant parameters of the cor-
Actually, Telescript 64] was conceived explicitly to support    responding device. In this setting, all the computation re-
electronic commerce. For this reason, the term \mobile          lated to management, e.g., statistics, is demanded to the
agent" is often related with electronic commerce. Another       management station. Polling is performed using very low
application of code mobility to electronic commerce can be      level primitives|basically get and set of atomic values in
found in 65].                                                   the MIB. This ne grained CS interaction is often called
micro management, and leads to the generation of intense
VI. A Case Study in Network Management                     tra c and computational overload on the management sta-
The purpose of this section is to illustrate how the clas-   tion. This centralized architecture is particularly ine cient
si cation we presented so far can be used to guide the          during periods of heavy congestion, when management be-
software engineer through the design and implementation         comes important. In fact, during these periods the man-
phases of the application development process. To this end,     agement station increases its interactions with the devices
we focus on the typical functionality required to a network     and possibly uploads con guration changes, thus increas-
management application, i.e., the polling of management         ing congestion. In turn, congestion, as an abnormal status,
information from a pool of network devices. Current proto-      is likely to trigger noti cations to the management station
cols are based on a centralized client-server paradigm that     which worsen network overload. Due to this situation, ac-
exhibits several drawbacks 57], discussed in Section VI-A.      cess to devices in the congested area becomes di cult and
The identi cation and evaluation of alternative solutions       slow.
will be discussed in the remainder of this section.               5 Despite the name, management agents are conventional programs
The suggested development process proceeds as follows.       which cannot move and in general do not exhibit a great deal of
Given an application whose requirements have been already       intelligence.
6 MIB is actually the term used for information bases in SNMP only.
speci ed, the rst step is to determine if the mobile code       CMIP uses the term management information tree (MIT) database
approach is suited to meet the application needs|that is,       instead. Hereinafter, we will ignore the di erence for the sake of
whether we have to use code mobility at all. This early         simplicity.
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                      17

These problems have been addressed by IETF and ISO           primitives, stores them on the management station, and
with modi cations of their management architecture. For         invokes their remote evaluation on the appropriate device
instance, SNMPv2 69] introduced hierarchical decentral-         whenever needed.
ization through the concept of proxy agents. A proxy               On the other hand, the capability to retain the state
agent is responsible for the management of a pool of de-        across several hops implicit in an MA design adds a new
vices (towards which it acts as a client) on behalf of the      dimension to the bene ts achievable through an REV de-
network management station (towards which it acts as a          sign: autonomy. In the REV paradigm each remote evalu-
server). Another protocol derived from SNMP, called Re-         ation on a device must be initiated explicitly by the man-
mote MONitoring (RMON) 70], assumes the existence of            agement station. In the MA paradigm, the management
stand-alone dedicated devices called probes. Each probe         station can exploit the capability of a mobile component
hosts an agent able to monitor \global" information owing       to retain its state and demand to it the retrieval of infor-
through links rather than information \local" to a device.      mation from a speci ed pool of devices. Thus, it can dele-
Although these decentralization features improve the sit-       gate to it the decision about when and where to migrate,
uation, experimentation showed that they do not provide         according to its current state. Whether this is actually im-
the desired level of decentralization needed to cope with       proving tra c load is still unclear at this point, because
large networks.                                                 the state of the mobile component is likely to grow from
As discussed in Section V, network management applica-       hop to hop. This issue will be analyzed later. Nevertheless,
tions may overcome some of these limits by taking advan-        some other advantages which can determine the choice of
tage of the bene ts of the mobile code approach, such as        the MA paradigm independently of the issue of tra c are
dynamicity in service deployment and customization, au-         worth to be mentioned. For instance, let us consider a sce-
tonomy, and fault tolerance.                                    nario where the pool of devices to be managed resides in
a LAN, and assume that the management station is con-
B. Identifying the Design Paradigms                             nected to the managed devices by a long-haul link, likely to
In this section, we analyze if and how the mobile code       be unreliable and slow. In this case the mobile component,
design paradigms described in Section IV can provide a          once injected into the LAN, can collect information about
suitable alternative to the CS paradigm fostered by SNMP,       all the managed devices without any need to be connected
and thus help in solving the problems depicted above.           with the management station. Even if the state of the mo-
The rationale for the management architecture proposed       bile component increases during this operation, bandwidth
in SNMP and CMIP, which provides very low-granularity           is assumed to be cheaper within the LAN than on the long-
primitives, is to keep the agents on the devices small and      haul link. In addition, mobile components could have the
easily implementable, keeping all the complexity on the         capability to operate even when network level routing is
management station. Nevertheless, as we described ear-          disrupted. If the management station does not have net-
lier, this is going to dramatically increase congestion and     work level connectivity with a node to be managed, it can
decrease performance. For instance, tables are often used       provide its mobile component with a route calculated from
to store information into devices. To search a value in         historical routing information and send it to the rst hop
a table using a CS approach, either the table has to be         on the route. Whenever the mobile component resumes ex-
transferred to the management station and searched there        ecution on an intermediate hop, it tries to reach one of the
for the desired value, or the agent has to be modi ed to        next hop towards the target node using its internal route,
provide a new search service. Neither solution is desirable.    until it reaches the target and performs the management
second increases the size of the agent as a larger number          The COD paradigm gives only a partial solution to the
of routines are implemented|maybe without a substantial         problem, as it provides the capability to extend dynami-
payo if the routines are used only now and then.                cally the set of services o ered by a device. This is conve-
The REV paradigm could be used to pack together the          nient if many identical queries have to be performed on a
set of SNMP operations describing the search and send           device: once the code to perform the SNMP queries locally
them on the device holding the table for local interac-         is installed, it can be remotely invoked by the management
tion7 . After execution, only the target value should be        station. On the other hand, if few di erent queries have to
sent back|thus performing semantic compression of data.         be performed, COD does not help that much: either a REV
Intuitively, this solution is likely to save bandwidth at least or MA paradigm need to be exploited. In the following, we
for big tables and small routines. As an aside, this solution   will focus our discussion only on these two paradigms.
provides a desirable side-e ect: it raises the level of ab- C. Evaluating the Design Tradeo s
straction of the operations available to the network man-
ager. One could envision a scenario where the manager                   The previous section has emphasized from an informal
builds her own management procedures upon lower level and qualitative viewpoint several advantages of mobile
7 We assume the presence of a run-time support for mobile code
Nevertheless, as we pointed out in Section IV, mobile code
on network devices. This assumption could be considered unrealistic design paradigms are not good per se. Rather their ap-
only a couple of years ago. Today, some device manufacturers already
announced support for Java in the next releases of their systems.    plication must be carefully analyzed on a case-by-case ba-
18                                           IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998

sis, taking into account traditional paradigms as well. In      holds, where variables in the right hand side depend on
this section, we exemplify this concept by comparing for-       the SNMP protocol and those in the left hand side depend
mally and quantitatively di erent solutions to the problem      on the particular network con guration and functionality.
of polling device data with respect to the tra c they gen-      The formula above proves the intuition that REV is con-
erate.                                                          venient when a set of SNMP instructions can be \packed"
The scenario we assume is the following. A network man-      e ciently into mobile code, e.g., by exploiting loops. Nev-
agement station retrieves management data from a pool of        ertheless, the formula gives a quanti cation about when to
devices, e.g., the load on every network interface of each      use a paradigm rather than the other.
device. Data retrieval is conceptually a single query on the       Although overall tra c is an important parameter to op-
device, but is actually implemented by several SNMP in-         timize, we pointed out earlier that one of the key bene ts of
structions. Table III shows a set of parameters needed to       mobile code is that it enables decentralized network man-
model this scenario. Such parameters de ne an oversimpli-       agement, reducing the load on the management station.
ed model. For instance, CPU time is considered an in nite     With the CS and REV paradigms, the expression for the
resource and, even more important, the network is consid-       tra c around the management workstation coincides with
ered uniform, with no di erence in bandwidth or latency         the expression for the overall tra c. Instead, an MA de-
among the links|a heavy assumption for network manage-          sign involves the management station only when the mobile
ment. Finally, in real protocols like TCP the overhead h ac-    component is injected into the network and when it comes
tually depends on the payload size. Nevertheless, our goal      back to the station, giving the expression
here is to illustrate some guidelines to evaluate the trade-
o s among paradigms: a quantitative comparison among                             TMA = 2(h + CMA ) + rQN:
Mgm

paradigms, encompassing a precise characterization of net-      In other words, the tra c around the workstation is dimin-
work management functionalities and an accurate model of        ished, that is T = TCS ; TMA > 0, when
Mgm
network protocols, can be found in 66].
A design exploiting the CS paradigm fostered by SNMP                               CMA < (2h + i)
would lead to an overall tra c described by the expression                            QN         2
TCS = (2h + i + r)QN:                         assuming QN 1. Again, this provides quantitative ev-
idence for the fact that improvement of tra c increases
In fact, due to the SNMP architecture, each of the Q in-        with the number of nodes being managed autonomously by
structions implementing the query has to be sent separately     the mobile component and with the number of instructions
on each of the N nodes and returns a single result r which is   that can be packed e ciently into the component code.
collected and subsequently elaborated by the management            It is worth noting that small changes in the model can
station.                                                        modify slightly the tradeo . For example, if semantic com-
Exploitation of the REV paradigm assumes that the set        pression of data is performed, e.g., because the manage-
of Q SNMP instructions representing the query are em-           ment station is interested only in the maximum among the
bedded in mobile code sent to each device and executed          Q values retrieved on each the device, the expression for
remotely. If the management station is interested in all        the tra c in the MA case becomes
the results returned by each SNMP instruction (which are
shipped altogether) and we assume a value CREV for the                      TMA = (h + CMA )(N + 1) rN (N + 1)
0

2
size of the code sent, the expression for the tra c is
and can even become linear, that is
TREV = (2h + CREV + rQ)N:
TMA = (h + CMA )(N + 1) + rN
00

Finally, in a design based on the MA paradigm the code
encapsulating the query can move autonomously among             when semantic compression can be performed across all the
the network devices retaining its state, which is growing       devices (e.g., because the management station is interested
as long as the mobile component collects information. The       in nding the maximum value among all the devices). This
expression for the overall tra c, assuming a value CMA for      would make MA a candidate even in absence of congestion
the size of the mobile component, is then                       around the management station.
The analysis just carried out evidences that, as far as
N
TMA = (h + CMA )(N + 1) + rQN (2 + 1) :                code mobility is concerned, REV and MA are the design
paradigms one may want to exploit in designing a polling
This analysis shows that the MA paradigm is never con-        functionality for a network management application. Nev-
venient, at least as far as overall network tra c is con-       ertheless, if the actual values of the application param-
cerned. On the other hand, assuming that 2h CREV ,              eters are in a certain range, it still desirable to use a CS
REV is more convenient than CS if                               paradigm. Hence, the choice of the paradigm is constrained
by the actual values for the parameters of the application.
CREV < (2h + i)                              As a nal remark, it should be pointed out that, al-
Q                                        though network tra c is a key parameter in the context
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                    19

parameter    unit     description
N        node     number of managed network devices
Q     instruction number of SNMP instructions needed
to perform a single device query
i         bit     size of an SNMP instruction
h         bit     message header and other auxiliary data
encapsulating message content
r         bit     average size of the result of an SNMP instruction
TABLE III
Parameters modeling a simple network management data retrieval functionality.

of network management, in other applications it might be       plication must take into account not only which is the best
completely irrelevant and other factors may be predomi-        technology to implement a given functionality designed fol-
nant, e.g., CPU usage. In these cases, the same approach       lowing a certain design paradigm, but also how the technol-
based on quantitative analysis can be put in place.            ogy ts the global application development. For example,
let us suppose that we are faced with the choice between a
D. Choosing the Implementation Technology                      strong MCS that does not provide good support for stand-
alone code shipping and a weak MCS that provides it. In
In principle, design paradigms and the technology used      the context depicted above, the rst functionality is likely
for their implementation are orthogonal, as discussed at the   to be used less frequently than the others: in this case,
end of Section IV. Nevertheless, we have already pointed       we may want to sacri ce the tra c optimization achiev-
out that this is true only partially, and that an inappro-     able with the strong MCS and use the weak one, to obtain
priate technology may put an unnecessary burden on the         better support in the key functionalities and keep the uni-
programmer|at least as far as code mobility is concerned.      formity of the development tools.
In particular, we showed how a strong MCS is the nat-
ural choice for implementing an MA design. Mobility is                             VII. Conclusions
reduced to a single instruction, and the migrating EU can
be mapped directly to a roaming component in the higher-          Mobile code is a promising solution for the design and im-
level design. Conversely, a weak MCS constrains the pro-       plementation of large scale distributed applications, since it
grammer to manage explicitly the execution state, which        overcomes many of the drawbacks of the traditional client-
degradates programmer productivity, program readability,       server approach. However, most research e orts in this
and ease of debugging. In the context of our case study,         eld have been focused on the development of mobile code
however, there is an additional drawback. The formulas         technologies, and little attention has been payed so far to
we derived in the previous section show how the size of the    the formulation of a sound conceptual framework for code
transferred code is a key parameter in the expressions of      mobility.
network tra c. Implementing an MA design with a weak              In this paper we proposed a conceptual framework struc-
MCS is likely to end up in creating bigger code (because of    tured along three classes of concepts: applications, design
the explicit management of execution state), thus reducing     paradigms, and technologies. Applications are the solu-
the bene ts potentially achievable.                            tions to speci c problems. Paradigms guide the design
Nevertheless, the nal choice might be in uenced by          of applications. Technologies support application devel-
other considerations as well. The analysis described in        opment. We surveyed each of these concepts and pointed
the previous section aims at identifying the best paradigm     out features, advantages, and disadvantages of existing ap-
to design a single functionality within an application. Of     proaches and proposals. The purpose of the framework pre-
course, an application is composed of several functionali-     sented in this work is to foster progress towards a common
ties, each with its own peculiarities that may lead to com-    understanding of the issues and contributions in the area
pletely di erent designs. For instance, suppose we want        of code mobility. The framework will be a useful guideline
to implement a network management application that pro-        to practitioners, who can use it to exploit the potential of
vides, among the others, a rst functionality to determine      the di erent mobile code concepts and technologies.
the most loaded interface on a given path, a second func-         Certainly, the work presented in this paper needs to be
tionality that determines all the parameters for a given       incrementally enriched and revised, taking into account ex-
interface, and a third one that allows the manager to set      periences, results, and innovations as they emerge from the
a given value in a device's MIB. The analysis carried out      research activity. In particular, we need to improve our
earlier tells us that in the rst case we may want to take      understanding of the properties and weaknesses of the ex-
advantage of the opportunity to perform global semantic        isting design paradigms. We also need to consolidate a
compression, and exploit the MA paradigm in the sec-           detailed conceptual framework for mobile code languages,
ond, we may want to follow the REV paradigm to save            that makes it possible to compare them as we do for tra-
bandwidth|MA is overshooting. In the third one, CS will        ditional programming languages 71]. Another issue is the
su ce.                                                         development of models enabling formal reasoning and ver-
The choice of the technology used to implement the ap-      i cation. Finally, we need to further explore the relatively
20                                                 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 24, NO. XX, XXXXX 1998

unknown world of applications and problems that can bene-               22] M. Wooldridge and N.R. Jennings, \Intelligent Agents: Theory
t from the adoption of technology and methodology based                   and Practice," Knowledge Engineering Review, vol. 10, no. 2,
June 1995.
on mobile code. Nonetheless, we believe that the concepts               23] F.C. Knabe, Language Support for Mobile Agents, Ph.D. thesis,
presented in this paper can be instrumental in the creation                 Carnegie Mellon Univ., Pittsburgh, PA, USA, Dec. 1995, Also
of a mature and comprehensive background for the evolu-                     available as Carnegie Mellon School of Computer Science Tech-
nical Report CMU-CS-95-223 and European Computer Industry
tion and further di usion of mobile code applications and                   Centre Technical Report ECRC-95-36.
techniques.                                                             24] H. Peine and T. Stolpmann, \The Architecture of the Ara Plat-
form for Mobile Agents," In Rothermel and Popescu-Zeletin 72],
Acknowledgments                                       pp. 50{61.
25] D. Wong, N. Paciorek, T. Walsh, J. DiCelie, M. Young, and
We wish to thank Mario Baldi, Antonio Carzaniga, Gi-                      B. Peet, \Concordia: An Infrastructure for Collaborating Mobile
anpaolo Cugola, and Carlo Ghezzi. Without the lively dis-                   Agents," In Rothermel and Popescu-Zeletin 72], pp. 86{97.
26] M. Fukuda, L. Bic, M. Dillencourt, and F. Merchant, \Intra-
cussions with them and the related work developed jointly,                  Inter-Object Coordination with MESSENGERS," in 1st Int.
this paper would have never been written.                                   Conf. on Coordination Models and Languages (COORDINA-
TION'96), 1996.
References                                  27] J. Kiniry and D. Zimmerman, \A Hands-On Look at Java Mobile
Agents," IEEE Internet Computing, vol. 1, no. 4, pp. 21{30,
1]    G.H. Forman and J. Zahorjan, \The Challenges of Mobile Com-           1997.
puting," IEEE Computer, vol. 27, no. 4, pp. 38{47, 1994.          28] D. Volpano, \Provably-Secure Programming Languages for Re-
2]    T. Imielinsky and B.R. Badrinath, \Wireless Computing: Chal-          mote Evaluation," ACM Computing Surveys, vol. 28A, Dec.
lenges in Data Management," Comm. of the ACM, vol. 37, no.            1996, Participation statement for ACM Workshop on Strategic
10, pp. 18{28, 1994.                                                  Directions in Computing Research.
3]    Object Management Group, CORBA: Architecture and Speci -          29] G. Cugola, C. Ghezzi, G.P. Picco, and G. Vigna, \Analyzing
cation, Aug. 1995.                                                    Mobile Code Languages," In Vitek and Tschudin 73], pp. 93{
4]    A. Carzaniga, G.P. Picco, and G. Vigna, \Designing Distributed        111.
Applications with Mobile Code Paradigms," in Proc. of the 19th    30] R.S. Gray, \Agent Tcl: A transportable agent system," in
Int. Conf. on Software Engineering (ICSE'97), R. Taylor, Ed.          Proc. of the CIKM Workshop on Intelligent Information Agents,
1997, pp. 22{32, ACM Press.                                           Baltimore, Md., Dec. 1995.
5]    J.K. Boggs, \IBM Remote Job Entry Facility: Generalize Sub-       31] B. Thomsen, L. Leth, S. Prasad, T.-M. Kuo, A. Kramer, F.C.
system Remote Job Entry Facility," IBM Technical Disclosure           Knabe, and A. Giacalone, \Facile Antigua Release programming
Bulletin 752, IBM, Aug. 1973.                                         guide," Tech. Rep. ECRC-93-20, European Computer Industry
6]    Adobe Systems Incorporated, PostScript Language Reference             Research Centre, Munich, Germany, Dec. 1993.
Manual, Addison-Wesley, 1985.                                     32] Sun Microsystems, \The Java Language: An Overview," Tech.
7]    M. Nuttall, \Survey of systems providing process or object mi-        Rep., Sun Microsystems, 1994.
gration," Tech. Rep. Doc 94/10, Dept. of Computing, Imperial      33] D.B. Lange, \Java Aglets Application Programming Interface
College, May 1994.                                                    (J-AAPI)," IBM Corp. White Paper, Feb. 1997.
8]    G. Thiel, \Locus operating system, a transparent system," Com-    34] D.B. Lange and D.T. Chang, \IBM Aglets Workbench|
puter Communications, vol. 14, no. 6, pp. 336{346, 1991.              Programming Mobile Agents in Java," IBM Corp. White Paper,
9]    E. Jul, H. Levy, N. Hutchinson, and A. Black, \Fine-grained           Sept. 1996.
Mobility in the Emerald System," ACM Trans. on Computer           35] C. Tschudin, An Introduction to the M0 Messenger Language,
Systems, vol. 6, no. 2, pp. 109{133, Feb. 1988.                       Univ. of Geneva, Switzerland, 1994.
10]   R. Lea, C. Jacquemont, and E. Pillevesse, \COOL: System Sup-      36] C. Tschudin, \OO-Agents and Messengers," in ECOOP'95
port for Distributed Object-Oriented Programming," Comm. of           Workshop W10 on Objects and Agents, Aug. 1995.
the ACM, vol. 36, no. 9, pp. 37{46, Nov. 1993.                    37] M. Stra er, J. Baumann, and F. Hohl, \Mole|A Java Based
11]   M. Rozier, V. Abrossimov, F. Armand, I. Boule, M. Gien,               Mobile Agent System," in Special Issues in Object-Oriented
M. Guillemont, F. Herrmann, C. Kaiser, P. Leonard, S. Langlois,       Programming: Workshop Reader of the 10th European Conf. on
and W. Neuhauser, \Chorus Distributed Operating Systems,"             Object-Oriented Programming ECOOP'96, M. Muhlauser, Ed.
Computing Systems, vol. 1, pp. 305{379, Oct. 1988.                    July 1996, pp. 327{334, dpunkt.
12]   C.G. Harrison, D.M. Chess, and A. Kershenbaum, \Mobile            38] J. Baumann, F. Hohl, N. Radouniklis, K. Rothermel, and
Agents: Are they a good idea?," In Vitek and Tschudin 73],            M. Stra er, \Communication Concepts for Mobile Agent Sys-
pp. 25{47, Also available as IBM Technical Report.                    tems," In Rothermel and Popescu-Zeletin 72], pp. 123{135.
13]   L. Bic, M. Fukuda, and M. Dillencourt, \Distributed Computing     39] J. Hogg, \Island: Aliasing Protection in Object-Oriented Lan-
Using Autonomous Objects," IEEE Computer, Aug. 1996.                  guages," in Proc. of OOPSLA '91, 1991.
14]   J.W. Stamos and D.K. Gi ord, \Implementing Remote Evalua-         40] L. Cardelli, \A language with distributed scope," Computing
tion," IEEE Trans. on Software Engineering, vol. 16, no. 7, pp.       Systems, vol. 8, no. 1, pp. 27{59, 1995.
710{722, July 1990.                                               41] N. Borenstein, \EMail With A Mind of Its Own: The Safe-Tcl
15]   A. Birrell and B. Nelson, \Implementing Remote Procedure              Language for Enabled Mail," Tech. Rep., First Virtual Holdings,
Calls," ACM Trans. on Computer Systems, vol. 2, no. 1, pp.            Inc, 1994.
29{59, Feb. 1984.
16]   E.J.H. Chang, \Echo Algorithms: Depth Parallel Operations on      42] J. Ousterhout, Tcl and the Tk Toolkit, Addison-Wesley, 1995.
General Graphs," IEEE Trans. on Software Engineering, July        43] J. Ousterhout, J. Levy, and B. Welch, \The Safe-Tcl Security
1982.                                                                 Model," Tech. Rep., Sun Microsystems, Nov. 1996, Reprinted
17]   D. Johansen, R. van Renesse, and F.B. Schneider, \An Introduc-        in 52].
tion to the TACOMA Distributed System - Version 1.0," Tech.       44] A. Acharya, M. Ranganathan, and J. Saltz, \Sumatra: A
Rep. 95-23, Dept. of Computer Science, Univ. of Troms and             Language for Resource-aware Mobile Programs," In Vitek and
Cornell Univ., Troms , Norway, June 1995.                             Tschudin 73], pp. 111{130.
18]   A.S. Park and S. Leuker, \A Multi-Agent Architecture Support-     45] M. Shaw and D. Garlan, Software Architecture: Perspective on
ing Services Access," In Rothermel and Popescu-Zeletin 72], pp.       an Emerging Discipline, Prentice Hall, 1996.
62{73.                                                            46] G. Abowd, R. Allen, and D. Garlan, \Using Style to Under-
19]   J.E. White, \Telescript Technology: Mobile Agents," in Soft-          stand Descriptions of Software Architecture," in Proc. of SIG-
ware Agents, J. Bradshaw, Ed. AAAI Press/MIT Press, 1996.             SOFT'93: Foundations of Software Engineering, Dec. 1993.
20]   P. Maes, \Agents that Reduce Work and Information Overload,"      47] J. Waldo, G. Wyant, A. Wollrath, and S. Kendall, \A Note
Comm. of the ACM, vol. 37, no. 7, July 1994.                          on Distributed Computing," In Vitek and Tschudin 73], Also
21]   M. Genesereth and S. Ketchpel, \Software Agents," Comm. of            available as Sun Microsystems Laboratories Technical Report
the ACM, vol. 37, no. 7, July 1994.                                   TR-94-29.
FUGGETTA, PICCO, AND VIGNA: UNDERSTANDING CODE MOBILITY                                                                                     21

48] G.P. Picco, G.-C. Roman, and P.J. McCann, \Expressing              72] K. Rothermel and R. Popescu-Zeletin, Eds., Mobile Agents: 1st
Code Mobility in Mobile UNITY," in Proc. of the 6th Eu-                International Workshop MA '97, vol. 1219 of LNCS. Springer,
ropean Software Engineering Conf. held jointly with the 5th            Apr. 1997.
ACM SIGSOFT Symp. on the Foundations of Software Engi-             73] J. Vitek and C. Tschudin, Eds., Mobile Object Systems: Towards
neering (ESEC/FSE '97), M. Jazayeri and H. Schauer, Eds.,              the Programmable Internet, vol. 1222 of LNCS, Springer, Apr.
Zurich, Switzerland, Sept. 1997, vol. 1301 of LNCS, pp. 500{           1997.
518, Springer.
49] C. Ghezzi and G. Vigna, \Mobile Code Paradigms and Tech-
nologies: A Case Study," In Rothermel and Popescu-Zeletin
72], pp. 39{49.
50] M. Baldi, S. Gai, and G.P. Picco, \Exploiting Code Mobility in                            Alfonso Fuggetta is Associate Professor of
Decentralized and Flexible Network Management," In Rother-                              Software Engineering at Politecnico di Milano.
mel and Popescu-Zeletin 72], pp. 13{26.                                                 He is also Senior Researcher at CEFRIEL, a
51] K.A. Bharat and L. Cardelli, \Migratory Applications," Tech.                            research and education institute established in
Rep. 138, Digital Equipment Corporation, Systems Research                               Milano by universities, the regional council of
Center, Feb. 1996.                                                                      Lombardy, and several major IT industries.
52] G. Vigna, Ed., Mobile Agents and Security, LNCS State-of-the-                           His research interest are in work ow and pro-
Art Survey. Springer, 1998.                                                             cess modeling and support, technologies and
53] P. Knudsen, \Comparing two Distributed Computing Paradigms                              methods for distributed and mobile systems,
- a Performance Case Study," M.S. thesis, Univ. of Troms ,                              requirement engineering. He is member of
1995.                                                                                   IEEE, IEEE Computer Society, and ACM.
54] A. Limongiello, R. Melen, M. Roccuzzo, A. Scalisi, V. Trecordi,    More info can be found at http://www.elet.polimi.it/ fuggetta.
and J. Wojtowicz, \ORCHESTRA: An Experimental Agent-
based Service Control Architecture For Broadband Multimedia
Networks," GLOBAL Internet '96, Nov. 1996.
55] T. Magedanz, K. Rothermel, and S. Krause, \Intelligent Agents:
An Emerging Technology for Next Generation Telecommunica-                                 Gian Pietro Picco holds a Dr.Eng. degree in
tions?," in INFOCOM'96, San Francisco, CA, USA, Mar. 1996.                                electronic engineering from Politecnico di Mi-
56] R.S. Gray, D. Kotz, S. Nog, D. Rus, and G. Cybenko, \Mo-                                  lano, Italy, and a Ph.D. degree in computer en-
bile agents for mobile computing," in Proc. of the 2nd                                    gineering from Politecnico di Torino, Italy. The
Aizu Int. Symp. on Parallel Algorithms/Architectures Synthe-                              subject of his recent Ph.D. dissertation and
sis, Fukushima, Japan, Mar. 1997.                                                         of his current research is understanding, eval-
57] Y. Yemini, \The OSI Network Management Model," IEEE                                       uating, formalizing, and exploiting code mo-
Communications, pp. 20{29, May 1993.                                                      bility in the context of large-scale distributed
58] G. Goldszmidt and Y. Yemini, \Distributed Management by                                   systems. Prior to that, he published work
Delegation," in Proc. of the 15th Int. Conf. on Distributed Com-                          in software process modeling, object-oriented
puting, June 1995.
databases, and robotics. He is presently a vis-
iting researcher at Washington University, St. Louis, USA, where
59] T. Cai, P. Gloor, and S. Nog, \DataFlow: A Work ow Man-            he is investigating the relationships between mobile code and mobile
agement System on the Web using transportable Agents," Tech.       computing. He is member of IEEE, IEEE Computer Society, and
Rep. TR96-283, Dept. of Computer Science, Dartmouth College,       ACM. More info can be found at http://www.polito.it/ picco.
Hanover, NH, 1996.
60] D.L. Tennenhouse, J.M. Smith, W.D. Sincoskie, D.J. Wetherall,
and G.J. Minden, \A Survey of Active Network Research," IEEE
Communications, vol. 35, no. 1, pp. 80{86, Jan. 1997.
61] Y. Yemini and S. da Silva, \Towards Programmable Networks,"                               Giovanni Vigna received the Dr.Eng. degree
in IFIP/IEEE Int. Workshop on Distributed Systems: Opera-                                  in electronic engineering in 1994 and the Ph.D.
tions and Management, L'Aquila, Italy, Oct. 1996.                                          degree in computer engineering 1998 from Po-
62] S. Bhattacharjee, K.L. Calvert, and E.W. Zegura, \An Architec-                             litecnico di Milano, Italy. His Ph.D. disserta-
ture for Active Networking," in High Performance Networking                                tion focused on mobile code technologies and
(HPN'97), Apr. 1997.                                                                       design paradigms, with an emphasis on secu-
63] D.J. Wetherall, J. Guttag, and D.L. Tennenhouse, \ANTS: A                                  rity issues. He authored several publications
Toolkit for Building an Dynamically Deploying Network Pro-                                 on mobile code and he is editor of a special is-
tocols," Tech. Rep., MIT, 1997, Submitted for publication to                               sue of the LNCS on mobile code and security.
IEEE OPENARCH'98.                                                                          He is currently with University of California,
64] J.E. White, \Telescript Technology: The Foundation for the                                 Santa Barbara, as a post doc researcher. His
Electronic Marketplace," Tech. Rep., General Magic, Inc., 1994,    research interests include mobile code, WWW engineering, electronic
White Paper.                                                       commerce, network security, and intrusion detection. He is a member
65] M. Merz and W. Lamersdorf, \Agents, Services, and Electronic       of IEEE, IEEE Computer Society, and ACM. More info can be found
Markets: How do they Integrate?," in Proc. of the Int'l Conf.      at http://www.elet.polimi.it/ vigna.
on Distributed Platforms. IFIP/IEEE, 1996.
66] M. Baldi and G.P. Picco, \Evaluating the Tradeo s of Mobile
Code Design Paradigms in Network Management Applications,"
in Proc. of the 20th Int. Conf. on Software Engineering, R. Kem-
merer, Ed., 1998, To appear.
67] J.D. Case, M. Fedor, M. L. Scho stall, and C. Davin, \Simple
Network Management Protocol," RFC 1157, May 1990.
68] OSI, \ISO 9595 Information Technology, Open System Inter-
connection, Common Management Information Protocol Speci-
cation," 1991.
69] J.D. Case, K. McCloghrie, M. Rose, and S. Waldbusser, \Struc-
ture of Management Information for version 2 of the Simple Net-
work Management Protocol," RFC 1902, Jan. 1996.
70] S. Waldbusser, \Remote Network Monitoring Management In-
formation Base," RFC 1757, Feb. 1995.
71] C. Ghezzi and M. Jazayeri, Programming Language Concepts,
John Wiley & Sons, 3rd edition, 1997.


DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
 views: 132 posted: 3/6/2012 language: Thai pages: 134