LDCP : An Optimal Algorithm for Static Task Scheduling in Grid Systems
The International Journal of Computer Science and Information Security is a monthly periodical on research articles in general computer science and information security which provides a distinctive technical perspective on novel technical research work, whether theoretical, applicable, or related to implementation. Target Audience: IT academics, university IT faculties; and business people concerned with computer science and security; industry IT departments; government departments; the financial industry; the mobile industry and the computing industry. Coverage includes: security infrastructures, network security: Internet security, content protection, cryptography, steganography and formal methods in information security; multimedia systems, software, information systems, intelligent systems, web services, data mining, wireless communication, networking and technologies, innovation technology and management. Thanks for your contributions in July 2010 issue and we are grateful to the reviewers for providing valuable comments. IJCSIS July 2010 Issue (Vol. 8, No. 4) has an acceptance rate of 36 %.

(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
LDCP+: An Optimal Algorithm for Static Task
Scheduling in Grid Systems
Negin Rzavi Safieh Siadat Amir Masoud Rahmani
Islamic Azad University, Islamic Azad University, Islamic Azad University,
Science and Research Branch, Science and Research Branch, Science and Research Branch,
Tehran, Iran Tehran, Iran Tehran, Iran
n.razavi@srbiau.ac.ir s.siadat@srbiau.ac.ir rahmani@srbiau.ac.ir
Abstract— after a computational job is designed and realized as a scheduling algorithms all information needed for scheduling
set of tasks, an optimal assignment of these tasks to the such as the structure of the parallel application, the execution
processing elements in a given architecture needs to be time of individual tasks and the communication costs between
determined. In grid system with the existence of heterogeneous tasks must be known, in contrast, these information are
processing elements and data transferring time between them, unknown in dynamic task scheduling algorithms.
determining an assignment of tasks to processing elements in
order to optimize the performance and efficiency is so important. Among different types of scheduling algorithms, HEFT is a
In this paper a heuristic algorithm named LDCP+ is presented, scheduling algorithm for heterogeneous distributed computing
which has optimized the Longest Dynamic Critical Path systems which is consists of two phases: first, cost computing
algorithm (LDCP) presented by Mohammad L. Daoud and for each task and task selection, second, processor selection. In
Nawwaf Kharma in 2007. This algorithm is a list-based algorithm the task selection phase the algorithm sets the computation
in the way it assigns each task a priority for its execution. Using costs of tasks to their mean values and this may limit the ability
task duplication, using idle processing element's time and also of scheduling algorithm to precisely compute the priorities of
optimizing priority assignment method which is used in LDCP tasks. The CPOP algorithm is same as HEFT in the two phases
algorithm, are the basic specifications of LDCP+, since LDCP but with different strategies in assigning priorities to tasks and
algorithm is executable with the assumption that computation processor selection. These two algorithms have been
cost of tasks are monotonic, our algorithm which is presented in
mentioned as optimal algorithms in the parameter of total finish
this paper has made the scheduling algorithm free from this
time.
restriction and in the case of non-monotonic computation costs,
LDCP+ has the minimum total finish time in the comparison of In this paper we present a heuristic list-based algorithm
other scheduling algorithms such as HEFT and CPOP. called LDCP+ (optimized of Longest Dynamic Critical Path
algorithm) for static task scheduling in Grid systems with
Keywords- Grid; Static task scheduling; Longest Dynamic limited number of processors and we compare our scheduling
Critical Path. results with other algorithms such as CPOP, HEFT and LDCP
for performance evaluation.
I. INTRODUCTION
A Grid system is a group of connected computers that has II. RELATED WORKS
the ability of executing parallel programs via a high speed Static task scheduling for Grid systems, in general is known
interconnection. The efficiency of program parallelism in Grid to be NP-Complete problem [4, 7, 9] and most of these
systems depends on methods used in task scheduling on algorithms are heuristic [1, 2, 3, 4, 7]. One of the most
available processing elements. Inner connection of processing important classes of heuristic algorithms is list-based
elements in Grid causes an overhead when two tasks assigned algorithms [6], in such algorithms each task is assigned with a
to different processing elements of distinct computers, transfer priority and three steps of task selection, processor selection
data. In fact, task scheduling in distributed heterogeneous and status update are repeated until all tasks are scheduled. In
systems are more complex in which each task can have the task selection phase the unscheduled task with the highest
different execution time on different processing elements, so priority is selected. In the processor selection phase, the
scheduling algorithms for a Grid system should consider the selected task is assigned to the processor that minimizes a
execution time of each task on different processing elements predefined cost criterion that can be minimizing the schedule
and even one incorrect decision can restrict the system length. At last in status update phase, the status of the system is
performance to the slowest processing element [2]. updated. Examples of list-based algorithms are: Heterogeneous
There are two kinds of scheduling algorithms: static scheduling Earliest Finish Time (HEFT) [9], Critical Path on a Processor
algorithms and dynamic scheduling algorithms. In static (CPOC) [9], Critical Path on a Cluster (CPOC) [5], Dynamic
335 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
Level Scheduling (DLS) [8], Modified Critical Path (MCP)
[10], Mapping Heuristic (MH) [3], Dynamic Critical Path
(DCP), and Longest Dynamic Critical Path (LDCP) [2]. p0 t0 t2
0 6 15
III. PROBLEM DEFINITION
p1 t1 Idle t3
In static task scheduling in Grid system, the execution
precedence between tasks is represented by a Directed Acyclic 0 4 15 20 23
Graph (DAG), each DAG is shown by tupple (T, E) where T is Figure 2. Schedule length of the presented DAG in Fig. 1. on two processors
a set of n tasks and E is a set of e edges. Each ti ∈ T
represents a task and each ei, j = (ti , t j ) ∈ E represents the
execution precedence between the two tasks which are
6 n0 n1 5 n0 n1 4
connected with the edge ei, j . 6 2
2 1 1
If (ti , t j ) ∈ E then the execution of task t i ∈T cannot be
started before task finishes its execution. For the edge (t i , t j ) , 3 n2 9 3 n2 8
the source task t i is parent of the sink task t j , while t j is a
5 5
child of t i . A task with no parents is called an entry task and a
task with no children is called an exit task. Associated with n3 7 n3 3
each edge (t i ,t j ) is a value d i , j that represents the amount of
data to be transmitted from task to task t j , and in some cases it (a) (b)
also represents the minimum time that a task needs to wait for Figure 3. Tasks computation time on each processor that will be acquired
starting after task t i finishes its execution. from cost matrix in Fig. 1.
A Grid system is represented by a set P of m processors, a Assigning task priorities in Grid system the efficiency of
set T of n tasks and n × m computation cost matrix ( Wn×m ). list-based scheduling algorithms depends on the methods which
Each element w i , k ∈ W ,1 ≤ i < n ,1 ≤ k ≤ m represents the assign priorities to tasks.
execution time (computation cost) of task t i on processor Pk .
We have the same assumption as LDCP that all processors are In our suggested algorithm LDCP+, if selecting a task in
fully connected and communications between processors occur one step of scheduling causes the minimum schedule length we
via independent communication units [2], so, we can have task assign a high priority to that task. There are some basic
execution and data transferring in parallel. Also the data definitions which are used in LDCP algorithm and because
LDCP+ is the result of optimization of LDCP, we decided to
transfer rate between any two processors on the network is
represent this basic knowledge too.
assumed to be fixed and constant as same as LDCP. The
communication cost between two processors is represented by
n × n matrix ( Dn×n ). d i , j ∈ D is zero if two tasks t i and t j of B. Definition 2
and are scheduled on the same processor and it is equal to Critical Path: For a given DAG, the Critical Path (CP) is
communication cost (non zero) in the other case. A task can defined as the path from an entry task to an exit task for which
start its execution on a processor only when all data from its the sum of the computation costs of tasks and the
parent become available to that processor. The goal of our communication costs of edges is maximal.
algorithm is to assign tasks in processors in a way that
minimizes the total finish time, or the schedule length. IV. LDCP: LONGEST DYNAMIC CRITICAL PATH
A. Definition 1 A. Definition 3
Schedule Length: The maximum execution time of the Longest Dynamic Critical Path: Given a DAG with n tasks
processors or the finish time of the final task after task and e edges and a Grid system with m processors, DCP during
scheduling is called scheduled length. There is a DAG and a a particular scheduling step is a path of tasks and edges from an
computation cost matrix with two processors as shown in entry task to an exit task.
Fig.1. The schedule length is computed in Fig.2. and it is equal LDCP is the largest DCP, considering that communication
to 23. costs between tasks scheduled on the same processors are
t0 t1 Task p0 p1 assumed zero, and the execution constraints are preserved.
2 1
t0 6 5 Fig.3. represents two dynamic critical paths. First path in
3 Fig.3.a. is composed of tasks t 0 , t 2 and t 3 which is scheduled
t2 t1 6 4
on processor p0 and has the length of 29. The second DCP in
5 t2 9 8 Fig.3.b. is composed of tasks t 0 , t 2 and t 3 which is scheduled
t3 t3 7 3 on processor p1 and has the length of 23, so at the first step of
(a) (b) scheduling, LDCP is composed of tasks t 0 , t 2 and t 3 and with
Figure 1. An example of (a) DAG (b) computation matrix the schedule length of 29.
336 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
V. LDCP+: THE PROPOSED ALGORITHM 5) Definition 8
In the algorithm of LDCP+, each scheduling iteration KeyNodeSet: This set includes KeyNodes that are selected
includes three phases below: for scheduling and in the first scheduling iteration it can
include several tasks, but in other iterations it has only one task
1. Task selection for scheduling and in the first scheduling iteration it can
include several tasks, but in other iterations it has only one
2. Processor selection phase
task.
3. Status update phase
6) Definition 9.
These 3 phases will be accomplished for each task until last Least Execution Time (LET): Least Execution Time is
input task is selected for scheduling. defined as
A. Task Selection Phase min{ processTim e( pk ) + wi ,k + d i , j } (3)
LDCP+ selects a set of tasks that play main role in where processTim e ( p k ) is the time that find scheduled
determining schedule length. task on processor p k finishes its execution, w i , k is the
In first step of this phase, DAG of each processor is computation time of task corresponded to i on processor k, and
required for scheduling. d i , j is the communication time between t i and t j . If both
t i and t j are scheduled on processor p k , then communication
1) Definition 4 cost between them will be assumed zero. After computing
Directed Graph: With the assumption of having a DAG URankSet, the destined task for scheduling algorithm is the
including n tasks, e edges and a Grid system with m processors task corresponding to existing KeyNode in URankSet. In the
( p0 , p1 ,..., pm ), DAGP is the directed acyclic graph that
k first iteration to obtain minimum execution time on available
corresponds to processor p k . The computation cost of each processing elements, if the number of entry tasks is equal or
task in the processor p k , is represented by a number on the less than processors number, all entry tasks will be consider as
related node of the DAGPk . KeyNode, in other case, as same as the number of processors,
DAGP0 is shown in Fig.3.a. and DAGP1 is shown in tasks with maximum URanks will be selected as KeyNodes and
Fig.3.b. These figures are related with the DAG and the Grid place in KeyNodeSet. In the next iterations, KeyNodeSet
system which is represented in Fig.1. Trough the course of this merely includes one KeyNode (a set with one member).
paper, ti is used to refer to the i'th task in directed acyclic graph
and the node n i identifies task t i in DAGPk . The number B. Processor Selection Phase
associated with this node represents the computation cost of In this phase, selected task will be assigned to a processor
task ti on processor pk. In each DAGPk , all nodes are assigned in the way to gain the minimum schedule length in each
with a number named UpwardRank (URank). URanks are used iteration of scheduling. Therefore, in LDCP+, these stages will
to determine tasks priorities in DAGPk . be passed: As mentioned above, in the first scheduling
iteration, KeyNodeSet can have more than one KeyNode. For
2) Definition 5
the purpose of optimizing LDCP algorithm, LDCP+ computes
URank: UpwardRank of i'th node ( n i ) in DAGPk is distinct permutation of tasks, which their corresponding
defined recursively as KeyNodes are available in KeyNodeSet, on different
processors and the permutation with the minimum average
URankk (ni ) = wi ,k + max nl ∈succk ( ni ){ck (ni , nl ) + URank (nl )} (1)
execution time on processors will be the first assignment of
where succk (ni ) is a set of immediate successors of node tasks to processors. This average execution time can be
achieved from
n i , ck ( ni , nl ) is the communication cost between nodes
n i and nl in DAGPk , and wi , k is the computation cost of ⎧ m−1 ⎫
t i on processor p k . ⎪ ∑ wi ,k ⎪
⎪ ⎪
min ⎨ k =0 ⎬ (4)
3) Definition 6 ⎪ m ⎪
URankSet: Each element of URankSet is defined as ⎪
⎩ ⎪
⎭
m −1 Where i is the number of tasks corresponding to their
Max {∑ URank k ( ni )} (2) KeyNodes, w i ,k ∈w and m is the number of processors. In
k =0 the next iterations, the only available KeyNode in KeyNodeSet
is selected to be scheduled.
where URank k (n i ) is URank (n i ) in DAGPk .
1) Definition 10
4) Definition 7 Idle Space: In a processor when there is a gap between the
KeyNode: KeyNode is the node that has the maximum start time of a task and the end time of the previous task, that
URank in URankSet. Corresponded task to this node is used as interval time is called idle space.
selected task for scheduling algorithm.
337 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
2) Definition 11
Replacement Ability: One task can be placed in an idle Establish DAGP for all processors in the system where 0 ≤ k ≤ m − 1
k
space when parents of that task have been terminated before the Calculate URanks for all DAGPk
start time of the task. If any of its parents have been scheduled Compute the URankSet
While there are unscheduled tasks in task list do
on a different processor, the required time for transferring data
Find the KeyNode(s) in the URankSet
between processors should be mentioned. Put the KeyNode(s) in KeyNodeSet
If there is a processor with the idle space and selected task If (it’s the first step of scheduling) then
Choose the processors which have the minimum permutation;
has the ability of locating in that space (replacement ability),
Else
that processor will be selected. At the end of this phase, If (there is any processor with idle time and the task have the
LDCP+ algorithm uses duplication process to decrease the replacement ability) then
schedule length. With this definition, after selecting the Selected the processor;
processor if the selected task has a parent scheduled on a Else
different processor and the selected processor has an idle space Compute the finish time of the selected task on every system;
before the start time of the selected task, then duplication Find and select the processor that minimizes the finish time of the
process in the idle space will be used (regarding to the Selected task;
replacement ability). End if
Duplicate the parent(s) of the selected task if needed;
3) Definition 12 End if
Duplication Process: Duplication process is repeating the Assign the selected task to the selected processor;
Update the selected processor time;
execution of one task on other processors.
Update the URANK set;
Update unscheduled task list;
C. Status Update Phase End while
After selecting the task and assigning it to a processor,
Figure 4. LDCP+ algorithm
appropriate URank with the selected task will be deleted from
URankSet. Finish process time of the selected processor will be
updated after the task has been assigned to the processor.
Selected task will be deleted from the list of unscheduled tasks.
Task p1 p2 p3
LDCP+ algorithm is proposed in Fig.4.
1 14 16 9
2 13 19 18
VI. CASE STUDY 3 11 13 19
4 13 8 17
In this section, execution results of CPOP, HEFT and 5 12 13 10
LDCP+ algorithms are compared in the case of having non 6 13 16 9
monotonic computation cost matrix. A Grid system compose of 7 7 15 11
three single-processor computers (m=3), fifteen tasks (n=15), a 8 5 11 14
non monotonic computation cost matrix and a DAG with 9 18 12 20
10 21 7 16
communication costs assigned to graph edges are shown in
Fig.5. which also presents scheduling results of the mentioned
DAG, executed by HEFT, CPOP and LDCP+ algorithms. (a) (b)
Execution results of LDCP and LDCP+ are compared
according to monotonic computation cost matrix. A Grid
system with the parameters m=2 and n=10, a monotonic p1 p2 p3 p1 p2 p3
computation cost matrix and a DAG with communication costs 0
assigned to graph edges are shown in Fig.6. Fig.6 also shows n1 n1
scheduling results of the mentioned DAG presented in Fig6.b, 10
executed by LDCP and LDCP+ scheduling algorithms. n3
20
n4 n2
n4
30 n5 n3
n2
n6 n5
40 n7
n7 n6
50
60 n8 n8
n9
70 n9
80 n10
n1 0
90
(c) (d)
338 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
step Selected Selected step Selected
task processor Selecte
processo
1 t2 p0 d task
r
2 t1 p1 1 n1 p1
3 t4 p1 2 n4 p2
4 t9 p0 3 n3 p1
5 t5 p0 4 n2 p3
6 t3 p0 5 n5 p2
7 t7 p1 6 n6 p3
8 t6 p1 7 n9 p2
9 t8 p0 8 n7 p1
10 t11 p0 9 n8 p1
11 t10 p1 10 n10 p2
(e) (f) (e) (f)
Figure 5. Scheduling results for HEFT, CPOP, LDCP+ algorithms. (a) A Figure 6. Scheduling results for LDCP and LDCP+ algorithms. (a) A graph
graph with 10 tasks. (b) Graph cost matrix. (c) HEFT Scheduling algorithm with 11 tasks. (b) Graph cost matrix (c) tasks execution sequence in LDCP
with schedule length of 80. (d) CPOP algorithm with schedule length of 89. algorithm (d) LDCP algorithm with schedule length of 64 (e) tasks execution
(e) LDCP+ algorithms with schedule length of 68. (f) Tasks execution sequence in LDCP+ algorithm. (f) LDCP+ algorithm with schedule length of
sequence in LDCP+ algorithm. Duplicated tasks: 61.5
VII. CONCLUSION AND FUTURE WORK
task p0 p1
t1 4 6
In Grid systems, task scheduling is an important problem in
t2 15 22.5 the domain of optimizing heterogeneous distributed systems. In
t3 4 6 this paper a new heuristic scheduling algorithm, named
t4 13 19.5 LDCP+, is proposed. This algorithm has optimized LDCP
t5 10 15 algorithm that better result are attained for schedule length by
t6 7 10.5 improving these three phases: task selection phase, processor
t7 8 12 selection phase and status update phase. LDCP+ can schedule
t8 4 6 tasks in Grid systems in both case of having monotonic and
t9 12 18 non monotonic cost matrix. Using duplication process for
t10 6 9 optimizing priority assigns to tasks and also using idle spaces
t11 9 13.5
of processors will result in having better schedule length rather
than other scheduling algorithms. In real time environment, the
assignment of resources such as processors in a specific time is
(a) (b) so important. More works can be done to improve algorithms
with less computation cost for such environments.
REFERENCES
step Selected Selected [1] S. Bansal, P. Kumar, and K. Singh. An improved duplication strategy for
task processor scheduling precedence constrained graphs in multiprocessor systems. In
1 t2 p0 IEEE Transactions on Parallel and Distributed Systems 14(6), pages
2 t1 p1 533-544, 2003.
3 t4 p1 [2] M. I. Daoud and N. N. Kharma. A high performance algorithm for static
4 t9 p0 task scheduling in heterogeneous distributed computing systems. In
5 t5 p0 Journal of Parallel and Distributed Computing 68(4), pages 399-409,
6 t3 p0 2008.
7 t7 p1 [3] H. El-Rewini and T. G. Lewis. Scheduling parallel program tasks onto
8 t6 p1 arbitrary target machines. Journal of Parallel and Distributed Computing
9 t8 p0 9(2), pages 138-153, 1990.
10 t10 p0 [4] E. Ilavarasan, P. Thambidurai, and R. Mahilmannan. Performance
11 t11 p0 effective task scheduling algorithm for heterogeneous computing
system. 4th International Symposium on Parallel and Distributed
Computing, 0:28-38, 2005.
[5] J. Kim, J. Rho, J.-O. Lee, and M.-C. Ko. Cpoc: Effective static task
(c) (d) scheduling for grid computing. In Proceeding of the 2005 International
[6] Conference on High Performance Computing and Communcations,
pages 477-486, 2005.
339 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 4, July 2010
[7] Y.-K. Kwok and I. Ahmad. Static scheduling algorithms for allocating
directed task graphs to multiprocessors. ACM Comput. Surv. 31(4),
pages 406-471, 1999.
[8] Y. kwong Kwok, I. Ahmad, and I. Ahmad. Dynamic critical-path
scheduling: An effective technique for allocating task graphs to
multiprocessors. IEEE Transactions on Parallel and Distributed Systems
7(5), pages 506-521, 1996.
[9] G. C. Sih and E. A. Lee. A compile-time scheduling heuristic for
interconnection constrained heterogeneous processor architectures. IEEE
Transaction on Parallel and Distributed Systems 4(2), pages 175-187,
1993.
[10] H. Topcuoglu, S. Hariri, and W. Min-You. Performance-effective and
low complexity task scheduling forheterogeneous computing. IEEE
Transaction on Parallel and Distributed Systems 13(3), pages 260-274,
2002.
340 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
Get documents about "