A Budget Constrained Scheduling of Workflow Applications on Utility Grids using Genetic Algorithms
Jia Yu and Rajkumar Buyya
Grid Computing and Distributed Systems Laboratory Dept. of Computer Science and Software Engineering The University of Melbourne, Australia
Content
Introduction
Utility Grids Problem overview Genetic Algorithms
Proposed Work Experiment Results Related work Conclusion and future work
Utility Computing and Utility Grids
Utility Computing
New service provisioning model. Providing computing services such as servers, storage and applications. Pay-per-use. Grid computing provides a global infrastructure for resource sharing and integration. Enabling users to consume utility services transparently over a secure, shared, scalable and standard world-wide network environment.
Utility Grids
Community Grids vs. Utility Grids
Community Grids Availability Best effort QoS Pricing Best effort Not considered / free access Utility Grids Advanced Reservation Contract/SLA Usage, QoS level, Market supply and demand
Workflow Scheduling
Scheduling on Community Grids
Minimize the execution time ignoring other factors such as monetary cost of resource access and various users’ QoS satisfaction levels.
Optimize performance under most important QoS constraints imposed by users.
Minimize execution cost while meeting a specified deadline. Minimize execution time while meeting a specified budget.
Scheduling on Utility Grids
Genetic Algorithms
Random search method based on the principle of evolution. Exploitation of best solutions from past searches. Exploration of new regions of the solution space. A high-quality solution to be derived from a large search space.
Genetic Algorithms
Start
individual in the search space of the problem represents a solution.
maintains a population of individuals that evolves over generations.
The
A GA
Each
Initialize the population of possible solutions
Generate offspring solutions by genetic operators
No
Evaluate the fitness of each individual in the population
quality of an individual is determined by a fitness function.
Select the fittest solutions in the population
Terminated?
Yes Stop
Proposed Work
Existing GAs
Schedule dependent tasks in homogeneous multiprocessor systems. Minimize execution time or maximize system throughput. Schedule dependent tasks in heterogeneous environments. Minimize execution time while meeting users’ budget.
Our work
Application Model
There is no cycle in the graph. A task cannot be executed until all of its parent tasks are completed.
B
A
C
D
Directed Acyclic Graph (DAG)
Construction of a Genetic Algorithm
Representation of individual in the population. Determination of the fitness function. Design of genetic operators.
Problem encoding
Workflow
Schedule
T0 T3 T5 T1 T2 T4 S1 S2 T 6 S3 T7 T0 T1 T2 T7
Two-dimensional strings
S1:T 0-T2-T7 S2:T 1 S3:T 3-T5 S4:T 4-T6
T3 T4
time
T5 T6
S4
One-dimensional string
T0(1)-T2(1)-T7(1)-T1(2)-T3(3)-T5(3)-T4(4)-T6(4)
Fitness function
Cost-fitness: encourages the formation of the solutions that achieve the budget constraint. c( I ) Fcost ( I ) B
c(I) is the sum of the task execution cost and data transmission cost of I , and B is the budget of the workflow.
Time-fitness: encourages the GA to choose individuals with earliest completion time in the current population.
t(I ) maxTime where t(I) is the completion time of I and maxTime is the largest completion time of the current population. Ftime ( I )
Fitness function
Fcost ( I ), if Fcost ( I ) 1 F (I ) otherwise Ftime ( I ),
Genetic operators
Selection
Retain fittest individuals in the population as successive generations evolve. Produce new individuals by combining the two existing individuals.
Crossover
Mutation
Crossover
Before crossover S1:T0-T2-T7 S2:T1 S3:T3-T5 S4:T4-T6 S1: T0-T1 S7: T2-T7 S8: T3 S9: T4-T6 S10:T5
parent1
parent2
Crossover
Randomly select crossover window
T0(1)-T2(1)-T7(1)-T1(2)-T3(3)-T5(3)-T4(4)-T6(4) T0(1)-T1(1)-T2(7)-T7(7)-T3(8)-T4(9)-T6(9)-T5(10) After crossover S1: T0-T2-T1 S4: T4-T6 S7: T7 S8: T3 S10:T5 S1: T0-T7 S2: T1 S3: T3-T5 S7: T2 S9:T4-T6
offspring1
offspring2
Mutation Operations
Mutation operations:
Allow a certain offspring to obtain features that are not possessed by either parent.
Swapping mutation aims to change the execution order of tasks in an individual that compete for a same time slot.
Swapping mutation
Replacing mutation
Replacing mutation aims to re-allocate an alternative service to a task in an individual.
Schedule refinement
Rescheduled tasks (G$300) 0-2450 T1 (G$200) 0-4440 T1
0-1878 T0 1878(G$150) 2050 T3 (G$180) 2050- T5 2650
0-4450 T2
0-1878 T0 1878(G$100) 3050 T3 (G$100) 3050- T5 5000
0-4450 T2
T4 44505166
T4 44505166
T6 51665666
T6 51665666
(a) Before refinement
(b) After refinement
Experiments
GridSim experiment environment
2. query(type A)
GIS
3.service list
1.register(service type)
Grid Service
1. register
Workflow System Grid Service GIS: Grid Index System
Experiments
Applications
1 2
Align_wap (300000)
Align_wap
3 4
Align_wap (300000)
5 6
Align_wap (300000)
7
(300000)
SignalP
1 5
COILS2 (300000)
2
SEG (600000)
3
PROSITE (600000)
4
(900000)
TMHMM
(300000)
reslice
reslice (600000)
reslice (600000)
reslice (600000)
8
(600000)
PSI-BLAST
Prospero
6 9
HMMer (150000)
7
(150000)
softmean
9
8
BLAST (300000)
IMPALA (300000)
10 (300000)
(300000)
PSI-PRED
slicer
10
slicer
11
slicer
12
(300000)
3D-PSSM
11 (600000) 12 (300000)
Genome Summary Summary
(300000)
(300000)
13
(600000)
convert
13
convert
14
convert
15
(600000)
14 15
(150000)
(600000)
(600000)
SCOP
(300000)
Balanced structure
Unbalanced structure
Experiments
Service type represents different types of services. 15 types of services, each supported by 10 different service providers with different processing capability.
Table I. Service speed and corresponding price for executing a task. Service ID 1 2 Processing Time (sec) 1200 600 Table II. Transmission bandwidth and corresponding price. Bandwidth (Mbps) Cost/sec (G$/sec)
Cost (G$) 300 600
100
200 512
1
2 5.12
3
4
400
300
900
1200
1024
10.24
Evolution of execution time and cost during 100 generations.
Evolution of execution time and cost in response to different refinement rate when budget is G$3000.
Heuristics compared
Greedy time
Assigns a planed budget to each task in the workflow based on the average estimated execution costs of tasks and the total budget of the workflow. Assigns each task to a service which can complete at earliest time within its assigned sub-budget.
Related Work
Time optimization algorithms
Min-Min: vGrADS, Pegasus HEFT: ASKLON GRASP: Pegasus Simulated Annealing: ICENI Genetic Algorithms: ASKALON
Genetic algorithms in multiprocessors systems Heuristics
E. Tsiakkouri et al., “Scheduling Workflows with Budget Constraints”, the CoreGRID Workshop on Integrated Research in Grid Computing, Nov. 28-30, 2005.
Conclusion and Future Work
Budget constrained workflow scheduling
Minimize execution time while meeting user’s budget Genetic algorithms
Fitness function Crossover and Mutation
Future work
Different negotiation models Run time rescheduling Other QoS constraints
Thank You… Any ??