Document Sample

Evaluating Provider Reliability in Risk-aware Grid Brokering Iain Gourlay Outline • AssessGrid background • Problem Statement • Basic Reliability • Analysis of behaviour • Stationarity Problem • Weighted Reliability • Simulations and Results • What if a provider is unreliable? • Alternative: Bayesian Inference • Summary and Conclusions 2 AssessGrid Background • AssessGrid addresses Risk Management in the Grid. • This is a necessity in the drive towards commercialisation of Grid technology… - The goal is to move beyond best-effort, using SLAs to specify agreed upon level of service. However, - For resource providers, offering an SLA with service guarantees and penalties is a business risk! - For end-users, agreeing to an SLA is a business risk! • A large part of AssessGrid is concerned with methods to support providers with tools and methods to: - Monitor and collect useful data. - Assess risk associated with accepting an SLA request, based on this data. 3 What is risk? • Risk is “Hazard, danger, exposure to mischance or peril” (Oxford English Dictionary). • Risk Management is a discipline that addresses the possibility that future events may cause adverse events. - Economics, Operations Research, Engineering, Gambling, … • In Risk Management, risk is quantified with two parameters: Risk = Probability of Occurrence x Impact • Grid computing: Event is SLA failure! 4 Scenario 5 Role of the Broker • Key role: Finding/Negotiating with providers on behalf of end-users. • Broker can also act as an independent party: - Providers may have motivation to lie! - Providers may have unidentified problems in their infrastructure. • Here, we assume the broker is independent and honest. • Broker can give a second opinion on risk assessments. • Broker can agree its own SLAs (virtual provider). 6 Problem statement: What do we mean by reliability? • A provider makes an SLA offer: - includes an estimate of the Probability of Failure (PoF). • Each time an offer is accepted, the details are stored in a database, including: - Final status (Success/Fail) - Offered PoF • The problem is: Given a provider’s past data, can their risk assessments be considered reliable? 7 What is reliable? • Considering only systematic errors! • Assume s SLAs in the database for the same provider. - Offered PoFs, pi ; i 0,..., s 1 • Assume number of fails ~ N F , F 2 • We define a reliable provider as one that does not systematically underestimate or overestimate the PoF, so that: 1 Fpred F 1 Fpred 8 Is it normal? Normal vs Binomial, p=0.05, n=50 0.3 0.25 f(nmber of fails) 0.2 0.15 Binomial 0.1 Normal 0.05 0 0 5 10 15 20 25 -0.05 number of fails 9 Is it normal? (2) Normal vs Binomial, p=0.05, n=100 0.2 0.15 f(number of fails) 0.1 Binomial 0.05 Normal 0 0 10 20 30 40 50 -0.05 number of fails 10 Basic Reliability: Identifying Systematic Errors Using the provider’s offered PoFs: s 1 Fpred pi s 1 i 0 0 p 1 p i 0 i i The evaluation is based on the following measure: Fpred Fails R 0 11 Basic Reliability: Identifying Systematic Errors(2) Distribution of R for Reliable Providers 0.45 0.4 0.35 0.3 Perfect Provider 0.25 f(R) Overestimates PoF 0.2 0.15 Underestimates PoF 0.1 0.05 0 -6 -4 -2 0 2 4 6 R 12 Basic Reliability: Identifying Systematic Errors(3) We note that Fpred F F 2 R ~ N , 0 0 and recall the condition, 1 Fpred F 1 Fpred leading to Fpred F Robserved z 0 0 13 Analysis: How does the measure behave? Simple Example: • m SLAs in database. • Offered PoF is constant, p. • There is a systematic overestimation/underestimation of the PoF, such that: p fail p1 14 Analysis (2) 0.45 1.1 0.4 1 95.35% 0.9 Cumulative Probability 0.35 Probability Density 0.8 0.3 0.7 0.25 0.6 0.2 0.5 0.15 0.4 0.3 0.1 0.2 0.05 0.1 0 0 -6 -4 -2 0 2 4 6 Punreliable 0 mp 1 p z 0 mp 1 p z F 0 F 0 F mp1 1 p1 15 Stationarity Problem • Conditions are not static! - Example: 60 red balls in a bag. 40 blue balls in the same bag. You try to estimate the number of red balls by taking a ball out and replacing it, repeating this 50 times. Someone is secretly removing a red ball and replacing it with a blue after every sample. E(red) =17.5 Number of reds =10! 16 Stationarity Problem(2) A provider’s behaviour could change as a consequence of a variety of factors, e.g. • A provider’s infrastructure is updated. • A provider’s risk assessment methodology or model parameterisation may change. • A provider’s policy may change, for example due to economic considerations. 17 Weighted Reliability • Use a weighted average, ensuring more recent SLAs have a larger influence. • Total of mk SLAs are split into k categories, with the kth consisting of the most recent SLAs. k R i mi RW i 1 k j 1 j Here, Rm i is the basic measure R over the ith category. i i 1 18 Simulations • A database of SLAs is generated: - Each SLA object has an offered PoF, true Pof and final status. • Reliability computed. • Process repeated 10000 times for each scenario. • Simple case considered here: - Offered PoF is fixed and true PoF is fixed. 19 Results "Perfect" Provider, PoF=0.05, alpha=0.1, z=2 0.025 Prob(found unreliable) 0.02 0.015 Simulation 0.01 Analytic 0.005 0 0 1000 2000 3000 4000 5000 6000 Number of SLAs 20 Results(2) Unreliable Provider, offered PoF=0.05, true PoF=0.07, alpha=0.1, z=2 1.2 Prob(found unreliable) 1 0.8 Simulation 0.6 Analytic 0.4 0.2 0 0 1000 2000 3000 4000 5000 6000 Number of SLAs 21 Results (3) Unreliable Provider, offered PoF=0.05, true PoF=0.03,alpha=0.1, z=2 Prob (found unreliable) 1.2 1 0.8 Weighted 0.6 Basic 0.4 Moving Average 0.2 0 0 1000 2000 3000 4000 5000 6000 Number of SLAs 22 Results(4) Provider unreliable>500 jobs: truePof=2*offeredPof(=0.05) 1.2 1 Prob(unreliable) 0.8 Basic 0.6 Moving Average 0.4 Weighted 0.2 0 0 500 1000 1500 2000 Number of Jobs 23 Results (5) Provider unreliable>500 jobs: truePof=1.5*offeredPof(=0.05) 1.2 1 Prob(unreliable) 0.8 Basic 0.6 Moving Average 0.4 Weighted 0.2 0 0 500 1000 1500 2000 2500 3000 3500 Number of Jobs 24 What if the provider is unreliable? • Discrete approximation: When SLA Offer received with offered POF of p, estimate POF by looking at failure rate for all SLAs with offered POF of ~p. Then, If (|reliability measure| < threshold) Believe provider. Else(PoF estimate = numFails(POF~p)/numSLAs(POF~p) • Use all SLAs with offered PoF within x% of the offered PoF in the current SLA. 25 Weighted Average risk assessment • Split km SLAs into k categories. • Compute the estimate PoF, pi for each category, i=0,…,k-1. k 1 i pi pbr i 0 k 1 j 0 k 26 Never Trust Doctors • You are tested for a disease, which 2% of the population has. • The test never gives a false-negative. • If you are clear, there is still a 5% chance of a false positive. • You test positive. • What is the probability you have the disease? 27 Alternative Approach: Bayesian Inference • The provider offers a linguistic risk assessment, e.g. the failure probability is: - “extremely low” : <1% - “very low” : 1-5% - “low” : 5-10% - “medium” : 10-20% - “high” : 20-30% - “very high” :30-50% - “extremely high” : >50% • If the broker/end-user requests the PoF exact value this can be provided. 28 Alternative Approach: Bayesian Inference (2) • The broker does not consider the provider’s reliability directly. Instead it takes the following approach: - Having received a linguistic risk assessment for a new SLA, the broker first computes a prior distribution for the PoF, given the linguistic category by considering data across all other providers. - The broker computes a posterior distribution, based on the failure rate observed in past SLAs from the same provider with the same linguistic risk assessment. - The broker returns an object which contains: • (PoF_broker, confidence) 29 Alternative Approach: Bayesian Inference (3) E p 1 1 vi vcurr N 1 2 s s N 1 i1 i curr 30 Summary/Conclusions • A detailed analysis has been carried out for a method to identify providers who are systematically unreliable. • The stationarity problem has been addressed. - Weighted Average - Results indicate good performance relative to basic measure and moving average. • This can be extended to other measures for “non-systematic” errors. • Bayesian approach has been considered and is also promising. 31

DOCUMENT INFO

Shared By:

Categories:

Tags:

Stats:

views: | 8 |

posted: | 10/3/2012 |

language: | Unknown |

pages: | 31 |

OTHER DOCS BY alicejenny

Docstoc is the premier online destination to start and grow small businesses. It hosts the best quality and widest selection of professional documents (over 20 million) and resources including expert videos, articles and productivity tools to make every small business better.

Search or Browse for any specific document or resource you need for your business. Or explore our curated resources for Starting a Business, Growing a Business or for Professional Development.

Feel free to Contact Us with any questions you might have.