__________________________________________________________________________
Grid and Cloud Computing
Architecture and Services
_________________________________________________________________________
Mark Erlenmeyer
Software Development &Management MS Thesis Rochester Institute of Technology May, 2009
Erlenmeyer 1
ABSTRACT
In this paper, we will examine the evolution of grid and cloud computing and their associated services. To do this effectively, we must discuss hosting infrastructures separately from services, attempt to clarify their definitions, and finally bring them back together for analysis. That analysis must include a consideration of advanced virtualization and grid computing operations, because these are essential elements of a cloud computing environment. We will show how cloud computing truly is grid computing offering services at a higher level of abstraction. We will develop the concept of Grid Services which will provide customers with the ability to access and use the resources within a grid computing infrastructure. Platform-as-a-Service (PaaS), is commonly associated with cloud computing; we will show that this is incorrect, and that PaaS is a service offered within the scope of grid computing. Computing-as-a-Service, Storage-as-a-Service, and Network-as-a-Service are also grid services. The term cloud computing should not focus on the underlying middleware, hardware, storage, or network resources -- those resources are hidden from the consumer of cloud services. Software-as-a-Service (SaaS) is generally associated with cloud computing, and is currently the primary cloud service offering. The conclusion we draw is that the only significant difference between grid and cloud computing pertains to what services are offered and how customers use those services. It has to do with levels of abstraction, because the basic architectural requirements for grids and clouds are the same.
Erlenmeyer 2
CONTENTS
ABSTRACT ........................................................................................................................................... 1 INTRODUCTION .................................................................................................................................. 4 Business Climate Driving Cloud Computing and Software-as-a-Service .......................................... 4 Economy ......................................................................................................................................... 4 Need for deployment of solutions quickly ..................................................................................... 4 Operational costs ............................................................................................................................ 5 Shift in the software industry ......................................................................................................... 5 Challenges for small and medium business .................................................................................... 6 Government regulations ................................................................................................................. 6 Technology shifts and application architecture .............................................................................. 7 Services on the Internet .................................................................................................................. 7 Software as a product ..................................................................................................................... 8 Software as a service ...................................................................................................................... 8 Virtualization ...................................................................................................................................... 8 Grid Computing.................................................................................................................................. 9 Cloud Computing ............................................................................................................................. 10 Virtualization of applications ....................................................................................................... 11 Summary .......................................................................................................................................... 11 LITERATURE REVIEW ..................................................................................................................... 15 SURVEY OF INDUSTRY SOLUTIONS ............................................................................................ 22 DISCUSSION ...................................................................................................................................... 28 Virtualization .................................................................................................................................... 28 Abstraction and pooling ............................................................................................................... 28 Grid Computing................................................................................................................................ 29 Virtual Organizations ................................................................................................................... 29 Grid Computing Architecture ....................................................................................................... 30 Resource Broker ........................................................................................................................... 31 Resource Discovery in Current Grids ........................................................................................... 32
Erlenmeyer 3 Centralized Grid Management – Scheduling................................................................................ 33 Distributed Data Management ...................................................................................................... 33 Performance Forecasting .............................................................................................................. 34 Automation ................................................................................................................................... 35 Grid Services .................................................................................................................................... 35 Platform as a Service .................................................................................................................... 36 Cloud Computing ............................................................................................................................. 37 Cloud Services.................................................................................................................................. 38 Software-as-a-Service................................................................................................................... 38 Service Provider Outsourcing........................................................................................................... 39 Grid and Cloud Service provider responsibilities............................................................................. 41 CONCLUSION .................................................................................................................................... 44 About the Author ................................................................................................................................. 46 Copyright .......................................................................................................................................... 46 Contact ............................................................................................................................................. 46 REFERENCES ..................................................................................................................................... 47
Erlenmeyer 4
INTRODUCTION
Business Climate Driving Cloud Computing and Software-as-a-Service
Economy The current condition of the global economy (2009) has caused most companies to look for ways to reduce expense. Revenue and income have dropped, so in order for a business to survive, it must focus all of its resources on its core business objectives, not on IT. Hosting an IT infrastructure in-house is very expensive, and its upkeep often involves significant capital expenditures. The subscription-based Software-as-a-Service (SaaS) model is considered to be an expense item in business accounting, rather than a capital expense (Lamont, 2009). Because SaaS consists of common services, there are significant cost reductions for all tenants who benefit from those services (Nassar & Vridhachalam, 2008). Further, the larger the grid or cloud in which the services are delivered, the lower the cost should be for each customer subscribing to those services. Need for deployment of solutions quickly The SaaS deployment model allows a business to have a solution up-and-running quickly, within minutes in some cases. Custom applications developed and hosted in-house can take up to a year or more to deploy. Within that timeframe, requirements and/or the business climate may have significantly changed. Corporate executives in non-IT organizations, such as banks and insurance companies, want to ensure their focus is on their business goals, not IT operations and software development.
Erlenmeyer 5
Despite the advances in Agile software development methodologies, the time it takes to develop and host IT applications that support a core business function can still put a company at risk – marketing share may be lost while competitors are able to get solutions up-and-running quickly, by taking advantage of SaaS offerings. Operational costs Organizations seeking to reduce the cost for computing equipment and network infrastructure are focused on consolidation and improved efficiency through more effective equipment utilization. Studies have shown that a high percentage of equipment is significantly underutilized (Singh, Korupolu, & Mohapatra, 2008). A primary benefit of equipment and infrastructure consolidation is the reduced physical space requirements. It can not only reduce the cost of leasing the space, but also operational costs such as energy and other utilities (Weiss, 2007). Having fewer IT assets to manage also enables an organization to be more agile in responding to business requirements. Shift in the software industry When companies take advantage of SaaS offerings, their need to purchase commercial software products is reduced or eliminated. This trend has caused companies that develop and sell software products to shift some of their focus to services offered in cloud computing environments that they have constructed. Large companies such as IBM, Oracle, Amazon, and Sun, have taken this approach. Smaller software product producers are finding it more and more difficult to compete because they do not have the hosting infrastructure to provide SaaS. In many cases, they may be forced to outsource SaaS hosting services to one of the larger companies that has an established cloud computing infrastructure.
Erlenmeyer 6
Challenges for small and medium business As small companies expand, their IT requirements become more complex. Medium businesses require many services that they are not likely to be able to support in-house, such as HR, CRM, and online sales. Even if they have an existing hosting infrastructure, it continues to become more difficult for them to recruit and manage an IT staff to support the infrastructure and the applications running on it (Boulton, 2008). SaaS offerings are attractive for these small and medium businesses because it allows them to grow their business without having to devote resources to IT. For example, a simple process such as scanning email and dealing with virus and malware is processor intensive. Small and medium businesses are not likely to have the computing resources necessary to scan all incoming email and data; however, they now have the option of subscribing to or purchasing email services from a SaaS provider (Prince, 2008). Government regulations Governments have instituted laws mandating how business information is collected, stored and made accessible. E-mail, financial information, employee records, confidential corporate information, and customer personal information must be protected. Internal audit reports must be produced on demand. Many small companies do not have the storage capacity to manage and protect that growing body of information (Preimesberger, 2007; govtracc.us, 2007). Platform services such as Storage-as-a-Service allow companies to extend their existing hosting infrastructure without having to devote additional capital.
Erlenmeyer 7
Technology shifts and application architecture Over the decades, custom developed IT solutions that support a company’s core business have become increasingly complex and difficult to support. Legacy applications not expected to have long life-spans have endured well beyond their projected sunset date. Many were developed during the 1990s and were not designed to accommodate four-digit year data; these applications had to be replaced or significantly redesigned prior to the year 2000. This gave corporations an opportunity to review their entire enterprise application inventory so that tactical and strategic changes could be made. As IT architects analyzed legacy applications, they realized that tightly-coupled components inhibited their ability to adapt the applications in response to quickly changing business requirements. It was observed that loose-coupling enabled architects and software designers to adapt components rather than modify an entire code base for each revision. Component development delivers functionality through few, but clearly defined interfaces. The black-box approach allows components to “advertise” their services so that any other component can request those services. This enabled the software developer to focus on functionality, and not the code behind the interfaces. As the Services Architecture model continued to evolve, standards developed that provide clear direction on how interfaces should be designed and documented (Jacobs, 2005). Services on the Internet Not only is the services architecture effective within the enterprise’s IT infrastructure, web services offered on the Internet by independent companies such as Yahoo, Amazon, and Google, can be leveraged by any web application developer who requires one or more of those services.
Erlenmeyer 8
Because a web service is generally not an application in itself, IT architects and application designers use web services when constructing applications that are deployed on the web. Software as a product Off-the-shelf software products can be purchased and deployed by any IT organization supporting a company; however, as companies continue to focus more and more on their business objectives, IT’s role as an enabler continues to change. It is costly for non-IT companies to operate their own computing environments. It also takes significant time and funding to deploy custom applications, and even off-the-shelf software. Delays and unnecessary cost have led to the loss of marketing and sales opportunities for companies, as well as difficulty in meeting customers’ expectations with regard to support. Software as a service Software products have evolved to the point where they can support processing across many servers in an efficient and cost-effective manner. Companies producing these software products have sensed the shift in consumer focus and are now offering their products as services to be accessed through the web. Services based on deployed software can be shared among many customers (or tenants) (Gaw, 2008). Rather than a company supporting its own hosting infrastructure, it may request the provisioning of such a service for use on-demand – the process may be automated, including registration, authentication, and payment or billing.
Virtualization
Much of the deployed IT infrastructure is underutilized which adds unnecessary capital and operating expense. Virtualization enables IT operations management to dynamically adjust like and unlike resources within a grid computing environment.
Erlenmeyer 9
A large cluster of servers does not necessarily constitute a grid. In order to manage those resources, virtualization is required so that resources are aggregated, and expanded or pulled back as the computational, storage, or bandwidth requirements of applications change. Managing virtualization is not simply placing administrators at consoles, having them provision application space, and adjust physical resources as needed. Virtualization implies a level of automation, not only with regard to resource sharing, but in provisioning new deployments as well. Virtualization can be performed at two levels: Server virtualization uses software that allows a server to be logically partitioned so that applications have a dedicated space in which to execute. The partitions and underlying resources are continually balanced, based on the needs of applications running on it (How does virtualization work? n.d.). Pooling of resources is the aggregation of multiple like resources (such as servers), and their on-demand allocation and re-allocation. This allows applications to use resources beyond a single server, based on their current computing requirements, and to have those resources made available on-demand. Our analysis will include the pooling of resources, which is required for grid computing.
Grid Computing
A company hosting its own IT infrastructure must plan for worst-case scenarios – the risk of outages, or failure when demand will be highest. It must also be prepared for normal spikes in activity; however, much of the time, the hosting infrastructure is underutilized. Other organizations have extreme computing requirements but do not have the computing environment
Erlenmeyer 10
to support them. Collaboration between like organizations has enabled the pooling of their computing resources resulting in large computing grids to which each organization can have full access. Grid computing builds on the pooling-of-resources model by adding administrative and operational functions that manage the discovery and dynamic allocation, de-allocation, and balancing of pooled resources, monitoring and reporting of grid status, and the scheduling and deployment of jobs that are to execute (Berry, Djaoui, Grimshaw, Horn, Maciel, Sienbenlist, et al., 2005). When hundreds or thousands of servers are clustered and managed as a single massive computing resource, the processing power available for use is enormous. Mega-computing grids have been created to support scientific and government communities who have significant computing and data storage requirements. Grids are also being used globally for commercial purposes. Large computing grids have even been built using thousands of small home computers (What is the grid? n.d.). Grids allow customers to pay only for the processing capability, storage, and network bandwidth they use – they must be able to use little, or a significant portion of the resources that are available. This implies that applications running on the virtualized servers may have a requirement to utilize the capability of many of them concurrently.
Cloud Computing
Cloud computing requires that a grid already be established because the foundation of a cloud computing environment is a grid. Clouds have the same operational requirements as grids, and customer expectations are the same for both with regard to security, data protection, isolation, performance, and availability. A cloud is simply a grid that is used in a more abstract
Erlenmeyer 11
context. Rather than having a focus on platform middleware, server hardware, network and storage resources, cloud computing is the offering of services without exposing the grid and how it is managed. The Open Cloud Manifesto describes the important characteristics of cloud computing – those characteristics are also found in grid computing (Open cloud manifesto, 2009; Jha, Merzky & Fox, n.d.). Virtualization of applications Just as physical computing resources can be virtualized and offered as a single resource, applications can be virtualized so that they appear as a single software service to customers. Cuomo describes this concept as “virtualization of applications (Cuomo, 2008). SaaS is an example of deployed solution(s) that appear to customers as a single software service. The SaaS customers receive are generally based on deployed software products that they share using a subscription or “pay-as-you-go” model.
Summary
We’ve considered the change in business climate and the technology landscape that fostered the development of grid and cloud computing. A grid cannot function without advanced virtualization -- virtualization of pooled resources is the foundation of grid computing. A cloud cannot exist without a grid as its foundation; therefore a cloud must be supported by virtualization by extension (Raichura & Vayanippetta, 2009). Server virtualization is concerned with creating virtual servers on one physical device; however, grid computing requires more advanced virtualization -- the pooling of resources enables applications to use varying amounts, and combinations of resources that extend beyond a single server (Kourpas, 2006). Grids offer services at the infrastructure level. These services
Erlenmeyer 12
include Platform-as-a-Service, Computing-as-a-Service, Storage-as-a-Service, and Network-asa-Service. The services are offered using the utility model. Grid resources are only accessible via grid services that expose them as part of their service. Customers may have significant “handson” access to the resources they are paying for (Oracle grid computing, 2008). Cloud and grid computing have similar basic requirements: flexible scaling of resources, resiliency, availability, performance, security, data privacy, etc. What differentiates clouds from grids is how they are used, not how they are constructed. A cloud is a grid used in a more abstract way. SaaS is the primary cloud offering, and has to do with services on-demand in an environment in which a common solution is deployed and used by many different customers using a subscription or pay-as-you-go model. Provisioning is automated and rapidly executed. Customers purchase or subscribe to a service, not simply computing power, storage, or a development environment (Gaw, 2008). Applying characteristics of grids to clouds blurs their definitions, definitions should be clear and understandable (Weiss, 2007). Grid computing developed many years ago, but the term cloud computing is relatively new, and no formal definition has been given it. Marketing specialists have taken advantage of the opportunity to use “cloud” as a buzzword in order to promote the idea that they are leading this new application of technology. For the remainder of our analysis, we define these important terms as follows: Virtualization: Server virtualization: the dynamic division of a single computing platform (server) so that each deployed application can execute in its own dedicated partition or “virtual server.” The virtualization software, or “hypervisor”, manages all access to hardware resources by each partition’s operating system (Crosby & Brown, 2006; How does
Erlenmeyer 13
virtualization work? n.d.). Pooling: the aggregation of multiple like hardware devices (such as servers and storage devices), so that they are viewed by applications as a single resource. Advanced virtualization software allows applications to use computing and storage that goes beyond a single device. Pooling is required for grid computing to work. Grid: A fabric of physical devices, that is managed through the abstraction and pooling of resources, and supported by internal grid services as defined by the Globus Alliance (Foster, Kesselman & Tuecke, n.d.). Internal Grid Services: services that support grid computing requirements for resource discovery, performance monitoring, scheduling, performance forecasting, billing and payment etc. Grid Computing: the architecture (grid) and internal grid services that form the platform on which external grid services are offered. External Grid Services: The offering of grid resources as a service that consumers use as single resource. The services involve computing, platform, storage, and network. The consumer pays for the services using the utility model. Platform-as-a-Service is an example of an external grid service. Cloud: A grid in which abstracted application services are made available (Cuomo, 2008). A cloud cannot exist without a grid as its foundation. Grids and clouds differ only in the types of services they offer, and how customers access those services.
Erlenmeyer 14
Software-as-a-Service: The offering of services based on a common solution that is shared among many customers. The services are based on a subscription or a pay-as-you-go model. These services are used by customers who have no need to understand the underlying hosting infrastructure and how it is managed. The operating environment and underlying grid resources are provisioned on-demand (Jha, Merzky & Fox, n.d.). Based on our definitions, we conclude that virtualization, grid and cloud are terms that apply to infrastructure. Computing-as-a-Service, Platform-as-a-Service, Storage-as-a-Service, and Network-as-a-Service are grid services. Software-as-a-Service is offered as a cloud service. Cloud services have no reference to physical resources or underlying infrastructure. Platform-asa-Service is incorrectly associated with cloud computing by many companies – it is a grid service by definition. After considering relevant literature on these topics, we will examine current grid and cloud offerings and the services that are provided through them. A further discussion on the topics of virtualization, grids, grid services, clouds, and Software-as-a-Service will follow. The conclusion we will draw from this will support the above definitions of these terms, and clarify what that means for service providers and their customers.
Erlenmeyer 15
LITERATURE REVIEW
There are many papers and proceedings which discuss SaaS, cloud computing, grid computing, and virtualization. Several of the most useful references are summarized in this section. References that both support and conflict with the definitions and conclusions presented in this paper are reviewed. Grid vs Cloud McEvoy and Schulze discuss the limitations of grid computing and how cloud computing addresses those limitations. The authors believe that a flaw in grid computing is that it “exposes too much detail of the underlying implementation, thus making the application development more complex, difficult interoperability and scaling” (Mc Evoy & Schulze, 2008). Rather than being a flaw, it is a characteristic of grid computing. When looking for a solution at a higher and more abstract level, grids are not satisfactory, and that is where cloud computing plays a role. McEvoy and Schulze describe cloud computing as providing “a narrower interface while abstracting away implementation details” (Mc Evoy & Schulze, 2008). This necessarily reduces the flexibility of the operating environment, because generic shared solutions are deployed which have less access to underlying resources that are managed in the grid. Grid Kourpas describes a grid, not simply as a set of physical resources, but primarily “as the method with which broad sets of resources are accessed and combined”, and “the way in which IT resources dynamically interact to address changing business requirements” (Kourpas, 2006). Kourpas identifies five business areas in which grid services play an important role: 1) Business
Erlenmeyer 16
analytics, 2) Engineering and design, 3) Research and development, 4) Government development, and 5) Enterprise optimization (Kourpas, 2006). Additionally, Kourpas outlines the evolution of grid computing by showing how virtualization capabilities have advanced. An important first generation of virtualization involves “the logical joining of like resources” (Kourpas, 2006). This is effectively an advanced clustering solution. A second generation of virtualization brings together resources of different types such as application servers, storage, database, and file systems. All of these resources are managed as a single unit through virtualization. The final area in which virtualization has developed involves the bringing together of grids across organizational and company boundaries. Many technologists regard this grid/virtualization architecture as cloud computing. More accurately, it should be referred to as inter-grid (Dias de Assunção, Buyya & Venugopal, n.d.). Foster, Kesselman, and Tuecke detail the architecture of a grid from the perspective of the Globus Alliance. They break the architecture down into layers which provide specific grid services. These services are offered within the grid to support grid operations versus external grid services such as Platform-as-a-Service which is offered to external customers. The Grid Fabric includes the hardware resources that make up the physical layer in the grid architecture. The Connectivity Layer contains the protocols that are used to support communicate with physical resources. The Resource Layer adds to the connectivity layer other protocols, APIs, and tools for managing user authentication, usage monitoring, accounting and payments. Above the Resource Layer, the Collective Layer provides the features that are required for grid operations such as directory services, monitoring and workload allocation (Foster, Kesselman & Tuecke, n.d.). Applications are deployed within the grid according to what is available for a specific Virtual Organization (the group of participants in a grid organization and their collective assets.)
Erlenmeyer 17
Cloud Computing Jha, Merzky, and Fox also describe clouds as providing a higher-level of abstraction through which services are delivered to the customer. They agree with the premise that the difference between clouds and grids is simply the complexity of the interface through which the services are delivered, and the extent to which underlying resources are revealed. The higher-level cloud interfaces restrict services to off-the-shelf software, deployed as a generic shared platform. The authors present the notion of “Usage Modes” which “describe the usage patterns of the system and the system’s internal properties that support these patterns respectively” (Jha, Merzky & Fox, n.d.). Like McEvoy and Schulze, the authors state that grids are flawed in that they expose too many details of the underlying hosting infrastructure; however, grid services such as Platform-as-a-Service, and Computing-as-a-Service require the customer to understand, to a degree, the underlying hardware, OS, and middleware, if they are to use those resources in their own development activities, or when those resources are used by their custom applications. This is where cloud computing addresses the authors’ perceived flaw – it hides those details from the customer who will use services that are more specific, yet generic, via non-custom software solutions usually involving off-the-shelf software. The authors also argue that “grids are systems which … coordinate resources that are not subject to centralized control” (Jha, Merzky & Fox, n.d.). However, without centralized control, a grid is nothing more than a virtualized cluster of resources -- centralized management is a grid requirement. The authors point to the more flexible interfaces grids have as “mostly programmatic” (Jha, Merzky & Fox, n.d.). This makes them useful to application developers who require Platform-as-a-Service in order to develop and test their solutions, and to
Erlenmeyer 18
organizations requiring additional computing or storage resources, for their own custom applications to use when expected workloads will surpass normal levels. Blurring the lines Perilli describes cloud computing in a similar manner to other authors. Her definition focuses on the on-demand, pay-per-use services delivered in a shared infrastructure than is managed as a virtual resource. Although there is significant overlap in the definitions of cloud and grid computing, the author makes the boundary between them even more difficult to discern. Perilli describes a cloud as an “infrastructure for hosting applications or data storage, a development platform, or even an application you can get on-demand…” (Perilli, 2009). This description more appropriately describes grid services because customers should never think of cloud computing in terms of infrastructure, data storage, or as a development platform. Perilli’s paper more accurately describes the benefits of virtualization and grid computing which is inconsistent with the paper’s title, The Benefits of Virtualization and Cloud Computing (Perilli, 2009). Because the paper was written recently (March 10, 2009), we can see that confusion over the definitions of grid and cloud computing persist. Vaquero et al attempt to develop a definition of the terms cloud and grid. They define cloud as “a large pool of easily usable and accessible virtualized resources” (such as hardware, development platforms, and/or services.) The definition continues to describe how those resources are managed, through virtualization (Vaquero, Rodero-Merino, Caceres & Lindner , 2009). Unfortunately, this definition can easily apply to grids as well. What differentiates clouds from grids is how they are used, not necessarily how they are constructed. Messinger and Piech discuss grid computing, virtualization, and SOA in order to support the concept of an “Application Grid.” While not directly specifying what kinds of applications might
Erlenmeyer 19
be deployed in such an application grid, the authors include a section that discusses SOA and how processing in a grid environment enables applications to function with the performance and availability required, due to the dynamic allocation and removal of resources even while an application is running (Messinger, & Piech, 2009). Messinger and Piech identify four characteristics of application grids (and any grid for that matter), as “clustering, adjusting, metering, and automating” (Messinger, & Piech, 2009). Of course, these functions must exist in order to support cloud computing; however, they operate in the underlying grid that supports the cloud and its services. Standardization Standards for managing grids are not fully developed nor universally supported. Underlying technologies may have been standardized, such as Ajax, REST, and Atom; however, these have nothing directly to do with grid or cloud computing. The Open Grid Forum is actively promoting the standardization of grid computing, and draft publications are under development. Participants in the effort include Microsoft, Oracle, HP, IBM, Fujitsu, Nortel, NetApp, SAS, and many others. It is clear that standards are essential and a sense of urgency is felt due to the speed at which grid computing is proliferating (Berry, Djaoui, Grimshaw, Horn, Maciel, Sienbenlist, et al., 2005). The Globus Alliance is a working group of organizations that is developing processes, tools, and standards for global grid architectures. The Globus Toolkit is fairly comprehensive and supports most conceptual requirements for grid computing on a global basis (Globus Toolkit, n.d.). The Toolkit is implemented by a number of commercial suppliers such as Univa UD (Univa UD, n.d.). Other organizations working on standards, tools and processes include the
Erlenmeyer 20
Open Grid Forum (Open Grid Forum, n.d.), and the Open Science Grid organization that is working on the Open Grid Services Architecture (OSGA) (Open Science Grid, n.d.). Service Management IBM has identified a number of value segments important to service management in grid computing environments. These value segments include: 1) Discovery of resources, 2) monitoring, 3) Security and 4) Integration of business service management as it pertains to support, planning, and lifecycle management (IBM service management software, n.d.). Although these value segments are readily supported in the IT industry, standards for each are not consistent across vendors’ products. In a separate discussion, IBM includes additional requirements for grid computing: 1) an information architecture, and 2) business continuity and resiliency. In order to effectively monitor and manage grid resources, information must be collected and analyzed on-demand, or as an administrative function. The data is collected, stored, and processed by automated virtualization management in the grid. Administrators will use the information to make decisions about how the physical resources in the grid will be managed. Customers who have applications running the grid will also require information regarding the applications’ performance and availability (IBM infrastructure management, (n.d.). Data Ownership and Protection Shroff highlights the concern that most customers considering SaaS have with regard to intellectual capital collected and stored within the cloud computing environment. The author identifies “control over data” as the root of the concern, and describes how customers can manage or host data that is used by a SaaS either by having dedicated resources (in the underlying grid), or within the customer’s own hosting infrastructure (Shroff, 2008). Hybrid grids/clouds address this issue by providing SaaS support across the boundary between service
Erlenmeyer 21
provider and the customer’s internal network. Nanneman shows how this can be done using Informatica technology to architect a solution where customers are able to keep sensitive data on premises while they subscribe to SaaS from a service provider (Nanneman, 2009). Wenger states that many businesses have a critical need for SaaS that provides data protection and retention services that customers may not be able to support in-house (Wenger, 2008). Utility Computing Fontecilla identifies the elements of cloud computing as: 1) grid computing, 2) utility computing, and 3) virtualization technologies (Fontecilla, 2009). This supports our premise that cloud computing encompasses grid computing; however, utility computing and virtualization are elements of grid computing, and of cloud computing only by extension.
Erlenmeyer 22
SURVEY OF INDUSTRY SOLUTIONS
SaaS Showplace The SaaS Showplace is a directory service that lists over one thousand SaaS providers. Categories of providers include accounting/financial, call center, CRM, e-learning, HR, messaging, collaboration services. Most of the services listed are truly SaaS; however, some are actually grid services, such as web development platforms. The Showplace is an effective source for SaaS in just about any industry or business application (SaaS Showplace, n.d.). AutoDesk (SaaS) AutoDesk has long been regarded as the premier supplier of Computer-Aided-Design (CAD) software. The company now offers a SaaS that not only provides the CAD design platform, but a project management service and collaboration platform as well. Teams that are geographically dispersed are able to manage a design or architecture project from end-to-end using the service. Beginning from the development of a proposal/bid to the operation of a fully completed project, customers will find everything they need using the SaaS rather than spending capital and taking resources away from their core business to run an IT shop (Collaboration project management, n.d.). Aria CRM (SaaS) Aria Systems provides a complete customer life-cycle management SaaS. Its CRM platform provides both a SaaS hosted solution, as well as grid-level services including an API for accessing web services that Aria offers. These services include customer information, billing, communications and payment platforms. Aria’s payment services are also able to interface with third-party systems. Although Aria provides a common CRM platform, user interfaces can be
Erlenmeyer 23
tailored to meet the needs of its corporate customers who want consistency with its own website look-and-feel, or tailored forms and reports. Aria’s offering is clearly SaaS; however, the use of their API implies that an application deployed elsewhere is using the services through a programmed interface. This more clearly falls into the category of a platform service. Aria makes no mention of grid or cloud computing, and it is not clear what the underlying infrastructure is. We make the case that SaaS can be offered on non-grid/cloud hosting infrastructures (Better Billing, n.d.). Acclaris (SaaS) Acclaris provides HR and employee benefits services as either a complete package, or as individual services a la carte. Acclaris claims to provide SaaS; however, the details of their hosted solution do not fit the description of grid computing. HR and benefits inherently require significant database management and customized forms. In addition, there is no dynamic, ondemand provisioning of services. For these reasons, the offering loosely fits the requirements for SaaS, and is more of a hosted solution that provides the services a customer subscribes to (Acclaris, n.d.). EuroGrid (Grid) EuroGrid provides classic computational grid services supporting meteorology, biomolecular modeling, research, and computer aided engineering. For each category of service, EuroGrid outlines who the providers of computing power are, and what systems they make available. The mission of EuroGrid is to provide a massive computing environment that is managed as a single platform (EuroGrid, n.d.).
Erlenmeyer 24
TeraGrid (Grid) The TeraGrid is a well known example of a scientific supercomputing grid made possible by funding from the National Science Foundation. There are eleven scientific organizations and universities that provide computing resource to the grid. TeraGrid does not provide higher-level services that would be used for business operations. From an architectural perspective, the TeraGrid is a massive complex of computing resources that are shared and coordinated to support specific projects. The software packages available on the TeraGrid are used by scientists and engineers as a platform to build their own applications that require massive processing power (Coordinated TeraGrid software and services, n.d.). SuperLU is an example of a TeraGrid platform that uses high performance computers in a cluster to solve nonsymetric systems of linear equations (SuperLU, n.d.). TeraGrid now provides an option for research organizations to build their own “gateways” which are sets of tools, applications, and data that scientists can use for research and modeling (TeraGrid Science Gateways, n.d.). Oracle CRM on Demand (SaaS) As a leader in CRM, Oracle provides a CRM hosted service that loosely fits the description of a SaaS. Like Acclaris’ HR and benefits services, there are significant data requirements that must be addressed before a CRM solution can be provisioned. In addition, a great deal of userinterface customizability is available. This puts Oracle’s CRM on Demand in the category of a tailored SaaS. Because subscription-based payment is available, CRM on Demand is clearly a SaaS once fully provisioned (Oracle announces Oracle® CRM On Demand release 16, 2009).
Erlenmeyer 25
Citrix - gotomeeting.com (SaaS) GoToMeeting is a classic example of SaaS. Customers can provision a meeting place online in a matter of minutes. Participants do not have to register to join a meeting. Customers are billed on a pay-as-you-go basis. There are no details regarding the underlying hosting infrastructure – SaaS customers do not have a need to know. GoToMeeting is another example of a SaaS that could be supported by a standard clustered hosting environment (GoToMeeting, n.d.). IBM (SaaS) LotusLive is an entry point for a collection of collaboration services such as e-meetings, email, forums, and social bookmarking. Like GoToMeeting, the IBM Sametime Unyte SaaS allows companies of any size to quickly provision an emeeting space that will accommodate any number of participants in any geography. LotusLive services are true SaaS, based on a standard clustered hosting infrastructure (LotusLive, n.d.). VMWare vCloud (Grid) As a leader in hardware and software virtualization, VMWare is well positioned to provide a powerful grid computing platform as a service. The offering’s name which includes the term “cloud” is misleading. vCloud is clearly a grid that customers can leverage for the development and hosting of their own SaaS solutions. vCloud’s computing grid is supported by many partners who provide a variety of hosting services, and PaaS (VMware announces vCloud initiative for enterprise-class cloud computing, n.d.).
Erlenmeyer 26
Amazon EC2 (Grid) EC2 is a grid environment in which companies can develop applications that they will offer as SaaS, effectively transforming the grid into a cloud for Amazon customers (Amazon Elastic Compute Cloud (Amazon EC2), n.d.). Like VMWare’s vCloud, Amazon’s EC2 provides a grid services platform on which developers can design and build web-based applications or SaaS. Developers are able to obtain direct access to the resources their applications are using, which is in contrast to the strict definition of a grid, where all resources are managed as a single entity (either like resources or unlike resources.) Amazon’s offering loosely fits the description of grid computing. 3Tera (Grid) The 3Tera model provides a grid hosting environment in which their customers can deploy their own SaaS. The grid resources are managed as a single entity and customers can use the grid services on a utility basis. The 3Tera grid provides both the development and the hosting infrastructures to support a customer’s SaaS offerings (Cloud computing without compromise, n.d.). GoGrid (Grid) GoGrid is clearly a grid in which developers can create web applications and SaaS on the platform of their choice. Like EC2, developers are able to access their platforms directly. GoGrid is presented as a cloud-computing offering which it clearly is not (Control in the cloud, n.d.).
Sun’s Open Cloud Platform Sun has entered the cloud computing business with an advanced method of managing “virtual data centers” that can be provisioned for individual customers (Cohen, 2009). Sun’s
Erlenmeyer 27
definition of the term cloud is a significant departure from the common definitions for cloud and grid computing architectures. While most definitions of the term cloud focus on services delivered through software, Sun has created yet another niche market by providing entire data center solutions within a virtualized grid environment. These virtual data centers include all of the resources that a company would have available in its own in-house hosted data center. This allows customers to deploy their custom applications in a Sun-hosted infrastructure. Sun refers to this offering as SunCloud. Because the virtual data center is intended to be administered by the customer who can use GUI tools to manage virtual resources, this offering may be appropriately called Platform-as-a-Service. Interestingly, Sun, in its Guide to Getting Started With Cloud Computing, outlines the features commonly included in the definition of cloud computing. The business benefits of cloud computing are mentioned, such as IT efficiency, business agility and reduced capital expenditures. Sun speaks of cloud computing as “the abstraction of computer resources” and then describing the grid architecture that underlies the cloud offering (Jha, Merzky & Fox, n.d.).
Erlenmeyer 28
DISCUSSION
Virtualization
Virtualization supports grid computing through the pooling and sharing of resources, and the dynamic allocation and reallocation of those resources (Breiter & Gupta, 2008). Abstraction and pooling Virtualization must support the pooling of like resources from a variety of hardware suppliers. For example, computing hardware may include brands such as IBM, Sun, and Hewlett-Packard. These may be running different operating systems, such as IBM z/OS and AIX, or Sun OS, Linux, and other forms of the Unix operating system. These servers running unlike operating systems must still be aggregated and made available as a single computing resource. Similarly, storage virtualization must also support the pooling of unlike file systems from different manufacturers. Technology that enables the pooling of computing resources and their dynamic allocation enables an organization to better utilize computing equipment and adjust applications’ computing requirements on-demand. Attaining the correct balance between having too little resource available for all applications and having underutilized resources is difficult to achieve in small computing environments. As resources in large data centers are managed through virtualization, better precision in striking this balance is achieved because the pool of resources and the number of applications being hosted is much larger.
Erlenmeyer 29
Grid Computing
Virtualization is required for grid computing; however, there is more to grid computing than the logical pooling of like and unlike physical resources. Grid computing implies that there are administrative and operational functions that manage the discovery and dynamic allocation, deallocation, and balancing of pooled resources, and the scheduling and deployment of jobs that are to execute (Berry et al, 2009). Also required are monitoring and reporting of grid status functions. (Messinger & Piech, 2009). Virtual Organizations When individuals, companies, education and research institutions come together to share resources for a purpose, a Virtual Organization (VO) is formed (Walker, n.d.). The VO also includes all of the physical resources that are to be shared within the grid. There are many working examples of VOs in operation that are based on the architecture, standards, and services defined by the Globus Alliance (The Globus Alliance, n.d.). Obstacles prevent some providers from completely giving themselves over to full participation in the grid community. The required sharing includes direct access to a participant’s physical computing, network, and data resources, which raises significant security and privacy concerns (Roxburgh, Pawlikowski & McNickle, n.d.; Foster, Kesselman & Tuecke, n.d.). Managing financial and legal relationships between grid participants is also a challenge. By definition, grids are built upon heterogeneous resources that are centrally managed; however, grids that are unlike in their composition can be difficult to manage, sometimes due to differences in ownership and the level of support for standardized interfaces between them (Gibbins & Buyya, n.d.). Despite this, significant progress has been made in the standardization of grid architecture, and the major management functions that support its physical resources – as
Erlenmeyer 30
these further develop and concerns over security and cost recovery are addressed, the computing power of grids and the number of participants will increase dramatically. Specifications for a complete grid computing architecture and its services are available and continue to be developed by the Globus Alliance. Not all of the tools in the Globus Toolkit are required in order for a computing environment to be considered a grid. The advanced inter-grid forms of provisioning and scheduling functions are not implemented in many cases (Coordinated TeraGrid software and services, n.d.). There are also varying degrees of automation that are supported in established grid environments. Grid Computing Architecture The Globus Alliance defines the Open Grid Services Architecture and associated standards for grid computing. As explained by Foster, Kesselman, and Tuecke, the architecture is multilayer, and built upon a standard network ”fabric” of physical resources (Foster, Kesselman & Tuecke, n.d.). The “Connectivity Layer” defines the communications protocols necessary to interact with resources in the virtualized Fabric Layer. Above the Connectivity Layer, the “Resource Layer” provides the tools needed for security, monitoring, management, authentication, and financial transactions. The authors describe the “Collective Layer” as providing the components required for the implementation of the required grid functions – such as discovery of resources, scheduling, monitoring, and other services needed for collaboration between resources (Foster, Kesselman & Tuecke, n.d). Applications are able to access any or all of these layers depending on which services are to be leveraged, as figure 1 shows.
Erlenmeyer 31
Applications Collective Resource Connectivity
Fabric
Figure 1: Open Grid Services Architecture
The Globus Toolkit identifies five advanced grid services that may be implemented to support a high level of automation and centralized management: o Resource discovery o Centralized management and scheduling o Performance forecasting o Monitoring and reporting o Automated provisioning and reallocation of resources Resource Broker As figure 2 shows, at lower levels within the grid architecture, applications may be given direct access to a set of virtualized resources when continuous availability is required, or when access to specific features of a middleware product or OS is needed. When applications are able to operate asynchronously, resources may be assigned as needed by a resource broker that discovers available resources and schedules applications to use them (Foster, Kesselman & Tuecke, n.d.).
Erlenmeyer 32
Applications Resource Broker Virtualization
Processor Middleware Disc Storage Network Bandwidth
Figure 2: Applications access to resource directly or through a resource broker
Resource Discovery in Current Grids In theory, centrally managed grids are able to discover available resources in any interconnected grid, and schedule their use based on customer demand; however, in many cases, the discovery of resources is limited to the grid in which a resource broker is functioning. Two primary components of the Globus Toolkit that enable the required discovery and resource management functions are the Monitoring and Discovery System, and Grid Resource Allocation and Management. Monitoring and Discovery System (MDS) The MDS monitors resource usage and publishes the data for access by other services in the grid. The data includes a list of discovered resources and their status, and information about the local job schedulers (Tang, & Zhang, 2006). The Globus Toolkit allows resources to be discovered when added to the grid, they are also tracked for performance and availability, and allocated to the appropriate jobs based on the jobs’ priorities (Globus Toolkit, n.d.).
Erlenmeyer 33
Grid Resource Allocation and Management (GRAM) The GRAM manages the scheduling and submission of jobs to be executed by grid resources. A hierarchy of schedulers is involved. GRAM fills a super-scheduling role and hands off jobs to local schedulers discovered by MDS (Tang, & Zhang, 2006). The Globus toolkit provides software tools that support the management of security, access control, and VO administration. A set of data management tools provide access to shared data using a resource broker. Information service tools participate in the collection of resource data so that the execution tools can broker access to resources through job submission. There are several toolkit distributions designed for specific VOs or projects (Globus Toolkit,. n.d.). Centralized Grid Management – Scheduling In order for peer-to-peer grid interoperability to be effectively managed, an automated InterGrid job scheduling and resource indexing function must exist. Ranjan, Harwood, and Buyya describe the superscheduling role, which has responsibility for determining what resources are available that can meet the requirements of a job presented for processing. Within individual grids, resource brokers discover and schedule workload (Ranjan, Harwood & Buyya, n.d.). Distributed Data Management In some cases, applications have dedicated persistent storage (databases.) It is far more common for an organization to build shared databases which many applications will access. When many organizations participate in a VO, databases are spread across many networks in the grid. In an inter-grid arrangement, databases are scattered across the entire collection of grids, as figure 3 shows. Data synchronization, currency, and overall data quality become significant issues, especially when important scientific or governmental computing takes place. This places
Erlenmeyer 34
a requirement on VO administration to tightly control membership, the resources that are included, what data is stored, where it is located, how it is accessed, and who is authorized to use it. Inter-grid metadata is critical, and a data architecture well defined, documented, and understood by all of the participants in the VO. Differences between server hardware, physical storage media, and database management software compound the problems that must be addressed (Pierre, Schütt, Domaschka & Coppola, 2009).
Database
Database Database
Resource Broker
Database Database
Resource Broker
Resource Broker
Database Resource Broker
Resource Broker/ Scheduler
Resource Broker
Database Database
Database Database
Resource Broker
Resource Broker
Resource Broker
Database Database
Database
Figure 3: Resource broker management of inter-grid resources
Performance Forecasting
Erlenmeyer 35
If resources are allowed to go on and offline frequently, it is difficult to effectively estimate performance and availability of resources, and ensure an acceptable quality-of-service. As performance and utilization data is collected for each resource, improved forecasting is possible; however, the enormous amount of data that is captured needs to be stored for further analysis. It is not only a grid management problem --the data management challenges are just as significant (Roxburgh, Pawlikowski & McNickle, n.d; Pandey & Buyya, n.d.). Automation Automated provisioning of resources is required for grid management to perform effectively. It is based on the availability of resources, their record of performance, and the complexity of the jobs that are to be executed. There are additional challenges to address when implementing an automated provisioning process; software product licensing may need to be managed. If computing resources must be provisioned, it is likely that sufficient storage must be made available also. Finally, security and access control must be managed automatically as well (Haynos, 2005; What is grid computing? 2008).
Grid Services
Grid Services fall into two categories: 1) the services defined by the Globus Alliance for the internal operations inside grids, and between grids, and 2) services offered to external customers who wish to utilize grid resources. Several of the internal grid operations services (resource discovery, scheduling, performance forecasting, monitoring, and reporting) were considered in the previous sections. In this section, we will consider the various services that consumers will require.
Erlenmeyer 36
Most grid service providers offer services in several contexts and for a variety of purposes (Foster, Kesselman & Tuecke, n.d.). The resources that may be sought by customers include: o Computational o Storage o Network o Code repositories o Catalogs In the following section we will consider the packaged service, Platform-as-a-Service (PaaS), which combines some or all of the above resources. This service is selected because it is generally associated with cloud computing – we will see why it should not be.
Platform as a Service Similar resources on virtualized grid infrastructures are pooled and managed as a single entity. For example, storage devices are in a storage pool, server hardware is in a computing pool. At a higher level, pooling takes place at the operating system and middleware levels. Service providers make available pooled resources of a certain type (or a combination of types) to consumers who require a platform upon which they can develop, test, or deploy their applications. PaaS allows customers to request virtualized OS and middleware resources. In this model, the customer may have access to one or more levels in the software stack or infrastructure. Service providers also offer Computing-as-a-Service, Storage-as-a-Service, and Networkingas-a-Service
Erlenmeyer 37
Consumer
Computing as a Service
Platform as a Service
Storage as a Service
Network as a Service
Virtualization
Processor Middleware Disc Storage Network Bandwidth
Figure 4: Offering grid services to consumers
Cloud Computing
There is a tremendous amount of interest in cloud computing because it represents the next step in the evolution of computing services. Most of the excitement is driven by marketing professionals who see the potential for increased revenue through the sale of products and services that are “cloud-friendly.” While there is disagreement about how the term cloud should be defined, from an architectural perspective, a cloud is simply a grid on which software-as-services are offered. These services are used by customers who have no need to understand the underlying grid infrastructure and how it is managed. A customer purchases services on-demand or through a subscription process. The operating environment and underlying grid resources are provisioned on-demand.
Erlenmeyer 38
Cloud services are pre-packaged and generally require no special customization or configuration for a prospective customer. Because a customer has no need to manage resources on the underlying grid, they are provided only with a software platform that they use for a specific business function. In most cases, the platform is simply a product that has been deployed in a shared environment that many customers can use as needed. Because the service is based on software, it is referred to as Software-as-a-Service.
Cloud Services
Software-as-a-Service Cuomo describes SaaS as an “attempt to virtualize the application” (Cuomo, 2008). SaaS providers advertise the services they offer, and do not speak of them as applications. A virtual application environment, or SaaS, allows customers to benefit from the service without being concerned about the applications or products involved in delivering the service. Examples of SaaS include on-demand business function such as HR and CRM, social computing environments (social bookmarking, forums, chat, blogs), e-meetings and webcasts, email, and document management. By definition, SaaS does not have to be supported in a cloud computing environment. Our analysis has shown that SaaS has no direct reference to underlying infrastructure; however, SaaS is usually discussed in the context of cloud computing.
Erlenmeyer 39
SaaS in the Cloud
Software as a Service
Grid
Virtualization
Processor Middleware Disc Storage Network Bandwidth
Figure 5: SaaS offered in a cloud requires grid services as an operational foundation
Service Provider Outsourcing
Players in the industry of cloud computing are identified as: 1) cloud service (SaaS) providers, and 2) cloud enablers that provide grid hosting environments in which companies can develop their own SaaS. Vendors providing grid and/or cloud services may not be consistently providing SLA documentation for customers; however, they are more likely to do so when interacting with other vendors. One company will frequently purchase services from another when additional capacity is needed. Figure 6 shows one relationship between vendors. In this example, vendor B is purchasing platform services (PaaS) from Vendor A so that it can provide SaaS to its own customers.
Erlenmeyer 40
The relationship between vendors could also be as follows: o Vendor B purchases computing or storage resources from Vendor A due to planned campaigns which will generate a significant workload increase on its own hosting infrastructure. o Vendor B leverages Vendor A’s PaaS while it migrates its own software products to additional OS or middleware platforms, or as it develops its own SaaS
Customer
Vendor B ( SaaS)
Software as a Service
Vendor A ( Grid Services)
Computing as a Service
Platform as a Service
Storage as a Service
Network as a Service
Virtualization
Processor Middleware Disc Storage Network Bandwidth
Figure 6: In the cloud, a SaaS provider purchases PaaS from grid service provider
Erlenmeyer 41
The example shows the interaction between vendors who may share a grid, or may be on separate grids. There is tremendous flexibility in how providers and consumers of services can interact on today’s Internet. As standards for interoperability between grids is developed, more options will become available.
Grid and Cloud Service provider responsibilities
Service providers are expected to manage security and data privacy within the SaaS environment. Frequently, the guarantee that customers receive with regard to security and privacy is in the form of a Service Level Agreement (SLA), or other forms of contracts for longterm usage. On-demand SLAs are less comprehensive and are usually applicable only to the current session that the customer is paying for. When a company enters into an arrangement where it will purchase SaaS on a long-term basis, more extensive negotiation is necessary. Customers expect a level of service that enables them to meet their business objectives, whether they are purchasing grid services or SaaS. This includes the protection of their data, availability, and performance of the services they subscribe to. Many businesses are hesitant to subscribe to services that are mission-critical or where poor performance and availability is unacceptable. Part of the challenge is the inability of customers to directly monitor the services they purchase (Broberg, Venugopal & Buyya, 2007). Features that should be supported in any SLA that is developed, and which are common to both grid and cloud computing environments are: o Automatic allocation and de-allocation of resources (CPU, storage, network bandwidth etc.) o Manage the performance and availability of its services through network bandwidth monitoring, storage and processor utilization.
Erlenmeyer 42
o Provide a process for developing legal agreements (contracts, SLAs, etc) between the service provider and its customers, especially when sensitive personal or confidential business information is being handled. Identify and document accountability if confidential information is compromised. o Regularly auditing security performance and providing reports to customers. Manage grid resources so that acceptable quality of service (QOS) is maintained, performance and availability is ensured, and the offering is scalable. o Ensure that capacity is available for all customers who require additional on-demand resources o Providing disaster recovery support if required by customers o Ensuring the portability of customer applications and/or data o Providing billing and payment functions which support pay-as-you-go, utility and subscription-based resource usage o Implement a problem management process so that customer complaints are managed quickly o Provide a process for customers to obtain support through help desk or other support interfaces o Providing usage reporting and/or site analytics so customers can analyze the value of the services they purchase (Wong, 2008). Manage security and data privacy, so that confidential data is not stolen or shared inadvertently through: o Physical access control o Logical access control
Erlenmeyer 43
o Physical storage location of data and the movement of the data outside of the location o Management of shared databases and file systems o Ensuring and managing data privacy on all interfaces presented to customers via services it offers o Ensuring the protection of personal data in all virtualized platforms (server, storage, network) (Lindquist & Tapio, 2008),(Collier, Plassman & Pegah, 2007). o Ensuring network security via firewalls and intrusion detection methods o Ensuring Anti-virus scanning is performed on all vulnerable resources o Use of encryption to protect data while stored or transmitted o Secure physical and logical access to network resources o Isolation of clients in virtualized server partitions
Erlenmeyer 44
CONCLUSION
It is difficult to talk about grid computing, virtualization, cloud computing, and Software-asa-Service in separate contexts because they are so closely connected. Because some of these terms are relatively new, their definitions have not been agreed upon. Complicating the matter, many technology companies are using cloud computing and SaaS as buzzwords in their marketing efforts. As we have seen, some companies completely misapply them as a result of their efforts to define the terms in their own way. Grid computing is not new technology and has been in use for many years. Cloud computing is a term that has overtaken grid, even though the foundation for cloud computing is the grid. Through our analysis, we have determined that: o A grid cannot function without advanced pooling of resources, in combination with internal grid services such as resource discovery, performance monitoring, scheduling, job submission, and performance forecasting. o A cloud cannot exist without a grid as its foundation -- in reality, they are the same. Current definitions of cloud computing highlight the invisibility of the underlying grid infrastructure, and the inability customers have to deploy their own applications in the cloud. So, grid and cloud differ only in the types of services they offer, and how customers access those services. o By definition, Software-as-a-Service can be offered regardless of what its underlying infrastructure and resources are. This allows for the option of SaaS being offered on a standard clustered hosting environment. In most instances, SaaS is offered on a cloud computing platform.
Erlenmeyer 45
As the industry moves toward the development of solid standards for grids and clouds and the services they offer, there will continue to be disagreement on the definitions of the terms. Because the infrastructure in these computing environments is basically the same, the differences are in how they are used. Hopefully, those that have the privilege of developing the standards will focus on that aspect of grid and cloud computing.
Erlenmeyer 46
About the Author
Mark Erlenmeyer has been an IT professional for over eighteen years in various roles including software engineering, solution design, and enterprise architecture. He is currently a senior IT architect at IBM Corporation. Mr. Erlenmeyer’s areas of expertise include human resources IT, service center development, and manufacturing logistics systems. Mr. Erlenmeyer received a Bachelor of Science degree from the Rochester Institute of Technology (RIT), with majors in telecommunications and computer science. This paper was developed as his MS thesis which enabled him to receive a Master of Science degree from RIT. His area of study is software development and management. Copyright This paper is copyrighted and may be used as a reference, but not reproduced without express written approval by the author. Contact Mr. Erlenmeyer can be contacted via email: merlenmeyer@nc.rr.com
Erlenmeyer 47
REFERENCES
Acclaris. (n.d.). Retrieved May 18, 2009, from http://www.acclaris.com/ Amazon Elastic Compute Cloud (Amazon EC2). (n.d.). Retrieved May 18, 2009, from Amazon Web site: http://aws.amazon.com/ec2/ Berry, D., Djaoui, A., Grimshaw, A., Horn, B., Maciel, F., Sienbenlist, F., et al. (2005, January 29). The open grid services architecture, version 1.0. Retrieved May 18, 2009, from Grid Forum Web site: http://www.gridforum.org/documents/GWD-I-E/GFD-I.030.pdf Better Billing. (n.d.). Retrieved May 18, 2009, from Aria Systems Web site: http://www.ariasystems.com/product Boulton, C. (2008, May 5). Lotus chief picks on the cloud. eWeek, 25(14), 6-6. Retrieved March 17, 2009, from Academic Search Complete database. Breiter, G., & Gupta, P. (2008). Cloud computing and virtualization. IBM. Broberg, J., Venugopal, S., & Buyya, R. (2007, August 4). Market-oriented grids and utility computing: The state-of-the-art and future directions. Retrieved May 18, 2009, from Univ of Melbourne Web site: http://www.gridbus.org/reports/ MarketGridUtilitySurvey2007.pdf Cloud computing without compromise. (n.d.). Retrieved May 18, 2009, from 3tera Web site: http://www.3tera.com/?_kk=vcloud&_kt=100c365e-3c8e-4a87-9a75d8b8b42817e8&gclid=CJS0u-uS8ZkCFR0Sagodg1tzRw Cohen, R. (2009, April 14). Sun announces open cloud platform & API. Retrieved May 18, 2009, from Jlitzer Web site: http://cloudcomputing.ulitzer.com/node/883729
Erlenmeyer 48
Collaborative Project Management. (n.d.). Retrieved May 18, 2009, from Autodesk Web site: http://usa.autodesk.com/adsk/servlet/index?siteID=123112&id=9682500 Collier, G., Plassman, D., & Pegah, M. (2007). Virtualization’s next frontier: Security. In Conference on User Services (pp. 34-36). ACM. Retrieved May 17, 2009, from ACM Digital Library database. Control in the cloud. (n.d.). Retrieved May 18, 2009, from GOGRID Web site: http://www.gogrid.com/ Coordinated TeraGrid software and services. (n.d.). Retrieved May 18, 2009, from TeraGrid Web site: http://www.teragrid.org/userinfo/software/ctss.php Crosby, S., & Brown, D. (2006). The virtualization reality. Computer Architecture, 4(10), 31-41. Retrieved May 17, 2009, from ACM Digital Library database. Cuomo, G. (2008, April 4). Rainmaking. Message posted to http://www.ibm.com/ developerworks/blogs/page/gcuomo?entry=rainmaking Dias de Assunção, M., Buyya, R., & Venugopal, S. (n.d.). InterGrid: A case for internetworking islands of grids. Retrieved May 18, 2009, from Univ of Melbourne Web site: http://www.gridbus.org/papers/InterGrid.pdf EuroGrid. (n.d.). Retrieved May 18, 2009, from http://www.eurogrid.org/ Fontecilla, R. (2009, April). Cloud computing: A transition methodology. Cloud Computing, 3(2), 14. Retrieved May 17, 2009, from Sys Con Web site: http://www2.sys-con.com/ cloud/pdf/CCJournal_2-2_spread.pdf Foster, I., Kesselman, C., & Tuecke, S. (n.d.). The anatomy of the grid. Retrieved May 18, 2009, from Globus Alliance Web site: http://www-unix.globus.org/alliance/publications/papers/ anatomy.pdf
Erlenmeyer 49
Gaw, P. (2008, July 25). What’s the difference between cloud computing and SaaS? Retrieved May 18, 2009, from Web 2.0 Journal Web site: http://web2.sys-con.com/node/612033 Gibbins, H., & Buyya, R. (n.d.). Gridscape II: An extensible grid monitoring portal architecture and its integration with Google Maps. Retrieved May 17, 2009, from Univ of Melbourne Web site: http://www.gridbus.org/papers/GridscapeII-IJPEDS-Journal.pdf Globus Alliance. (n.d.). Retrieved May 18, 2009, from The Globus Alliance Web site: http://www.globus.org/ Globus Toolkit. (n.d.). Retrieved May 18, 2009, from http://www.globus.org/toolkit/ GoToMeeting. (n.d.). Retrieved May 18, 2009, from Citrix Web site: https://www2.gotomeeting.com/?Portal=gotomeeting.com GovTrack.us. S. 495--110th Congress (2007): Personal Data Privacy and Security Act of 2007, GovTrack.us (database of federal legislation) (accessed May 18, 2009) Haynos, M. (2005, January 11). Perspectives on grid: Using automation effectively within a grid infrastructure. Retrieved May 18, 2009, from IBM Web site: http://www.grid.org/system/ files/gr-automation-ltr.pdf How does virtualization work? (n.d.). Virtualization basics. Retrieved May 18, 2009, from vmware Web site: http://vmware.com/technology/virtualization.html IBM infrastructure management. (n.d.). Retrieved May 18, 2009, from http://www-03.ibm.com/ systems/dynamicinfrastructure/solutions/?met=contentspace IBM service management software. (n.d.). Retrieved May 18, 2009, from https://www01.ibm.com/software/tivoli/solutions/it-service-management/
Erlenmeyer 50
Jacobs, D. (2005, August). Enterprise software as a service. Enterprise Distributed Computing, 3(6), 36-42. Retrieved May 17, 2009, from ACM Digital Library database. Jha, S., Merzky, A., & Fox, G. (n.d.). Using clouds to provide grids higher-levels of abstraction and explicit support for usage modes. Retrieved May 18, 2009, from Open Grid Forum Web site: http://ogf.org/OGF_Special_Issue/cloud-grid-saga.pdf Kourpas, E. (2006, June). Grid computing: Past, present and future. Retrieved May 18, 2009, from IBM Web site: http://www-03.ibm.com/grid/pdf/innovperspective.pdf Lamont, J. (2009, January). Sass: flexible, efficient & affordable. KM World, 18(1), 10-11. Retrieved May 18, 2009, from Academic Search Complete database. Lindquist, J., & Tapio, J.-M. (2008). Protecting privacy with protocol stack virtualization. In Workshop on Privacy in the Electronic Society (pp. 65-74). ACM. Retrieved May 17, 2009, from ACM Digital Library database (ACM). LotusLive. (n.d.). Retrieved May 18, 2009, from IBM Web site: http://lotuslive.com Mc Evoy, G., & Schulze, B. (2008). Using clouds to address grid limitations. In Workshop on Middleware for Grid Computing (article 11). Retrieved May 17, 2009, from ACM Digital Library database. McLaughlin, B. D., Sr. (2009, March 31). Navigate the cloud computing labyrinth. In Web development. Retrieved May 18, 2009, from IBM Web site: http://www.ibm.com/ developerworks/web/library/wa-cloudflavor/index.html?ca=dgr-jw22CCLabyrinth&S_TACT=105AGX59&S_CMP=grsitejw22 Messinger, A., & Piech, M. (2009, April 11). Why an application grid? Retrieved May 18, 2009, from Sys Con Web site: http://soa.sys-con.com/node/905532 Nanneman, D. (2009, January). SaaS INTEGRATION SOLUTIONS: Is IT keeping up with your
Erlenmeyer 51
SaaS Integration needs?. CRM Magazine, 13(1), 9-9. Retrieved May 18, 2009, from Academic Search Complete database. Nassar, T., & Vridhachalam, M. (n.d.). Software as a Service: Build a web-delivered SaaS framework for forms and workflow-driven applications. Retrieved December 9, 2008, from IBM developerWorks Web site: http://download.boulder.ibm.com/ibmdl/pub/ software/dw/architecture/ar-saasframe/ar-saasframe-pdf.pdf Open cloud manifesto. (2009). Retrieved May 18, 2009, from http://www.opencloudmanifesto.org/opencloudmanifesto1.htm Open Grid Forum. (n.d.). Retrieved May 18, 2009, from http://ogf.org/ Open Science Grid. (n.d.). Retrieved May 18, 2009, from http://www.opensciencegrid.org/ Oracle announces Oracle® CRM On Demand release 16. (n.d.). Retrieved May 18, 2009, from Oracle Web site: http://www.oracle.com/us/corporate/press/017837_EN Oracle grid computing. (2008, May). Retrieved May 18, 2009, from Oracle Web site: http://docs.oraclewhitepapers.com/oraclewhitepapers/oracle-grid-computing-322/ ?sub_id=DP4y6TrHHgycA Pandey, S., & Buyya, R. (n.d.). Scheduling of scientific workflows on data grids. Retrieved May 18, 2009, from Univ of Melbourne Web site: http://www.gridbus.org/ papers//WorkflowDataGrids-TCSC-DocSymp2008.pdf Perilli, W. (2009, March 10). The benefits of virtualization and cloud computing. Retrieved May 18, 2009, from Sys Con Web site: http://cloudcomputing.sys-con.com/node/870217
Erlenmeyer 52
Pierre, G., Schütt, T., Domaschka, J., & Coppola, M. (2009). Highly available and scalable grid services. In Workshop on Dependable Distributed Data Management (pp. 18-20). ACM. Retrieved May 19, 2009, from ACM Digital Library database. Preimesberger, C. (2007, November 27). Saving the data. In Data storage. Retrieved May 17, 2009, from eWeek Web site: http://www.eweek.com/c/a/Data-Storage/Saving-the-Data/ Prince, B. (2008, June 9). Malware beware. eWeek, 25(18), 29-29. Retrieved March 17, 2009, from Academic Search Complete database. Raichura, B., & Vayanippetta, V. (2009, April). ISV strategy for revenue & customer growth online software distribution store on the cloud. Cloud Computing Journal, 2(2), 5. Retrieved May 17, 2009, from Sys Con Web site: http://www2.sys-con.com/cloud/pdf/ CCJournal_2-2_spread.pdf Ranjan, R., Harwood, A., & Buyya, R. (n.d.). Peer-to-peer-based resource discovery in global grids: A tutorial. Retrieved May 17, 2009, from Univ of Melbourne Web site: http://www.gridbus.org/papers/p2p-grid-resource-discovery2008.pdf Roxburgh, A., Pawlikowski, K., & McNickle, D. C. (n.d.). Grid computing: The current state and future trends. Retrieved May 17, 2009, from Univ of Canterbury Web site: http://nzcsrsc08.canterbury.ac.nz/research/reports/TechReps/2004/tr_0401.pdf SaaS Showplace. (n.d.). Retrieved May 18, 2009, from http://www.saas-showplace.com/ Shroff, G. (2008). Dev 2.0: Model driven development in the cloud. In Symposium on Foundations of Software Engineering (p. 283). Retrieved May 17, 2009, from ACM Digital Library database.
Erlenmeyer 53
Singh, A., Korupolu, M., & Mohapatra, D. (2008). Server-storage virtualization: Integration and load balancing in data centers. In ACM/IEEE conference on Supercomputing. Piscataway, NJ, USA: IEEE Press. Retrieved May 17, 2009, from ACM Digital Library database. SuperLU. (n.d.). Retrieved May 18, 2009, from Univ of California (Berkeley) Web site: http://crd.lbl.gov/~xiaoye/SuperLU/ Tang, J., & Zhang, M. (2006). An agent-based peer-to-peer grid computing architecture: Convergence of grid and peer-to-peer computing. In Australasian workshops on Grid computing and e-research (pp. 33-39). Australian Computer Society. Retrieved May 17, 2009, from ACM Digital Library database. TeraGrid Science Gateways. (n.d.). Retrieved May 18, 2009, from TeraGrid Web site: http://www.teragrid.org/gateways/# Univa UD. (n.d.). Retrieved May 18, 2009, from http://www.univaud.com/index.php Vaquero, L. M., Rodero-Merino, L., Caceres, J., & Lindner, M. (2009, January). A break in the clouds: Towards a cloud definition. Computer Communication Review, 39(1), 50-55. Retrieved May 17, 2009, from ACM Digital Library database. VMware announces vCloud initiative for enterprise-class cloud computing. (n.d.). Retrieved May 18, 2009, from VMware Web site: http://www.vmware.com/company/news/ releases/vcloud_vmworld08.html Walker, D. W. (n.d.). The grid, virtual organizations, and problem-solving environments. Retrieved May 17, 2009, from IEEE Web site: csdl.computer.org/comp/proceedings/ cluster/2001/1116/00/11160445.pdf Weiss, A. (2007). Computing in the clouds. Cloud Computing, 11(4), 16-25. Retrieved May 17, 2009, from ACM Digital Library database.
Erlenmeyer 54
Wenger, A. (2008, September). DATA PROTECTION WITH SAAS. Communications News, 45(9), 30-30. Retrieved May 18, 2009, from Academic Search Complete database. What is grid computing? (2008, May). Retrieved May 18, 2009, from Oracle Web site: http://docs.oraclewhitepapers.com/oraclewhitepapers/oracle-grid-computing-322/ ?sub_id=DP4y6TrHHgycA What is the grid? (n.d.). Retrieved May 18, 2009, from Gridipedia Web site: http://www.gridipedia.eu/aboutgrid.html Wong, H. (2008, November). Compute in the cloud, not the fog. Cloud Computing, 1(1), 10. Retrieved May 17, 2009, from Sys Con Web site: http://cloudcomputing.sys-con.com/ node/740239