Elysium Grid Computing 2010 by elysiumtechnologies1


More Info
									                         Elysium Technologies Private Limited
                         ISO 9001:2008 A leading Research and Development Division
                         Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

 Abstract                                 Grid Computing                                                  2010 - 2011

01     Virtual resources allocation for workflow-based applications distribution on a cloud infrastructure

            Cloud computing infrastructures are providing resources on demand for tackling the needs of large-scale distributed
            applications. Determining the amount of resources to allocate for a given computation is a difficult problem though.
            This paper introduces and compares four automated resource allocation strategies relying on the expertise that can be
            captured in workflow-based applications. The evaluation of these strategies was carried out on the Aladdin/Grid’5000
            test bed using a real application from the area of medical image analysis. Experimental results show that optimized
            allocation can help finding a tradeoff between amount of resources consumed and applications make span.

02     Unibus-managed Execution of Scientific Applications on Aggregated Clouds

            Our on-going project, Unibus, aims to facilitate provisioning and aggregation of multifaceted resources from resource
            providers and end-users’ perspectives. To achieve that, Unibus proposes (1) the Capability Model and mediators
            (resource drivers) to virtualize access to diverse resources, and (2) soft and successive conditioning to enable
            automatic and user-transparent resource provisioning. In this paper we examine the Unibus concepts and prototype in
            a real situation of aggregation of two commercial clouds and execution of benchmarks on aggregated resources. We
            also present and discuss benchmarks’ results

03     TrustStore: Making Amazon S3 Trustworthy with Services Composition

            The enormous amount of data generated in daily operations and the increasing demands for data accessibility across
            organizations are pushing individuals and organizations to outsource their data storage to cloud storage services.
            However, the security and the privacy of the outsourced data goes beyond the data owners' control. In this paper, we
            propose a service composition approach to preserve privacy for data stored in untrusted storage service. A virtual file
            system, called TrustStore, is prototyped to demonstrate this concept. It allows users utilize untrusted storage service
            provider with confidentiality and integrity of the data preserved. We deployed the prototype with Amazon S3 and
            evaluate its performance

04     Towards Energy Aware Scheduling for Precedence Constrained Parallel Tasks in a Cluster with DVFS

            Reducing energy consumption for high end computing can bring various benefits such as, reduce operating costs,
            increase system reliability, and environment respect. This paper aims to develop scheduling heuristics and to present
            application experience for reducing power consumption of parallel tasks in a cluster with the Dynamic Voltage
            Frequency Scaling (DVFS) technique. In this paper, formal models are presented for precedence-constrained parallel
            tasks, DVFS enabled clusters, and energy consumption. This paper studies the slack time for non-critical jobs, extends
            their execution time and reduces the energy consumption without increasing the task’s execution time as a whole.
            Additionally, Green Service Level Agreement is also considered in this paper. By increasing task execution time within
            an affordable limit, this paper develops scheduling heuristics to reduce energy consumption of a tasks execution and
            discusses the relationship between energy consumption and task execution time. Models and scheduling heuristics
            are examined with a simulation study. Test results justify the design and implementation of proposed energy aware
            scheduling heuristics in the paper.

     #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
     (: +91 452-4390702, 4392702, 4390651
     Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
     Email: info@elysiumtechnologies.com
                     Elysium Technologies Private Limited
                     ISO 9001:2008 A leading Research and Development Division
                     Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

05    TOPP goes Rapid—The OpenMS Proteomics Pipeline in a Grid-enabled Web Portal

           Proteomics, the study of all the proteins contained in a particular sample, e.g., a cell, is a key technology in current
           biomedical research. The complexity and volume of proteomics data sets produced by mass spectrometric methods
           clearly suggests the use of grid-based high-performance computing for analysis. TOPP and OpenMS are open-source
           packages for proteomics data analysis; however, they do not provide support for Grid computing. In this work we
           present a portal interface for high-throughput data analysis with TOPP. The portal is based on Rapid [1], a tool for
           efficiently generating standardized portlets for a wide range of applications. The web-based interface allows the
           creation and editing of user-defined pipelines and their execution and monitoring on a Grid infrastructure. The portal
           also supports several file transfer protocols for data staging. It thus provides a simple and complete solution to high-
           throughput proteomics data analysis for inexperienced users through a convenient portal interface.

06    Topology Aggregation for e-Science Networks

         We propose several algorithms for topology aggregation (TA) to effectively summarize large-scale networks. These TA
         techniques are shown to significantly better for path requests in e-Science that may consist of simultaneous reservation of
         multiple paths and/or simultaneous reservation for multiple requests. Our extensive simulation demonstrates the benefits of
         our algorithms both in terms of accuracy and performance. .

07    The Lightweight Approach to Use Grid Services with Grid Widgets on Grid WebOS

           To bridge the gap between computing grid environment and users, various Grid Widgets are developed by the Grid
           development team in the National Center for High performance Computing (NCHC). These widgets are implemented to
           provide users with seamless and scalable access to Grid resources. Currently, this effort integrates the de facto Grid
           middleware, Web-based Operating System (WebOS), and automatic resource allocation mechanism to form a virtual
           computer in distributed computing environment. With the capability of automatic resource allocation and the feature of
           dynamic load prediction, the Resource Broker (RB) improves the performance of the dynamic scheduling over
           conventional scheduling policies. With this extremely lightweight and flexible approach to acquire Grid services, the
           barrier for users to access geographically distributed heterogeneous Grid resources is largely reduced. The Grid
           Widgets can also be customized and configured to meet the demands of the users.

08    The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems

           With the increasing functionality and complexity of distributed systems, resource failures are inevitable. While
           numerous models and algorithms for dealing with failures exist, the lack of public trace data sets and tools has
           prevented meaningful comparisons. To facilitate the design, validation, and comparison of fault-tolerant models and
           algorithms, we have created the Failure Trace Archive (FTA) as an online public repository of availability traces taken
           from diverse parallel and distributed systems. Our main contributions in this study are the following. First, we describe
           the design of the archive, in particular the rationale of the standard FTA format, and the design of a toolbox that
           facilitates automated analysis of trace data sets. Second, applying the toolbox, we present a uniform comparative
           analysis with statistics and models of failures in nine distributed systems. Third, we show how different interpretations
           of these data sets can result in different conclusions. This emphasizes the critical need for the public availability of
           trace data and methods for their analysis.

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                    Elysium Technologies Private Limited
                    ISO 9001:2008 A leading Research and Development Division
                    Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

09    Supporting OFED over Non-InfiniBand SANs

           Open Fabrics Enterprise Distribution (OFED) is open-source software, committed to provide common communication
           stack to all RDMA capable System Area Networks (SANs). It supports high performance MPIs and legacy protocols for
           HPC domain and Data Centre community. Currently, it supports InfiniBand (IB) and Internet Wide Area RDMA Protocol
           (iWARP). This paper presents a technique to support OFED software stack over non-IB RDMA capable SAN. We
           propose the design of Virtual Management Port (VMP) to enable IB subnet management model. Integration of VMP with
           IB-Verbs interface driver prevents hardware and OFED modifications and enables connection manager that is
           mandatory to run user applications. The performance evaluation shows that VMP is lightweight

10    SciCloud: Scientific Computing on the Cloud

           SciCloud is a project studying the scope of establishing private clouds at universities. With these clouds, researchers
           can efficiently use the already existing resources in solving computationally intensive scientific, mathematical, and
           academic problems. The project established a Eucalyptus based private cloud and developed several customized
           images that can be used in solving problems from mobile web services, distributed computing and bio-informatics
           domains. The poster demonstrates the SciCloud and reveals two applications that are benefiting from the setup along
           with our research scope and results in scientific computing

11    Scalable Communication Trace Compression

           Characterizing the communication behavior of parallel programs through tracing can help understand an application’s
           characteristics, model its performance, and predict behavior on future systems. However, lossless communication
           traces can get prohibitively large, causing programmers to resort to variety of other techniques. In this paper, we
           present a novel approach to lossless communication trace compression. We augment the sequitur compression
           algorithm to employ it in communication trace compression of parallel programs. We present optimizations to reduce
           the memory overhead, reduce size of the trace files generated, and enable compression across multiple processes in a
           parallel program. The evaluation shows improved compression and reduced overhead over other approaches, with up
           to 3 orders of magnitude improvement for the NAS MG benchmark. We also observe that, unlike existing schemes, the
           trace files sizes and the memory overhead incurred are less sensitive to, if not independent of, the problem size for the
           NAS benchmarks

12    Runtime Energy Adaptation with Low-Impact Instrumented Code in a Power-scalable Cluster System

           Recently, improving the energy efficiency of high performance PC clusters has become important. In order to reduce
           the energy consumption of the microprocessor, many high performance microprocessors have a Dynamic Voltage and
           Frequency Scaling (DVFS) mechanism. This paper proposes a new DVFS method called the Code-Instrumented
           Runtime (CIRuntime) DVFS method, in which a combination of voltage and frequency, which is called a P-State, is
           managed in the instrumented code at runtime. The proposed CI-Runtime DVFS method achieves better energy saving
           than the Interrupt based Runtime DVFS method, since it selects the appropriate P-State in each defined region based
           on the characteristics of program execution. Moreover, the proposed CI-Runtime DVFS method is more useful than the
           Static DVFS method, since it does not acquire exhaustive profiles for each P-State. The method consists of two parts.
           In the first part of the proposed CI-Runtime DVFS method, the instrumented codes are inserted by defining regions that
           have almost the same characteristics. The instrumented code must be inserted at the appropriate point, because the

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                       Elysium Technologies Private Limited
                       ISO 9001:2008 A leading Research and Development Division
                       Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

           performance of the application decreases greatly if the instrumented code is called too many times in a short period. A
           method for automatically defining regions is proposed in this paper. The second part of the proposed method is the
           energy adaptation algorithm which is used at runtime. Two types of DVFS control algorithms energy adaptation with
           estimated energy consumption and energy adaptation with only performance information, are compared. The
           proposed CIRuntime DVFS method was implemented on a power-scalable PC cluster. The results show that the
           proposed CI-Runtime with energy adaptation using estimated energy consumption could achieve an energy saving of
           14.2% which is close to the optimal value, without obtaining exhaustive profiles for every available P-State setting

13    Rigel: A Scalable and Lightweight Replica Selection Service for Replicated Distributed File System

           Replicated distributed file systems are designed to store large file reliably across lots of machines, and it arouse the
           problem of selecting the nearest replica for clients. In this paper, we propose Rigel, a Network Coordinates (NC) based
           nearest replica selection service, which is an effective infrastructure to select the nearest replica for client in a scalable
           and lightweight way. Our simulation results have demonstrated that Rigel can at least reduce the read latency between
           clients and replicas by 20% when compared to the replica selection strategy in Hadoop Distributed File System.

14    Polyphony: A Workflow Orchestration Framework for Cloud Computing

           Cloud Computing has delivered unprecedented compute capacity to NASA missions at affordable rates. Missions like
           the Mars Exploration Rovers (MER) and Mars Science Lab (MSL) are enjoying the elasticity that enables them to
           leverage hundreds, if not thousands, or machines for short durations without making any hardware procurements. In
           this paper, we describe Polyphony, a resilient, scalable, and modular framework that efficiently leverages a large set of
           computing resources to perform parallel computations. Polyphony can employ resources on the cloud, excess
           capacity on local machines, as well as spare resources on the supercomputing center, and it enables these resources
           to work in concert to accomplish a common goal. Polyphony is resilient to node failures, even if they occur in the
           middle of a transaction. We will conclude with an evaluation of a production-ready application built on top of
           Polyphony to perform image-processing operations of images from around the solar system, including Mars, Saturn,
           and Titan

15    Policy-based Management of QoS in Service Aggregations

           We present a policy-centered QoS meta-model which can be used by service providers and consumers alike to
           express capabilities, requirements, constraints, and general management characteristics relevant for SLA
           establishment in service aggregations. We also provide a QoS assertion model which is generic, domain-independent
           and conforming to the WS-Policy syntax and semantics. Using these two models, assertions over acceptable and
           required values for QoS properties can be expressed across the different service layers and service roles.

16    Planning Large Data Transfers in Institutional Grids

           In grid computing, many scientific and engineering applications require access to large amounts of distributed data.
           The size and number of these data collections has been growing rapidly in recent years. The costs of data
           transmission take a significant part of the global execution time. When communication streams flow concurrently on
           shared links, transport control protocols have issues allocating fair bandwidth to all the streams, and the network
           becomes suboptimally used. One way to deal with this situation is to schedule the communications in a way that will
           induce an optimal use of the network. We focus on the case of large data transfers that can be completely described at
           the initialization time. In this case, a plan of data migration can be computed at initialization time, and then executed.
           However, this computation phase must take a small time when compared to the actual execution of the plan. We

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                    Elysium Technologies Private Limited
                    ISO 9001:2008 A leading Research and Development Division
                    Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

           propose a best effort solution, to compute approximately, based on the uniform random sampling of possible
           schedules, a communication plan. We show the effectiveness of this approach both theoretically and by simulations.

17    Over dimensioning for Consistent Performance in Grids

           Grid users may experience inconsistent performance due to specific characteristics of grids, such as fluctuating
           workloads, high failure rates, and high resource heterogeneity. Although extensive consistent performance remains
           largely an unsolved problem. In this study we use overdimensioning, a simple but cost-ineffective solution, to solve
           the performance inconsistency problem in grids. To this end, we propose several overdimensioning strategies, and we
           evaluate these strategies through simulations with workloads consisting of Bag-of-Tasks. We find that although
           overdimensioning is a simple solution, it is a viable solution to provide consistent performance in grids.

18    On the use of machine learning to predict the time and resources consumed by Applications

           Most datacenters, clouds and grids consist of multiple generations of computing systems, each with different
           performance profiles, posing a challenge to job schedulers in achieving the best usage of the infrastructure. A useful
           piece of information for scheduling jobs, typically not available, is the extent to which applications will use available
           resources once they are executed. This paper comparati vely assesses the suitability of several machine learning
           techniques for predicting spatiotemporal utilization of resources by applications. Modern machine learning techniques
           able to handle large number of attributes are used, taking into account application- and system-specific attributes (e.g.,
           CPU micro architecture, size and speed of memory and storage, input data characteristics and input parameters). The
           work also extends an existing classification tree algorithm, called Predicting Query Runtime (PQR), to the regression
           problem by allowing the leaves of the tree to select the best regression method for each collection of data on leaves.
           The new method (PQR2) yields the best average percentage error, predicting execution time, memory and disk
           consumption for two bioinformatics applications, BLAST and RAxML, deployed on scenarios that differ in system and
           usage. In specific scenarios where usage is a non-linear function of system and application attributes, certain
           configurations of two other machine learning algorithms, Support Vector Machine and k-nearest neighbors, also yield
           competitive results. In addition, experiments show that the inclusion of system performance and application-specific
           attributes also improves the performance of machine learning algorithms investigated.

19    On the Origin of Services - Using RIDDL for Description, Evolution and Composition of REST ful Services

           WSDL as a description language serves as the foundation for a host of technologies ranging from semantic annotation
           to composition and evolution. Although WSDL is well understood and in widespread use, it has its shortcomings
           which are partly imposed by the way how the SOAP protocol works and is used. Cloud computing fostered the rise of
           Representational State Transfer (REST), a return to arguably simpler but more flexible ways to expose services solely
           through the HTTP protocol. For RESTful services many achievements that have been acquired have to be rethought
           and reapplied. We perceive that one of the biggest hurdles is the lack of a dedicated and simple yet powerful language
           to describe RESTful services. In this paper we want to introduce RIDDL, a flexible and extensible XML based language
           that not only allows to describe services but also covers the basic requirements of service composition and evolution
           to provide a clean foundation for further developments

20    Identification, modelling and prediction of non-periodic bursts in workloads

           Non-periodic bursts are prevalent in workloads of large scale applications. Existing workload models do not predict
           such non-periodic bursts very well because they mainly focus on repeatable base functions. We begin by showing the
           necessity to include bursts in workload models by investigating their detrimental effects in a petabyte-scale distributed

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                    Elysium Technologies Private Limited
                    ISO 9001:2008 A leading Research and Development Division
                    Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

           data management system. This work then makes three contributions. First, we analyse the accuracy of five existing
           prediction models on workloads of data and computational grids, as well as derived synthetic workloads. Second, we
           introduce a novel averages-based model to predict bursts in arbitrary workloads. Third, we present a novel metric;
           mean absolute estimated distance, to assess the prediction accuracy of the model. Using our model and metric, we
           show that burst behaviour in workloads can be identified, quantified and predicted independently of the underlying
           base functions. Furthermore, our model and metric are applicable to arbitrary kinds of burst prediction for time series

21    A Categorisation of Cloud Computing Business Models

           This paper reviews current cloud computing business models and presents proposals on how organisations can
           achieve sustainability by adopting appropriate models. We classify cloud computing business models into eight types:
           (1) Service Provider and Service Orientation; (2) Support and Services Contracts; (3) In-House Private Clouds; (4) All-
           In-One Enterprise Cloud; (5) One-Stop Resources and Services; (6) Government funding; (7) Venture Capitals; and (8)
           Entertainment and Social Networking. Using the Jericho Forum’s ‘Cloud Cube Model’ (CCM), the paper presents a
           summary of the eight business models. We discuss how the CCM fits into each business model, and then based on
           this discuss each business model’s strengths and weaknesses. We hope adopting an appropriate cloud computing
           business model will help organizations investing in this technology to stand firm in the economic downturn.

22    A Fair Decentralized Scheduler for Bag-of-tasks Applications on Desktop Grids

           Desktop Grids have become very popular nowadays, with projects that include hundred of thousands computers.
           Desktop grid scheduling faces two challenges. First, the platform is volatile, since users may reclaim their computer at
           any time, which makes centralized schedulers inappropriate. Second, desktop grids are likely to be shared among
           several users, thus we must be particularly careful to ensure a fair sharing of the resources. In this paper, we propose
           a decentralized scheduler for bag of- tasks applications on desktop grids, which ensures a fair and efficient use of the
           resources. It aims to provide a similar share of the platform to every application by minimizing their maximum stretch,
           using completely decentralized algorithms and protocols

23    A Heuristic Query Optimization Approach for Heterogeneous Environments

           In a rapidly growing digital world the ability to discover, query and access data efficiently is one of the major
           challenges we are struggling today. Google has done a tremendous job by enabling casual users to easily and
           efficiently search for Web documents of interest. However, a comparable mechanism to query data stocks located in
           distributed databases is not available yet. Therefore our research focuses on the query optimization of distributed
           database queries, considering a huge variety on different infrastructures and algorithms. This paper introduces a novel
           heuristic query optimization approach based on a multi-layered blackboard mechanism. Moreover, a short evaluation
           scenario proofs our investigations that even small changes in the structure of a query execution tree (QET) can lead to
           significant performance improvements

24    A Proximity-Based Self-Organizing Framework for Service Composition and Discovery

           The ICT market is experiencing an important shift from the request/provisioning of products toward a service-oriented
           view where everything (computing, storage, applications) is provided as a network-enabled service. It often happens
           that a solution to a problem cannot be offered by a single service, but by composing multiple basic services in a
           workflow. Service composition is indeed an important research topic that involves issues such as the design and
           execution of a workflow and the discovery of the component services on the network. This paper deals with the latter
           issue and presents an ant-inspired framework that facilitates collective discovery requests, issued to search a network

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                     Elysium Technologies Private Limited
                     ISO 9001:2008 A leading Research and Development Division
                     Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

           for all the basic services that will compose a specific workflow. The idea is to reorganize the services so that the
           descriptors of services that are often used together are placed in neighbor peers. This helps a single query to find
           multiple basic services, which decreases the number of necessary queries and, consequently, lowers the search time
           and the network load.

25    A Realistic Integrated Model of Parallel System Workloads

           Performance evaluation is a significant step in the study of scheduling algorithms in large-scale parallel systems
           ranging from supercomputers to clusters and grids. One of the key factors that have a strong effect on the evaluation
           results is the workloads (or traces) used in experiments. In practice, several researchers use unrealistic synthetic
           workloads in their scheduling evaluations because they lack models that can help generate realistic synthetic
           workloads. In this paper we proposea full model to capture the following characteristics of real parallel system
           workloads: 1) long range dependence in the job arrival process, 2) temporal and spatial burstiness, 3) bag-oftasks
           behaviour, and 4) correlation between the runtime and the number of processors. Validation of our model with real
           traces shows that our model not only captures the above characteristics but also fits the marginal distributions well. In
           addition, we also present an approach to quantify burstiness in a job arrival process (temporal) as well as burstiness in
           the load of a trace (spatial).

26    Applying software engineering principles for designing Cloud@Home

           Cloud computing is the “new hot” topic in IT. It combines the maturity of Web technologies (networking, APIs,
           semantic Web 2.0, languages, protocols and standards such as WSDL, SOAP, REST, WS-BPEL, WS-CDL, IPSEC, etc.),
           the robustness of geographically distributed computing paradigm (Network, Internet and Grid computing) and self-
           management capabilities (Autonomic computing), with the capacity to manage quality of services by monitoring,
           metering, quantifying and billing computing resources and costs (Utility computing). Those have made possible and
           cost-effective for businesses, small and large, to completely host data- and application centers virtually... in the Cloud.
           Our idea of Cloud proposes a new dimension of computing, in which everyone, from single users to communities and
           enterprises, can, on one hand, share resources and services in a transparent way and, on the other hand, have access
           to and use such resources and services adaptively to their requirements. Such an enhanced concept of Cloud,
           enriching the original one with Volunteer computing and interoperability challenges, has been proposed and
           synthesized in Cloud@Home. The complex infrastructure implementing Cloud@Home has to be supported by an
           adequate distributed middleware able to manage it. In order to develop such a complex distributed software, in this
           paper we apply software engineering principles such as rigor, separation of concerns and modularity. Our idea is,
           starting from a software engineering approach, to identify and separate concerns and tasks, and then to provide both
           the software middleware architecture and the hardware infrastructure following the hw/sw co-design technique widely
           used in embedded systems. In this way we want to primarily identify and specify the Cloud@Home middleware
           architecture and its deployment into a feasible infrastructure; secondly, we want to propose the development process
           we follow, based on hardware/software co-design, in distributed computing contexts, demonstrating its effectiveness
           through Cloud@Home

27    Cache Performance Optimization for Processing XML-based Application Data on Multi-core Processors

           There is a critical need to develop new programming paradigms for grid middleware tools and applications to harness
           the opportunities presented by emerging multi-core processors. Implementations of grid middleware and applications
           that do not adapt to the programming paradigm when executing on emerging processors can severely impact the
           overall performance. In this paper we focus on the utilization of the L2 cache, which is a critical shared resource on
           Chip Multiprocessors. The access pattern of the shared L2 cache, which is dependent on how the application
           schedules and assigns processing work to each thread, can either enhance or undermine the ability to hide memory
           latency on a multi-core processor. None of the current grid simulators and emulators provides feedback and fine-
           grained performance data that is essential for a detailed analysis. In this paper, using the feedback from an emulation

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                    Elysium Technologies Private Limited
                    ISO 9001:2008 A leading Research and Development Division
                    Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

           framework, we present performance analysis and provide recommendations on how processing threads can be
           scheduled on multi-core nodes to enhance the performance of a class of grid applications that requires processing of
           large-scale XML data. In particular, we discuss the gains associated with the use of the adaptations we have made to
           the Cache-Affinity and Balanced-Set scheduling algorithms to improve L2 cache performance, and hence the overall
           application execution time.

28    Cluster Computing as an Assembly Process

           This poster will present a coordination language for distributed computing and will discuss its application to cluster
           computing. It will introduce a programming technique of cluster computing whereby application components are
           completely dissociated from the communication/coordination infrastructure (unlike MPI-style message passing), and
           there is no shared memory either, whether virtual or physical (unlike Open-MP). Cluster computing is thus presented
           as something that happens as late as the assembly stage: components are integrated into an application using a new
           form of network glue: Single-Input, Single-Output (SISO) asynchronous, nondeterministic coordination

      Data Injection at Execution Time in Grid Environments using Dynamic Data Driven Application System for
29    Wildland Fire Spread Prediction

           In our research work, we use two Dynamic Data Driven Application System (DDDAS) methodologies to predict wildfire
           propagation. Our goal is to build a system that dynamically adapts to constant changes in environmental conditions
           when a hazard occurs and under strict real-time deadlines. For this purpose, we are on the way of building a parallel
           wildfire prediction method, which is able to assimilate real-time data to be injected in the prediction process at
           execution time. In this paper, we propose a strategy for data injection in distributed environments

      D-Cloud: Design of a Software Testing Environment for Reliable Distributed Systems Using Cloud
30    Computing Technology

           In this paper, we propose a software testing environment, called D-Cloud, using cloud computing technology and
           virtual machines with fault injection facility. Nevertheless, the importance of high dependability in a software system
           has recently increased, and exhaustive testing of software systems is becoming expensive and time-consuming, and,
           in many cases, sufficient software testing is not possible. In particular, it is often difficult to test parallel and
           distributed systems in the real world after deployment, although reliable systems, such as high availability servers, are
           parallel and distributed systems. D-Cloud is a cloud system which manages virtual machines with fault injection
           facility. D-Cloud sets up a test environment on the cloud resources using a given system configuration file and
           executes several tests automatically according to a given scenario. In this scenario, D-Cloud enables fault tolerance
           testing by causing device faults by virtual machine. We have designed the D-Cloud system using Eucalyptus software
           and a description language for system configuration and the scenario of fault injection written in XML. We found that
           the D-Cloud system, which allows a user to easily set up and test a distributed system on the cloud and effectively
           reduces the cost and time of testing.

31    Design and Implementation of an efficient Two-level Scheduler for Cloud Computing Environment

           Cloud computing focuses on delivery of reliable, faulttolerant and scalable infrastructure for hosting Internet based
           application services. Our work presents the implementation of an efficient Quality of Service (QoS) based meta-
           scheduler and Backfill strategy based light weight Virtual Machine Scheduler for dispatching jobs. The user centric
           meta-scheduler deals with selection of proper resources to execute high level jobs. The system centric Virtual Machine
           (VM) scheduler optimally dispatches the jobs to processors for better resource utilization. We also present our
           proposals on scheduling heuristics that can be incorporated at data center level for selecting ideal host for VM

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                    Elysium Technologies Private Limited
                    ISO 9001:2008 A leading Research and Development Division
                    Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

           creation. The implementation can be further extended at the host level, using Inter VM scheduler for adaptive load
           balancing in cloud environment

32    Discovering Piecewise Linear Models of Grid Workload

           Despite extensive research focused on enabling QoS for grid users through economic and intelligent resource
           provisioning, no consensus has emerged on the most promising strategies. On top of intrinsically challenging
           problems, the complexity and size of data has so far drastically limited the number of comparative experiments. An
           alternative to experimenting on real, large, and complex data, is to look for well-founded and parsimonious
           representations. This study is based on exhaustive information about the gLite-monitored jobs from the EGEE grid,
           representative of a significant fraction of e-science computing activity in Europe. Our main contributions are twofold.
           First we found that workload models for this grid can consistently be discovered from the real data, and that limiting
           the range of models to piecewise linear time series models is sufficiently powerful. Second, we present a
           bootstrapping strategy for building more robust models from the limited samples at hand

33    Dynamic Auction Mechanism for Cloud Resource Allocation

           We propose a dynamic auction mechanism to solve the allocation problem of computation capacity in the environment
           of cloud computing. Truth-telling property holds when we apply a second-priced auction mechanism into the resource
           allocation problem. Thus, the cloud service provider (CSP) can assure reasonable profit and efficient allocation of its
           computation resources. In the cases that the number of users and resources are large enough, potential problems in
           second-priced auction mechanism, including the variation of revenue, will not be weighted seriously since the law of
           large number holds in this case

34    Dynamic Job-Clustering with Different Computing Priorities for Computational Resource Allocation

           The diversity of job characteristics such as unstructured/unorganized arrival of jobs and priorities, could lead to
           inefficient resource allocation. Therefore, the characterization of jobs is an important aspect worthy of investigation;
           which enables judicious resource allocation decisions achieving two goals (performance and utilization) and improves
           resource availability

35    Dynamic Resource Pricing on Federated Clouds

           Current large distributed systems allow users to share and trade resources. In cloud computing, users purchase
           different types of resources from one or more resource providers using a fixed pricing scheme. Federated clouds, a
           topic of recent interest, allows different cloud providers to share resources for increased scalability and reliability.
           However, users and providers of cloud resources are rational and maximize their own interest when consuming and
           contributing shared resources. In this paper, we present a dyanmic pricing scheme suitable for rational users requests
           containing multiple resource types. Using simulations, we compare the efficiency of our proposed strategy-proof
           dynamic scheme with fixed pricing, and show that user welfare and the percentage of successful requests is increased
           by using dynamic pricing.

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                    Elysium Technologies Private Limited
                    ISO 9001:2008 A leading Research and Development Division
                    Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

36    Dynamic TTL-Based Search In Unstructured Peer-to-Peer Networks

           Resource discovery is a challenging issue in unstructured peer-to-peer networks. Blind search approaches, including
           flooding and random walks, are the two typical algorithms used in such systems. Blind flooding is not scalable
           because of its high communication cost. On the other hand, the performance of random walks approaches largely
           depends on the random choice of walks. Some informed mechanisms use additional information, usually obtained
           from previous queries, for routing. Such approaches can reduce the traffic overhead but they limit the query coverage.
           Furthermore, they usually rely on complex protocols to maintain information at each peer. In this paper, we propose
           two schemes which can be used to improve the search performance in unstructured peer-to-peer networks. The first
           one is a simple caching mechanism based on resource descriptions. Peers that offer resources send periodic
           advertisement messages. These messages are stored into a cache and are used for routing requests. The second
           scheme is a dynamic Time-To-Live (TTL) enabling messages to break their horizon. Instead of decreasing the query
           TTL by 1 at each hop, it is decreased by a value v such as 0 < v < 1. Our aim is not only to redirect queries towards the
           right direction but also to stimulate them in order to reliably discover rare resources. We then propose a Dynamic
           Resource Discovery Protocol (DRDP) which uses the two previously described mechanisms. Through extensive
           simulations, we show that our approach achieves a high success rate while incurring a low search traffic

37    Energy Efficient Allocation of Virtual Machines in Cloud Data Centers

           Rapid growth of the demand for computational power has led to the creation of large-scale data centers. They
           consume enormous amounts of electrical power resulting in high operational costs and carbon dioxide emissions.
           Moreover, modern Cloud computing environments have to provide high Quality of Service (QoS) for their customers
           resulting in the necessity to deal with power-performance trade-off. We propose an efficient resource management
           policy for virtualized Cloud data centers. The objective is to continuously consolidate VMs leveraging live migration
           and switch off idle nodes to minimize power consumption, while providing required Quality of Service. We present
           evaluation results showing that dynamic reallocation of VMs brings substantial energy savings, thus justifying further
           development of the proposed policy

38    Enhanced Paxos Commit for Transactions on DHTs

           Key/value stores which are built on structured overlay networks often lack support for atomic transactions and strong
           data consistency among replicas. This is unfortunate, because consistency guarantees and transactions would allow a
           wide range of additional application domains to benefit from the inherent scalability and fault-tolerance of DHTs. The
           Scalaris key/value store supports strong data consistency and atomic transactions. It uses an enhanced Paxos
           Commit protocol with only four communication steps rather than six. This improvement was possible by exploiting
           information from the replica distribution in the DHT. Scalaris enables implementation of more reliable and scalable
           infrastructure for collaborative Web services that require strong consistency and atomic changes across multiple

39    Expanding the Cloud: A component-based architecture to application deployment on the Internet

           Cloud Computing allows us to abstract distributed, elastic IT resources behind an interface that promotes scalability
           and dynamic resource allocation. The boundary of this cloud sits outside the application and the hardware that hosts
           it. For the end user, a web application deployed on a cloud is presented no differently to a web application deployed on
           a stand-alone web server. This model works well for web applications but fails to cater for distributed applications
           containing components that execute both locally for the user and remotely using non-local resources. This research
           proposes extending the concept of the cloud to encompass not only server-farm resources but all resources

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                    Elysium Technologies Private Limited
                    ISO 9001:2008 A leading Research and Development Division
                    Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

           accessible by the user. This brings the resources of the home PC and personal mobile devices into the cloud and
           promotes the deployment of highly-distributed component based applications with fat user interfaces. This promotes
           the use of the Internet itself as a platform. We compare this to the standard Web 2.0 approach and show the benefits
           that deploying fat-client component based systems provide over classic web applications. We also describe the
           benefits that expanding the cloud provides to component migration and resources utilisation.

40    FaReS: Fair Resource Scheduling for VMM-Bypass InfiniBand Devices

           In order to address the high performance I/O needs of HPC and enterprise applications, modern interconnection
           fabrics, such as InfiniBand and more recently, 10GigE, rely on network adapters with RDMA capabilities. In virtualized
           environments, these types of adapters are configured in a manner that bypasses the hypervisor and allows virtual
           machines (VMs) direct device access, so that they deliver near-native low-latency/high bandwidth I/O. One challenge
           with the bypass approach is that it causes the hypervisor to lose control over VM-device interactions, including the
           ability to monitor such interactions and to ensure fair resource usage by VMs. Fairness violations, however, permit low
           priority VMs to affect the I/O allocations of other higher priority VMs and more generally, lack of supervision can lead
           to inefficiencies in the usage of platform resources. This paper describes the FaReS system-level mechanisms for
           monitoring VMs’ usage of bypass I/O devices. Monitoring information acquired with FaReS is then used to adjust VMM-
           level scheduling in order to improve resource utilization and/or ensure fairness properties across the sets of VMs
           sharing platform resources. FaReS employs a memory introspection-based tool for asynchronously monitoring VMM-
           bypass devices, using InfiniBand HCAs as a concrete example. FaReS and its very low overhead

41    File- Access Characteristics of Data-intensive Workflow Applications

           This paper studies five real-world data intensive workflow applications in the fields of natural language processing,
           astronomy image analysis, and web data analysis. Data intensive workflows are increasingly becoming important
           applications for cluster and Grid environments. They open new challenges to various components of workflow
           execution environments including job dispatchers, schedulers, file systems, and file staging tools. Their impacts on
           real workloads are largely unknown. Understanding characteristics of real-world workflow applications is a required
           step to promote research in this area. To this end, we analyse real-world workflow applications focusing on their file
           access patterns and summarize their implications to schedulers and file system/staging designs.

42    Fine-Grained Profiling for Data-IntensiveWorkflows

           Profiling is an effective dynamic analysis approach to investigate complex applications. ParaTrac is a user-level
           profiler using file system and process tracing techniques for data intensive workflow applications. In two respects
           ParaTrac helps users refine the orchestration of workflows. First, the profiles of I/O characteristics enable users to
           quickly identify bottlenecks of underlying I/O subsystems. Second, ParaTrac can exploit fine-grained data-processes
           interactions in workflow execution to help users understand, characterize, and manage realistic data-intensive
           workflows. Experiments on thoroughly profiling Montage workflow demonstrate that ParaTrac is scalable to tracing
           events of thousands of processes and effective in guiding fine-grained workflow scheduling or workflow management
           systems improvements

43    Framework for Efficient Indexing and Searching of Scientific Metadata

           A seamless and intuitive data reduction capability for the vast amount of scientific metadata generated by experiments
           is critical to ensure effective use of the data by domain specific scientists. The portal environments and scientific

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                    Elysium Technologies Private Limited
                    ISO 9001:2008 A leading Research and Development Division
                    Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

           gateways currently used by scientists provide search capability that is limited to the predefined pull-down menus and
           conditions set in the portal interface. Currently, data reduction can only be effectively achieved by scientists who have
           developed expertise in dealing with complex and disparate query languages. A common theme in our discussions with
           scientists is that data reduction capability, similar to web search in terms of easeof- use, scalability, and
           freshness/accuracy of results, is a critical need that can greatly enhance the productivity and quality of scientific
           research. Most existing search tools are designed for exact string matching, but such matches are highly unlikely
           given the nature of metadata produced by instruments and a user’s inability to recall exact numbers to search in very
           large datasets. This paper presents research to locate metadata of interest within a range of values. To meet this goal,
           we leverage the use of XML in metadata description for scientific datasets, specifically the NeXus datasets generated
           by the SNS scientists. We have designed a scalable indexing structure for processing data reduction queries. Web
           semantics and ontology based methodologies are also employed to provide an elegant, intuitive, and powerful free-
           form query based data reduction interface to end users

      Handling Recoverable Temporal Violations in Scientific Workflow Systems: A Workflow Rescheduling
44    Based Strategy

           Due to the complex nature of scientific workflow systems, the violations of temporal QoS constraints often take place
           and may severely affect the usefulness of the execution’s results. Therefore, to deliver satisfactory QoS, temporal
           violations need to be recovered effectively. However, such an issue has so far not been well addressed. In this paper,
           we first propose a probability based temporal consistency model to define the temporal violations which are
           statistically recoverable by light-weight exception handling strategies. Afterwards, a novel Ant Colony Optimisation
           based two-stage workflow local rescheduling strategy (ACOWR) is proposed to handle detected recoverable temporal
           violations in an automatic and cost-effective fashion. The simulation results demonstrate the excellent performance of
           our handling strategy in reducing both local and global temporal violation rates

45    High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand

           GridFTP, designed by using the Globus XIO framework, is one of the most popular methods for performing data
           transfers in the Grid environment. But the performance of GridFTP in WAN is limited by the relatively low
           communication bandwidth offered by existing network protocols. On the other hand, modern interconnects such as
           InfiniBand, with many advanced communication features such as zero- copy protocol and RDMA operations, can
           greatly improve communication efficiency. In this paper, we take on the challenge of combining the ease of use of the
           Globus XIO framework and the high performance achieved through InfiniBand communication, thereby natively
           supporting GridFTP over InfiniBand-based networks. The Advanced Data Transfer Service (ADTS), designed in our
           previous work, provides the low-level InfiniBand support to the Globus XIO layer. We introduce the concepts of I/O
           staging in the Globus XIO ADTS driver to achieve efficient disk based data transfers. We evaluate our designs in both
           LAN and WAN environments using microbenchmarks as well as communication traces from several real-world
           applications. We also provide insights into the communication performance with some in-depth analysis. Our
           experimental evaluation shows a performance improvement of up to 100% for ADTS-based data transfers as opposed
           to TCP- or UDP-based ones in LAN and high-delay WAN scenarios

      Linear Combinations of DVFS-Enabled Processor Frequencies to Modify the Energy-Aw are Scheduling
46    Algorithms

           The energy consumption issue in distributed computing systems has become quite critical due to environmental
           concerns. In response to this, many energy-aware scheduling algorithms have been developed primarily by using the
           dynamic voltage-frequency scaling (DVFS) capability incorporated in recent commodity processors. The majority of
           these algorithms involve two passes: schedule generation and slack reclamation. The latter is typically achieved by
           lowering processor frequency for tasks with slacks. In this paper, we revisit this energy reduction technique from a
           different perspective and propose a new slack reclamation algorithm which uses a linear combination of the maximum
           and minimum processor frequencies to decrease energy consumption. This algorithm has been evaluated based on
           results obtained from experiments with three different sets of task graphs: 1,500 randomly generated task graphs, and

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com
                    Elysium Technologies Private Limited
                    ISO 9001:2008 A leading Research and Development Division
                    Madurai | Chennai | Kollam | Ramnad | Tuticorin | Singapore

           300 task graphs of each of two real-world applications (Gauss-Jordan and LU decomposition). The results show that
           the amount of energy saved in the proposed algorithm is 13.5%, 25.5% and 0.11% for random, LU decomposition and
           Gauss-Jordan task graphs, respectively, these percentages for the reference DVFSbased algorithm are 12.4%, 24.6%
           and 0.1%, respectively

      23.97GHz CMOS Distributed Voltage Controlled Oscillators with Inverter Gain Cells and Frequency Tuning
47    by Body Bias and MOS Varactors Concurrently

           Tunable VCOs operating around 24GHz in 0.18µm CMOS are reported. Simple CMOS inverters are used as gain stages
           and tuning is achieved with a novel Method using both body-bias as well as MOS varactors concurrently and
           compared for Performances. The novel tuning method allows for a wider tuning range than using a single method.
           Here forward body bias (FBB) type tuning of p-FETs has 9- 10 times higher tuning bandwidth as compared to MOS
           varactors tuning when the latter is connected in series (before output collection point) but equal or nearly equal tuning
           when the Varactor pair is connected in parallel (to drain transmission line). Six monolithically integrated novel
           distributed voltage ontrolled oscillators (D-VCOs) with a novel gain cell comprising of CMOS inverter are designed.
           Top Layer metal is used for coplanar waveguide (CPW) for onchip inductors. First D-VCO OSC-1 has 3-stages of the
           gain cell and oscillating at 23.97GHz, the second D-VCO OSC-2 has 4-stages of gain cell and oscillating at 18.64GHz,
           both K-band oscillators use body bias variation of p-FETs for wide frequency tuning. For further tuning after body bias
           type of tuning, MOS Varactors are added in series to OSC-1 and OSC-2 resulting in designs respectively OSC-3 and
           OSC-4, while in parallel resulting in designs respectively OSC-3a and OSC-4a. OSC-3 is oscillating at 23.53GHz and
           OSC-4 is oscillating at 18.09GHz. OSC-3a is oscillating at 22.79GHz with 340MHz tuning by each of these two tuning
           techniques (doubling of tuning bandwidth as total tuning is 680MHz). OSC-4a is oscillating at 17.77GHz (resulting Ku-
           band VCO from K-band for substantial design reuse) with 240MHz tuning by FBB and 200MHz tuning by Varactor pair
           (total tuning of 440MHz). The phase noise is reported at 1MHz offset from the carrier, for example it is -102.4dBc/Hz for
           18.64GHz D-VCO. These oscillators are emitting very low power in 2nd and 3rd harmonics.

 #230, Church Road, Anna Nagar, Madurai 625 020, Tamil Nadu, India
 (: +91 452-4390702, 4392702, 4390651
 Website: www.elysiumtechnologies.com,www.elysiumtechnologies.info
 Email: info@elysiumtechnologies.com

To top