IEEE Paper Template in A4 _V1_ by liuqingyan


									              Optimum utilization of lab resources using
                        Grid Computing
                                        Harshit Kharbanda, Anand Narayan

                                                                    1.    All the lab computers will be connected in a local grid
                                                                          using the intranet thereby making the communication
                                                                          faster and more efficient.
Abstract— In this era of cloud computing, grid computing is
slowly becoming obsolescent; but there are still certain areas      2.    A grid manager would be set up; the algorithm
where grid computing can yield profound results. The                      designed would be implemented by the grid manager.
resources in a university lab are not properly utilized;                  The grid manager will utilize the resources within the
moreover the programs that are run in the university labs do              level to the maximum level.
not require much resources. It is proposed to create a local
grid in the labs such that the lab resources can be efficiently     3.    The grid manager will have a grid administrator whose
utilized for running big and unwieldy applications.                       job would be to distribute the resources within the
                                                                          different labs; he will be able to assign more resources
We propose to create a grid manager with an algorithm                     to a particular lab for a particular time thereby
designed by us, which will split the computer resources                   increasing the processing capability of the lab
between all the buildings of the university. The grid will be             manifold.
formed over the intranet thereby making it fast and efficient.
The algorithm in the grid manager gives the grid administrator
the flexibility to divide the resources in the pool as required          III. SCENARIO AFTER IMPLEMENTATION OF THE GRID
between various schools.
                                                                  Once the grid is setup with the grid manager, the lab
Comparing the performance and resource utilization of labs
                                                                  computers will never stay idle. Power consumption will be
before and after the grid with the grid manager will yield
                                                                  reduced, and labs will be used to run cumbersome
contrasting results.
                                                                     Research work involving Image processing and pattern
                                                                  matching could be easily undertaken by the university, such
                                                                  research work requires a lot of processing power which can
                                                                  be offered only by super computers. Super computers use
                                                                  multiple cores and are highly expensive; setting up this grid
                      I. INTRODUCTION                             obviates the problem of super computers.
   The university lab computers are not utilized properly by         University labs will not only be used for normal classes but
the students, lab resources are being wasted. Moreover all the    students would like to utilize the immense capabilities of the
computers in the labs are always on standby mode, thereby         computer being offered to them in the labs. Classes can be
wasting power. Consumption of power is one of the prime           taken to teach unwieldy applications such as adobe Photoshop
goals of the Indian government and wastage of power in this       which is one of the many applications that require a lot of
era of power shortage is not acceptable. Using the concept of     processing capabilities.
grid computing optimum utilization of the resources can be           The university can use this grid for their own purpose of
done. University labs which are not known for their capability    data mining and database query processing in optimum time.
to handle cumbersome and unwieldy tasks can be used for this
very purpose. Many research areas in the universities require
processing capabilities that labs do not offer, using this
concept processing power as required can be distributed
according to need.

                    II. AIM OF THE PAPER                            .
   The aim of the paper is connecting the computers in the
labs to a local grid and proposing an algorithm for the grid
manager that can utilize the resources at hand effectively.
  Why Grid Computing?                                            of cpu-scavenging or shared computing or cycle stealing or
   Computer Science has grown to such an extent where
intense applications which require the processing power of a        Typically this technique uses the instruction cycles that
super computer have become very common. Processors to            would otherwise be wasted at night, during lunch, or even in
accommodate to this level of computing are very costly. So a     the scattered seconds throughout the day when the computer is
mechanism of using the resources at hand had to be               waiting for user input or slow devices.
                                                                   Volunteer computing projects use the CPU scavenging
  What is Grid Computing?                                        model almost exclusively.

   Grid Computing is the combination of the resources               In practice, participating computers also donate some
available at hand, the combined processing power is used to      supporting amount of disk storage space, RAM, and network
tackle the high processing demanding application                 bandwidth, in addition to raw CPU power. Since nodes are
                                                                 likely to go "off-line" from time to time, as their owners use
   Grid Computing is distributed cluster computing. The size     their resources for their primary purpose, this model must be
of it may vary depending upon the resources available at hand.   designed to handle such contingencies.
Grid Computing also takes leverage on the concept of parallel

   Grids have the advantage that we can pool resources from
computer belonging to different individuals or organisations.
Grids can also add computers which are geographically far
from each other.

   Public systems, those crossing administrative domains
(including different departments in the same organisation)
often result in the need to run on heterogeneous systems,
using different operating systems and hardware architectures.
With many languages, there is a trade off between investment
in software development and the number of platforms that can
be supported (and thus the size of the resulting network).
Cross-platform languages can reduce the need to make this
trade off, though potentially at the expense of high
performance on any given node (due to run-time interpretation
or lack of optimisation for the particular platform).

   Various middle ware projects have created generic                Grid computing is similar to cluster computing, but there
infrastructure, to allow diverse scientific and commercial       are a number of distinct differences. In a grid, there is no
projects to harness a particular associated grid, or for the     centralised management; computers in the grid are
purpose of setting up new grids.                                 independently controlled, and can perform tasks unrelated to
                                                                 the grid at the operator's discretion. The computers in a grid
   The middle ware can be seen as a layer between the
                                                                 are not required to have the same operating system or
hardware and the software. On top of the middle ware, a
                                                                 hardware. Grids are also usually loosely connected, often in a
number of technical areas have to be considered, and these
                                                                 decentralised network, rather than contained in a single
may or may not be middle ware independent. Example areas
                                                                 location, as computers in a cluster often are.
include SLA management, Trust and Security, Virtual
organisation management, License Management, Portals and
Data Management. These technical areas may be taken care of
                                                                    Grid computing is the predecessor of cluster computing and
in a commercial solution, though the cutting edge of each area
                                                                 cloud computing. The word cloud computing became famous
is often found within specific research projects examining the
                                                                 at around 2007. Grid Computing is a layer in the stack of a

                  IV. CPU-SCAVENGING
  Creating a Grid from the unused resources in a network of
participants, no matter the geographic distance is the process
           V.   FACTS AND ADVANTAGES OF THE GRID                                           VIII. GRID ENGINE

   Grids make research projects possible that formerly were           We need a master to manage all the resources. Which
impractical or unfeasible due to the physical location of vital    Resource is to be used at which time and which resource is to
resources.                                                         be kept silent at which time, all such decisions are taken by
   Using a grid, researchers in Great Britain, for example, can    the Grid Engine
conduct research that relies on databases across Europe,
instrumentation in Japan, and computational power in the              The Grid Engine basically manages the resources in the
United States. Making resources available in this way exposes      Grid and looks after taking into account the process of
students to the tools of the profession, facilitating new          accepting, scheduling, dispatching, and managing the remote
possibilities for research and instruction, particularly at the    and distributed execution of large numbers of standalone,
undergraduate level.                                               parallel or interactive user jobs. It also manages and schedules
                                                                   the allocation of distributed resources such as processors,
                                                                   memory, disk space, and software licenses.

            VI. SOME CHALLENGES TO THE GRID                           With the help of a Grid Manager we can optimise file
                                                                   transmission within the Grid, certain Grid Engine use a
   Security is a major problem when it comes to Grid               protocol called the Grid File Transfer Protocol
Computing. Imagine a major IT firm sharing its resources to a
student for research purposes. The firm will never know if any        Servers which host files to be downloaded, work on the
black hat hacker will get into their resources and steal           mirroring mechanism, these servers work on Grids. Certain
important information. Similarly Directors of Research             Grid Management Tools help with Mirror Packaging
Projects will be reluctant to take advantages of the               Technology, It is famously called as the GDMP (Grid Data
potentialities of the grid without assurances that the integrity   Mirror Packaging). This replication tool also provides a
of the project its data, and its participants will be protected.   simple interface to Mass Storage Systems (MSS).

   Another challenge facing grids is the complexity in             Grid Managers also look onto Managing the Security for the
building middle-ware structures that can knit together             Grid. This called as the GSI (Grid Security Infrastructure).
collections of resources to work as a unit across network          The Security is an overall package for messaging as well as
connections that often span oceans and continents. Scheduling      file transfer
the availability of IT resources connected to a grid can also
present new challenges to organisations that manage those             GridFTP allow for secure file transmission from one server
resources. Increasing standardisation of protocols addresses       to the other. Since Large amounts of Data are not stored on
some of the difficulty in creating smoothly functioning            Discs but on Mass Storage Systems (MSS). The Grid
                                                                   Management Tools has to provide a Mechanism for
                                                                   transferring files between storage elements and MSS.

                                                                      With the GDMP the grid administrator can control the file
                VII.     FUTURE OF THE GRID                        transferring via local catalogues and automatically replicate
                                                                   back-ends that for actions that are performed on a scheduled
   Today the number of Functional Grids is a small number.         basis.
But as the speed of the networks grow so will the number of
grids, especially in the domain of research community and
higher education , new opportunities will arise in the computer
domain that expose students to the tools and the application
which are directly related to the studies and reduce to the time
taken to compute highly complex data intensive jobs While
there are obvious advantages of solving a problem through
grid , certain day to day applications like Image editing with       .
Photoshop also will be one of the trade-offs for the Grid
                                                                                IX. THE ALGORITHM WITHIN THE
                                                                                       GRID MANAGER

                                                                     The grid manager is the most vital part of the Grid. As
                                                                   explained above the entire efficiency of the grid is dependent
                                                                   on the grid manager. The algorithm designed by us resides
inside the grid and utilizes the resources in the best possible        In the part II of the algorithm the resources of the pool does
way.                                                                not constitute of all the nodes available; Infact the pool
                                                                    consists of the total number of nodes in use subtracted from
                                                                    the total number of nodes available for the use.
                     X.    THE ALGORITHM                               This methodology guarantees that every node which is
                                                                    logged in has at least its own processing power. Later
   The algorithm assumes that all the resources available to its    depending on the priority number assigned to the nodes, the
use are of equal processing capabilities. This is specifically      remaining resources would be divided by the algorithm;
true in case of university labs where in a particular lab all the      1 Assume the total number of nodes available for use is n
computers have the same processing capabilities. Hence this                 and k number of nodes are logged in(occupied) ;
algorithm is particularly designed for lab-like situations, and             Where k > ((14%) *n)
yields best results when the conditions are similar to those as        2 The pool of resources will be now n-k; allocating 1
in the university labs.                                                     resource to each of its own node; that is each node will
                                                                            be allocated its own processing power in the beginning
                                                                            so that it can keep on working even with minimal
  The Algorithm has two parts to it:                                        speed;
                                                                       3     The priority number 1 will now be assigned F=x*k
                                   Part I                                   number of nodes more for its processing power,
                                                                            consequently it will have 1+(x*k) number of nodes
  In the first part of the algorithm it is assumed that the                 assigned to it, to complete its job.
number of nodes occupied is null; that is all the terminals are        4 .The priority number 2 will be assigned x*(k-F)
empty and nobody is logged into the lab.                                    number of nodes to carry out its work, thereby
  1. Assume that there are a total of n nodes in the lab                    allocating priority number 2, 1 + (x*(k-F)) of resources
      available for use. When the first user comes and logs                 to complete its job. This algorithm will subsequently
      into      a      terminal,      he       is     assigned              call F recursively for the other nodes logged on.
      F=x*n resources to carry out his work.                           5 In the part II of the algorithm whenever a new node
  2. Subsequently the number of resources available in the                  logs on to the server, the part II of the algorithm will
      pool now are n-(x*n).                                                 be reconstructed and started again, so that every node
  3. When another user enters the lab and logs into a                       is given minimum processing power to carry out its
      terminal, he is assigned x*(n-F) resources to carry out               computation.
      his work.                                                        6. x used here in PART II of the algorithm is same as the
  4. The priority number is decided based on the time for                   x used in PART I of the algorithm.
      which the user is logged onto his machine; the more
      the time he is logged in, the lesser will be his priority              XI. ASSIGNMENT OF PRIORITY NUMBER
      number.                                                       The priority number to the nodes will be assigned in
  5. This procedure continues and the function F is                 accordance with the time it is logged onto the server. The
      recursively called again and again until the number of        lesser the time the computer is logged onto the server, the
      nodes occupied (logged in) is less than or equal to           more will be its priority. The reasoning for this is that the
      (14%)*n.                                                      more amount of time a computer offers its resources to the
                                                                    pool, the higher should be its priority simply because it is
  x is:                                                             contributing for a longer time to the grid.
  30% for n=60;
  30%+y for n=n-z;
  30 %+( y/i) for n=n-z/i;

  Where:        i = 2,4,6,8.....
                y = 5%                                                                         XII. QUEING
                z = 10
                                                                    All the nodes with the number of nodes assigned to them are
                                                                    stored in a queue. This is done to manage the priority numbers
                                                                    of the computer efficiently. When a node with higher priority
                                                                    than the nodes already in the queue is logged in, first it will be
                              Part II
                                                                    assigned the number of nodes according to the number of its
                                                                    queue entry. Once it has been assigned the number of nodes,
The part one of the algorithm will only work until the number
of occupied nodes is less than or equal to (14%*n). Once the        within the queue its position will be swapped. Each node in
number of nodes increases and becomes more than (14%*n),            the queue will be shifted and it will be placed in the queue
                                                                    according to its priority number in the queue. Hence this
part II of the algorithm has to be applied.
                                                                    simple and very efficient method of queuing the nodes allows
proper priority assignment to the nodes and avoids                          grid. These are only a few fields; once the grid is
discrepancies                                                               formed research work will find its own uses of the
                       XIII. FEASIBILITY                               [3] The university can use the grid for query processing
                                                                            of its huge database. It has been shown that when
   The creation of grid within a particular university is not a             query processing is done using MPI libraries, and
very cumbersome task. The creation of grid requires good                    the query is split, query processing optimized and
networking skills. The size of the grid depends upon the                    yields results faster. Instead of using MPI libraries
problem at hand. For a university all the nodes can be used to              the local grid formed can be used for query
form a local grid over the intranet. As the grid can be formed              processing.
over the intranet it would be efficient and fast. Usually the
grid involves a dedicated server which in our case will be the       This list consists of only a few uses which are very
grid manager. This server as all servers will have an             prominent and visible easily, once the grid is formed, uses will
administrator whose task is already mentioned before.             surface themselves, computational power is of utmost
                                                                  requirement and everybody wants to finish off their jobs faster
                                                                  and quicker, the grid applied in the labs with the grid manager
                     XIV. USE AND NEED                            allows people to do just that.

   University labs are not considered a great source of
computational power; many of the research works are
hindered due to the in availability of this very computational
power. The formation of grid will help the university in
numerous ways, some of them being
     [1] Student’s perspective towards university labs will
          change entirely; students would want to come to the                              REFERENCES
          labs to try out programs involving considerable
          processor consumption.                                    [1] "IBM Solutions Grid for Business Partners: Helping
     [2] The grid formed in the university will encourage the           IBM Business Partners to Grid-enable applications for
          PhD students to undertake cumbersome research                 the next phase of e-business on demand" (PDF).
          work. For instance, in the field of biotechnology and     [2] "The Grid Café -- The place for everybody to learn
          bioinformatics pattern matching is involved in                about grid computing". CERN.
          almost all the research works, for this very use the      [3] The Grid Technology Cookbook
          PERL language was developed which has extremely           [4] Berstis, Viktors. "Fundamentals of Grid Computing".
          powerful regular expressions. This pattern matching           IBM
          requires immense processing power which many              [5] Ferreira, Luis; et al.. "Grid Computing Products and
          universities are unable to provide as immense                 Services". IBM.
          processing power can be given only by the
          supercomputer which is very expensive. Image
          processing is another field that can gain from this

To top