Tenant Partitioning and Load Balancing in Stratos

					Tenant Partitioning and Load Balancing
in Stratos
{Service, Tenant, and Instance aware}
Load Balancing

                                        Feb 2011
Tenants, Services, and Instances ..
          May have multiple services in a single instance.
          Sticky Sessions – Many products need the sessions to be sticky.
Tenants, Services, and Instances
Architecture – Shared Nothing

✔ Parallel array of load balancers, load balanced by DNS round robin
✔ Each node can route any message.

✔ All state is kept at all LBs

✔ LB logic can be described as two functions

         ✔   (service,tenant, {other factors .. partitions}) → {node set}
         ✔   {nodeset} → node
✔   Hashmap in each load balancer, as an efficient routing table:
         ✔   {Service, Tenant, Partition*} → Concatenate as a String → Relevant Nodes.
         ✔   Clean the hashmap, when a node has failed or shutdown.
         ✔   Client IP remains same for a single request. Hence Client ip as the param.
         ✔   Sticky routing: tenant Id → node Id : {tenant Id, client IP} → node Id
✔   Sticky Sessions - Something Unique about the session -
             Session ID as an argument to a Hash function which always map requests from that
               session to a unique node.
         ✔   Not doing that through keeping state in the LB - Session data replication across nodes
               does not scale.
         ✔   A hash function - To map requests in the same session to a same node.
                  • Robust Hash Routing - Handling the case for server additions and failure.
                  • Consistant Hashing (used by Cassandra, Dynamo, Akami.)
✔ For the first request in the session, pick a node from the hashmap and send
   request to the node. We will also rewrite the response SESSION_ID and
   append the selected node to the session ID.
✔ For any subsequent requests, find the node by parsing the session ID.
Tenant Partitioning
  Tenant may have multiple partitions
       Tenants will define the partitions.
       Handling geographies through the partitions.
    When a new instance added, notify all the LB instances.
  Autoscaling (starting and terminating of instances based on the
load) moved out of LB.
 Adding a service interface to LB, so that external components
such as the Autoscaler can add assign nodes in to LB.
      Key in the load balancer algorithm.
      Whether the tenant has been loaded to a node.
Tenant Partitioning
    Tenants are loaded in demand and assign to a cluster
    State replication scope
  Unloading the unused tenants.
  Notifying all the load balancers when a tenant is loaded.
    by adding the Loadbalancers in to a group communication
   group and publishing a message to the group when a tenant is
    Keeping track of open slots and assign tenants - bin packing?
  Data partitioning - data about tenants
       Round Robin
    Always running an additional node to quickly load new tenant.
Thank you ..