SETraining-Optimizations

Document Sample
SETraining-Optimizations Powered By Docstoc
					The Envelope/Payload
pattern



When to use it
How to implement it
Some optimizations
Problem Introduction
            Entry


           MyEntry
   Public Long attr1;
   Public Long attr2;
   Public Long attr3;
   …
   Public Long attrN;
Memory Utilization & Performance



                                                  Memory Utilization (Embedded)                                                                  Write Period (Remote)

                               200000                                                                                     12000
                               180000
Number of objects in 64m JVM




                                                                                                                          10000
                               160000
                               140000
                                                                                                                          8000




                                                                                                         Load time (mS)
                               120000
                               100000                                                   Raw Attributes                    6000                                                          Raw Attributes
                               80000
                                                                                                                          4000
                               60000
                               40000
                                                                                                                          2000
                               20000
                                   0                                                                                         0
                                        0    20      40      60      80     100   120                                             0   20        40      60       80         100   120
                                                     Number of attributes                                                                  Number of attributes per entry




                                            An entry with many attributes uses up much memory
                                            internal to the space. This is a critical issue at PayPal. The
                                            memory usage is proportional to the number of attributes.
Envelope/Payload Pattern – reducing the attributes exposed to the
space

         Entry                         Serializable


           Envelope                      MyEntry
                                Public Long attr1;
 Public Long attr1;
                                Public Long attr2;
 Public Serializable payload;   Public Long attr3;
                                …
                                Public Long attrN;
Raw Entry vs Envelope



                                                 Memory Utilization (Embedded)                                                                    Write Period (Remote)

                               200000                                                                                     25000
                               180000
Number of objects in 64m JVM




                               160000                                                                                     20000
                               140000




                                                                                                         Load time (mS)
                               120000                                                                                     15000
                                                                                       Raw Attributes                                                                               Raw Attributes
                               100000
                                                                                       Simple Envelope                                                                              Simple Envelope
                               80000                                                                                      10000
                               60000
                               40000                                                                                      5000
                               20000
                                   0                                                                                         0
                                        0   20      40     60      80     100    120                                              0   20        40      60      80      100   120
                                                   Number of attributes                                                                    Number of attributes per entry




                                                                                                                                   Envelope’s payload
                                   Less Attributes                                                                            serialized/deserialized a lot –
                                    means better                                                                          performance inversely proportional to
                                      memory                                                                               number of attributes in this class 
                                    utilization 
Serialization differences
•   So why does the serialization of the payload cost so much, yet we don’t see
    that with the raw attribute entry?
•   This is because of the different serialization modes – the raw attribute entry
    is using GigaSpaces light serialization (serialization-type 0) – whereas the
    simple envelope/payload is using full java serialization.
•   Solution – reduce the serialization of the envelope and payload by
    implementing Externalizable
Reduce serialization costs through Externalizable
             Entry                    Externalizable


           Envelope                      MyEntry
                                Public Long attr1;
 Public Long attr1;
                                Public Long attr2;
 Public Serializable payload;   Public Long attr3;
                                …
                                Public Long attrN;
Externalizable results



                                             Memory Utilization (Embedded)                                                                       Write Period (Remote)

                               200000                                                                                    25000
                               180000
Number of objects in 64m JVM




                               160000                                                                                    20000
                               140000                                    Raw Attributes                                                                                     Raw Attributes




                                                                                                        Load time (mS)
                               120000                                                                                    15000
                                                                         Simple Envelope                                                                                    Simple Envelope
                               100000
                               80000                                     SimpleEnvelopeExternaliza                       10000                                              SimpleEnvelopeExternaliza
                               60000                                     ble                                                                                                ble

                               40000                                                                                     5000
                               20000
                                   0                                                                                        0
                                        0     50          100      150                                                           0          50          100           150
                                            Number of attributes                                                                     Number of attributes per entry




                                                                                                          Much improved
                                                                                                     performance – although
                                                                                                      still something to pay
                                                                                                       for the serialization
                                                                                                           to/from objects
Reduce representation – minimize deserialization costs inside
space

 Entry           Externalizable

                                                 MyEntry
     ByteArrayEnvelope                  Public Long attr1;
                                        Public Long attr2;
Public Long attr1;                      Public Long attr3;
Public byte[] payload;                  …
                                        Public Long attrN;
Private MyEntry myEntry
Public MyEntry getMyEntry()             Public byte[] pack()
                                        Public void unpack(byte [])


                 Serialize outputs myEntry as a byte array
               (payload). Deserialization simply ensures the
              payload is deserialized. myEntry is only finally
             reconstructed when the client calls getMyEntry()
Externalizing the envelope
•   writeExternal
     – Pack the myEntry into the payload attribute
     – Set myEntry to null
     – Write out only the attr1 and payload attributes
•   readExternal
     – Read in only attr1 and payload
     – IGNORE myEntry (i.e. leave it null)
•   getMyEntry
     – Set myEntry to an initial MyEntry object
     – Unpack() payload into myEntry
     – Set payload to null
ByteArrayEnvelope measurements



                                             Memory Utilization (Embedded)                                                                    Write Period (Remote)

                               200000                                                                                 25000
                               180000
Number of objects in 64m JVM




                               160000                                                                                 20000
                                                                         Raw Attributes                                                                                  Raw Attributes
                               140000




                                                                                                     Load time (mS)
                               120000                                    Simple Envelope                              15000                                              Simple Envelope

                               100000
                                                                         Byte Array Envelope                                                                             SimpleEnvelopeExternaliza
                               80000                                                                                  10000                                              ble
                               60000                                     SimpleEnvelopeExternaliza                                                                       ByteArrayEnvelope
                                                                         ble
                               40000                                                                                  5000
                               20000
                                   0                                                                                     0
                                        0     50          100      150                                                        0          50           100          150
                                            Number of attributes                                                                  Number of attributes per entry
Embedded Gotcha! Be careful


     ByteArrayEnvelope
                                    If this is used with an
Public Long attr1;                   embedded space no
Public byte[] payload;            serialization occurs – the
                                 payload is null AND myEntry
Private MyEntry myEntry                      is null 
Public MyEntry getMyEntry()



                               Make myEntry public,
                              and ensure only one of
                              payload or myEntry are
                                non-null at a time.
Envelope/Payload optimizations
•   Issue – poor memory utilization in space caused by large number of
    attributes
•   Solution – refactor entry into envelope/payload, reducing number of entries
•   Issue – Poor performance due to payload deserialization
•   Solution – change envelope/payload to use byte array representation, and
    only de-serialize to real payload in the client
Distributed Queries in a Partitioned Workspace
•   If using an IWorker have to write the requests to all the spaces – this is a
    special piece of code that knows how to write to every partition
•   Replace IWorker with an IFilter that triggers on a read/takeMultiple – now
    there’s no special “all partitions” code because the read/takeMultiple is
    naturally performed against all partitions by the proxy!
•   Example on the Wiki:
    http://192.168.10.6:8080/display/TECHF/Custom+Query+Pattern (Custom
    Query Pattern)
IWorker Distributed Query
  WARNING
  Complex                      Master
   code!                 Query
                        Distributor
               1

                       1                          4

                   4                      1
                                      4
      2
                                              2
          3                                           3
     IWorker                                  IWorker
                           2      3
                       IWorker
IFilter Distributed Query (Custom Query Pattern)

                            Master


                  1

                      1
        IFilter
                                     1   IFilter
                          IFilter
Here’s how simple this is:

 SQLQuery template = new SQLQuery(Entity.class.getName(),"flags <= 5");
 SumTask sumTask = new SumTask("accountNumber", template);
 Entry[] results = space.readMultiple(sumTask,null,Integer.MAX_VALUE);
 sumTask.executeOnClient(results);
 System.out.println(" ---> Total Sum = " + sumTask.getTotalSum());

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:2
posted:9/2/2011
language:English
pages:17