Docstoc

Supercharged Rabbit Resource Management at High Speed in Erlang

Document Sample
Supercharged Rabbit Resource Management at High Speed in Erlang Powered By Docstoc
					S R: R M  H S  E
Matthew Sackman1
1 LShift

T 
Y   . . .

R M
• Resource management is important in any dæmon process. • Especially important when clients have the potential to

overwhelm the server by flooding the server with lots of data which the server is expected to hang on to and process.
• Rabbit is essentially expected to cope with a DDoS attack. • Erlang presents some unique challenges and solutions to

measuring and managing resources.
• Mainly manage memory usage, but additionally manage

usage of file descriptors.

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

W  AMQP?

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

W  AMQP?
• Protocol for dynamically configurable message routing

platform.

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

W  AMQP?
• Protocol for dynamically configurable message routing

platform.
• Written entirely by committee.

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

W  AMQP?
• Protocol for dynamically configurable message routing

platform.
• Written entirely by committee. Several times.

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

W  AMQP?
• Protocol for dynamically configurable message routing

platform.
• Written entirely by committee. Several times. Ongoing.

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

W  AMQP?
• Protocol for dynamically configurable message routing

platform.
• Written entirely by committee. Several times. Ongoing. • Only about 14k lines of text.

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

W  AMQP?
• Protocol for dynamically configurable message routing

platform.
• Written entirely by committee. Several times. Ongoing. • Only about 14k lines of text. Sometimes inconsistent.

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

W  AMQP?
• Protocol for dynamically configurable message routing

platform.
• Written entirely by committee. Several times. Ongoing. • Only about 14k lines of text. Sometimes inconsistent. • Several interoperable implementations available.

AMQP  RMQ
W  RMQ?
• An AMQP broker. • Written entirely in Erlang. • Only about 17k lines of Erlang.

W  AMQP?
• Protocol for dynamically configurable message routing

platform.
• Written entirely by committee. Several times. Ongoing. • Only about 14k lines of text. Sometimes inconsistent. • Several interoperable implementations available. Rabbit is the

most interoperable implementation.

AMQP K C

AMQP K C

AMQP K C

AMQP K C

AMQP K C

AMQP K C

AMQP K C

AMQP K C

AMQP K C

AMQP K C

AMQP K C

R M
W    ?

O     
• Memory is limited. Clients can send unlimited messages to

Rabbit.
• Memory is nearly unlimited. Computers often have lots of

swap. But this is a dumb technique and avoiding swap usage is beneficial.
• Disk space is often plentiful. Being able to force messages out

to disk in an efficient manner allows Rabbit to achieve no per-message RAM cost

R M
W    ?

O     
• Memory is limited. Clients can send unlimited messages to

Rabbit.
• Memory is nearly unlimited. Computers often have lots of

swap. But this is a dumb technique and avoiding swap usage is beneficial.
• Disk space is often plentiful. Being able to force messages out

to disk in an efficient manner allows Rabbit to achieve no per-message RAM cost sort of.

P  

A          ,    :
• Prioritise newly created queues because they are likely to see

use soon (temporal locality)?
• Prioritise short queues; long queues may not be able to fit in

RAM anyway, so why try?
• Prioritise fast queues: making a fast queue go via disk will

likely slow it down, and if publish rate stays the same, the queue will suddenly grow, destabilising the system?
• What’s the goal here anyway?

W  ,   ,    ?
G
• Rabbit must always be able to eventually accept a message

published to it.

W  ,   ,    ?
G
• Rabbit must always be able to eventually accept a message

published to it.
• Can delay it by flow control if necessary.

W  ,   ,    ?
G
• Rabbit must always be able to eventually accept a message

published to it.
• Can delay it by flow control if necessary. • Can achieve this most efficiently by minimising the number of

messages held by Rabbit.

W  ,   ,    ?
G
• Rabbit must always be able to eventually accept a message

published to it.
• Can delay it by flow control if necessary. • Can achieve this most efficiently by minimising the number of

messages held by Rabbit.
• This implies stopping queues from getting long.

I
• Messages can only be (re)written to disk a bounded number of

times.
• Can operate in a mode which ensures no per-message RAM

cost.

M 
H      ?
• A queue has a number of messages in it, and each message

has a size.

M 
H      ?
• A queue has a number of messages in it, and each message

has a size.
• But, a single message can be routed to multiple queues, at

which point the queues share the same message. So ten queues all sharing the same messages take (approx) the same memory as 1 queue.
• Erlang is garbage collected (generational, and per process).

Often is reluctant to return freed space to the OS.

M 
H      ?
• A queue has a number of messages in it, and each message

has a size.
• But, a single message can be routed to multiple queues, at

which point the queues share the same message. So ten queues all sharing the same messages take (approx) the same memory as 1 queue.
• Erlang is garbage collected (generational, and per process).

Often is reluctant to return freed space to the OS.

W   ?
• Total memory used by the Erlang VM.

M 
H      ?
• A queue has a number of messages in it, and each message

has a size.
• But, a single message can be routed to multiple queues, at

which point the queues share the same message. So ten queues all sharing the same messages take (approx) the same memory as 1 queue.
• Erlang is garbage collected (generational, and per process).

Often is reluctant to return freed space to the OS.

W   ?
• Total memory used by the Erlang VM. • Rates. Specifically, ingress and egress rates of queues

(msgs/sec).

RMQ ..  

N    
• All messages always held in RAM. • When RAM is exhausted, raise the channel.flow flag and

prevent clients from publishing.
• Maybe crash if Erlang can’t actually allocate enough memory

to do its garbage collection.
• Use snapshot and deltas when writing to disk. Periodically

rewrite snapshot.

RMQ ..  

N    
• All messages always held in RAM. • When RAM is exhausted, raise the channel.flow flag and

prevent clients from publishing.
• Maybe crash if Erlang can’t actually allocate enough memory

to do its garbage collection.
• Use snapshot and deltas when writing to disk. Periodically

rewrite snapshot.
• Needless to say, many users of Rabbit are keen for this

behaviour to change.
• Failed both invariants.

R   

I’    
• Either, a queue holds messages in RAM, or it sends them all via

disk.
• Difficulty here was that on the transition, writing a few million

messages out to disk takes time.
• Also used Mnesia:

R   

I’    
• Either, a queue holds messages in RAM, or it sends them all via

disk.
• Difficulty here was that on the transition, writing a few million

messages out to disk takes time.
• Also used Mnesia: • disc_copies mode uses Ets (memory) and disk_log:

disk_log does the snapshot and deltas thing too. Failed invariant.

R   

I’    
• Either, a queue holds messages in RAM, or it sends them all via

disk.
• Difficulty here was that on the transition, writing a few million

messages out to disk takes time.
• Also used Mnesia: • disc_copies mode uses Ets (memory) and disk_log:

disk_log does the snapshot and deltas thing too. Failed invariant.
• disc_only_copies mode uses Dets (disk). Very slow, and Dets

has internal 32-bit pointers so can’t go beyond 2GB.

R   
I’    
• Queues attempted to measure their own memory usage,

ignoring sharing.
• A central process allocated tokens to queues and a queue

could only be evicted to disk if it either was idle and another queue needed additional space, or if the queue itself wanted to grow and there were no tokens available.
• Major failing with this scheme is inaccuracy of measurement. • Also no attempt to penalise massive but very slow queues:

these should probably be operating off the disk all the time.
• Became clear it was impossible to reasonably come up with an

ordering of importance of queues. Just too many factors.

R   

I’    ,    
• A queue can have each message in one of three states:

completely in RAM; message on disk but index entry (pointer to message) in RAM; message and index on disk.
• Queues can maintain very fine grained control of the numbers

of messages in memory, and can make smooth transitions.
• Existing messages can be pushed to disk-only mode but are

only brought back into RAM for delivery to a consumer.
• All disk structures created specifically for Rabbit.

R   
I’    ,    
• Queues periodically measure their ingress and egress rate, and

know how many messages they have in RAM. Divide RAM message count by average rate to get duration.
• Durations reported to a central process. This averages the

durations, and scales by the fraction of memory used over the total memory available. This gives the desired average duration were we using all the memory we should be using. This value sent back to queues.
• Queues take this desired duration, multiply by their rates, to

get the target number of messages they should hold in RAM. They adjust themselves based on this target.

R   
I’    ,    
• Outcome is that the same duration: for a fast moving queue,

results in a high target, thus it implicitly is given lots of RAM; for a slow moving queue, the duration results in a low target, thus implicitly very little RAM.
• Queues periodically reporting their duration, combined with

the central process remeasuring the fraction of memory used results in a feed-back loop that adapts smoothly and consistently to changing circumstances.
• Ingress and egress rates used (simple average) to avoid

difficulties with queues with publishers and no consumers. Several other corner cases discovered and dealt with.

R   
I’    ,    
• The message store itself is the same as in the first attempt, and

guarantees that messages can only be rewritten (due to garbage collection of files) a bounded number of times.
• There are some in RAM structures that contain information

about files on disk that contain messages. These amount to a per-message RAM cost.

R   
I’    ,    
• The message store itself is the same as in the first attempt, and

guarantees that messages can only be rewritten (due to garbage collection of files) a bounded number of times.
• There are some in RAM structures that contain information

about files on disk that contain messages. These amount to a per-message RAM cost. Small though: e.g. 100 bytes per 16,384 messages.

R   
I’    ,    
• The message store itself is the same as in the first attempt, and

guarantees that messages can only be rewritten (due to garbage collection of files) a bounded number of times.
• There are some in RAM structures that contain information

about files on disk that contain messages. These amount to a per-message RAM cost. Small though: e.g. 100 bytes per 16,384 messages.
• At twice Rabbit’s maximum current speed, you could not fill

4GB RAM with these structures in less than 7000 hours (approx 10 months).

R   
I’    ,    
• The message store itself is the same as in the first attempt, and

guarantees that messages can only be rewritten (due to garbage collection of files) a bounded number of times.
• There are some in RAM structures that contain information

about files on disk that contain messages. These amount to a per-message RAM cost. Small though: e.g. 100 bytes per 16,384 messages.
• At twice Rabbit’s maximum current speed, you could not fill

4GB RAM with these structures in less than 7000 hours (approx 10 months). Question of whether it’s worth the extra code to solve this.

W   ?

E      
• We would like a couple of file handles per queue. However,

whilst Windows has a limit around 16M, most Linuxes have a limit of 1024 by default, and OS X has a default limit of 256!
• Would be easy if we were happy with one central process to

allocate.
• However, whilst allocation would be easy, revocation is tricky. • Thus take a very soft approach.

W   ?
L  
• Every process reports when a file handle is opened or closed to

a central process.
• The central process is also told the timestamp at which the least

recently used file handle was used per process.
• Central process calculates the average timestamp and when

the limit is reached, asks the processes to close any file handles older than this.
• This doesn’t stop processes from opening file handles: you can

set the limit to 0 and Rabbit will still work very happily.
• But if you do so, each process will open a file handle, use it, and

then immediately be asked to close it again.
• Also account for network sockets.

R
T . . .
• Rabbit does very smoothly evict messages to disk in a sensible

and justifiable way that seeks to minimise the impact of exhausting its available RAM.

R
T . . .
• Rabbit does very smoothly evict messages to disk in a sensible

and justifiable way that seeks to minimise the impact of exhausting its available RAM.
• Queues can seamlessly grow to many times the size of the

available memory, and very few pauses are observed.

R
T . . .
• Rabbit does very smoothly evict messages to disk in a sensible

and justifiable way that seeks to minimise the impact of exhausting its available RAM.
• Queues can seamlessly grow to many times the size of the

available memory, and very few pauses are observed.
• Millions of queues can be used (requires some tuning of Erlang

VM - raise process limit) and file handles are automatically shared in such a way as to keep open the most recently used file handles.

R
T . . .
• Rabbit does very smoothly evict messages to disk in a sensible

and justifiable way that seeks to minimise the impact of exhausting its available RAM.
• Queues can seamlessly grow to many times the size of the

available memory, and very few pauses are observed.
• Millions of queues can be used (requires some tuning of Erlang

VM - raise process limit) and file handles are automatically shared in such a way as to keep open the most recently used file handles.
• When necessary, Rabbit can completely saturate the

bandwidth of a hard disk, maintaining > 70MB/s if fed enough data. In many cases, due to aggressive caching, writes much less than one would expect. E.g. message delivered and acknowledged before it’s necessary to write out to disk. Very common in short queues.

E
T G,  B,   U

T G
• Use of unification makes checking return codes very easy.

Results in code that fails fast and is very defensive.
• Writing concurrent programs is easy and much safer than with

pthreads.
• Debugging tools are excellent: easy to dive in to a live system

and examine the state of various processes.
• Libraries reasonably complete and fairly well written. • Wide platform support, though some platforms feel slightly

like second-class citizens.
• Erlang becoming more popular. E.g. ejabberd, couchdb etc

E
T G,  B,   U

T B
• Quality of libraries is variable. Some bizarre API issues. E.g. Ets

and Dets share a lot of API, but have different return codes.
• Interfacing with C is more painful than it could be. • Missing language features: implementation of records feels

incomplete; language desperately needs Haskell-style type classes and monads: bored of passing around state.
• Some modules buggy, and continue to be buggy, e.g. SSL.

E
T G,  B,   U

T U
• Type support is poor. Only recently gained ability to export

types from a module!
• Assigning to a _Throwaway actually declares and assigns to that

variable!
• Can’t do maths. In particular doesn’t understand infinity or

NaN.
• Lack of community improving the language. See e.g.

Hackage/Cabal, Ruby Gems etc.

I E

S  ’ ,    
• gen_server2: Add support for messages with priority; support

automatic hibernation of processes. Drain message queue so that calls are much more efficient when receiving replies.
• os_mon, memsup: System memory allocation monitoring is a

bad idea: e.g. in OS X it’s all but impossible to know how much free memory there is. Thus don’t bother. Instead limit to a fraction of total installed memory.
• supervisor: If there’s a race when terminating a child of a

supervisor, all processes but the first will get back an error. This should not be the case: the goal is that the child has been terminated.

I E

S  ’ ,    
• file: Whilst the file module offers some caching, it does not

try to optimise file accesses: e.g. eliminate duplicate seeks, syncs etc. Also no manual control over flushing write buffers. Our file_handle_cache only supports appending to a file, and a single writer with multiple readers, but offers much finer control and far more aggressive caching and elimination of operations.

C

• Resource management is always tricky, and made worse when

it’s difficult to measure the resource you’re trying to manage.

C

• Resource management is always tricky, and made worse when

it’s difficult to measure the resource you’re trying to manage.
• Having clear goals as to what the behaviour is you wish to see

helps a lot.

C

• Resource management is always tricky, and made worse when

it’s difficult to measure the resource you’re trying to manage.
• Having clear goals as to what the behaviour is you wish to see

helps a lot.
• Erlang can make very efficient use of resources.

C

• Resource management is always tricky, and made worse when

it’s difficult to measure the resource you’re trying to manage.
• Having clear goals as to what the behaviour is you wish to see

helps a lot.
• Erlang can make very efficient use of resources. • Managing resources in Erlang isn’t substantially different from

other languages though the ease of communication between processes helps: naturally lends itself to the server with many clients model.

C

• Resource management is always tricky, and made worse when

it’s difficult to measure the resource you’re trying to manage.
• Having clear goals as to what the behaviour is you wish to see

helps a lot.
• Erlang can make very efficient use of resources. • Managing resources in Erlang isn’t substantially different from

other languages though the ease of communication between processes helps: naturally lends itself to the server with many clients model.
• Limiting the amount of communication is key to scaling well:

what works well for 10 clients and one server may not work well for 10 million clients and one server.

E
. . .  , ’ .

Thank you. Questions?


				
DOCUMENT INFO
Shared By:
Stats:
views:17
posted:1/22/2010
language:English
pages:64
Description: Supercharged Rabbit Resource Management at High Speed in Erlang