Designing Enterprise Drupal

Document Sample
Designing Enterprise Drupal Powered By Docstoc
					Designing Enterprise Drupal
ENVIRONMENTS
How to scale Drupal server infrastructure
                       INTRODUCTIONS

• Jason Burnett (jason@neospire.net)
   NeoSpire Director of Infrastructure
               SO…FIRST THING’S FIRST
• Prerequisites that you’ll need (or at least want).
   – A good, reliable network with plenty of capacity
   – At least one expert Systems Administrator
• You don’t necessarily need this:
            OKAY, LET’S TALK STACKS
Default Stack       Performance Stack
                    Varnish
Apache
                    Apache

                    APC
PHP                 PHP

                    Pressflow
Drupal              Memcached

                    Solr
MySQL               MySQL
         FIRST, DON’T USE DRUPAL
                          (WELL, SORTA)

               Pressflow

Apache         • Drop-in replacement for Drupal 6.x
               • Support for database replication
               • Support for reverse proxy caching
               • Optimization for MySQL
PHP            • Optimization for PHP 5
               • Available at:
               http://fourkitchens.com/pressflow-
Drupal         makes-drupal-scale


MySQL
      AFTER PRESSFLOW, IT’S ALL ABOUT
                                CACHE
                    Varnish

Varnish             • Varnish is a reverse proxy cache
                    • Caches content based on HTTP
                      headers
Apache              • Uses kernel-based virtual memory
                    • Watch out for cookies,
PHP                   authenticated users
                    • Great Config http://lb.cm/ZyR
Pressflow               • Thanks quicksketch!
                    • Available at
                        http://varnish-cache.org/
MySQL
                          HTTP PIPELINE
Apache Configuration     Varnish Configuration

NameVirtualHost *:8080   backend default {
Listen 8080                  .host = "127.0.0.1";
                             .port = "8080";
<VirtualHost *:8080>     }
*…+
</VirtualHost>
                           HTTP LOGGING
• VarnishNCSA daemon handles logging
• Default Apache logs will always show 127.0.0.1
• Define a new log format to use X-Forwarded-For

LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s
%b \"%{Referer}i\" \"%{User-Agent}i\""
combined_proxy

CustomLog /var/log/apache2/access.log
combined_proxy
                      CACHING WITH COOKIES
sub vcl_recv {
    // Remove has_js and Google Analytics __* cookies.
    set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-
z]+|has_js)=[^;]*", "");
        // Remove a ";" prefix, if present.
        set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", "");
        // Remove empty cookies.
        if (req.http.Cookie ~ "^\s*$") {
              unset req.http.Cookie;
    }
}

sub vcl_hash {
    // Include cookie in cache hash
    if (req.http.Cookie) {
          set req.hash += req.http.Cookie;
    }
}
                                  BASIC SECURITY
// Define the internal network subnets
acl internal {
     "127.0.0.0"/8;
     "10.0.0.0"/8;
}

sub vcl_recv {

    *…+

    // Do not allow outside access to cron.php
    if (req.url ~ "^/cron\.php(\?.*)?$" && !client.ip ~ internal) {
          set req.url = "/404-cron.php";
    }
}
             VARNISH IS SUPER-FAST
• Able to handle many more connections than
  Apache
• Needs a large number of file handles
/etc/security/limits.conf

*         soft nofile       131072
*         hard nofile        131072
            APACHE OPTIMIZATIONS
                    Apache

Varnish            • Tune apache to match your
                     hardware
Apache             • Setting MaxClients too high is
                     asking for trouble
PHP                • Every application is different
                   • A good starting point is total
Pressflow            amount of memory allocated
                     to Apache divided by 40MB
MySQL              • One of the areas that will
                     need to be monitored and
                     updated on an ongoing basis
            STILL ALL ABOUT CACHE
• APC Opcode Cache   APC
Varnish
                     • APC is an Opcode cache
Apache               • Officially supported by PHP
APC                  • Prevents unnecessary PHP
                       parsing and compiling
PHP                  • Reduces load on Memory
Pressflow              and CPU

MySQL
  ALLOCATE ENOUGH MEMORY FOR APC


Php.ini            Sysctl.conf
extension=apc.so    kernel.shmmax=13421
apc.shm_size=120    7728
apc.ttl=300
               EVEN MORE CACHE
 • Memcached      Memcached
Varnish
Apache            • Memcached is a
                    distributed memory object
APC                 caching system
PHP               • Reduces load on database
Pressflow         • Simple key/value datastore

Memcached
MySQL
                            MEMCACHED BINS
/usr/bin/memcached -d -p 11211 -u nobody -m 64
/usr/bin/memcached -d -p 11212 -u nobody -m 16
/usr/bin/memcached -d -p 11213 -u nobody -m 128
/usr/bin/memcached -d -p 11214 -u nobody -m 128
/usr/bin/memcached -d -p 11215 -u nobody -m 16
/usr/bin/memcached -d -p 11216 -u nobody -m 8
/usr/bin/memcached -d -p 11217 -u nobody -m 8
/usr/bin/memcached -d -p 11218 -u nobody -m 8
/usr/bin/memcached -d -p 11219 -u nobody -m 8
/usr/bin/memcached -d -p 11220 -u nobody -m 8
/usr/bin/memcached -d -p 11221 -u nobody -m 8
/usr/bin/memcached -d -p 11222 -u nobody -m 8
/usr/bin/memcached -d -p 11223 -u nobody -m 8
        DRUPAL MEMCACHED MODULE
$conf['cache_inc'] =
                                                               $conf['memcache_bins'] = array(
'sites/all/modules/memcache/memcache.db.inc';
                                                                 'sessions' => 'sessions',
                                                                 'cache' => 'default',
$memcache_servers = array(                                       'cache_block' => 'block',
  '127.0.0.1',                                                   'cache_content' => 'content',
);                                                               'cache_filter' => 'filter',
                                                                 'cache_menu' => 'menu',
                                                                 'cache_mollom' => 'mollom',
foreach ($memcache_servers as $ip) {                             'cache_page' => 'page',
  $conf['memcache_servers'][$ip . ':11211'] = 'default';         'cache_views' => 'views',
  $conf['memcache_servers'][$ip . ':11212'] = 'block';           'cache_views_data' => 'views_data',
  $conf['memcache_servers'][$ip . ':11213'] = 'content';         'cache_users' => 'users',
                                                                 'cache_pathsrc' => 'path_source',
  $conf['memcache_servers'][$ip . ':11214'] = 'filter';
                                                                 'cache_pathdst' => 'path_dest',
  $conf['memcache_servers'][$ip . ':11215'] = 'menu';          );
  $conf['memcache_servers'][$ip . ':11216'] = 'mollom';
  $conf['memcache_servers'][$ip . ':11217'] = 'page';
  $conf['memcache_servers'][$ip . ':11218'] = 'views';
  $conf['memcache_servers'][$ip . ':11219'] = 'views_data';
  $conf['memcache_servers'][$ip . ':11220'] = 'sessions';
  $conf['memcache_servers'][$ip . ':11221'] = 'users';
  $conf['memcache_servers'][$ip . ':11222'] = 'path_source';
  $conf['memcache_servers'][$ip . ':11223'] = 'path_dest';
}
     BUT WHAT ABOUT SEARCH?
               Solr
Varnish
Apache         • Better than native Drupal
                 search
APC
               • Built on standard application
PHP              server
Pressflow         • You can decide what J2EE
                     server to use
Memcached      • Flexibility allows fault tolerance
Solr           • Configurable through the
MySQL            Drupal Solr module
   CAN OTHER STACKS WORK TOO?
• Yes, there are some different technologies and
  strategies that do the same thing (nginx, Cassandra,
  eAccelerator, etc.).
• Arguments can be made both for/against
• This stack is what we have used in production and feel
  is the most stable and enterprise-ready
• We’re always refining our stack too. So far, this is what
  we like best. (Currently testing the Comanche web server)
• Project Mercury uses the same stack
                          BACK TO STACKS
• The beauty of this performance stack…

 Varnish                     …is that it can be installed
 Apache                        entirely on a single
                               server, and that server
 APC
                               will perform well.
 PHP
 Pressflow
 Memcached
                             But what if one server
 Solr                          isn’t good enough?
 MySQL
                                SCALE APART
• Because these services are modular, we can separate
  server roles
 Varnish
 Apache                            Varnish Server
 APC
 PHP
 Pressflow                         Web/App Server
 Memcached
 Solr
                                   Database Server
 MySQL
   YOU CAN DO THIS A COUPLE WAYS
• Another example:

Varnish
Apache
                     Web/App Server
APC
PHP
Pressflow
                     Memcached/Solr
Memcache             Server
Solr
                     Database Server
MySQL
                HOW WE LIKE TO DO IT
• This is our standard separation

Varnish
Apache                              Web/App
APC
                                    Server
PHP
Pressflow
Memcached
                                    Database
Solr
MySQL                               Server
SCALE FURTHER: LOAD-BALANCING

   Load Balancer(s)            • Multiple web servers
                                 can be load-balanced
                                 for greater capacity,
Web/App            Web/App       using the same
 Server             Server
                                 database
                               • Single-points of
                                 failure apparent.
        Database
         Server                • Load balancing
                                 utilizing LVS
  Load-balanced Architecture
            FILE SYNCHRONIZATION
• NFS              NFS
Varnish            •NFS allows multiple
Apache             web/app servers to
APC                seamlessly serve the same
PHP                content
Pressflow          •User uploaded content is
Memcached          instantly available to all web
Solr               servers
MySQL              •Any code changes only need
NFS                to be made in one location
                            LOGS
• Syslog    Syslog
Varnish
            •Drupal can log to syslog,
Apache
APC         reducing load on the
PHP         database
Pressflow   •Sending logs to a central
Memcached   location allows for easy
Solr        review
MySQL
NFS
Syslog
  FAULT TOLERANCE IS IMPORTANT NOW
                                       • Now that we’re
                                         scaling out with more
       Load Balancer(s)                  capacity, we’re
                                         probably really scared
     Web/App           Web/App           of the DB failing
      Server            Server
                                       • MySQL circular
                                         replication
MySQL / NFS              MySQL / NFS   • NFS-HA
  Server                   Server
                                       • Solr fault tolerance
    High Availability Architecture     • All managed by
                                         Heartbeat
    MYSQL CIRCULAR REPLICATION

        MySQL Server 1             MySQL Server 2




• Circular replication is the method by which we
  synchronize data
• There are 2 IP addresses (master and slave)
• Heartbeat is used to automatically failover the
  addresses when necessary
                         NFS HA USING DRBD

          NFS Server 1                       NFS Server 2



• Data synchronization handled with DRBD
• Distributed Replicated Block Device (DRBD)
   – Essentially RAID1 over the network
   – Only one NFS server is able to access the data at a time, which is
     why we have the IP management
• IP management is handled by Heartbeat automatically
                                                          SOLR

          Solr Server 1                      Solr Server 2



• Data synchronization handled with DRBD
• Distributed Replicated Block Device (DRBD)
   – Essentially RAID1 over the network
   – Only one Solr server is able to access the data at a time, which is
     why we have the IP management
• IP management is handled by Heartbeat automatically
                                              SKY’S THE LIMIT

                                 Load Balancer(s)



 Web           Web      Web           Web      Web          Web      Web          Web
Server        Server   Server        Server   Server       Server   Server       Server




      NFS                        NFS              MySQL                      MySQL
     Server                     Server            Server                     Server



                                  Solr             Solr
                                 Server           Server
       OTHER THINGS TO CONSIDER
• Drush
• Monitoring
  –   Availability
  –   Core updates
  –   Module updates
  –   Munin
• CDN
                 LESSONS WE’VE LEARNED
…things we’ve picked up from experience

• Conntrack tables
   – Disable all the IPTables
     connection tracking modules
     unless you need them
• NTP
   – Time synchronization is
     extremely important on any
     system that utilizes Heartbeat
• Load Testing
   – Load test your solutions and
     make sure you can achieve
     your goal
  Thanks a bunch!
QUESTIONS?

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:14
posted:12/3/2011
language:English
pages:34