Startup Scalability Strategies
• • • • • • • • Keeping score: What to measure? Discerning the difference: What to focus on? Finding your path: Which way to go? Choosing your architecture: How to partition? Walking the line: How to balance? Building your team: Who to hire? Thinking ahead: What about the future? Offloading scalability: Is it for me?
Keeping Score: What to Measure?
distribution of data
disk response time
writes per second
transactions per second
queries per shard
client response time
cache utilization disk saturation thread thrashing
memory/ IO contention
cache hit ratio
connections per shard
exceeding high or low water marks
reads per second
Performance != High Availability != Scalability
Ability to process or execute a task compared to time and resources used
Ability of a system to ensure a certain degree of operational continuity
Ability to handle growing amounts of traffic in a graceful manner or ability to be readily enlarged
Pick any two!!
Choose if you don’t care about 24/7 availability or accommodating high traffic.
Choose for a site that must be available 24/7
Choose for a high traffic website
Vertical or Horizontal?
Vertical • Aka Scaling up • Adding resources to a node
– Getting a bigger server – Using faster CPUs
• Aka Scaling out • Adding more nodes • Cost efficient
– Commodity hardware – increased management complexity
• Twice as fast servers can be more than twice expensive
• “more complex” programming model
– Right foundation
• Throughput and latency between nodes
How to Partition?
• Functional Partitioning • Key based partitioning
– (users ending in 01 go to server 1)
• Range based partitioning
– (records ranging from 2M to 4M go to server 8)
• Directory server based partitioning
– (no pre-defined partitioning scheme, instead a lookup is required)
How to balance?
• Balance is easier if the foundation is right • Use agile methodologies • Technical debt is expensive • Technical mortgage is a KILLER!
Before and After
• What to do before you get big?
– – – – – – – – – – Lay the right foundation Ability to Shard / Partition Decouple components Effectively cache Have a plan in place Now focus on micro optimizations Acquiring and upgrading hardware Performance optimizations and OS tuning Implementing High Availability and Disaster recovery Use CDN
• What can wait after things start to grow?
What skills / hires are most crucial to dealing with scalability?
• Go Asynchronous • Go Stateless • “Best IO is No IO”
– Cache effectively using shared cache and monitor utilization
• Decouple as much as possible • Build using APIs
– Easy to scale development and deployment and open up your service
• Virtualize/Abstract everything
How to blow up?
Can scalability be outsourced? aka Can Cloud fix Twitter?
• • • • • • Amazon Google AppEngine Rackspace AppNexus 10Gen Other providers?
Things to take away
• • • • Focus on scalability, the rest will follow Horizontal is better, Vertical is costly Go Asynchronous Architect so you don’t have to rearchitect • Choose two out of Consistency, Availability and Partition-Tolerance • Measure utilization first, then performance • Choose the right infrastructure & invest in right skills
• Notes/Tips: http://mashraqi.com/2008/09/startonomicsstartup-scalability.html • Personal blog: http://mashraqi.com • Twitter: http://twitter.com/mashraqi • MySQL Blog: http://mysqldatabaseadministration.blogspot. com • Email: firstname.lastname@example.org
Putting the Fun in Functional:
New Trends in Game Design for Web 2.0 Startups Amy Jo Kim, ShuffleBrain http://ShuffleBrain.com