Moving out of the Garage
Scaling for Startups
AKA why scaling is fun
Why are you here?
• I want to conquer an incredibly technically challenging problem that no one has solved
– Others have attempted to solve it and failed (or succeeded but I can do it better) – I see something that no one has tried to solve
• I want to build a business possibly based on technology that fills a gap in the market
– Technology is core, product is centric
• I want to leverage technology to build a non tech based revenue generating business
– Technology is awesome, and it is going to support my business.
The Path
Day 1: aka the last easy day of your life
(or at least for a while)
G2
Day X
myspace.com : a place for moving
0-20M active users in 2 years 0-100M active users in 4 years
Laying Foundations
• Lay groundwork to prepare for 3 critical growth phases
– Scaling the technology – Scaling the team – Scaling the revenues/efficiency
• Decisions made in first 90 days create lasting impressions that can be felt for years
Scaling the Technology
• Scale up vs. scale out is no longer a question
– Unless you just founded a bank, don’t scale up
• Partition Data
– Decide early on, and make the right decision
• Range Based • Mod Based • Mod’ed Ranges (if I had to do it over)
• Concentrate on write management
Scaling the Technology
• Don’t solve problems you don’t have
– Reuse as many existing solutions as possible – What are your goals?
• Make $$? • Build Cool Technology? • Both?
• Fail Fast
– Admit failure
• Don’t Double Down on bad decisions
– Walk away from failure
Infrastructure Decisions
Easiest
Leverage existing publically available scaling solutions
– – – – – Replication Sharding Memcache Hardware loadbalancers NAS/SAN
Harder
Leverage public solutions when possible, when not develop proprietary internal scaling solutions
– – – – Myspace DFS MyCache Transaction Manager Dspace map/reduce
Most Difficult
– – –
Leverage public and internal solutions
Without negatively impacting developer productivity Without wasting time Without wasting money
Decouple the User from the Authoritative Disks
Overflow
Cache
Relational Data Store Flat Data Store
SAN/NAS Virtualization Layer
Authoritative Primary Arrays
Queue
Overflow
Decouple the User from the Authoritative Disks
Overflow
Cache
Relational Data Store Flat Data Store
SAN/NAS Virtualization Layer
Authoritative Primary Arrays
Queue
Overflow
Decouple the User from the Authoritative Disks
Overflow
Cache
Relational Data Store Flat Data Store
SAN/NAS Virtualization Layer
Authoritative Primary Arrays
Queue
Overflow
Range Partitions
Users 0 -1 Million Users 1-2 Million Users 2–3 Million
New User Pipe
• Infinitely Scalable • Newest Ranges Create Hot Spots
Mod Partitions
Mod 1 Mod 2 Mod 3
New User Pipe
•Eliminates Hot Spots •Difficult to add new hardware •Scalable only to a certain point
Mod/Range Combo Partitioning
Users 0 – 1 Million / Mod 1 Users 0 – 1 Million / Mod 2 Users 0 – 1 Million / Mod 3 Users 1 – 2 Million / Mod 1 Users 1 – 2 Million / Mod 2 Users 1 – 2 Million / Mod 3
New User Pipe
•Eliminates Hot Spots •Infinitely Scalable •Adding additional hardware is easy
Scaling the Organization
• The first 25 people you hire will define the success of your company
– Don’t hire fast, hire smart
• Manage your burn, not your timeframe
– Do you have competitors trying to do the same? – Are you second to market?
• Sprint
– Is it new? Is no one else thinking about this?
• Marathon
• Be smart about your stealth phase
– Countless failures from coming out to early – Countless failures from coming out to late
We Want To Code
Scaling the Organization
• #1 Priority – Minimize ramp time
– Counterpart to technology’s “fail fast”
• Abstract core technologies from front end development groups
– – – – Data Access Layer Cache Queues Etc.
• Create vertical product partitions with horizontal skillset partitions
Scaling Profit
• Technology is a cost center
– Manage profit by managing expenses
• Bucket Scaling Model • Calculate yearly cost of user
– Inverse LUV
• Use commodity gear
– No SAN/NAS unless absolutely necessary
• Leverage CDN • The Cloud?
True? False?
True? False?
The Cloud
• I love a good buzzword
– Cloud Computing – Economies of scale? – Little to no SLA. Now you own my data
• Consumers eat the cloud
– Email (circa…how long ago?) – Photos – Interests
• The cloud has existed for consumers for the last 15 years • As a business, unless you are doing something that requires huge volatile processing power, rent your servers. • Don’t handshake your data
– Your business is your data
• Build your own cloud
– GlusterFS – MaxiScale
Questions?