Embed
Email

Tech

Document Sample
Tech
Shared by: Eslam Moustafa
Categories
Tags
Stats
views:
8
posted:
11/7/2011
language:
English
pages:
25
Chapter Three: Google Technology









Chapter Three:



Google Technology









“Apart from the problems of scaling traditional search techniques to data of this

magnitude, there are new technical challenges involved with using the additional

information present in hypertext to product better search results.... Fast crawling

technology is needed to gather the Web documents and keep them up to date.

Storage space must be used efficiently to store indices and, optionally, the

documents themselves. The indexing system must process hundreds of gigabytes of

data efficiently. Queries must be handled quickly, at the rate of hundreds to

thousands per second.” – Sergey Brin and Lawrence Page, 19971

In the beginning, there was BackRub, the service that became Google. Today, Google is most

closely associated with its PageRank algorithm. PageRank is a voting algorithm weighted for

importance. The indicators of a Web page’s importance is the number of pages that link to a

particular page.

Messrs. Brin and Page soon added another factor which voted for the importance of a Web

page. This idea was the number of people who click on a Web page. The more clicks on a Web

page, the more weight that Web page was given. Over time, still other factors have been added

to the PageRank algorithm; for example, the frequency with which content on a page is

changed.

Google’s PageRank technology is closely allied with Internet search. Voting algorithms are

less effective in enterprise search, for instance. The attention given to Google and its search

technology dominate popular thinking about the company. Google search is like a nova. The



1. From “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” www.-

db.standord.edu/~backrub/google.html





The Google Legacy 55

Chapter Three: Google Technology









luminescence makes it difficult for the observer to see other aspects of the phenomenon

clearly or easily.

Radiance aside, Google is a technology company.2 Some of that technology when described in

technical papers such as the earliest one “The Anatomy of a Large-Scale Hypertextual Web

Search Engine” is demanding. The later papers such as “MapReduce: Simplified Data

Processing on Large Clusters” can be a slow read.3 Since Google is technology, explaining

what Google does in an easily-digestible meal is difficult. The diagram below provides

unauthorized snapshot of Google’s computing framework.









b







a d









c









Important Google technologies that underlie this diagram of the Googleplex

include: [a] modifications to Linux to permit large file sizes and other functions so

as to accelerate the overall system; [b] a distributed architecture that allows

applications and scaling to be “plugged in” without the type of hands-on set-up

other operating systems require; [c] a technical architecture that is similar at every

level of scale; [d] a Web-centric architecture that allows new types of applications

to be built without a programming language limitation.







2. The annex to this monograph contains a listing of more than 60 Google patents. The list is

not all-inclusive; however, it does provide the patent number and a brief description for some of

Google’s most important patents. The PageRank patent belongs to the trustees of Stanford

University. Google’s patent efforts have focused on systems and methods for relevance,

advertising, and other core foci of the company. Google is creating a patent fence to protect its

interests.

3. Jeff Dean, former Alta Vista researcher and a Google senior engineer, has been an

advocate of MapReduce. His most recent papers are available on his Web page at http://

labs.google.com/people/jeff/.







56 The Google Legacy

Chapter Three: Google Technology









Google’s technology has emerged from a series of continuous improvements or what Japanese

management consultants call kaizan. Each Google technical change may be inconsequential to

the average user of Google. But when taken as a whole, Google’s “technological advantage”

comes from Google’s incremental innovations, clever adaptations of research-computing

concepts, and Byzantine tweaks to Linux. Some day, a historian of technology will be able to

identify, from the hundreds of improvements that Google has engineered in the last nine years,

one or two that stand with PageRank as of major importance. Critics of Google will see that

the company has grafted to its core technology processes from many different sources.

To illustrate, the structure of Google’s data centers and the messages passed to and from these

data centers is in many ways a variant of grid computing.4 Google’s ability to read data from

many computers simultaneously is reminiscent of BitTorrent’s technology.5 Google’s use of

commodity or “white box” hardware in its data centers is an indication of Google’s hacker

ethos. The use of memory and discs to store multiple copies of data comes from the frontiers

of computing.

Google’s approach to technology, then, is eclectic and in many ways represents a building

block approach to large-scale systems. Google benefits from that eclecticism in several ways.

First, Google’s computational framework delivers sizzling performance from low-cost

hardware. Second, Google worked around the bottlenecks of such operating systems as

Solaris, Windows Advanced Server, and off-the-shelf Linux. Third, Google took good

programming ideas from other languages, implementing new functions and libraries to

eliminate most of the manual coding required to parallelise an application across Google’s

servers.6

According to Jeff Dean, one of Google’s senior engineers, “Google engineering is sort of

chaotic.”7 This is neither surprising nor necessarily a negative. The Googleplex is a toy box

for engineers and programmers. The tools are sophisticated. The challenges of the problems

and peers make Google “the place to be” for the best and brightest technical talent in the

world. The nature of creativity combined with Google’s approach to innovation make it

difficult to predict the next big thing from Google.

Before reviewing selected parts of Google’s technology in somewhat more detail, the diagram

“Google’s Computing Framework” provides an overview of the Googleplex and some of its

technologies. These will be touched upon in this section.







4. Grid computing is applying resources from many computers in a network to a single problem

or application. Google uses grid-like technology in its distributed computing system.

5. BitTorrent is a peer-to-peer file distribution tool written by programmer Bram Cohen in

2001.The reference implementation is written in Python and is released under the MIT License.

6. Google has anywhere from 100,000 to 165,000 or more servers. Servers are organized into

clusters. Clusters may reside within one rack or across multiple racks of servers. Some Google

functions are distributed across data centers.

7. From Dr Dean’s speech at the University of Washington in October 2003. See http://

www.uwtv.org/programs/displayevent.asp?rid=2459.





The Google Legacy 57

Chapter Three: Google Technology









PageRank requires a lot of computing horsepower cycles to work. When Google got

underway in 1996, Messrs. Brin and Page had limited computing horsepower. In order to

make PageRank work, they had to figure out how to get the PageRank algorithm to run on

garden-variety computers available to them.

From the beginning – and this is an important issue with regards to Google’s almost-certain

collision course with Microsoft – Google had to solve both software engineering and

hardware engineering issues to make Google Search viable. In fact, when discussing Google

technology, it is important to keep in mind that PageRank is important only because it can run

quickly in the real world, not in a sterile computer lab illuminated with the blue glow of

supercomputers.

The figure Google’s Fusion: Hardware and Software Engineering shows that Google’s

technology framework has two areas of activity. There is the software engineering effort that

focuses on PageRank and other applications. Software engineering, as used here, means

writing code and thinking about how computer systems operate in order to get work done

quickly. Quickly means the sub one-second response times that Google is able to maintain

despite its surging growth in usage, applications and data processing.

Google’s Fusion: Hardware and Software Innovations







The Google phenomenon comes from

the fission occurring when PageRank’s

software and hardware engineering

interact. Google’s technology delivers

super computer applications for mass

markets.









The other effort focuses on hardware. Google has refined server racks, cable placement,

cooling devices, and data center layout. The payoff is lower operating costs and the ability to

scale as demand for computing resources increases. With faster turnaround and the







58 The Google Legacy

Chapter Three: Google Technology









elimination of such troublesome jobs as backing up data, Google’s hardware innovations give

it a competitive advantage few of its rivals can equal as of mid-2005.

PageRank with its layering of additional computations added over the years is a software

problem of considerable difficulty. The Google system must find Web pages and perform

dozens, if not hundreds of analyses of those Web pages. Consider the links pointing to a Web

page. Google must keep track of them for more than eight billion Web pages. For a single Web

page with one link pointing to it, the problem is trivial. One link equals one pointer. But what

happens when a site has 10,000 links pointing to it? The problem becomes many times larger

and more computationally demanding. Some of these links are likely to come from sites that

have more traffic than others. Some of the links may come from sites that have spoofed

Google for fun or profit. The calculations to sort out the “value” of each of these links adds to

computational work associated with PageRank. Keeping track of these factors is a big job.

Sizing up different factors against one another for a single page can be hard without a

calculator to help. Take the same task and apply it by a couple of billion Web pages, and the

computing task becomes one for a supercomputer.

Yet this task is everyday stuff for Google and its PageRank process. Users do not give much

thought to what technology underpins a routine query or the 300 million queries Google

handles each day. In a single second, Google’s technology handles around 340 queries in

dozens of languages from users worldwide.

Google’s technology cannot be separated from search. Search was the prime mover in the

Google universe. Once Messrs. Brin and Page were able to fiddle with a limited number of

commodity computers and make their PageRank algorithm work, Google was headed down a

road that it still follows.

The software requires a suitable hardware and network infrastructure in which to operate.

Without Google’s hardware and software, there would be no Google. Hardware and software

are inextricably linked at Google. With each new advance in software, Google’s engineers

must make correspondingly significant advances in hardware. And when hardware engineers

come up with an advance, the software engineers greedily use that advance to up the

functionality of their software.

What Google owns is its own snappy, turbocharged supercomputer, interesting software tools,

and several thousand people trying to figure out what else the Googleplex can do. Some of the

tinkerers come at the problem from bits and bytes, writing code, and weaving applications out

of the available functions. The result is a brilliant product.

Others come at the problem from the soldering iron and screwdriver angle. These engineers

look for ways to build hardware and physical systems that can perform the calculations needed

to make PageRank work. Google’s approach to data centers, the racks in the data centers, and

the devices in the racks in the data centers is as clever as the company’s search system. The

hardware has to be more than clever. The hardware has to work 24x7, under continuous load,

and in locations from Switzerland to Beijing. The synergy between software and hardware is

perhaps one of Google’s major accomplishments.







The Google Legacy 59

Chapter Three: Google Technology









How Google Is Different from MSN and Yahoo

Google’s technology is simultaneously just like other online companies’ technology, and very

different. A data center is usually a facility owned and operated by a third party where

customers place their servers. The staff of the data center manage the power, air conditioning

and routine maintenance. The customer specifies the computers and components. When a data

center must expand, the staff of the facility may handle virtually all routine chores and may

work with the customer’s engineers for certain more specialized tasks.

Before looking at some significant engineering differences between Google and two of its

major competitors, review this list of characteristics for a Google data center.

1 Google data centers – now numbering about two dozen, although no one outside Google

knows the exact number or their locations. They come online and automatically, under

the direction of the Google File System, start getting work from other data centers.

These facilities, sometimes filled with 10,000 or more Google computers, find one

another and configure themselves with minimal human intervention.

2 The hardware in a Google data center can be bought at a local computer store. Google

uses the same types of memory, disc drives, fans and power supplies as those in a

standard desktop PC.

3 Each Google server comes in a standard case called a pizza box with one important

change: the plugs and ports are at the front of the box to make access faster and easier.

4 Google racks are assembled for Google to hold servers on their front and back sides.

This effectively allows a standard rack, normally holding 40 pizza box servers, to hold

80.

5 A Google data center can go from a stack of parts to online operation in as little as 72

hours, unlike more typical data centers that can require a week or even a month to get

additional resources online.

6 Each server, rack and data center works in a way that is similar to what is called “plug

and play.” Like a mouse plugged into the USB port on a laptop, Google’s network of data

centers knows when more resources have been connected. These resources, for the most

part, go into operation without human intervention.

Several of these factors are dependent on software. This overlap between the hardware and

software competencies at Google, as previously noted, illustrates the symbiotic relationship

between these two different engineering approaches. At Google, from its inception, Google

software and Google hardware have been tightly coupled. Google is not a software company

nor is it a hardware company. Google is, like IBM, a company that owes its existence to both

hardware and software. Unlike IBM, Google has a business model that is advertiser supported.

Technically, Google is conceptually closer to IBM (at one time a hardware and software

company) than it is to Microsoft (primarily a software company) or Yahoo! (an integrator of

multiple softwares).









60 The Google Legacy

Chapter Three: Google Technology









Software and hardware engineering cannot be easily segregated at Google. At MSN and Yahoo

hardware and software are more loosely-coupled. Two examples will illustrate these

differences.

Microsoft – with some minor excursions into the Xbox game machine and peripherals –

develops operating systems and traditional applications. Microsoft has multiple operating

systems, and its engineers are hard at work on the company’s next-generation of operating

systems. Microsoft does not design or make its own hardware. Its operating systems are coded,

for example, for processors that evolved from the Intel chips for personal computers. Recently

Microsoft embarked on a new path with its game machine, the Xbox 360. The new Xbox uses

a processor from IBM’s family of PowerPC chips also used in the Macintosh computer, the

Sony PS/3, and Nintendo next-generation game machines. Microsoft’s applications run on

Microsoft operating systems, although a version of Microsoft Office and Internet Explorer run

on Apple’s Macintosh.

In addition, Microsoft buys hardware from various suppliers to run its online systems. Most of

these suppliers, not surprisingly, are certified by Microsoft. Examples include Microsoft’s use

of Dell Computers. Microsoft’s engineers use these machines in configurations required by the

Microsoft operating systems and applications. For example, Microsoft servers often require a

load balancing feature. Microsoft implements its load balancing via software. When more

performance is required, Microsoft upgrades the hardware, adds memory, or shifts to higher-

speed hard drive technology instead of recoding the operating system itself to deliver higher

performance as Google does. Once a function is released to customers, Microsoft’s engineers

focus on stamping out bugs. Re-engineering a software application for higher performance is

not typically a priority.

Several observations are warranted:

1 Unlike Google, Microsoft does not focus on performance as an end in itself. As a result,

Microsoft gets performance the way most computer users do. Microsoft buys or

upgrades machines. Microsoft does not fiddle with its operating systems and their

subfunctions to get that extra time slice or two out of the hardware.

2 Unlike Google, Microsoft has to support many operating systems and invest time and

energy in making certain that important legacy applications such as Microsoft Office or

SQLServer can run on these new operating systems. Microsoft has a boat anchor tied to

its engineer’s ankles. The boat anchor is the need to ensure that legacy code works in

Microsoft’s latest and greatest operating systems.

3 Unlike Google, Microsoft has no significant track record in designing and building

hardware for distributed, massively parallelised computing. The mice and keyboards

were a success. Microsoft has continued to lose money on the Xbox, and the sudden

demise of Microsoft’s entry into the home network hardware market provides more

evidence that Microsoft does not have a hardware competency equal to Google’s.









The Google Legacy 61

Chapter Three: Google Technology









In terms of technology, Google has the hardware and software engineering expertise to build

applications rapidly, perform computationally-intensive applications quickly, and deliver

high-reliability services from low-cost, commodity hardware.

Yahoo! operates differently from both Google and Microsoft. Yahoo! is in mid-2005 a direct

competitor to Google for advertising dollars. Yahoo! has grown through acquisitions. In

search, for example, Yahoo acquired 3721.com to handle Chinese language search and

retrieval. Yahoo bought Inktomi to provide Web search. Yahoo bought Stata Labs in order to

provide users with search and retrieval of their Yahoo! mail. Yahoo! also owns

AllTheWeb.com, a Web search site created by FAST Search & Transfer. Yahoo! owns the

Overture search technology used by advertisers to locate key words to bid on. Yahoo! owns

Alta Vista, the Web search system developed by Digital Equipment Corp. Yahoo! licenses

InQuira search for customer support functions. Yahoo has a jumble of search technology;

Google has one search technology.

Historically Yahoo has acquired technology companies and allowed each company to operate

its technology in a silo. Integration of these different technologies is a time-consuming,

expensive activity for Yahoo. Each of these software applications requires servers and systems

particular to each technology. The result is that Yahoo has a mosaic of operating systems,

hardware and systems. Yahoo!’s problem is different from Microsoft’s legacy boat-anchor

problem. Yahoo! faces a Balkan-states problem.

There are many voices, many needs, and many opposing interests. Yahoo! must invest in

management resources to keep the peace. Yahoo! does not have a core competency in

hardware engineering for performance and consistency. Yahoo! may well have considerable

competency in supporting a crazy-quilt of hardware and operating systems, however. Yahoo!

is not a software engineering company. Its engineers make functions from disparate systems

available via a portal.

Google also acquires technology. A good example is Picasa. The photo management software

runs on the user’s Windows PC.

The program has been integrated with several of Google’s network-centric applications:

1 Gmail. The user’s images can be uploaded and sent via email to friends, colleagues and

family. A Picasa user without a Gmail account is able to register and receive a user

name and password. The Gmail account can also be used, if the user wishes, for other

Google services, including Fusion, which is Google’s personalized portal, and the

search history function, which saves a registered user’s Google queries for later

reference.

2 Blog Publishing. The user can post pictures to a Google property, Blogger.com. The

image publishing function is simplified to one or two clicks. Posting images on some

Web log systems is beyond the expertise of many computer users.

3 Image Printing. The user can send images to online photo processing services.









62 The Google Legacy

Chapter Three: Google Technology









One-click access to functions

performed on the user’s local

computer.









Recently-viewed images One-click access to network

services available as part of the

user’s virtual application.







In sharp contrast to Yahoo’s approach, Google integrated the Picasa application into the

Googleplex. The “hooks” are painless to the user.8 Google has bundled into one free

application point-and-click solutions to make management of digital still images intuitive and

fluid. Yahoo!’s acquisitions, in general, are not woven into a seamless experience with other

Yahoo! services. Consider the 3721.com search system. That service remains a separate

Chinese language operation available from mostly non-English Yahoo pages. Google

constructs an application using some code on the user’s PC and other software running on the

Googleplex somewhere on the Internet.

These three companies, different in structure and technical focus, are on a collision course.

Like vessels in America’s Cup, each is going toward the same goal, but subject to forces

difficult for their helmsman to control. Even though there is market space between the three,





8. Picasa requires a download. The installation process is smooth. Indexing speed was about

five times faster than ACDSee’s image management program, a competitive product. With

Picasa, Google’s technologists demonstrate a rapid, trouble-free installation and an intuitive

interface.





The Google Legacy 63

Chapter Three: Google Technology









collisions are inevitable. The figure below provides an overview of the mid-2005 technical

orientation of Google, Microsoft and Yahoo.









MSN, and by extension Microsoft Corporation, has a core competency in software. The

company has grown from its operating system roots to provide a range of products for mobile

devices, desktop and notebook computers, and enterprise-class servers. Looking forward, the

company’s Dot Net technology is Microsoft’s framework for virtual applications. In some

ways, Dot Net is a less-open version of the AJAX technology that Google uses in the Google

Maps and Gmail products. Microsoft has expended great effort to push Windows downward to

mobile devices and outward to network-centric computers in an effort to increase revenue. For

Microsoft to continue to be the dominant force in software in the future, the company must be

able to capture a commanding share of the market for network-centric applications. However,

Microsoft’s position (whether real or perceived) is its products’ vulnerability to security

breaches. Patch after patch, problem after problem, then promise after promise have done little

to bolster the firm’s credibility for delivering secure systems and software. Looking forward

over the next 12 to 18 months, Microsoft’s prospects hinge on security, cost and its developer

community. The growth of open source alternatives are hard proof that die-hard Microsoft

users are willing to shift for security, cost savings and functionality. Microsoft has weaknesses

that can be attacked by Google and other competitors.

Yahoo’s situation is typical to many American organizations. Most large US corporations are a

hotch-potch of different systems, incompatible architectures and a Tower of Babel of data

formats. For Yahoo to deliver specific markets to its advertisers, Yahoo must integrate

information from disparate systems and be able to segment and deliver ads to those users

efficiently. Yahoo is now spending money to break down the walls of its data silos and

integrating its user data. If Yahoo cannot deliver narrowly segmented markets, advertisers

may abandon Yahoo for services that offer more targeted marketing opportunities. After years

of flirting with becoming a New Age America Online, Yahoo is beginning to behave like a

traditional media company.





64 The Google Legacy

Chapter Three: Google Technology









MSN and Yahoo! are becoming ad-supported versions of general-interest portals like Yahoo,

America Online and Tiscali. In contrast, Google is focusing on applications that tie users to its

Googleplex. The company’s focus on hardware and software engineering gives it a cost and

performance advantage over MSN and Yahoo, among others competing in Web search.

Google’s high-performance, homogeneous Googleplex means that the company does not

struggle with some integration, performance and cost issues that bedevil Microsoft and MSN.

Google may not be doing everything right from a computer science point of view. Compared

to MSN or Yahoo, Google is doing less wrong than these two aggressive competitors.



The Technology Precepts

Google’s technology uses concepts and techniques from the leading edge of computer science.

Most of these innovations are difficult to explain to engineers steeped in traditional approaches

to massively distributed, highly parallelized computing. The eclectic footnotes and references

in the earlier BackRub paper have been sharpened in Google’s later technical presentations.

Readers without a first-hand understanding of NOW-Sort, River, and BAD-FS are unlikely to

craft dinner conversation from Google’s explanations of the influence of these research

computing demonstrations.9

For the purposes of this monograph and understanding the nature of Google’s technology, five

precepts thread through Google’s technical papers and presentations. The following snapshots

are extreme simplifications of complex, yet extremely fundamental, aspects of the

Googleplex.



Cheap Hardware and Smart Software

Google’s use of commodity hardware for high-demand, 24x7 systems has existed as a core

precept since 1996. Most of its competitors’ online systems combine branded hardware from

IBM, Sun Microsystems, Hewlett-Packard, and Dell Computers with specialized peripherals.

The operating systems in use are a combination of Unix and Microsoft operating systems with

some Linux and open source components.

Google approaches the problem of reducing the costs of hardware, set up, burn-in and

maintenance pragmatically. A large number of cheap devices using off-the-shelf commodity

controllers, cables and memory reduces costs. But cheap hardware fails.

In order to minimize the “cost” of failure, Google conceived of smart software that would

perform whatever tasks were needed when hardware devices fail. A single device or an entire

rack of devices could crash, and the overall system would not fail. More important, when such

a crash occurs, no full-time systems engineering team has to perform technical triage at 3 a.m.



9. See for example Andrea C. Arpaci-Dusseau, et. al. “HIgh Performance Sorting on Network

of Workstations”. In Proceedings of the 1997 ACM SIGMOD International Conference on

Management of Data, Tucson, Arizona, May 1997 or John Bent, et. al. “Explicit Control in a

Batch-Aware Distributed File System”. Both contained in Proceedings of the 1st USENIX

Symposium on Networked Systems Design and Implementation. March 2004.





The Google Legacy 65

Chapter Three: Google Technology









The focus on low-cost, commodity hardware and smart software is part of the Google culture.

In one presentation at a December 2004 technical conference, a Google spokesman joked that

anyone in the room could buy the same hardware that Google uses at Frye’s Electronics, a

retail chain with stores in Palo Alto and other cities in California.



Logical Architecture

Google’s technical papers do not describe the architecture of the Googleplex as self-similar.

Google’s technical papers provide tantalizing glimpses of an approach to online systems that

makes a single server share features and functions of a cluster of servers, a complete data

center, and a group of Google’s data centers.

The diagram below shows a representation of the Googleplex’s tightly organized, highly

regular organization of files, servers, clusters, and more than two dozen data centers in a stable

organizational pattern.10

The Googleplex A data centre

is a larger uses the same

instance of the design and is

organization of composed of

a single pizza racks.

box server.

A single Google

cluster embodies

the same

organizing

A single principle as a

replicated single pizza box

Google file server

reflects the

controllling A single Google

organizing pizza box server

principle









The diagram illustrates that Google’s technical infrastructure is similar at every level in the

Googleplex. The collections of servers running Google applications on the Google version of

Linux is a supercomputer. The Googleplex can perform mundane computing chores like

taking a user’s query and matching it to documents Google has indexed. Further more, the

Googleplex can perform side calculations needed to embed ads in the results pages shown to

user, execute parallelized, high-speed data transfers like computers running state-of-the-art

storage devices, and handle necessary housekeeping chores for usage tracking and billing.





10.The illustration is a Sierpinkski Triangle, chosen because it conveys how each component

in Google’s infrastructure replicates other larger combinations of servers and data centers. The

overall structure – in this illustration an equilateral triangle – expresses the stability of the

Google approach to its system. This famous fractal connotes how Google scales without

altering the micro or macro structure of the Googleplex.







66 The Google Legacy

Chapter Three: Google Technology









What is of interest is that Google does this with low-cost commodity hardware running on

Google’s version of Linux. Google has infused the Googleplex with logic that allows software

to handle data recovery, to streamline messages passed from server to server, and to grab

additional computing resources in order to complete a job quickly. When Google needs to add

processing capacity or additional storage, Google’s engineers plug in the needed resources.

Due to self-similarity, the Googleplex can recognize, configure and use the new resource.

Google has an almost unlimited flexibility with regard to scaling and accessing the capabilities

of the Googleplex. Unlike a collection of different building materials, Google’s approach

delivers a homogeneous computing system.

A good example is bringing a new rack of 40 or more pizza box servers online and creating

one of the many types of servers Google users.11 Servers, according to the fractal architecture,

consist of two or more clusters of pizza boxes. A cluster allows data to be replicated and work

shared among pizza boxes with spare capacity. A rack is assembled and then Google’s pizza

box servers are “plugged in.” Cables are attached among the pizza boxes and the rack is then

plugged into a network hub. An engineer turns on the power, and the other devices become

aware of the new rack’s resources. Master servers – Google’s term for the pizza box that is in

charge of one or more clusters – instruct other servers to copy data to the new cluster and begin

using the clusters to do work.

In Google’s self-similar architecture, the loss of an individual device is irrelevant. In fact, a

rack or a data center can fail without data loss or taking the Googleplex down. The Google

operating system ensures that each file is written three to six times to different storage devices.

When a copy of that file is not available, the Googleplex consults a log for the location of the

copies of the needed file. The application then uses that replica of the needed file and

continues with the job’s processing. Redundancy and other engineering tweaks to Linux gives

the Googleplex ways to eliminate or reduce the bottlenecks associated with traditional online

computer systems’ operation. The Google technical recipe includes distributed computing,

optimized file handling, and embedded logic to make the servers working on tasks smarter.

This architecture allows Google to expand its computational capacity, its storage and its

supported applications with an ease and price point rivals cannot easily match. According to

Jeff Dean, one of Google’s senior engineers, “At Google, everything is about scale.”12



Speed and Then More Speed

Google Search is fast with most results coming back to the user in less than one second. In

commercial data centers, speed has traditionally been achieved by buying high-end, high-

performance hardware from such manufacturers such as Sun Microsystems and using

advanced storage devices connected to the servers by exotic fibre optics.





11.Data centers use computer cases that are shaped like the boxes used to hold pizzas. The

term pizza boxes has been appropriated by engineers to describe one of the standard form

factors for servers housed in rack mounts in data centers.

12.Statement made at the University of Washington, October 2004





The Google Legacy 67

Chapter Three: Google Technology









Not Google. Google uses commodity pizza box servers organized in a cluster. A cluster is

group of computers that are joined together to create a more robust system. Instead of using

exotic servers with eight or more processors, Google generally uses servers that have two

processors similar to those found in a typical home computer.

Through proprietary changes to Linux and other engineering innovations, Google is able to

achieve supercomputer performance from components that are cheap and widely available.

The table below provides some data from 2002 about the speed with which Google can read

data from hard drives:13









These data show the results of two clusters’ performance. Google’s read throughput has

gone up since 2002. Based on increases in commodity drive throughput, Google’s read

rate may be close to 2,000 megabytes per second, which may be a Google watchers

enthusiasm boosting already-robust figures.



To put these data in a context of 2002 technology, consider that an IBM EXP3 storage device

available in 2002 could read data in burst mode at the rate of about 58 MB / second. Google’s

read rate in 2002 averaged ten times the read rate of the IBM EXP The write rate is comparable.

The cost of a single IBM EXP3 in 2002 was about $18,000 for 360 gigabytes of storage,

excluding controller and cables. Google’s cost for comparable storage and the higher

performance was about $1,000. For greater speed, Google spends less. In the world of ever-

increasing demands for speed and storage, Google has a strong one-two punch.14 Advances in

commodity storage devices translate to even faster performance for Google. Google has not

updated its read rate data, but engineers familiar with Google believe that read rates may in

some clusters approach 2,000 megabytes a second. When commodity hardware gets better,

Google runs faster without paying a premium for that performance gain.

Google engineers for computational speed. Google’s approach has been to focus on making its

software engineering produce the turbocharged performance. Speed is crucial to Google’s

PageRank and other analytic processes. If Google’s computational throughput were slow,

Google could not perform the work needed to know that for a particular query, a particular set







13.From “The Google File System” by Sanjay Ghemawat, Howard Gobioff, and Shun-Tak

Leung (Google) ACM SOSP 2003 Conference Proceedings 1-58113-757-5/03/0010, page 12.

14.With Google’s advanced programming tools, Google is able to increase the productivity of

its engineers. Combined with hardware speed and performance, Google squeezes out more

productivity by applying its engineering talents to application development. This is a one-two-

three punch to which Google’s competitors have to respond.







68 The Google Legacy

Chapter Three: Google Technology









of indexed Web pages is the best match. Without fast response to a query, users would not be

willing to run multiple queries and interact fluidly with the Google applications.

Google does not mindlessly match key words in a user’s query to the terms in the Google

index. Google’s approach is more subtle and computationally involved, although term

matching is an important part of the Google process. Google reviews data, various scores or

values from certain algorithms. Google then uses these different values in other algorithms to

find search results, identify the best match (Google’s “Feeling Lucky” link), extract matching

ads from its advertising server, and continuously update values as Google users of click on

links. Once these various query and ad matching processes are complete, Google displays the

results page to the user; typically in less than one second across a public network.

Google is a hot rod computer that can perform the basic mathematics needed to deliver most

search results in less than a half second, display maps with the speed of a dedicated desktop

application like Encarta, and look at a Web page matching a user’s query and, in some

applications, insert additional hyperlinks to related content before displaying the results page

to the user. The Googleplex does experience slow downs. When these occur, the Googleplex

allocates additional resources to eliminate the brown out.

Speed has many meanings at Google. Speed means that users can interact with the Google

products and services as if the Google application were running on a dedicated PC in front of

the user. Speed also means that Google must be able to expand its computational and storage

capacity quickly. Speed also means rapid development and deployment of new products.

Speed, like Google’s ability to scale, is a core functionality of the Googleplex.

Google applies its high-speed technology to search and to other types of servers. Among the

servers using Google’s go-fast technology are those shown below:



Type Function



Advertising server Delivers text and other paid advertisements for AdWords and AdSense.



Chunkserver Schedules and delivers blocks of data for further processing.



Image servers Serves images for Google Image, Print and Video services.



Index server The workhorse of search. Server handles search-and-retrieval.



Mail server Delivers the Gmail service.



News server Gathers, analyses and displays news.



Web server Orders results and makes them available to users.



What does the combination of go-fast technology plus multiple types of Google data allow the

company to do? Google can engage in fast new product development. One example is Google

Maps. Google developed a basic mapping product over the course of 2004. In late 2004,

Google purchased Keyhole. By June 30, 2005, Google had:

1 Released a basic mapping product.





The Google Legacy 69

Chapter Three: Google Technology









2 Integrated information from Google Local in early 2005.

3 Hooked Keyhole satellite imagery into Google Maps in early May 2005.

4 Announced Google Earth in May 2005.

5 Upgraded the system to integrate two dimensional point-to-point routes on top of

satellite imagery.

6 Demonstrated a function that accepts a query in another language, translates the results

to the user’s language, and displays the data in a three-dimensional mode.

The image below shows that Google’s Map and Earth service pushes the functions of online

map and data integration to another level. In the span of several days, Google integrated

Keyhole technology, launched, upgraded and redefined online mapping services.15









This is the results of a Japanese language Google Maps-Earth query for the location of Wendy’s

restaurants in New York City. The addition of the Japanese language support, the three-dimensional

view of the section of Manhattan where the user wants directions, and the integration of hot links, the

two dimensional map, and information about the restaurants was part of Google’s fast-cycle launch

and enhancement program designed to beat Microsoft to the market.



Another key notion of speed at Google concerns writing computer programs to deploy to

Google users. Google has developed short cuts to programming. An example is Google’s

creating a library of canned functions to make it easy for a programmer to optimize a program

to run on the Googleplex computer. At Microsoft or Yahoo, a programmer must write some



15.The source for this image was http://blog.eee-craft.com/archives/23345086.html.







70 The Google Legacy

Chapter Three: Google Technology









code or fiddle with code to get different pieces of a program to execute simultaneously using

multiple processors. Not at Google. A programmer writes a program, uses a function from a

Google bundle of canned routines, and lets the Googleplex handle the details. Google’s

programmers are freed from much of the tedium associated with writing software for a

distributed, parallel computer. What does increased programmer productivity mean? In terms

of money, Google makes each engineering dollar go farther. If a single programmer can reduce

by 10 percent the time required to code a program, the savings could be several thousand

dollars. If a programmer can slash coding time in half, Google gets twice the potential

productivity out of each of its 3,000 plus programmers.16



Eliminate or Reduce Certain System Expenses

Some lucky investors jumped on the Google bandwagon early. Nevertheless, Google was

frugal, partly by necessity and partly by design. The focus on frugality influenced many

hardware and software engineering decisions at the company. Spending money wisely does

not mean cheaply. Examples of how Google eliminates or reduces certain system expenses

include:

• Google eliminates the costs associated with backing up and restoring data when a

hardware failure occurs. The fractal principal requires that Google replicate data three to

six times elsewhere in the Googleplex. When a device fails, the “master server” for a

task looks at a file that tells where the other copies of the data or the programs are. The

“master server” then uses those data or those processes to complete a task. No tape, no

human intervention, and no downtime; Google does not have these costs due to its

engineering acumen.

• Google does not have to certify new hardware. When additional storage or

computational capacity is required, Google technicians assemble one or more racks of

Google “pizza boxes.” Once in the rack, the Googleplex recognizes the new resources in

a way that is similar to how a laptop knows when a user plugs in a USB mouse. The

expensive certification processes otherwise required for some high-end hardware are

eliminated. Google engineers plug in resources and let the Googleplex handle the other

tasks.

• Google innovation uses open source code as a starting point. Many of Google’s most

striking technical advances are based on modifying open source software to benefit from

insights gained from experimental results in supercomputing. Google does not have to

work around known bottlenecks in some commercial operating systems. Unlike

Microsoft, Google did not write a complete operating system for its Googleplex. Google

made key changes to Linux, adding necessary services and functions to meet the specific

requirements of Google applications. Google’s approach is pragmatic and less time-





16.Some Google programmers have complained about the peer pressure to perform. Google

management faces a challenge in managing its programming talent. Staff burn out or defections

could impair Google’s technical resources.





The Google Legacy 71

Chapter Three: Google Technology









consuming than Microsoft’s “death march” to get Longhorn shipped by late 2006.

Compared with Yahoo, Google’s approach is more cohesive. Yahoo faces integration

drudgery as a result of its multiple systems and heterogeneous hardware and data.

Google has used Linux, standards, and open source software for virtually all of its core

services and thus spends less time pounding disparate systems and data into a standard

type.17

• Google does not spend money for high-performance devices to make its system perform

faster.

To illustrate the financial payoff from the use of commodity hardware, Google engineers

revealed a back-of-the-envelope calculation. Although dated, it underscores the economies of

the Google approach:18

The cost advantages of using inexpensive, PC-based clusters over high-end

multiprocessor servers can be quite substantial, at least for a highly parallelisable

application like ours. For example, a $278,000 rack contains 176 2-GHz Xeon CPUs,

176 Gbytes of RAM, and 7 Tbytes of disk space. In comparison, a typical x86-based

server contains eight 2-GHz Xeon CPUs, 64 Gbytes of RAM, and 8 Tbytes of disk

space; it costs about $758,000. In other words, the multi-processor server is

about three times more expensive but has 22 times fewer CPUs, three times less

RAM, and slightly more disk space. Much of the cost difference derives from the

much higher interconnect bandwidth and reliability of a high-end server, but again,

Google’s highly redundant architecture does not rely on either of these attributes.

[Emphasis added]

This means that when Microsoft of Yahoo! spends US$3.00 for better performance, Google

spends less than US$1.00.19 Over time, competitors such as IBM, Microsoft or Yahoo may

implement similar features into their network-centric services. Until then, Google has a cost

advantage at least with regards to scaling online operations. If these 2002 data can be

accepted, Google spends one-third for more computing horsepower and disc space than

companies spend using a traditional server architecture.



Snapshots of Google Technology

Google engineers generate a large volume of technical information. Some of the data are in

the form of patents, often written in a style that communicates little of the patent’s substance

to a lay reader. The link for Google’s publications can shift unexpectedly.20 Exploring





17.Google does not explicitly state that it has embraced a services oriented architecture or SOA.

However, many of Google’s practices illustrate an informed use of certain features of SOA.

18.Luiz André Barroso, Jeffrey Dean, and Urs Hölzle, “Web Search for a Planet: The Google

Cluster Architecture”, IEEE Computer Society 0272-1732/03, March April 2003.

19.A review of Google’s cost estimates for this monograph revealed that Google is understating

its cost advantage by one or two orders of magnitude. As the performance of commodity

hardware goes up, the cost of that hardware goes down. Bulk purchasing chops as much as 50

percent off the cost of some hardware. Google can replicate its data and give away free

gigabytes of email storage. The cost to Google can be as low as a few cents a gigabyte.

20.See http://labs.google.com/papers.html#compilers on June 1, 2005.







72 The Google Legacy

Chapter Three: Google Technology









biographies of Google executives and Google Web logs can yield some useful technical

information. For example, one Google biography linked to more than 36 personal projects,

including one by Google’s CEO.21 Surprisingly, Google’s search engine does a hit-and-miss

job of indexing Google’s own technical information.

Useful engineering information appears on the Google Web site. The topics covered in various

monographs, white papers and technical notes concern a wide range of subjects. For example,

in mid-2005, papers were available on such topics as algorithms, compiler optimization,

information retrieval, artificial intelligence, file system design, data mining, genetic

algorithms, software engineering and design, and operating systems and distributed systems,

among others. Google explains its use of very large files as well as how the Google-modified

version of Linux automatically allocates work and avoids the file system bottlenecks that can

plague Solaris and Windows Advanced Server 2003, among others.

Google’s technical papers and Google patents provide some insight into areas of interest at

Google. For example, Google is posting more information about operating systems and

applications. The thrust of Google’s innovation is to build out the search platform and expand

the functionality of its backoffice programs such as those used for advertising services.

The annex to this monograph provides information about more than 60 patents for which

Google is believed to be the assignee. To provide a more fine-grained look at Google

technology, the table below identifies selected examples of innovations documented by

Google engineers or researchers close to the company. Most of these papers appeared prior to

Google’s receiving a patent for the technology referenced in these reports:



Technology Purpose To Learn More



Google Suggest Helps users find needed information Services Computing, 2004 IEEE

by analysing queries and suggesting International Conference on (SCC'04) by

other queries. Stephen Davies, Serdar Badem,

Michael D. Williams, Roger King

September 2004.



Video Object Search User types an object name and Google Ninth IEEE International Conference on

finds that object in a video. Computer Vision Volume 2 Josef Sivic,

Andrew Zisserman Publication Date:

October 2003.



MapReduce New functions in Google Linux to OSDI Proceedings, December 2004.

speed programming and other

processes involving large data sets.



Google File System Extension to Google Linux to allow ACM Publication 1-58113-757-5/03/

high-speed data reads and writes from 0010.

commodity drives.









21.This is the lex project that “helps write programs whose control flow is directed by instances

of regular expressions in the input stream. It is well suited for editor-script type transformations

and for segmenting input in preparation for a parsing routine.”





The Google Legacy 73

Chapter Three: Google Technology









Technology Purpose To Learn More



Identify Authoritative or Uses pattern mining in order to Seventh International Database

High-Value Sources in generate a numeric value to indicate Engineering and Applications

Web Content an authoritative source as an Symposium (IDEAS'03) Haofeng Zhou,

indication of content quality. Yubo Lou, Qingqing Yuan, Wilfred Ng,

Wei Wang, Baile Shi July 2003.



MetaCrystal Metasearch technology to allow a Second International Conference on

single query to retrieve and organize Coordinated & Multiple Views in

results in a visual display. Exploratory Visualization (CMV'04)

Anselm Spoerri July 2004.





Drawbacks of the Googleplex

The coaching mantra, “No pain without gain” is true for Google. Google does make mistakes:

and some big ones. The example fresh in news headlines is Web Accelerator. The product was

introduced in May 2005 and withdrawn less than six weeks later. Speed and nimbleness aside,

Web Accelerator was technology that ran head on into “issues.”Of greater consequence are the

periodic slowdowns for Gmail. The Googleplex is scalable, but until more servers are online,

users may face annoying delays.



Going Too Fast: The Google Web Accelerator

The Web Accelerator software was supposed to use Google servers to store Web pages a user

viewed. Web Accelerator parsed a page in the user’s browser. The Web Accelerator function

then followed each link on that specific page. The page was then stored in a Google cache.

When the user clicked on a link, the user would see the page from the Google cache, thus

reducing the time required to display the page to the user.

Web Accelerator worked fine on such sites as a www.whitehouse.gov, which makes minimal

use of advanced Web services. Unfortunately, the Web Accelerator function followed links

that transmitted instructions to Web applications. For example, Web Accelerator would click

on “delete” links, causing some Web applications such as Backpack to remove the user’s

preferences or content.22 Web Accelerator blithely ignored confirmations generated by

JavaScript so that unintentional instructions were transmitted. Some Google watchers raised

questions about caching data as well as privacy and copyright issues. Before these concerns

reached a crescendo, Google reported that Web Accelerator had reached its capacity. Google

blocked downloads for the product.



The Laws of Physics: Heat and Power 101

Google does not reveal the number of servers it uses, but the number is believed to be in the

150,000 to 170,000 range as of June 30, 2005. Conflicting information surfaces in Web logs

and in talks at conferences. In reality, no one knows. Google has a rapidly expanding number

of data centers. The data center near Atlanta, Georgia, is one of the newest deployed. This



22.Backpack is a Web application that sends a user the contents of any page as email. See

www.backpackit.com.







74 The Google Legacy

Chapter Three: Google Technology









state-of-the-art facility reflects what Google engineers have learned about heat and power

issues in its other data centers. Within the last 12 months, Google has shifted from

concentrating its servers at about a dozen data centers, each with 10,000 or more servers, to

about 60 data centers, each with fewer machines.23 The change is a response to the heat and

power issues associated with larger concentrations of Google servers.

The most failure prone components are:

• Fans.

• IDE drives which fail at the rate of one per 1,000 drives per day.

• Power supplies which fail at a lower rate.

Repairs are batch operations. Scheduling the fixes is a major job and work is underway to

improve the Google-developed scheduling capability. Google has to locate hosting facilities

that can meet the company’s heat and power requirements.



Other Data Center Issues

Google data centers have access to multiple high-speed lines

and normal data center functions such as redundant power,

traffic routing and strict rules governing access to the physical

boxes.

PRWeaver’s Web log contained a posting of a photograph

allegedly taken inside a Google data center. If true, the physical

layout of the racks holding an estimated 2,000 or more servers

squeezes a large amount of hardware in a tightly-packed space.

This type of dense configuration helps explain the comments about Google’s heat and power

concerns. Most data centers were not designed to handle dense concentrations of thousands of

servers. Heat contributes to hard drive failures. On the plus side, the dense configuration

makes set up and maintenance somewhat easier. Google packs servers on two sides of a rack.

A unique property of the data centers is that replicated content can be written from one data

center to another. Google data within the data center are replicated on other servers and other

clusters running in the racks.

The Google “plug and play” engineering philosophy appears to be used in and across data

centers. If a data center, such as the one shown above, needs additional index server capacity,

the technicians in that center can build a Google rack of 40 pizza box servers. These servers

are connected to the network. When the rack is powered up, it becomes available to the master

servers for that data center. These master servers then mark the rack’s resources as available.

Master servers then begin sending work to the new devices. The information about data







23.These data appear at www.mcdar.net/SEOTools.htm





The Google Legacy 75

Chapter Three: Google Technology









centers indicates that this “plug and play” concept and automatic discovery of new resources

applies to new data centers, not just the racks within them.

It may be an exaggeration that a Google rack and the data center in which the rack resides

works like a USB mouse. The general concept seems to be what Google engineers have tried to

achieve. By eliminating such tasks as certifying and configuring Small Computer System

Interface RAID storage devices, Google is content to let the auto-discovery functionality alert a

“master server” to a new resource, master servers to alert other master servers, masters to

notify clients of tasks, and data centers to pass information that racks, clusters or a new data

center are available for use.

A a Google engineer said, “Wherever we put a cluster, we have heat, cooling and power

issues. When we put in a data center, that data center operator faces new challenges. We use

each day four megawatts of electric power.”

The problems include:

1 Heat. Special racks with fans that cool the core of the rack are used.

2 Power. The power demand at load is greater than data centers typically sustain. “Our

cages are custom built and there’s a lot of work done by us and the data center people

before we can flip the switch,” said Jeff Dean, a senior Google engineer.

3 Network management tools. Google has had to create network management tools to

manage its self-healing, automatic failover operating system.



What’s Up, Sergey?

The Google data centers are concentrated in North America with other data centers located in

Switzerland, the Pacific Rim, and Beijing.24

Because the GOS is self-healing, the operating system and the various “master computers” in a

cluster know what device is online and what device is dead. Off-the-shelf network

management tools are not tailored to Google’s requirements. Therefore, Google is developing

network management and monitoring tools so that the information in the Google operating

system log files can be displayed in a meaningful way to Google network engineers.

The overall Googleplex works and continues working even if a device, rack or data center

goes dark or dies. Network management tools have to provide a broad range of monitoring

and support functions for the global network, devices, data flows, work loads and potential

problem areas. Google is developing needed network management tools specifically for its the

Googleplex.









24.The Beijing data center was purpose built to conform to the ruling body’s requirements for

online access, monitoring and related issues. Google complied in order to do business in China.

Yahoo! bought 3721.com in order to accelerate its effort in China.







76 The Google Legacy

Chapter Three: Google Technology









Unanticipated Faults Could Derail Google’s Juggernaut

Google’s network uses a number of concepts from the fringes of computer innovation as well

as its hands-on knowledge gained by from the Googleplex itself. The result is a highly-

resilient network that may breed problems not previously encountered. Although Google has

operated for more than five years without downtime from system failure, the possibility –

however remote – does exist that something unanticipated could occur. A sufficiently large

problem could deal Google a severe blow. The advanced technology of Google’s MapReduce

tool and its 400 module library could pose as yet unforeseen technical problems.









The diagram shows how Google’s approach eliminates the bottleneck in parallelized systems

produced by excessive message traffic flowing through a server coordinating work among different

computers. This is a diagram produced by Google engineers.







Summary of Google’s Drawbacks

Critics of Google can point to three “problems” with Google’s approach to performance.

First, Google is a one-trick pony. The changes to Linux and the other technical modifications

are little more than hackers’ attempts to squeeze a small performance gain.

Second, Google’s use of commodity hardware and cheap storage is a risky solution. Unknown

problems may lurk when cheap components are used in a mission-critical system. Increasing

the potential risk are the changes Google makes to speed up program execution.









The Google Legacy 77

Chapter Three: Google Technology









Finally, other operating systems – including those from computer research laboratories and

even Microsoft – do the same things and have for years.



Leveraging the Googleplex

Google has demonstrated that search is just one application that can run in the Google

environment. There are many other applications that can benefit from Google’s approach to

online services.

1 Applications that require a high performance payoff for a low cost such as electronic

mail.

2 An application that can run in Google’s redundant environment where there is no

private-state replication such as found in IBM’s AS/400 operating environment and

others.

3 Computationally-intensive, stateless applications.

4 Applications that require request-level parallelism, a characteristic exploitable by

running individual requests on separate servers such as Google Earth.

There is little to be gained by trotting out war-horses to trample Google. The user experience

speaks for itself. Google’s approach to massively-parallel distributed computing works, even

on dial-up networks.

Google fused the type of thinking associated with small, cash-strapped companies with

techniques from advanced computer systems. Commodity products keep costs down. A

modified Linux delivers fast performance at a bargain basement cost. Google is taking a

strategic risk with commodity hardware and a souped up version of Linux. Each day Google

bets that its technologists can keep the system humming.

Another reason why Google’s approach to technology is paying off is that Google employes

the same pragmatism and cleverness in application development. Google uses standard

engineering practices, proprietary knowledge, and off-the-shelf techniques such as its use of

Web services. Google uses the same Web programming techniques that millions of Web

developers use. The payoff is that it is easy for Google to hire people who can code for the

Googleplex. Google so far has not had to spend money for developer marketing programs or

train new hires to work in the Googleplex.

The biggest boost to Google’s technical approach is that its competitors are following

different, more expensive approaches. Yahoo is a fruit cake of hardware, operating systems,

and applications coded at different times in different languages by different people. Microsoft

uses its own operating systems but relies on other operating systems as well, including Solaris.

Microsoft’s must invest in hardware to squeeze performance out of its platforms. Yahoo

wrestles with its many different platforms. Microsoft seems powerless to enhance the speed of

its operating system. Both are digital ostriches burying their heads in their own marketing

material.







78 The Google Legacy

Chapter Three: Google Technology









Google’s technology is one major challenge to Microsoft and Yahoo. So to conclude this

cursory and vastly simplified look at Google technology, consider these items:

1 Google is fast anywhere in the world.

2 Google learns. When the heat and power problems at dense data centers surfaced,

Google introduced cooling and power conservation innovations to its two dozen data

centers.

3 Programmers want to work at Google. “Google has cachet,” said one recent University

of Washington graduate.

4 Google’s operating and scaling costs are lower than most other firms offering similar

businesses.

5 Google squeezes more work out of programmers and engineers by design.

6 Google does not break down, or at least it has not gone offline since 2000.

7 Google’s Googleplex can deliver desktop-server applications now.

8 Google’s applications install and update without burdening the user with gory details

and messy crashes.

9 Google’s patents provide basic technology insight pertinent to Google’s core

functionality.

A young programmer in Osaka or Beijing is very likely to have been influenced by Google.

The skilled programmers want to work at Google, develop for the Googleplex, and, if possible,

create their own Google killer. The mantra is, “Be like Sergey and Larry”.

Google has a next-generation computing platform. That platform is optimised to deliver

virtual applications to its users worldwide. Google uses standard Web technologies in clever

ways. Although the technical challenges facing Google are formidable, the company has

advanced the art of online computing.









The Google Legacy 79


Related docs
Other docs by Eslam Moustafa
culture exam
Views: 2  |  Downloads: 0
Computer
Views: 1  |  Downloads: 0
France
Views: 0  |  Downloads: 0
Tech
Views: 8  |  Downloads: 0
The Thief and the Dogs
Views: 2  |  Downloads: 0
Hamlet
Views: 0  |  Downloads: 0
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!