The Search Engine Professionals at Rank for $ales.com --- In business since 1997.
Back to our Homepage SEO Tips that will make a big difference in your rankings and our most popular ** How To ** section The most common myths about SEO -- Read what the experts have to say about today's most common SEO myths and misconceptions Frequently Asked Questions to Search Engine Optimization and Positioning Search Engine Optimization Industry News -- Stay in tune with the most recent developments in search engine technology and the SEO industry Contact Rank for $ales today and get your site's rankings high in the engines-- Right where they should be!

  SEARCH FOR:   CITY or STATE:

Search this site              Join the SEO Help Forum


Google's hardware requirements in its data centers

March 3, 2005

Urs Hoelzle, Google's V.P. of engineering and operations offers a rare glimpse on how the company's data centers operate.

Many people consider the company's operations expertise more valuable than the actual search algorithms that launched the enterprise. Hoelzle spoke at EclipseCon, a conference for application programmers that's going on till Thursday here.

At EclipseCon, Urs Hoelzle, a vice president of engineering and of operations at the search giant, shed some light on how Google's data centers operate.

According to Hoelzle, Google has inexpensively built out its computing infrastructure by using thousands of "commodity" servers, instead of fewer high-end, and high-priced, machines. The trick is to make these racks of hardware work together and to ensure that the failure of one machine doesn't derail an operation.



The way Google has been able to build out its computing infrastructure for millions, rather than tens of millions, of dollars is by buying relatively cheap machines. Looking at hardware costs, company engineers saw that purchasing a few high-end servers, with eight or more powerful processors, costs significantly more than dozens of simpler "commodity" servers.

The trick is to make these racks of hardware operate in tandem and to ensure that the failure of one machine does not derail an operation, such as returning a search query or serving up an ad.

Consider a home PC, Hoelzle said. Optimistically, a consumer PC might crash once in three years from a software glitch or hardware problem.

"At Google scale...if you have thousands of PCs, you can expect one (failure) a day," he said. "So you better deal with that in an automated way, or you will have service outages."

Google, known for its rigorous hiring practices aimed at attracting the brightest minds in computer science, has created a number of software tools to handle its computing installation.

The company wrote its own file system, called Google File System, which is optimized for handling large, 64 megabyte blocks of data. Significantly, the file system was designed to assume that a failure, such as a failed disk or unplugged network cable, can happen at any time.

Data is replicated in three places, and there is a "master" machine that can locate copies of a piece of data, such as a keyword index, if the original is out of commission.

"You make the software tolerate failures. If you can expect failures, then this is what makes cheap commodity PCs viable for Internet services," Hoelzle said.

Google's PC servers, which number in the thousands, run a stripped-down version of Linux, which is based on the Red Hat distribution but is really just the operating system kernel modified for Google, he added.

The company has also devised a system for handling massive amounts of data and returning rapid responses to queries. Google splits the Web into millions of pieces, or "shards" in Google tech speak, which are replicated in case of failure.

Not surprisingly, the company creates an index of words that appear on the Web, which it stores as an array of large files. But it also has document servers, which hold copies of Web pages that Google crawls and downloads.

Another important engineering feat done by Google is to make writing programs that run across thousands of servers very straightforward, according to Hoelzle. Normally, building applications to run in a "parallel" configuration of servers requires specialized tools and skills.

Google's programming tool, called MapReduce, which automates the task of recovering a program in case of a failure, is critical to keeping the company's costs down.

"Cost is really the sum of what the equipment you need to do the work costs and how much programming time you need to put into getting something useful," Hoelzle said, adding that Google has started using MapReduce more widely over the past year.

Finally, Google has created "batch" job scheduling software that acts as a sort of taskmaster for millions of operations. Called the Global Work Queue, it breaks up computing jobs into many smaller tasks and distributes them across machines.

For all its built-in redundancy in case of failure, the system doesn't address all problems, Hoelzle revealed. During the presentation, he showed a photo of six fire trucks responding to an emergency at a Google data center in an undisclosed location.

He would not reveal any specific details on the mishap except to say that "it wasn't about one machine going down."

In a follow-up interview with CNET News.com, Hoelzle said the cost of power is another important factor in Google's data center designs.

"The physical cost of operations, excluding people, is directly proportional to power costs," he said. "(Power) becomes a factor in running cheaper operations in a data center. It's not just buying cheaper components but you also have to have an operating expense that makes sense."

Source: C-Net News


Drop your e-mail address
& get our free weekly newsletter

Read Serge Thibodeau's daily blogs on search engines at Serge Thibodeau Live. We strongly suggest you bookmark our web site by clicking here.

Tired of receiving unwanted spam in your in box? Then get SpamArrest™ and put a stop to all that nonsense. Click here to get all the details.
Tired of receiving unwanted spam in your in box? Get SpamArrest™ and put a stop to all that SPAM. Click here and get rid of SPAM forever!

Get your business or company listed in the Global Business Listing directory and increase your business. It takes less then 24 hours to get a premium listing in the most powerful business search engine there is. Click here to find out all about it.

Rank for $ales strongly recommends the use of WordTracker to effectively identify all your right industry keywords. Accurate identification of the right keywords and key phrases used in your industry is the first basic step in any serious search engine optimization program. Click here to start your keyword and key phrase research.

Pay Rank for $ales securely with your Visa, MasterCard, Discover, or American Express credit card through the secure PayPal network. (Note: PayPal is an eBay company, and maintains a net free capital of US $ 50 Million).
VisaMasterCardDiscoverAmerican Express

You can link to the Rank for Sales web site as much as you like. Read our section on how your company can participate in our reciprocal link exchange program and increase your rankings in all the major search engines such as Google, AltaVista, Yahoo and all the others.

Powered by Sun Hosting                  Sponsored by Avantex          Traffic stats by Site Clicks™

Site design by Mtl. Web D.         Sponsored by Press Broadcast         Sponsored by Blog Hosting.ca


Call Rank for Sales toll free from anywhere in the US or Canada:   1-800-631-3221
email:   info@rankforsales.com



| Home | SEO Tips | SEO Myths | FAQ | SEO News | Articles | Sitemap | Contact |


Copyright © Rank for Sales 2003    Terms of use    Privacy agreement    Legal disclaimer

       Ce site est disponible en Français