Contextual Advertising
Connecting you to your audience - Learn More

Recent Articles

Americans Turning To Web For Big Decisions
As the Internet grows to become the default source of information for millions of Americans, 45% of Internet users, or 60 million Americans, say that the World Wide Web played a major role in major decision-making in the past two years...

Yahoo Promoting Earth Day Pledges
Through a special website hosted at Yahoo, visitors can pledge to take energy-saving measures that may help improve the quality of life on Earth.

SEMLogic Means No More Search Secrets
During a compelling presentation of Fortune Interactive's SEMLogic demo, I realized a greater implication of the service than accurate search marketing - Mike Marshall and his team have managed to duplicate...

eBay Earnings Fall On Stock Expense
Although revenue increased by 35 percent compared to the same quarter last year, net income for eBay moved down by three percent as stock-based compensation expenses impacted the bottom line.


04.24.06


Matt Cutts Teaches Us To Crawl

By David A. Utter

The Google engineer followed up his WebmasterWorld PubCon Boston discussion of Google's Bigdaddy infrastructure update and "crawl cache" with a lengthier look at the topic.

Cutts' latest blog post reviewed Bigdaddy's crawl-caching proxy in greater depth. He even provided helpful charts to illustrate the process.

As a webmaster, one may see numerous fetches from multiple Googlebots, each of them using some bandwidth while accomplishing their appointed rounds. It makes for a more accurate Google index, but the site impact has given some webmasters fits over the bandwidth usage.

The proxy used in the Bigdaddy infrastructure works like other proxies. It handles the effort of retrieving pages from websites, and fulfills requests from the various Google crawlers. Instead of multiple spiders hitting a website, they hit the cache instead.

Cutts breaks down the crawl caching in a summary during his post (spacing added; we like Matt, but we'd really like him to enjoy the Return key a bit more often :) :

So the crawl caching proxy work like this: if service X fetches a page, and then later service Y would have fetched the exact same page, Google will sometimes use the page from the caching proxy.

Contextual Advertising
Connecting you to your audience - Learn More

Joining service X (AdSense, blogsearch, News crawl, any Google service that uses a bot) doesn't queue up pages to be include in our main web index. Also, note that robots.txt rules still apply to each crawl service appropriately. If service X was allowed to fetch a page, but a robots.txt file prevents service Y from fetching the page, service Y wouldn't get the page from the caching proxy.

Finally, note that the crawl caching proxy is not the same thing as the cached page that you see when clicking on the "Cached" link by web results. Those cached pages are only updated when a new page is added to our index.

It's more accurate to think of the crawl caching proxy as a system that sits outside of webcrawl, and which can sometimes return pages without putting extra load on external sites.


The essential goal of the proxy, to reduce bandwidth, seems to have worked to Google's satisfaction. Cutts wrote that "it was working so smoothly that I didn't know it was live."

About the Author:
David Utter is a staff writer for WebProNews covering technology and business.

About PromoteNews
PromoteNews provides the latest news, tips and concepts in successfully driving traffic to your website. PromoteNews knows that Web Success Begins With Promotion.

PromoteNews is brought to you by:

ActivePro.com EnterpriseWebPro.com
AdvertisingDay.com EntrepreneurNewz.com
CareerNewz.com ERPupdate.com
CRMNewz.com InsideOffice.com
EcommNewz.com InvestNewz.com
NetDummy.com SmallSiteNews.com



 
-- PromoteNews is an iEntry, Inc. publication --
iEntry, Inc. 2549 Richmond Rd. Lexington KY, 40509
2006 iEntry, Inc.  All Rights Reserved  Privacy Policy  Legal

archives | advertising info | news headlines | free newsletters | comments/feedback | submit article
Web Success Begines WIth Promotion PromoteNews Home Page About Article Archive News Downloads WebProWorld Forums Jayde iEntry Advertise Contact PromoteNews News Archives About Us Feedback