Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Googlebot

From Wikipedia, the free encyclopedia
Web crawler used by Google
This article has multiple issues. Please helpimprove it or discuss these issues on thetalk page.(Learn how and when to remove these messages)
icon
This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Googlebot" – news ·newspapers ·books ·scholar ·JSTOR
(October 2019) (Learn how and when to remove this message)
This article needs to beupdated. Please help update this article to reflect recent events or newly available information.(March 2020)
(Learn how and when to remove this message)
Googlebot
Original authorGoogle
TypeWeb crawler
WebsiteGooglebot FAQ

Googlebot is theweb crawler software used byGoogle that collects documents from theweb to build a searchable index for theGoogle Search engine. This name is actually used to refer to two different types of web crawlers: a desktop crawler (to simulate desktop users) and a mobile crawler (to simulate a mobile user).[1]

Behavior

[edit]

A website will probably be crawled by both Googlebot Desktop and Googlebot Mobile. However starting from September 2020, all sites were switched to mobile-first indexing, meaning Google is crawling the web using a smartphone Googlebot.[2] The subtype of Googlebot can be identified by looking at the user agent string in the request. However, both crawler types obey the same product token (useent token) in robots.txt, and so a developer cannot selectively target either Googlebot mobile or Googlebot desktop using robots.txt.

Google provides various methods that enable website owners to manage the content displayed in Google's search results. If awebmaster chooses to restrict the information on their site available to a Googlebot, or anotherspider, they can do so with the appropriate directives in arobots.txt file,[3] or by adding themeta tag<meta name="Googlebot" content="nofollow" /> to the web page.[4] Googlebot requests toweb servers are identifiable by auser-agent string containing "Googlebot" and a host address containing "googlebot.com".[5]

Currently, Googlebot followsHREF links and SRC links.[3] There is increasing evidence Googlebot can execute JavaScript and parse content generated byAjax calls as well.[6] There are many theories regarding how advanced Googlebot's ability is to process JavaScript, with opinions ranging from minimal ability derived from custom interpreters.[7] Currently, Googlebot uses a web rendering service (WRS) that is based on the Chromium rendering engine (version 74 as on 7 May 2019).[8] Googlebot discovers pages by harvesting every link on every page that it can find. Unless prohibited by anofollow-tag, it then follows these links to other web pages. New web pages must be linked to from other known pages on the web in order to be crawled and indexed, or manually submitted by the webmaster.

A problem that webmasters with low-bandwidthweb hosting plans[citation needed] have often noted with the Googlebot is that it takes up an enormous amount of bandwidth.[citation needed] This can cause websites to exceed their bandwidth limit and be taken down temporarily. This is especially troublesome formirror sites which host manygigabytes of data. Google provides "Search Console" that allow website owners to throttle the crawl rate.[9]

How often Googlebot will crawl a site depends on the crawl budget. Crawl budget is an estimation of how typically a website is updated.[citation needed] Technically, Googlebot's development team (Crawling and Indexing team) uses several defined terms internally to take over what "crawl budget" stands for.[10] Since May 2019, Googlebot uses the latestChromium rendering engine, which supportsECMAScript 6 features. This will make the bot a bit more "evergreen" and ensure that it is not relying on an outdated rendering engine compared to browser capabilities.[8]

Mediabot

[edit]

Mediabot is theweb crawler thatGoogle uses for analyzing the content soGoogle AdSense can servecontextually relevant advertising to a web page. Mediabot identifies itself with theuser agent string "Mediapartners-Google/2.1".

Unlike other crawlers, Mediabot does not follow links to discover new crawlable URLs, instead only visiting URLs that have included the AdSense code.[11] Where that content resides behind a login, the crawler can be given a log in so that it is able to crawl protected content.[12]

Inspection Tool Crawlers

[edit]

InspectionTool is the crawler used by Search testing tools such as the Rich Result Test and URL inspection inGoogle Search Console. Apart from the user agent and user agent token, it mimics Googlebot.[13]

A guide to the crawlers was independently published.[14] It details four (4) distinctive crawler agents based onWeb server directory index data - one (1) non-chrome and three (3) chrome crawlers.

References

[edit]
  1. ^"Googlebot".Google. 2019-03-11. Retrieved2019-03-11.
  2. ^"Announcing mobile first indexing for the whole web".Google Developers. Retrieved2021-03-17.
  3. ^ab"Google Search Console".Google.com.
  4. ^"Google Search Console".search.google.com. Retrieved2019-03-11.
  5. ^"What is Googlebot | Google Search Central | Documentation". May 2022.
  6. ^"Understand the JavaScript SEO basics | Search for Developers".Google Developers. Retrieved2020-07-26.
  7. ^Splitt, Martin (28 February 2019)."How Google Search indexes JavaScript sites - JavaScript SEO".YouTube.Archived from the original on 2021-12-12.
  8. ^ab"The new evergreen Googlebot".Official Google Webmaster Central Blog. Retrieved2019-06-07.
  9. ^"Google - Webmasters". Retrieved2012-12-15.
  10. ^"What Crawl Budget Means for Googlebot".Official Google Webmaster Central Blog. Retrieved2018-07-04.
  11. ^"About the AdSense Crawler".
  12. ^"Display ads on login-protected pages".
  13. ^"Google Crawler (User Agent) Overview".
  14. ^"The Ultimate Guide to the New InspectionTool Crawlers".

External links

[edit]
Features
Component algorithms and updates
Special purpose search engines
Data insights
Developer and business tools
Related
Active
Discontinued
Types
Retrieved from "https://en.wikipedia.org/w/index.php?title=Googlebot&oldid=1318837283"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp