Google, the search engine giant, has unveiled two new web crawlers designed specifically for scraping image and video content. The announcement, made recently, sheds light on Google’s ongoing efforts in research and development, particularly in enhancing image and video search capabilities.
There are two new GoogleOther crawlers:
- GoogleOther-Image
- GoogleOther-Video
These specialized crawlers, known as GoogleOther-Image and GoogleOther-Video, are tailored to fetch binary data—content that goes beyond the conventional text-based web pages. While the primary purpose of these crawlers is for research and development, it’s important to note that they are distinct from Google’s AI training data crawler, known as Google-Extended.
The original GoogleOther crawler, launched in April 2023, was primarily utilized by Google product teams for one-off crawls, aimed at gathering publicly accessible content for internal research purposes. The introduction of its image and video variants signifies Google’s heightened focus on multimedia content within its search ecosystem.
The GoogleOther-Image crawler, as the name suggests, is dedicated to scouring the web for image content, while its counterpart, the GoogleOther-Video crawler, is tailored for video content. These crawlers operate independently but are part of the broader GoogleOther family, which encompasses various specialized crawlers utilized for specific purposes.
For website administrators concerned about the impact of these crawlers on their platforms, Google has provided user agent tokens that can be incorporated into the robots.txt file to manage crawling behavior. Publishers retain the ability to block these crawlers from accessing their image and video content if they so choose.
In addition to the introduction of the image and video crawlers, Google has updated the user agent strings for the existing GoogleOther crawler. These updates, which include references to the Chrome browser version, aim to provide clearer identification of Google’s crawling activities and facilitate more effective management for website owners.
Website administrators are advised to familiarize themselves with these new crawlers and user agent strings to accurately identify genuine Google bot activity in their server logs. This knowledge empowers publishers who wish to exercise control over the scraping of their multimedia content for research and development purposes.
In summary, Google’s unveiling of specialized image and video crawlers underscores its commitment to enhancing search capabilities for multimedia content. While these crawlers offer valuable insights for research and development, website owners retain the ability to control access to their image and video assets, ensuring a balance between innovation and content protection in the digital landscape.