View Source GoogleApi.ContentWarehouse.V1.Model.LogsProtoIndexingCrawlerIdCrawlerIdProto (google_api_content_warehouse v0.3.0)

Proto-representation of the Crawler-ID in Web-Search (Alexandria-Scope). The string-representation (covered in //indexing/crawler_id/scope/alexandria/crawler_id.h) and the proto-representation are identical in meaning. For more information in regard to the crawler_id, please look at //depot/google3/indexing/crawler_id Used within the following components: - WebMirror: To understand the parsed crawler-ID and apply attributes within their own tables. - Serving : to identify the crawler-ID within the GenericSearchResponse, which implies being stored in the MDU and returned by ascorer to Superroot. - QSessions: To store the crawler-ID in all logged events for analysis. The default values represent the 'empty string' crawler-ID for the Alexandria-scope.

Attributes

  • country (type: String.t, default: nil) - The country to crawl the country from, defaults to the default non-specified crawling node (which is interpreted by most web-servers as USA). When specified, the crawling will fetch the document from a node in that country instead.
  • deviceType (type: String.t, default: nil) - The device type, which maps into the useragent to be set when initiating the fetch-request, e.g. desktop-googlebot vs. smartphone-googlebot.
  • indexGrowthExptType (type: String.t, default: nil) - Specifies whether the document is a duplicated document from the index growth experiment, detailed at go/indexsize_exp, defaults to not in any experiment.
  • language (type: String.t, default: nil) - The language being set by the crawler. Defaults to UNKNOWN_LANGUAGE which indicates to not apply an accept-language header on the FetchRequest. When a language is specified, on crawling this language is converted into an accept-language header (e.g. GERMAN -> "Accept-language: de"). Script variations, e.g. ZH-HANS vs. ZH-HANT, are handled as different enum values (e.g. CHINESE vs. CHINESE_T).
  • languageCode (type: String.t, default: nil) - Language-code used for identifying the locale of the document. 'language' and 'country' above are used for web-based documents, representing the detected language of the document and the country it was crawled from. The language code here, however, rather represents an artifical language_code applied to manually translated webpages (e.g. feeds), for instance for the pidgin-usecase. They are limited to the set of III-codes being supported by the client, yet are beyond the enum in 'language', e.g. to support variants of English across different countries.

Summary

Functions

Unwrap a decoded JSON object into its complex fields.

Types

@type t() ::
  %GoogleApi.ContentWarehouse.V1.Model.LogsProtoIndexingCrawlerIdCrawlerIdProto{
    country: String.t() | nil,
    deviceType: String.t() | nil,
    indexGrowthExptType: String.t() | nil,
    language: String.t() | nil,
    languageCode: String.t() | nil
  }

Functions

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.