View Source GoogleApi.ContentWarehouse.V1.Model.TrawlerTrawlerPrivateFetchReplyData (google_api_content_warehouse v0.4.0)

This is an optional container of arbitrary data that can be added to a FetchReplyData. This data is meant to be logged, but not sent back in a fetch reply (it should be added after the reply is prepared). Use FetchResponsePreparatorImpl::AddTrawlerPrivateDataToFetchReplyData to add. See also the comment in fetch_response_preparator_impl.cc. Next Tag: 49

Attributes

  • PostDataSize (type: String.t, default: nil) - What's the post data size (in bytes) if it's a post request.
  • numDroppedReplies (type: String.t, default: nil) - Number of times we drop the content of a stream reply or the final reply, which can only be caused by REJECTED_NO_RPC_BUFFERS now.
  • HintIPAddress (type: String.t, default: nil) - If we do not have Endpoints in FetchReplyData (e.g., url rejected due to hostload limit), do we have a guess of the server IPAddress (e.g., from robots fetch)? This helps us classify URLs based on country code, etc. The field is filled with IPAddress::ToPackedString().
  • RpcStartDeadlineLeftMs (type: integer(), default: nil) - RPC deadline left at the start of url control flow. Can be useful for debugging rpc deadline exceeded error received by clients, this field is only recorded if RpcEndDeadlineLeftMs is small enough.
  • largeStoreHitLocation (type: String.t, default: nil) - Set to the hit location (CNS filename) if cache comes from large store.
  • isDedicatedHostload (type: boolean(), default: nil) -
  • dependentFetchType (type: String.t, default: nil) - Dependent fetch type
  • isVpcTraffic (type: boolean(), default: nil) - Set if the fetch goes through the virtual private cloud path so we can track the VPC traffic.
  • httpVersion (type: String.t, default: nil) - Stores the HTTP version we used in the last hop.
  • BotGroupName (type: String.t, default: nil) - If we fetched using BotFetchAgent, what is the BotGroupName?
  • isBidiStreamingFetch (type: boolean(), default: nil) - Whether this is a bidirectional streaming fetch.
  • authenticationInfo (type: String.t, default: nil) - Stores the OAuth authentication method.
  • RequestUserName (type: String.t, default: nil) - Log the loas username in trawler private to help with debugging. Store the username in trawler private so clients won't see it from FetchReply. To reduce disk usage, we only log the loas username if the requestorid being used does not have ClientUsernameRestrictions.
  • cacheHitType (type: String.t, default: nil) - Only set if the fetch uses cache content (is_cache_fetch is true).
  • originalClientParams (type: GoogleApi.ContentWarehouse.V1.Model.TrawlerOriginalClientParams.t, default: nil) - Store the original client information.
  • IsRobotsFetch (type: boolean(), default: nil) - Was this an internally-initiated robots.txt fetch?
  • resourceBucket (type: String.t, default: nil) - If the requestor shares resource bucket with other requestorids, we will store the resource bucket name in these fields.
  • cacheAcceptableAge (type: integer(), default: nil) - Corresponds to AcceptableAge field in FetchParams.
  • Producer (type: String.t, default: nil) - Note TrawlerPrivateFetchReplyData is never sent back to clients. The following field is just for Trawler and Multiverse internal tracking, and clients should not look at this field at all.
  • ProxyInstance (type: String.t, default: nil) - If set, this fetch was done through a proxy (e.g., fetchproxy).
  • cdnProvider (type: String.t, default: nil) -
  • concurrentStreamNum (type: String.t, default: nil) - How many concurrent streams are on the connection when the request finishes (including this request). Export this value to monitor the stream multiplexing for HTTP/2.
  • cacheAcceptableAfterDate (type: integer(), default: nil) - Corresponds to AcceptableAfterDate field in FetchParams.
  • credentialId (type: String.t, default: nil) - Log the credential id
  • ResponseBytes (type: String.t, default: nil) - The number of bytes we sent back to the client.
  • downloadFileName (type: String.t, default: nil) - If the response header contains Content-Disposition header "attachment; filename="google.zip": the download_file_name would be "google.zip"
  • isFloonetFetch (type: boolean(), default: nil) - Whether or not this is a Floonet fetch request. Floonet requests have inherent lower availability (due to HOPE rejections when HOPE is in degraded mode, and other Floonet specific reasons). Therefore, it is important for debugging and for our availability SLO to know whether of not it is a floonet fetch. IMPORTANT NOTE: This field is only currently set for traffic that explicitly requires Floonet and can not failover to use Googlebot (i.e. "transparent" or "implicit" Floonet fetches).
  • multiverseClientIdentifier (type: GoogleApi.ContentWarehouse.V1.Model.TrawlerMultiverseClientIdentifier.t, default: nil) - Multiverse client information
  • TrawlerInstance (type: String.t, default: nil) - Which Trawler cell was this response fetched in? (e.g. "HR" or "YQ")
  • HSTSHeaderValue (type: String.t, default: nil) - HTTP Strict-Transport-Security (RFC6797) header value. We log this so we can generate a list of hosts that prefer HTTPS over HTTP.
  • tier (type: String.t, default: nil) - Service tier info will be used in traffic grapher for ploting per tier graph.
  • Is5xxHostId (type: boolean(), default: nil) - Represents if the HostId belongs to HostId set in 5xx url patterns, it can work as a tag when emitting requestor minute summary, this helps us to aggregate traffic affected by 5xx patterns, and test if there are any fetching changes.
  • UserAgentSent (type: String.t, default: nil) - The useragent string sent to the remote webserver. It corresponds to UserAgentToSend field in FetchParams.
  • googleExtendedObeyWildcardRobotsStatus (type: integer(), default: nil) - We check if Google-Extended is allowed to crawl this URL, wildcard rules are obeyed, this is for internal analysis. Check RobotsTxtClient::RobotsStatus for the meaning of number.
  • RobotsBody (type: String.t, default: nil) - If this was a robots.txt fetch (IsRobotsFetch above), this may contain the robots.txt body. (It may not, for instance, 404s are omitted; current policy is URL_CRAWLED + partially crawled) This includes http headers + body.
  • UserAgentSentFp (type: String.t, default: nil) - The fp2011 of useragent sent to the remote webserver, note it corresponds to UserAgentToSend field in FetchParams
  • prodRegion (type: String.t, default: nil) - Log the prod region (only for regional harpoon requestor ids)
  • RpcEndDeadlineLeftMs (type: integer(), default: nil) - RPC deadline left at the end of url control flow. Can be useful for debugging rpc deadline exceeded error received by clients, this field is only recorded if it's small enough.
  • isFromGrpcProxy (type: boolean(), default: nil) - Whether or not this response is sent from gRPC proxy service.
  • ServerSignature (type: String.t, default: nil) - An arbitrary string signature identifying the remote server type/version. In the case of HTTP, this would be the contents of the "Server:" header.
  • googleExtendedRobotsStatus (type: integer(), default: nil) - We check if Google-Extended is allowed to crawl this URL and store the result here, wildcard rules are not obeyed, this is for internal analysis. Check RobotsTxtClient::RobotsStatus for the meaning of number.
  • BotHostname (type: String.t, default: nil) - This is the HOPE server that we sent the url to. We log the HOPE backend cell and hope server shard number (e.g., 'qf:6'). This allows us to understand how we are balancing our load to the HOPE servers.
  • subResourceBucket (type: String.t, default: nil) -
  • vpcDestination (type: GoogleApi.ContentWarehouse.V1.Model.TrawlerLoggedVPCDestination.t, default: nil) - The following are vpc information that's only set if is_vpc_traffic is true.
  • bypassedHostOverfull (type: boolean(), default: nil) - Cache hit for this url, bypassed host_overfull error.
  • CacheRequestorID (type: String.t, default: nil) - Present if the reply is from the trawler cache. This is the requestorid of the trawler client that populated the cache with the data we are reusing.
  • HadInMemCacheHit (type: boolean(), default: nil) -
  • FetcherTaskNumber (type: integer(), default: nil) - Which Trawler fetcher task fetched this URL.

Summary

Functions

Unwrap a decoded JSON object into its complex fields.

Types

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.TrawlerTrawlerPrivateFetchReplyData{
  BotGroupName: String.t() | nil,
  BotHostname: String.t() | nil,
  CacheRequestorID: String.t() | nil,
  FetcherTaskNumber: integer() | nil,
  HSTSHeaderValue: String.t() | nil,
  HadInMemCacheHit: boolean() | nil,
  HintIPAddress: String.t() | nil,
  Is5xxHostId: boolean() | nil,
  IsRobotsFetch: boolean() | nil,
  PostDataSize: String.t() | nil,
  Producer: String.t() | nil,
  ProxyInstance: String.t() | nil,
  RequestUserName: String.t() | nil,
  ResponseBytes: String.t() | nil,
  RobotsBody: String.t() | nil,
  RpcEndDeadlineLeftMs: integer() | nil,
  RpcStartDeadlineLeftMs: integer() | nil,
  ServerSignature: String.t() | nil,
  TrawlerInstance: String.t() | nil,
  UserAgentSent: String.t() | nil,
  UserAgentSentFp: String.t() | nil,
  authenticationInfo: String.t() | nil,
  bypassedHostOverfull: boolean() | nil,
  cacheAcceptableAfterDate: integer() | nil,
  cacheAcceptableAge: integer() | nil,
  cacheHitType: String.t() | nil,
  cdnProvider: String.t() | nil,
  concurrentStreamNum: String.t() | nil,
  credentialId: String.t() | nil,
  dependentFetchType: String.t() | nil,
  downloadFileName: String.t() | nil,
  googleExtendedObeyWildcardRobotsStatus: integer() | nil,
  googleExtendedRobotsStatus: integer() | nil,
  httpVersion: String.t() | nil,
  isBidiStreamingFetch: boolean() | nil,
  isDedicatedHostload: boolean() | nil,
  isFloonetFetch: boolean() | nil,
  isFromGrpcProxy: boolean() | nil,
  isVpcTraffic: boolean() | nil,
  largeStoreHitLocation: String.t() | nil,
  multiverseClientIdentifier:
    GoogleApi.ContentWarehouse.V1.Model.TrawlerMultiverseClientIdentifier.t()
    | nil,
  numDroppedReplies: String.t() | nil,
  originalClientParams:
    GoogleApi.ContentWarehouse.V1.Model.TrawlerOriginalClientParams.t() | nil,
  prodRegion: String.t() | nil,
  resourceBucket: String.t() | nil,
  subResourceBucket: String.t() | nil,
  tier: String.t() | nil,
  vpcDestination:
    GoogleApi.ContentWarehouse.V1.Model.TrawlerLoggedVPCDestination.t() | nil
}

Functions

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.