View Source GoogleApi.ContentWarehouse.V1.Model.TrawlerFetchReplyDataRedirects (google_api_content_warehouse v0.4.0)

The sequence of redirects fetched, if applicable. This includes url plus stats for each hop after the first hop. NOTE: This can be one redirect longer than the chain of redirects followed, in the case where there was a redirect at the end of the chain that the fetcher detected but did not follow.

Attributes

  • BadSSLCertificate (type: String.t, default: nil) - The server SSL certificate chain in SSLCertificateInfo protobuf format. See this field in FetchReplyData (i.e., the initial hop) for more description on when it will be populated.
  • CrawlTimes (type: GoogleApi.ContentWarehouse.V1.Model.TrawlerCrawlTimes.t, default: nil) - Per redirect hop timestamps. This
  • DownloadTime (type: integer(), default: nil) - Download time of this fetch (ms)
  • Endpoints (type: GoogleApi.ContentWarehouse.V1.Model.TrawlerTCPIPInfo.t, default: nil) - ## stats If fetched, ip info.
  • HSTSInfo (type: String.t, default: nil) - This specifies if the url in a redirect was rewritten to HTTPS because of an HSTS policy for the domain. See comments on FetchReplyData.HSTSInfo for how this field's values. A redirect that was rewritten with HSTS will have HSTS_STATUS_REWRITTEN ## here.
  • HTTPResponseCode (type: integer(), default: nil) - The HTTP response code for this hop. We need this since multiple response codes may have the same redirect type (e.g., 302 and 307 are both REDIRECT_TEMPORARILY), but clients may want to know which one was received. Note this is set only for the hops that are followed (i.e., TargetUrl is present). If the last redirect hop was not followed the fetch status will be URL_NOT_FOLLOWED, and the response code will be in the top level ProtocolResponse field.
  • HopPageNoIndexInfo (type: integer(), default: nil) - Extra trawler::PageNoIndexInfo for this hop. Integer: ORed together bits from trawler::PageNoIndexInfo. The information specified by this field comes from the http header or content of the source url, not the "TargetUrl" in this Redirects group.
  • HopReuseInfo (type: String.t, default: nil) - trawler::ReuseInfo with status of IMS/IMF/cache query, for this hop.
  • HopRobotsInfo (type: integer(), default: nil) - Extra trawler::RobotsInfo for this hop. Integer: ORed together bits from trawler::RobotsInfo
  • HostId (type: String.t, default: nil) - If known, the hostid for this hop
  • HttpRequestHeaders (type: String.t, default: nil) - The http headers we sent for fetching this redirect hop. Not normally filled in, unless FetchParams.WantSentHeaders is set.
  • HttpResponseHeaders (type: String.t, default: nil) - The http headers we received from this redirect hop. Trawler does not fill this in; this is intended as a placeholder for crawls like webmirror that fill in and want to track this across redirect hops.
  • RawTargetUrl (type: String.t, default: nil) - bytes: can contain bad encoding.
  • RefreshTime (type: integer(), default: nil) - Refresh time in meta redirect tag
  • RobotsTxt (type: String.t, default: nil) - The robots.txt we used for this fetch. Not normally filled in unless WantRobotsBody is set.
  • SourceBody (type: GoogleApi.ContentWarehouse.V1.Model.TrawlerFetchBodyData.t, default: nil) - For meta-redirects, this field may contain the body of the source document. Currently only filled client side and not implemented (yet) for server-side redirects.
  • TargetUrl (type: String.t, default: nil) - Difference between the following two fields: TargetUrl is set when we have followed the redirect target, and the url is canonicalized. RawTargetUrl is set in either of the following two cases: (1) The url has not be been followed. For example, the redirect is intended to be handled by the client. In the fetch reply response, you will see the url's status as URL_NOT_FOLLOWED-NOT_FOLLOWED. (2) The extracted redirect url is different from its canonicalized* form. For example, if the target url contains fragments, then this RawTargetUrl will have the fragments. Redirect target
  • Type (type: String.t, default: nil) - URL and redirect type

Summary

Functions

Unwrap a decoded JSON object into its complex fields.

Types

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.TrawlerFetchReplyDataRedirects{
  BadSSLCertificate: String.t() | nil,
  CrawlTimes: GoogleApi.ContentWarehouse.V1.Model.TrawlerCrawlTimes.t() | nil,
  DownloadTime: integer() | nil,
  Endpoints: GoogleApi.ContentWarehouse.V1.Model.TrawlerTCPIPInfo.t() | nil,
  HSTSInfo: String.t() | nil,
  HTTPResponseCode: integer() | nil,
  HopPageNoIndexInfo: integer() | nil,
  HopReuseInfo: String.t() | nil,
  HopRobotsInfo: integer() | nil,
  HostId: String.t() | nil,
  HttpRequestHeaders: String.t() | nil,
  HttpResponseHeaders: String.t() | nil,
  RawTargetUrl: String.t() | nil,
  RefreshTime: integer() | nil,
  RobotsTxt: String.t() | nil,
  SourceBody:
    GoogleApi.ContentWarehouse.V1.Model.TrawlerFetchBodyData.t() | nil,
  TargetUrl: String.t() | nil,
  Type: String.t() | nil
}

Functions

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.