Search results for Trawler
GoogleApi.ContentWarehouse.V1.Model.IndexingEmbeddedContentFetchHostCount (module)
Log how many urls finally goes to trawler on a host in rendering.
GoogleApi.ContentWarehouse.V1.Model.TrawlerPolicyData (module)
Trawler can add a policy label to a FetchReply. The two main cases are: - "spam" label ...
GoogleApi.ContentWarehouse.V1.Model.TrawlerClientServiceInfo (module)
ClientServiceInfo is meant for trawler/harpoon clients which are in turn services to store some data specific to their clients...
GoogleApi.ContentWarehouse.V1.Model.TrawlerTCPIPInfo (module)
To keep track of fetch connection endpoints. Note: You can use trawler::SourceIP(info) or trawler::DestinationIP(info) (as well as HasSourceIP/HasDestinationIP) in ba...
Attributes - GoogleApi.ContentWarehouse.V1.Model.TrawlerFetchStatus (module)
...OO, the Reason value will be one of the various FetchFooReason enum values from crawler/trawler/trawler_enums.proto * `State` (*type:* `String.t`, *default:* `nil`) - The State field describes ...
Attributes - GoogleApi.ContentWarehouse.V1.Model.TrawlerTCPIPInfo (module)
...* `String.t`, *default:* `nil`) - Address of the destination host. Extract with trawler::DestinationIP() or decode with PackedStringToIPAddress(). * `DestinationPort` (*type:* `integ...
Attributes - GoogleApi.ContentWarehouse.V1.Model.IndexingEmbeddedContentRenderingFetchStats (module)
...host->count mapping to log how many embedded_links in each host finally goes to trawler during rendering.
GoogleApi.ContentWarehouse.V1.Model.TrawlerFetchReplyData (module)
...plyData (and FetchReply) is the output interface from Multiverse. Teams outside Multiverse/Trawler should not create fake FetchReplies. Trawler: When adding new fields here, it i...
here. - GoogleApi.ContentWarehouse.V1.Model.TrawlerFetchReplyDataRedirects (module)
...field. * `HopPageNoIndexInfo` (*type:* `integer()`, *default:* `nil`) - Extra trawler::PageNoIndexInfo for this hop. Integer: ORed together bits from trawler::PageNoIndexInfo. The in...
GoogleApi.ContentWarehouse.V1.Model.GDocumentBaseOriginalContent (module)
...pressed at the sstable level. In doclogs content will only be compressed if the Trawler fetchreply is also compressed--which is currently never and unlikely to change ...
Attributes - GoogleApi.ContentWarehouse.V1.Model.IndexingEmbeddedContentLinkInfo (module)
...Warehouse.V1.Model.TrawlerFetchStatus.t`, *default:* `nil`) - Fetch status from trawler. * `fetchUrlResponseMetadata` (*type:* `GoogleApi.ContentWarehouse.V1.Model.In...
Attributes - GoogleApi.ContentWarehouse.V1.Model.TrawlerFetchReplyData (module)
...) - Data about the host bucket this request is in (if desired) Please talk with Trawler team before considering using this, since what we fill in here is subject to ch...
Attributes - GoogleApi.ContentWarehouse.V1.Model.IndexingConverterRawRedirectInfo (module)
...events in rendering. At the beginning of it, there could be some redirects from trawler (i.e. could be partial or entire trawler redirect chain), other redirects have ...
Attributes - GoogleApi.ContentWarehouse.V1.Model.ImageMoosedogCrawlState (module)
...h of the above not_crawled_reason will have a set of detailed reason defined in crawler/trawler/trawler_enums.proto. * `internalStatus` (*type:* `GoogleApi.ContentWarehouse.V1.Model.UtilStatusPr...
Recommended way of reading: const string& doc_key = cdoc.doc().id().key(); ## CHECK(!doc_key.empty()); More background information can be found in google3/indexing/crawler_id/servingdocumentidentifier.proto The ServingDocumentIdentifier uniquely identifies a document in serving and also distinguishes between experimental vs. production documents. The SDI is also used as an input for the union/muppet key generation in serving. - GoogleApi.ContentWarehouse.V1.Model.GDocumentBase (module)
...t). NOTE: This field is copied from the first WEBMIRROR FetchReplyClientInfo in trawler_fetch_info column. We leave this field unpopulated if no WEBMIRROR FecthReplyClientInfo is...
Attributes - GoogleApi.ContentWarehouse.V1.Model.IndexingConverterRedirectParams (module)
...ault:* `nil`) - If set, it means that the redirect of type META was detected by Trawler (as opposed to the content processor.) Only makes sense when type is META. * ...
Attributes - GoogleApi.ContentWarehouse.V1.Model.TrawlerTrawlerPrivateFetchReplyData (module)
...uestUserName` (*type:* `String.t`, *default:* `nil`) - Log the loas username in trawler private to help with debugging. Store the username in trawler private so client...
Attributes - GoogleApi.ContentWarehouse.V1.Model.IndexingEmbeddedContentFetchUrlResponseMetadata (module)
... - * `numTrawlerFetches` (*type:* `integer()`, *default:* `nil`) - Number of trawler fetches while fetching this URL. In most cases, this number will be 0 or 1. * ...
Attributes - GoogleApi.ContentWarehouse.V1.Model.IndexingEmbeddedContentEmbeddedContentInfo (module)
...teger()`, *default:* `nil`) - The original encoding of the content crawled from trawler. It's the value of enum i18n::encodings::encoding. We put a int32 here instead o...
Attributes - GoogleApi.ContentWarehouse.V1.Model.TrawlerHostBucketData (module)
...et as of 2013/08/21. Even after they are populated, they may change. So talk to trawler-dev@ before you use the fields. Total qps for this hostid * `TotalUsedQps` (*type:...
Attributes - GoogleApi.ContentWarehouse.V1.Model.TrawlerSSLCertificateInfo (module)
...fault:* `nil`) - Details about the SSL/TLS protocol and cipher. See RFC5246 and google3/crawler/trawler/hope/proto/ssl.proto for more details. * `SSLProtocolVersionName` (*type:* `String.t`, *default:* ...