View Source GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics (google_api_content_warehouse v0.4.0)
Statistics of the anchors in a docjoin. Next available tag ID: 63.
Attributes
-
penguinLastUpdate
(type:integer()
, default:nil
) - BEGIN: Penguin related fields. Timestamp when penguin scores were last updated. Measured in days since Jan. 1st 1995. -
anchorCount
(type:integer()
, default:nil
) - -
badbacklinksPenalized
(type:boolean()
, default:nil
) - Whether this doc is penalized by BadBackLinks, in which case we should not use improvanchor score in mustang ascorer. -
penguinPenalty
(type:number()
, default:nil
) - Page-level penguin penalty (0 = good, 1 = bad). -
minHostHomePageLocalOutdegree
(type:integer()
, default:nil
) - Minimum local outdegree of all anchor sources that are host home pages as well as on the same host as the current target URL. -
droppedRedundantAnchorCount
(type:integer()
, default:nil
) - Sum of anchors_dropped in the repeated group RedundantAnchorInfo, but can go higher if the latter reaches the cap of kMaxRecordsToKeep. (indexing/docjoiner/anchors/anchor-loader.cc), currently 10,000 -
nonLocalAnchorCount
(type:integer()
, default:nil
) - -
mediumCorpusAnchorCount
(type:integer()
, default:nil
) - -
penguinEarlyAnchorProtected
(type:boolean()
, default:nil
) - Doc is protected by goodness of early anchors. -
droppedHomepageAnchorCount
(type:integer()
, default:nil
) - -
redundantanchorinfoforphrasecap
(type:list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap.t)
, default:nil
) - -
forwardedOffdomainAnchorCount
(type:integer()
, default:nil
) - -
droppedNonLocalAnchorCount
(type:integer()
, default:nil
) - -
perdupstats
(type:list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats.t)
, default:nil
) - -
onsiteAnchorCount
(type:integer()
, default:nil
) - -
droppedLocalAnchorCount
(type:integer()
, default:nil
) - -
penguinTooManySources
(type:boolean()
, default:nil
) - Doc not scored because it has too many anchor sources. END: Penguin related fields. -
forwardedAnchorCount
(type:integer()
, default:nil
) - -
anchorSpamInfo
(type:GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo.t
, default:nil
) - This structure contains signals and penalties of AnchorSpamPenalizer. It replaces phrase_anchor_spam_info above, that is deprecated. -
lowCorpusAnchorCount
(type:integer()
, default:nil
) - -
lowCorpusOffdomainAnchorCount
(type:integer()
, default:nil
) - -
baseAnchorCount
(type:integer()
, default:nil
) - -
minDomainHomePageLocalOutdegree
(type:integer()
, default:nil
) - Minimum local outdegree of all anchor sources that are domain home pages as well as on the same domain as the current target URL. -
skippedAccumulate
(type:integer()
, default:nil
) - A count of the number of times anchor accumulation has been skipped for this document. Note: Only used when canonical. -
topPrOnsiteAnchorCount
(type:integer()
, default:nil
) - According to anchor quality bucket, anchor with pagrank > 51000 is the best anchor. anchors with pagerank < 47000 are all same. -
pageMismatchTaggedAnchors
(type:integer()
, default:nil
) - -
spamLog10Odds
(type:number()
, default:nil
) - The log base 10 odds that this set of anchors exhibits spammy behavior. Computed in the AnchorLocalizer. -
redundantanchorinfo
(type:list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo.t)
, default:nil
) - -
pageFromExpiredTaggedAnchors
(type:integer()
, default:nil
) - Set in SignalPenalizer::FillInAnchorStatistics. -
baseOffdomainAnchorCount
(type:integer()
, default:nil
) - -
phraseAnchorSpamInfo
(type:GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo.t
, default:nil
) - Following signals identify spike of spammy anchor phrases. Anchors created during the spike are tagged with LINK_SPAM_PHRASE_SPIKE. -
anchorPhraseCount
(type:integer()
, default:nil
) - The number of unique anchor phrases. Capped by the constant kMaxAnchorPhraseCountInStats (=5000) defined in indexing/docjoiner/anchors/anchor-manager.cc. -
ondomainAnchorCount
(type:integer()
, default:nil
) - -
totalDomainsAbovePhraseCap
(type:integer()
, default:nil
) - Number of domains above per domain phrase cap. We see too many phrases in the domains. -
totalDomainsSeen
(type:integer()
, default:nil
) - Number of domains seen in total. -
topPrOffdomainAnchorCount
(type:integer()
, default:nil
) - -
scannedAnchorCount
(type:integer()
, default:nil
) - The total number of anchors being scanned from storage. -
localAnchorCount
(type:integer()
, default:nil
) - -
linkBeforeSitechangeTaggedAnchors
(type:integer()
, default:nil
) - -
globalAnchorDelta
(type:integer()
, default:nil
) - Metric of number of changed global anchors computed as, size(union(previous, new) - intersection(previous, new)). -
topPrOndomainAnchorCount
(type:integer()
, default:nil
) - -
mediumCorpusOffdomainAnchorCount
(type:integer()
, default:nil
) - -
offdomainAnchorCount
(type:integer()
, default:nil
) - -
totalDomainPhrasePairsSeenApprox
(type:integer()
, default:nil
) - Number of domain/phrase pairs in total -- i.e. how many anchors we would have if the domain/phrase cutoff was set to 1 instead of 200. This is "approx" for large anchor clusters because there can be double counting when the LRU cache forgets about rare domain/phrase pairs. -
skippedOrReusedReason
(type:String.t
, default:nil
) - Reason to skip accumulate, when skipped, or Reason for reprocessing when not skipped. -
anchorsWithDedupedImprovanchors
(type:integer()
, default:nil
) - The number of anchors for which some ImprovAnchors phrases have been removed due to duplication within source org. -
fakeAnchorCount
(type:integer()
, default:nil
) - -
redundantAnchorForPhraseCapCount
(type:integer()
, default:nil
) - Total anchor dropped due to exceed per domain phrase cap. Equals to sum of anchors_dropped in the repeated group RedundantAnchorInfoForPhraseCap, but can go higher if the latter reaches the cap of kMaxDomainsToKeepForPhraseCap (indexing/docjoiner/anchors/anchor-loader.h), currently 1000. -
totalDomainPhrasePairsAboveLimit
(type:integer()
, default:nil
) - The following should be equal to the size of the following repeated group, except that it can go higher than 10,000. -
timestamp
(type:integer()
, default:nil
) - Walltime of when anchors were accumulated last.
Summary
Functions
Unwrap a decoded JSON object into its complex fields.
Types
@type t() :: %GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics{ anchorCount: integer() | nil, anchorPhraseCount: integer() | nil, anchorSpamInfo: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo.t() | nil, anchorsWithDedupedImprovanchors: integer() | nil, badbacklinksPenalized: boolean() | nil, baseAnchorCount: integer() | nil, baseOffdomainAnchorCount: integer() | nil, droppedHomepageAnchorCount: integer() | nil, droppedLocalAnchorCount: integer() | nil, droppedNonLocalAnchorCount: integer() | nil, droppedRedundantAnchorCount: integer() | nil, fakeAnchorCount: integer() | nil, forwardedAnchorCount: integer() | nil, forwardedOffdomainAnchorCount: integer() | nil, globalAnchorDelta: integer() | nil, linkBeforeSitechangeTaggedAnchors: integer() | nil, localAnchorCount: integer() | nil, lowCorpusAnchorCount: integer() | nil, lowCorpusOffdomainAnchorCount: integer() | nil, mediumCorpusAnchorCount: integer() | nil, mediumCorpusOffdomainAnchorCount: integer() | nil, minDomainHomePageLocalOutdegree: integer() | nil, minHostHomePageLocalOutdegree: integer() | nil, nonLocalAnchorCount: integer() | nil, offdomainAnchorCount: integer() | nil, ondomainAnchorCount: integer() | nil, onsiteAnchorCount: integer() | nil, pageFromExpiredTaggedAnchors: integer() | nil, pageMismatchTaggedAnchors: integer() | nil, penguinEarlyAnchorProtected: boolean() | nil, penguinLastUpdate: integer() | nil, penguinPenalty: number() | nil, penguinTooManySources: boolean() | nil, perdupstats: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats.t() ] | nil, phraseAnchorSpamInfo: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo.t() | nil, redundantAnchorForPhraseCapCount: integer() | nil, redundantanchorinfo: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo.t() ] | nil, redundantanchorinfoforphrasecap: [ GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap.t() ] | nil, scannedAnchorCount: integer() | nil, skippedAccumulate: integer() | nil, skippedOrReusedReason: String.t() | nil, spamLog10Odds: number() | nil, timestamp: integer() | nil, topPrOffdomainAnchorCount: integer() | nil, topPrOndomainAnchorCount: integer() | nil, topPrOnsiteAnchorCount: integer() | nil, totalDomainPhrasePairsAboveLimit: integer() | nil, totalDomainPhrasePairsSeenApprox: integer() | nil, totalDomainsAbovePhraseCap: integer() | nil, totalDomainsSeen: integer() | nil }
Functions
Unwrap a decoded JSON object into its complex fields.