View Source GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics (google_api_content_warehouse v0.4.0)

Statistics of the anchors in a docjoin. Next available tag ID: 63.

Attributes

  • penguinLastUpdate (type: integer(), default: nil) - BEGIN: Penguin related fields. Timestamp when penguin scores were last updated. Measured in days since Jan. 1st 1995.
  • anchorCount (type: integer(), default: nil) -
  • badbacklinksPenalized (type: boolean(), default: nil) - Whether this doc is penalized by BadBackLinks, in which case we should not use improvanchor score in mustang ascorer.
  • penguinPenalty (type: number(), default: nil) - Page-level penguin penalty (0 = good, 1 = bad).
  • minHostHomePageLocalOutdegree (type: integer(), default: nil) - Minimum local outdegree of all anchor sources that are host home pages as well as on the same host as the current target URL.
  • droppedRedundantAnchorCount (type: integer(), default: nil) - Sum of anchors_dropped in the repeated group RedundantAnchorInfo, but can go higher if the latter reaches the cap of kMaxRecordsToKeep. (indexing/docjoiner/anchors/anchor-loader.cc), currently 10,000
  • nonLocalAnchorCount (type: integer(), default: nil) -
  • mediumCorpusAnchorCount (type: integer(), default: nil) -
  • penguinEarlyAnchorProtected (type: boolean(), default: nil) - Doc is protected by goodness of early anchors.
  • droppedHomepageAnchorCount (type: integer(), default: nil) -
  • redundantanchorinfoforphrasecap (type: list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap.t), default: nil) -
  • forwardedOffdomainAnchorCount (type: integer(), default: nil) -
  • droppedNonLocalAnchorCount (type: integer(), default: nil) -
  • perdupstats (type: list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats.t), default: nil) -
  • onsiteAnchorCount (type: integer(), default: nil) -
  • droppedLocalAnchorCount (type: integer(), default: nil) -
  • penguinTooManySources (type: boolean(), default: nil) - Doc not scored because it has too many anchor sources. END: Penguin related fields.
  • forwardedAnchorCount (type: integer(), default: nil) -
  • anchorSpamInfo (type: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo.t, default: nil) - This structure contains signals and penalties of AnchorSpamPenalizer. It replaces phrase_anchor_spam_info above, that is deprecated.
  • lowCorpusAnchorCount (type: integer(), default: nil) -
  • lowCorpusOffdomainAnchorCount (type: integer(), default: nil) -
  • baseAnchorCount (type: integer(), default: nil) -
  • minDomainHomePageLocalOutdegree (type: integer(), default: nil) - Minimum local outdegree of all anchor sources that are domain home pages as well as on the same domain as the current target URL.
  • skippedAccumulate (type: integer(), default: nil) - A count of the number of times anchor accumulation has been skipped for this document. Note: Only used when canonical.
  • topPrOnsiteAnchorCount (type: integer(), default: nil) - According to anchor quality bucket, anchor with pagrank > 51000 is the best anchor. anchors with pagerank < 47000 are all same.
  • pageMismatchTaggedAnchors (type: integer(), default: nil) -
  • spamLog10Odds (type: number(), default: nil) - The log base 10 odds that this set of anchors exhibits spammy behavior. Computed in the AnchorLocalizer.
  • redundantanchorinfo (type: list(GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo.t), default: nil) -
  • pageFromExpiredTaggedAnchors (type: integer(), default: nil) - Set in SignalPenalizer::FillInAnchorStatistics.
  • baseOffdomainAnchorCount (type: integer(), default: nil) -
  • phraseAnchorSpamInfo (type: GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo.t, default: nil) - Following signals identify spike of spammy anchor phrases. Anchors created during the spike are tagged with LINK_SPAM_PHRASE_SPIKE.
  • anchorPhraseCount (type: integer(), default: nil) - The number of unique anchor phrases. Capped by the constant kMaxAnchorPhraseCountInStats (=5000) defined in indexing/docjoiner/anchors/anchor-manager.cc.
  • ondomainAnchorCount (type: integer(), default: nil) -
  • totalDomainsAbovePhraseCap (type: integer(), default: nil) - Number of domains above per domain phrase cap. We see too many phrases in the domains.
  • totalDomainsSeen (type: integer(), default: nil) - Number of domains seen in total.
  • topPrOffdomainAnchorCount (type: integer(), default: nil) -
  • scannedAnchorCount (type: integer(), default: nil) - The total number of anchors being scanned from storage.
  • localAnchorCount (type: integer(), default: nil) -
  • linkBeforeSitechangeTaggedAnchors (type: integer(), default: nil) -
  • globalAnchorDelta (type: integer(), default: nil) - Metric of number of changed global anchors computed as, size(union(previous, new) - intersection(previous, new)).
  • topPrOndomainAnchorCount (type: integer(), default: nil) -
  • mediumCorpusOffdomainAnchorCount (type: integer(), default: nil) -
  • offdomainAnchorCount (type: integer(), default: nil) -
  • totalDomainPhrasePairsSeenApprox (type: integer(), default: nil) - Number of domain/phrase pairs in total -- i.e. how many anchors we would have if the domain/phrase cutoff was set to 1 instead of 200. This is "approx" for large anchor clusters because there can be double counting when the LRU cache forgets about rare domain/phrase pairs.
  • skippedOrReusedReason (type: String.t, default: nil) - Reason to skip accumulate, when skipped, or Reason for reprocessing when not skipped.
  • anchorsWithDedupedImprovanchors (type: integer(), default: nil) - The number of anchors for which some ImprovAnchors phrases have been removed due to duplication within source org.
  • fakeAnchorCount (type: integer(), default: nil) -
  • redundantAnchorForPhraseCapCount (type: integer(), default: nil) - Total anchor dropped due to exceed per domain phrase cap. Equals to sum of anchors_dropped in the repeated group RedundantAnchorInfoForPhraseCap, but can go higher if the latter reaches the cap of kMaxDomainsToKeepForPhraseCap (indexing/docjoiner/anchors/anchor-loader.h), currently 1000.
  • totalDomainPhrasePairsAboveLimit (type: integer(), default: nil) - The following should be equal to the size of the following repeated group, except that it can go higher than 10,000.
  • timestamp (type: integer(), default: nil) - Walltime of when anchors were accumulated last.

Summary

Functions

Unwrap a decoded JSON object into its complex fields.

Types

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatistics{
  anchorCount: integer() | nil,
  anchorPhraseCount: integer() | nil,
  anchorSpamInfo:
    GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorSpamInfo.t()
    | nil,
  anchorsWithDedupedImprovanchors: integer() | nil,
  badbacklinksPenalized: boolean() | nil,
  baseAnchorCount: integer() | nil,
  baseOffdomainAnchorCount: integer() | nil,
  droppedHomepageAnchorCount: integer() | nil,
  droppedLocalAnchorCount: integer() | nil,
  droppedNonLocalAnchorCount: integer() | nil,
  droppedRedundantAnchorCount: integer() | nil,
  fakeAnchorCount: integer() | nil,
  forwardedAnchorCount: integer() | nil,
  forwardedOffdomainAnchorCount: integer() | nil,
  globalAnchorDelta: integer() | nil,
  linkBeforeSitechangeTaggedAnchors: integer() | nil,
  localAnchorCount: integer() | nil,
  lowCorpusAnchorCount: integer() | nil,
  lowCorpusOffdomainAnchorCount: integer() | nil,
  mediumCorpusAnchorCount: integer() | nil,
  mediumCorpusOffdomainAnchorCount: integer() | nil,
  minDomainHomePageLocalOutdegree: integer() | nil,
  minHostHomePageLocalOutdegree: integer() | nil,
  nonLocalAnchorCount: integer() | nil,
  offdomainAnchorCount: integer() | nil,
  ondomainAnchorCount: integer() | nil,
  onsiteAnchorCount: integer() | nil,
  pageFromExpiredTaggedAnchors: integer() | nil,
  pageMismatchTaggedAnchors: integer() | nil,
  penguinEarlyAnchorProtected: boolean() | nil,
  penguinLastUpdate: integer() | nil,
  penguinPenalty: number() | nil,
  penguinTooManySources: boolean() | nil,
  perdupstats:
    [
      GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsPerDupStats.t()
    ]
    | nil,
  phraseAnchorSpamInfo:
    GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorPhraseSpamInfo.t()
    | nil,
  redundantAnchorForPhraseCapCount: integer() | nil,
  redundantanchorinfo:
    [
      GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfo.t()
    ]
    | nil,
  redundantanchorinfoforphrasecap:
    [
      GoogleApi.ContentWarehouse.V1.Model.IndexingDocjoinerAnchorStatisticsRedundantAnchorInfoForPhraseCap.t()
    ]
    | nil,
  scannedAnchorCount: integer() | nil,
  skippedAccumulate: integer() | nil,
  skippedOrReusedReason: String.t() | nil,
  spamLog10Odds: number() | nil,
  timestamp: integer() | nil,
  topPrOffdomainAnchorCount: integer() | nil,
  topPrOndomainAnchorCount: integer() | nil,
  topPrOnsiteAnchorCount: integer() | nil,
  totalDomainPhrasePairsAboveLimit: integer() | nil,
  totalDomainPhrasePairsSeenApprox: integer() | nil,
  totalDomainsAbovePhraseCap: integer() | nil,
  totalDomainsSeen: integer() | nil
}

Functions

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.