View Source GoogleApi.ContentWarehouse.V1.Model.GoodocSummaryStats (google_api_content_warehouse v0.3.0)

Goodoc stats for a range of elements, such as one page or a whole book. These stats can be computed using the SummaryStatsCollector class. Some range stats are pre-computed and stored in goodocs/volumes (eg., Page.stats below, and Ocean's CA_VolumeResult.goodoc_stats).

Attributes

  • numParagraphs (type: integer(), default: nil) - ------ Paragraph stats Median symbols and words omit junk, header and footer blocks; they are intended to be a measure of the typical "content" paragraph. There can still be substantial differences between means and medians, particularly if a table is present (every cell is a paragraph).
  • medianSymbolsPerParagraph (type: integer(), default: nil) -
  • estimatedFontSizes (type: boolean(), default: nil) - This flag is set if the histogram above has been derived by estimating font sizes from CharLabel.CharacterHeight; that happens if the FontSize field is constant, as has happened with Abbyy 9.
  • numLineSpaces (type: integer(), default: nil) - Lines (out of num_lines) that have a successor line within their para
  • medianSymbolsPerBlock (type: integer(), default: nil) -
  • numWords (type: integer(), default: nil) - ------ Word stats
  • medianSymbolsPerWord (type: integer(), default: nil) -
  • meanSymbolsPerWord (type: integer(), default: nil) -
  • numNonGraphicBlocks (type: integer(), default: nil) -
  • medianFullOddPrintedBox (type: GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t, default: nil) -
  • medianWordsPerLine (type: integer(), default: nil) -
  • medianLineSpan (type: integer(), default: nil) - top to next top in para
  • medianWidth (type: integer(), default: nil) -
  • medianWordsPerParagraph (type: integer(), default: nil) -
  • meanWordsPerBlock (type: integer(), default: nil) -
  • medianParagraphIndent (type: integer(), default: nil) - leading space on first line
  • medianOddPrintedBox (type: GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t, default: nil) - 1,3,5..
  • medianSymbolsPerLine (type: integer(), default: nil) -
  • meanSymbolsPerLine (type: integer(), default: nil) -
  • numLines (type: integer(), default: nil) - ------ Line stats "top" corresponds to the highest ascender and "bottom" to the lowest descender.
  • medianParagraphSpace (type: integer(), default: nil) - bottom to next top in block
  • numParagraphSpaces (type: integer(), default: nil) - paras that have a successor para within their block
  • medianPrintedBox (type: GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t, default: nil) - Each median*_printed_box excludes page header/footer and all graphic blocks
  • numPages (type: integer(), default: nil) - ------ Page stats.
  • medianHorizontalDpi (type: integer(), default: nil) -
  • meanSymbolsPerParagraph (type: integer(), default: nil) -
  • medianVerticalDpi (type: integer(), default: nil) -
  • medianFullPrintedBox (type: GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t, default: nil) - Each median_full*_printed_box includes page header/footer but still excludes all graphic blocks
  • fontSizeHistogram (type: list(GoogleApi.ContentWarehouse.V1.Model.GoodocFontSizeStats.t), default: nil) - Symbol counts (and other attributes) for each distinct CharLabel.FontId and FontSize; histogram is in decreasing order of symbol count
  • medianBlockSpace (type: integer(), default: nil) - bottom to next top in flow on page
  • medianLineHeight (type: integer(), default: nil) - top to bottom
  • medianHeight (type: integer(), default: nil) -
  • medianFullEvenPrintedBox (type: GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t, default: nil) -
  • meanWordsPerParagraph (type: integer(), default: nil) -
  • meanWordsPerLine (type: integer(), default: nil) -
  • medianEvenPrintedBox (type: GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t, default: nil) - 0,2,4..
  • medianLineSpace (type: integer(), default: nil) - bottom to next top in para
  • numSymbols (type: integer(), default: nil) - ------ Symbol stats
  • numBlocks (type: integer(), default: nil) - ------ Block stats Median symbols and words omit junk, header and footer blocks; they are intended to be a measure of the typical "content" block. There can still be substantial differences between means and medians; however, block values will generally exceed paragraph values (not the case when headers and footers are included).
  • medianWordsPerBlock (type: integer(), default: nil) -
  • numBlockSpaces (type: integer(), default: nil) - blocks that have a successor block within their flow on their page
  • meanSymbolsPerBlock (type: integer(), default: nil) -

Summary

Functions

Unwrap a decoded JSON object into its complex fields.

Types

@type t() :: %GoogleApi.ContentWarehouse.V1.Model.GoodocSummaryStats{
  estimatedFontSizes: boolean() | nil,
  fontSizeHistogram:
    [GoogleApi.ContentWarehouse.V1.Model.GoodocFontSizeStats.t()] | nil,
  meanSymbolsPerBlock: integer() | nil,
  meanSymbolsPerLine: integer() | nil,
  meanSymbolsPerParagraph: integer() | nil,
  meanSymbolsPerWord: integer() | nil,
  meanWordsPerBlock: integer() | nil,
  meanWordsPerLine: integer() | nil,
  meanWordsPerParagraph: integer() | nil,
  medianBlockSpace: integer() | nil,
  medianEvenPrintedBox:
    GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t() | nil,
  medianFullEvenPrintedBox:
    GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t() | nil,
  medianFullOddPrintedBox:
    GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t() | nil,
  medianFullPrintedBox:
    GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t() | nil,
  medianHeight: integer() | nil,
  medianHorizontalDpi: integer() | nil,
  medianLineHeight: integer() | nil,
  medianLineSpace: integer() | nil,
  medianLineSpan: integer() | nil,
  medianOddPrintedBox:
    GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t() | nil,
  medianParagraphIndent: integer() | nil,
  medianParagraphSpace: integer() | nil,
  medianPrintedBox:
    GoogleApi.ContentWarehouse.V1.Model.GoodocBoundingBox.t() | nil,
  medianSymbolsPerBlock: integer() | nil,
  medianSymbolsPerLine: integer() | nil,
  medianSymbolsPerParagraph: integer() | nil,
  medianSymbolsPerWord: integer() | nil,
  medianVerticalDpi: integer() | nil,
  medianWidth: integer() | nil,
  medianWordsPerBlock: integer() | nil,
  medianWordsPerLine: integer() | nil,
  medianWordsPerParagraph: integer() | nil,
  numBlockSpaces: integer() | nil,
  numBlocks: integer() | nil,
  numLineSpaces: integer() | nil,
  numLines: integer() | nil,
  numNonGraphicBlocks: integer() | nil,
  numPages: integer() | nil,
  numParagraphSpaces: integer() | nil,
  numParagraphs: integer() | nil,
  numSymbols: integer() | nil,
  numWords: integer() | nil
}

Functions

@spec decode(struct(), keyword()) :: struct()

Unwrap a decoded JSON object into its complex fields.