Class UnivariateStats

java.lang.Object
uk.ac.starlink.ttools.filter.UnivariateStats

public abstract class UnivariateStats extends Object
Calculates univariate statistics for a variable. Feed data to an instance of this object by repeatedly calling acceptDatum(java.lang.Object, long) and then call the various accessor methods to get accumulated values.
Since:
27 Apr 2006
Author:
Mark Taylor
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static interface 
    Aggregates statistics acquired from a column whose values are fixed-length numeric arrays.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Maximum value for cardinality counters.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    abstract void
    acceptDatum(Object value, long irow)
    Submits a single value to the statistics accumulator.
    abstract void
    Adds the accumulated content of a second UnivariateStats object to this one.
    createStats(Class<?> clazz, Supplier<Quantiler> qSupplier, boolean doCard)
    Factory method to construct an instance of this class for accumulating particular types of values.
    Returns an object containing statistics applicable to numeric-array-valued columns.
    abstract int
    Returns the number of distinct non-null values submitted, if known.
    abstract long
    Returns the number of good (non-null) values accumulated.
    abstract Comparable<?>
    Returns the maximum value submitted, if applicable.
    abstract long
    Returns the sequence number of the maximum value submitted.
    abstract Comparable<?>
    Returns the numeric minimum value submitted, if applicable.
    abstract long
    Returns the sequence number of the minimum value submitted.
    abstract Quantiler
    Returns a quantiler ready to provide quantile values, or null if quantiles were not gathered.
    abstract double
    Returns the numeric sum of values accumulated.
    abstract double
    Returns the sum of squares of values accumulated.
    abstract double
    Returns the sum of cubes of values accumulated.
    abstract double
    Returns the sum of fourth powers of values accumulated.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • MAX_CARDINALITY

      public static final int MAX_CARDINALITY
      Maximum value for cardinality counters.
      See Also:
  • Constructor Details

    • UnivariateStats

      public UnivariateStats()
  • Method Details

    • acceptDatum

      public abstract void acceptDatum(Object value, long irow)
      Submits a single value to the statistics accumulator. The submitted value should be of a type compatible with the class type of this Stats object.
      Parameters:
      value - value object
      irow - row index of input value
    • addStats

      public abstract void addStats(UnivariateStats other)
      Adds the accumulated content of a second UnivariateStats object to this one.
      Parameters:
      other - compatible UnivariateStats object
    • getCount

      public abstract long getCount()
      Returns the number of good (non-null) values accumulated.
      Returns:
      good value count
    • getSum

      public abstract double getSum()
      Returns the numeric sum of values accumulated.
      Returns:
      sum of values
    • getSum2

      public abstract double getSum2()
      Returns the sum of squares of values accumulated.
      Returns:
      sum of squared values
    • getSum3

      public abstract double getSum3()
      Returns the sum of cubes of values accumulated.
      Returns:
      sum of cubed values
    • getSum4

      public abstract double getSum4()
      Returns the sum of fourth powers of values accumulated.
      Returns:
      sum of fourth powers
    • getMinimum

      public abstract Comparable<?> getMinimum()
      Returns the numeric minimum value submitted, if applicable.
      Returns:
      minimum
    • getMaximum

      public abstract Comparable<?> getMaximum()
      Returns the maximum value submitted, if applicable.
      Returns:
      maximum
    • getMinPos

      public abstract long getMinPos()
      Returns the sequence number of the minimum value submitted. Returns -1 if there is no minimum or if the sequence number is not known.
      Returns:
      row index of minimum, or -1
    • getMaxPos

      public abstract long getMaxPos()
      Returns the sequence number of the maximum value submitted. Returns -1 if there is no maximum or if the sequence number is not known.
      Returns:
      row index of maximum, or -1
    • getCardinality

      public abstract int getCardinality()
      Returns the number of distinct non-null values submitted, if known. If the count was not collected, or if there were too many different values to count, -1 is returned.
      Returns:
      number of distinct non-null values, or -1
    • getQuantiler

      public abstract Quantiler getQuantiler()
      Returns a quantiler ready to provide quantile values, or null if quantiles were not gathered. If a non-null quantiler is returned, the Quantiler.ready() value will have been called on it.
      Returns:
      ready quantiler, or null
    • getArrayStats

      public abstract UnivariateStats.ArrayStats getArrayStats()
      Returns an object containing statistics applicable to numeric-array-valued columns.
      Returns:
      array stats object, or null
    • createStats

      public static UnivariateStats createStats(Class<?> clazz, Supplier<Quantiler> qSupplier, boolean doCard)
      Factory method to construct an instance of this class for accumulating particular types of values.
      Parameters:
      clazz - class of which all submitted values will be instances of (if they're not null)
      qSupplier - supplier for an object that can calculate quantiles, or null if quantiles are not required
      doCard - true if an attempt is to be made to count distinct values
      Returns:
      stats accumulator