public abstract class KllSketch extends Object implements QuantilesAPI
KLL is an implementation of a very compact quantiles sketch with lazy compaction scheme and nearly optimal accuracy per retained quantile.
Reference Optimal Quantile Approximation in Streams.
The default k of 200 yields a "single-sided" epsilon of about 1.33% and a "double-sided" (PMF) epsilon of about 1.65%, with a confidence of 99%.
QuantilesAPI| Modifier and Type | Class and Description |
|---|---|
static class |
KllSketch.SketchStructure
Used primarily to define the structure of the serialized sketch.
|
static class |
KllSketch.SketchType
Used to define the variable type of the current instance of this class.
|
| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_K
The default K
|
static int |
MAX_K
The maximum K
|
EMPTY_MSG, MEM_REQ_SVR_NULL_MSG, NOT_SINGLE_ITEM_MSG, SELF_MERGE_MSG, TGT_IS_READ_ONLY_MSG, UNSUPPORTED_MSG| Modifier and Type | Method and Description |
|---|---|
static int |
getKFromEpsilon(double epsilon,
boolean pmf)
Gets the approximate k to use given epsilon, the normalized rank error.
|
static int |
getMaxSerializedSizeBytes(int k,
long n,
KllSketch.SketchType sketchType,
boolean updatableMemFormat)
Returns upper bound on the serialized size of a KllSketch given the following parameters.
|
double |
getNormalizedRankError(boolean pmf)
Gets the approximate rank error of this sketch normalized as a fraction between zero and one.
|
static double |
getNormalizedRankError(int k,
boolean pmf)
Gets the normalized rank error given k and pmf.
|
int |
getNumRetained()
Gets the number of quantiles retained by the sketch.
|
int |
getSerializedSizeBytes()
Returns the current number of bytes this Sketch would require if serialized in compact form.
|
boolean |
hasMemory()
Returns true if this sketch's data structure is backed by Memory or WritableMemory.
|
boolean |
isCompactMemoryFormat()
Returns true if this sketch is in a Compact Memory Format.
|
boolean |
isDirect()
Returns true if this sketch's data structure is off-heap (a.k.a., Direct or Native memory).
|
boolean |
isEmpty()
Returns true if this sketch is empty.
|
boolean |
isEstimationMode()
Returns true if this sketch is in estimation mode.
|
boolean |
isMemoryUpdatableFormat()
Returns true if the backing WritableMemory is in updatable format.
|
boolean |
isReadOnly()
Returns true if this sketch is read only.
|
boolean |
isSameResource(org.apache.datasketches.memory.Memory that)
Returns true if the backing resource of this is identical with the backing resource
of that.
|
abstract void |
merge(KllSketch other)
Merges another sketch into this one.
|
String |
toString()
Returns a summary of the key parameters of the sketch.
|
abstract String |
toString(boolean withLevels,
boolean withLevelsAndItems)
Returns human readable summary information about this sketch.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitgetK, getN, getRankLowerBound, getRankUpperBound, resetpublic static final int DEFAULT_K
public static final int MAX_K
public static int getKFromEpsilon(double epsilon,
boolean pmf)
epsilon - the normalized rank error between zero and one.pmf - if true, this function returns the k assuming the input epsilon
is the desired "double-sided" epsilon for the getPMF() function. Otherwise, this function
returns k assuming the input epsilon is the desired "single-sided"
epsilon for all the other queries.public static int getMaxSerializedSizeBytes(int k,
long n,
KllSketch.SketchType sketchType,
boolean updatableMemFormat)
k - parameter that controls size of the sketch and accuracy of estimatesn - stream lengthsketchType - Only DOUBLES_SKETCH and FLOATS_SKETCH is supported for this operation.updatableMemFormat - true if updatable Memory format, otherwise the standard compact format.public static double getNormalizedRankError(int k,
boolean pmf)
k - the configuration parameterpmf - if true, returns the "double-sided" normalized rank error for the getPMF() function.
Otherwise, it is the "single-sided" normalized rank error for all the other queries.public final double getNormalizedRankError(boolean pmf)
QuantilesAPIgetNormalizedRankError in interface QuantilesAPIpmf - if true, returns the "double-sided" normalized rank error for the getPMF() function.
Otherwise, it is the "single-sided" normalized rank error for all the other queries.public final int getNumRetained()
QuantilesAPIgetNumRetained in interface QuantilesAPIpublic int getSerializedSizeBytes()
public boolean hasMemory()
QuantilesAPIhasMemory in interface QuantilesAPIpublic boolean isCompactMemoryFormat()
public boolean isDirect()
QuantilesAPIisDirect in interface QuantilesAPIpublic final boolean isEmpty()
QuantilesAPIisEmpty in interface QuantilesAPIpublic final boolean isEstimationMode()
QuantilesAPIisEstimationMode in interface QuantilesAPIpublic final boolean isMemoryUpdatableFormat()
public final boolean isReadOnly()
QuantilesAPIisReadOnly in interface QuantilesAPIpublic final boolean isSameResource(org.apache.datasketches.memory.Memory that)
that - A different non-null objectpublic abstract void merge(KllSketch other)
other - sketch to merge into this onepublic final String toString()
QuantilesAPItoString in interface QuantilesAPItoString in class Objectpublic abstract String toString(boolean withLevels, boolean withLevelsAndItems)
withLevels - if true includes sketch levels array summary informationwithLevelsAndItems - if true include detail of levels array and items array togetherCopyright © 2015–2024 The Apache Software Foundation. All rights reserved.