Class DataToSketchUDAF

  • All Implemented Interfaces:
    org.apache.hadoop.hive.ql.udf.generic.GenericUDAFResolver, org.apache.hadoop.hive.ql.udf.generic.GenericUDAFResolver2

    public class DataToSketchUDAF
    extends org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver
    Hive UDAF to create an CPCSketch from raw data.

    Note Strings as raw data values are encoded as a UTF-16 VARCHAR prior to being submitted to the sketch. If the user requires a different encoding for cross-platform compatibility, it is recommended that these values be encoded prior to being submitted and then typed as a BINARY byte[].

    • Constructor Detail

      • DataToSketchUDAF

        public DataToSketchUDAF()
    • Method Detail

      • getEvaluator

        public org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator getEvaluator​(org.apache.hadoop.hive.ql.udf.generic.GenericUDAFParameterInfo info)
                                                                                throws org.apache.hadoop.hive.ql.parse.SemanticException
        Performs argument number and type validation. DataToSketch expects to receive between one and three arguments.
        • The first (required) is the value to add to the sketch and must be a primitive.
        • The second (optional) is the lgK from 4 to 21 (default 11). This must be an integral value and must be constant.
        • The third (optional) is the update seed.
        Specified by:
        getEvaluator in interface org.apache.hadoop.hive.ql.udf.generic.GenericUDAFResolver2
        Overrides:
        getEvaluator in class org.apache.hadoop.hive.ql.udf.generic.AbstractGenericUDAFResolver
        Parameters:
        info - Parameter info to validate
        Returns:
        The GenericUDAFEvaluator that should be used to calculate the function.
        Throws:
        org.apache.hadoop.hive.ql.parse.SemanticException
        See Also:
        #getEvaluator(org.apache.hadoop.hive.ql.udf.generic.GenericUDAFParameterInfo)