Class UnionSketchUDF


  • public class UnionSketchUDF
    extends org.apache.hadoop.hive.ql.exec.UDF
    Hive union sketch UDF.
    • Constructor Summary

      Constructors 
      Constructor Description
      UnionSketchUDF()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.hadoop.io.BytesWritable evaluate​(org.apache.hadoop.io.BytesWritable firstSketch, org.apache.hadoop.io.BytesWritable secondSketch)
      Main logic called by hive if sketchSize is not passed in.
      org.apache.hadoop.io.BytesWritable evaluate​(org.apache.hadoop.io.BytesWritable firstSketch, org.apache.hadoop.io.BytesWritable secondSketch, int sketchSize)
      Main logic called by hive if sketchSize is also passed in.
      org.apache.hadoop.io.BytesWritable evaluate​(org.apache.hadoop.io.BytesWritable firstSketch, org.apache.hadoop.io.BytesWritable secondSketch, int sketchSize, long seed)
      Main logic called by hive if sketchSize is also passed in.
      • Methods inherited from class org.apache.hadoop.hive.ql.exec.UDF

        getRequiredFiles, getRequiredJars, getResolver, setResolver
    • Constructor Detail

      • UnionSketchUDF

        public UnionSketchUDF()
    • Method Detail

      • evaluate

        public org.apache.hadoop.io.BytesWritable evaluate​(org.apache.hadoop.io.BytesWritable firstSketch,
                                                           org.apache.hadoop.io.BytesWritable secondSketch,
                                                           int sketchSize,
                                                           long seed)
        Main logic called by hive if sketchSize is also passed in. Union two sketches of same or different column.
        Parameters:
        firstSketch - first sketch to be unioned.
        secondSketch - second sketch to be unioned.
        sketchSize - final output unioned sketch size. This must be a power of 2 and larger than 16.
        seed - using the seed is not recommended unless you really know why you need it.
        Returns:
        resulting sketch of union.
      • evaluate

        public org.apache.hadoop.io.BytesWritable evaluate​(org.apache.hadoop.io.BytesWritable firstSketch,
                                                           org.apache.hadoop.io.BytesWritable secondSketch,
                                                           int sketchSize)
        Main logic called by hive if sketchSize is also passed in. Union two sketches of same or different column.
        Parameters:
        firstSketch - first sketch to be unioned.
        secondSketch - second sketch to be unioned.
        sketchSize - final output unioned sketch size. This must be a power of 2 and larger than 16.
        Returns:
        resulting sketch of union.
      • evaluate

        public org.apache.hadoop.io.BytesWritable evaluate​(org.apache.hadoop.io.BytesWritable firstSketch,
                                                           org.apache.hadoop.io.BytesWritable secondSketch)
        Main logic called by hive if sketchSize is not passed in. Union two sketches of same or different column.
        Parameters:
        firstSketch - first sketch to be unioned.
        secondSketch - second sketch to be unioned.
        Returns:
        resulting sketch of union.