Class EstimateSketchSimilarityUDF


  • public class EstimateSketchSimilarityUDF
    extends org.apache.hadoop.hive.ql.exec.UDF
    Hive estimate sketch similarity UDF.
    • Constructor Detail

      • EstimateSketchSimilarityUDF

        public EstimateSketchSimilarityUDF()
    • Method Detail

      • evaluate

        public double evaluate​(org.apache.hadoop.io.BytesWritable firstSketchBytes,
                               org.apache.hadoop.io.BytesWritable secondSketchBytes)
        Main logic called by hive. Computes the jaccard similarity of two sketches of same or different column.
        Parameters:
        firstSketchBytes - first sketch to be compared.
        secondSketchBytes - second sketch to be compared.
        Returns:
        the estimate of similarity of two sketches