Examples of BloomFilter

com.aelitis.azureus.core.util.bloom.BloomFilter
com.bitsofproof.supernode.common.BloomFilter
com.codecademy.eventhub.base.BloomFilter
com.foundationdb.util.BloomFilter
com.jme3.post.filters.BloomFilter
engine.org/wiki/doku.php/jme3:advanced:bloom_and_glow">advanced:bloom_and_glow for more details @author Rémy Bouquet aka Nehon
it.unimi.dsi.util.BloomFilter
joshua.decoder.ff.lm.bloomfilter_lm.BloomFilter
A Bloom filter: a lossy data structure for set representation. A Bloom filter consists of a bit set and a set of hash functions. A Bloom filter has two operations: add and query. We can add an object to a Bloom filter to indicate that it should be considered part of the set that the Bloom filter represents. We can query the Bloom filter to see if a given object is considered part of its set.
An object is added by sending it through a number of hash functions, each of which returns an index into the bit set. The bit at each of the indices is flipped on. We can query for an abject by sending it through the same hash functions. Then we look the bit at each index that was returned by a hash function. If any of the bits is unset, we know that the object is not in the Bloom filter (for otherwise all the bits should have already been set). If all the bits are set, we assume that the object is present in the Bloom filter.
We cannot know for sure that an object is in the bloom filter just because all its bits were set. There may be many collisions in the hash space, and all the bits for some object might be set by chance, rather than by adding that particular object.
The advantage of a Bloom filter is that its set representation can be stored in a significantly smaller space than information-theoretic lossless lower bounds. The price we pay for this is a certain amount of error in the query function. One nice feature of the Bloom filter is that its error is one-sided. This means that while the query function may return false positives (saying an object is present when it really isn't), it can never return false negatives (saying that an object is not present when it was already added.
org.apache.cassandra.utils.BloomFilter
org.apache.hadoop.hbase.migration.nineteen.onelab.filter.BloomFilter
ne-lab.org">European Commission One-Lab Project 034819. @version 1.0 - 2 Feb. 07
org.apache.hadoop.hbase.util.BloomFilter
ne-lab.org">European Commission One-Lab Project 034819.
It must be extended in order to define the real behavior.
org.apache.hadoop.util.bloom.BloomFilter
ne-lab.org">European Commission One-Lab Project 034819. @see Filter The general behavior of a filter @see Space/Time Trade-Offs in Hash Coding with Allowable Errors
org.deuce.transaction.tl2.BloomFilter
Implements Bloom filter map @author Guy Korland @since 1.0
org.elasticsearch.common.bloom.BloomFilter
org.elasticsearch.common.util.BloomFilter
A bloom filter. Inspired by Guava bloom filter implementation though with some optimizations.
org.onelab.filter.BloomFilter
ne-lab.org">European Commission One-Lab Project 034819. @version 1.0 - 2 Feb. 07 @see org.onelab.filter.Filter The general behavior of a filter @see Space/Time Trade-Offs in Hash Coding with Allowable Errors
org.pathways.openciss.rest.impl.BloomFilter

Examples of org.apache.hadoop.hbase.util.BloomFilter

     */
    public boolean passesGeneralBloomFilter(byte[] row, int rowOffset,
        int rowLen, byte[] col, int colOffset, int colLen) {
      // Cache Bloom filter as a local variable in case it is set to null by
      // another thread on an IO error.
      BloomFilter bloomFilter = this.generalBloomFilter;
      if (bloomFilter == null) {
        return true;
      }


      byte[] key;
      switch (bloomFilterType) {
        case ROW:
          if (col != null) {
            throw new RuntimeException("Row-only Bloom filter called with " +
                "column specified");
          }
          if (rowOffset != 0 || rowLen != row.length) {
              throw new AssertionError("For row-only Bloom filters the row "
                  + "must occupy the whole array");
          }
          key = row;
          break;


        case ROWCOL:
          key = bloomFilter.createBloomKey(row, rowOffset, rowLen, col,
              colOffset, colLen);
          break;


        default:
          return true;
      }


      // Empty file
      if (reader.getTrailer().getEntryCount() == 0)
        return false;


      try {
        boolean shouldCheckBloom;
        ByteBuffer bloom;
        if (bloomFilter.supportsAutoLoading()) {
          bloom = null;
          shouldCheckBloom = true;
        } else {
          bloom = reader.getMetaBlock(HFile.BLOOM_FILTER_DATA_KEY,
              true);
          shouldCheckBloom = bloom != null;
        }


        if (shouldCheckBloom) {
          boolean exists;


          // Whether the primary Bloom key is greater than the last Bloom key
          // from the file info. For row-column Bloom filters this is not yet
          // a sufficient condition to return false.
          boolean keyIsAfterLast = lastBloomKey != null
              && bloomFilter.getComparator().compareFlatKey(key, lastBloomKey) > 0;


          if (bloomFilterType == BloomType.ROWCOL) {
            // Since a Row Delete is essentially a DeleteFamily applied to all
            // columns, a file might be skipped if using row+col Bloom filter.
            // In order to ensure this file is included an additional check is
            // required looking only for a row bloom.
            byte[] rowBloomKey = bloomFilter.createBloomKey(row, 0, row.length,
                null, 0, 0);


            if (keyIsAfterLast
                && bloomFilter.getComparator().compareFlatKey(rowBloomKey,
                    lastBloomKey) > 0) {
              exists = false;
            } else {
              exists =
                  bloomFilter.contains(key, 0, key.length, bloom) ||
                  bloomFilter.contains(rowBloomKey, 0, rowBloomKey.length,
                      bloom);
            }
          } else {
            exists = !keyIsAfterLast
                && bloomFilter.contains(key, 0, key.length, bloom);
          }


          return exists;
        }
      } catch (IOException e) {

View Full Code Here

Examples of org.apache.hadoop.hbase.util.BloomFilter


    public boolean passesDeleteFamilyBloomFilter(byte[] row, int rowOffset,
        int rowLen) {
      // Cache Bloom filter as a local variable in case it is set to null by
      // another thread on an IO error.
      BloomFilter bloomFilter = this.deleteFamilyBloomFilter;


      // Empty file or there is no delete family at all
      if (reader.getTrailer().getEntryCount() == 0 || deleteFamilyCnt == 0) {
        return false;
      }


      if (bloomFilter == null) {
        return true;
      }


      try {
        if (!bloomFilter.supportsAutoLoading()) {
          return true;
        }
        return bloomFilter.contains(row, rowOffset, rowLen, null);
      } catch (IllegalArgumentException e) {
        LOG.error("Bad Delete Family bloom filter data -- proceeding without",
            e);
        setDeleteFamilyBloomFilterFaulty();
      }

View Full Code Here

Examples of org.apache.hadoop.hbase.util.BloomFilter

          return true;
      }


      // Cache Bloom filter as a local variable in case it is set to null by
      // another thread on an IO error.
      BloomFilter bloomFilter = this.generalBloomFilter;


      if (bloomFilter == null) {
        return true;
      }


      // Empty file
      if (reader.getTrailer().getEntryCount() == 0)
        return false;


      try {
        boolean shouldCheckBloom;
        ByteBuffer bloom;
        if (bloomFilter.supportsAutoLoading()) {
          bloom = null;
          shouldCheckBloom = true;
        } else {
          bloom = reader.getMetaBlock(HFileWriterV1.BLOOM_FILTER_DATA_KEY,
              true);
          shouldCheckBloom = bloom != null;
        }


        if (shouldCheckBloom) {
          boolean exists;


          // Whether the primary Bloom key is greater than the last Bloom key
          // from the file info. For row-column Bloom filters this is not yet
          // a sufficient condition to return false.
          boolean keyIsAfterLast = lastBloomKey != null
              && bloomFilter.getComparator().compare(key, lastBloomKey) > 0;


          if (bloomFilterType == BloomType.ROWCOL) {
            // Since a Row Delete is essentially a DeleteFamily applied to all
            // columns, a file might be skipped if using row+col Bloom filter.
            // In order to ensure this file is included an additional check is
            // required looking only for a row bloom.
            byte[] rowBloomKey = bloomFilter.createBloomKey(row, 0, row.length,
                null, 0, 0);


            if (keyIsAfterLast
                && bloomFilter.getComparator().compare(rowBloomKey,
                    lastBloomKey) > 0) {
              exists = false;
            } else {
              exists =
                  bloomFilter.contains(key, 0, key.length, bloom) ||
                  bloomFilter.contains(rowBloomKey, 0, rowBloomKey.length,
                      bloom);
            }
          } else {
            exists = !keyIsAfterLast
                && bloomFilter.contains(key, 0, key.length, bloom);
          }


          getSchemaMetrics().updateBloomMetrics(exists);
          return exists;
        }

View Full Code Here

Examples of org.apache.hadoop.hbase.util.BloomFilter


    System.out.println("Mid-key: " + Bytes.toStringBinary(reader.midkey()));


    // Printing general bloom information
    DataInput bloomMeta = reader.getGeneralBloomFilterMetadata();
    BloomFilter bloomFilter = null;
    if (bloomMeta != null)
      bloomFilter = BloomFilterFactory.createFromMeta(bloomMeta, reader);


    System.out.println("Bloom filter:");
    if (bloomFilter != null) {
      System.out.println(FOUR_SPACES + bloomFilter.toString().replaceAll(
          ByteBloomFilter.STATS_RECORD_SEP, "\n" + FOUR_SPACES));
    } else {
      System.out.println(FOUR_SPACES + "Not present");
    }


    // Printing delete bloom information
    bloomMeta = reader.getDeleteBloomFilterMetadata();
    bloomFilter = null;
    if (bloomMeta != null)
      bloomFilter = BloomFilterFactory.createFromMeta(bloomMeta, reader);


    System.out.println("Delete Family Bloom filter:");
    if (bloomFilter != null) {
      System.out.println(FOUR_SPACES
          + bloomFilter.toString().replaceAll(ByteBloomFilter.STATS_RECORD_SEP,
              "\n" + FOUR_SPACES));
    } else {
      System.out.println(FOUR_SPACES + "Not present");
    }
  }

View Full Code Here

Examples of org.apache.hadoop.util.bloom.BloomFilter

        ArgumentCaptor<BloomFilter> argument = ArgumentCaptor.forClass(BloomFilter.class);
        verify(context).write(
                any(),
                argument.capture());


        BloomFilter f=argument.getValue();
        assertFalse(f.membershipTest(BloomReducer.toKey("Michigan")));
        assertTrue(f.membershipTest(BloomReducer.toKey("New Jersey")));
        assertTrue(f.membershipTest(BloomReducer.toKey("New Mexico")));
        assertTrue(f.membershipTest(BloomReducer.toKey("Lady Gaga")));
        assertTrue(f.membershipTest(BloomReducer.toKey("Beyonce")));
        assertFalse(f.membershipTest(BloomReducer.toKey("Olivia Newton-John")));
    }

View Full Code Here

Examples of org.apache.hadoop.util.bloom.BloomFilter

        assertFalse(f.membershipTest(BloomReducer.toKey("Olivia Newton-John")));
    }


    @Test
    public void justBloom() {
        BloomFilter f=new BloomFilter(100000,10, Hash.parseHashType("murmur"));
        f.add(new Key(new Text("New Jersey").getBytes()));
        assertTrue(f.membershipTest(new Key(new Text("New Jersey").getBytes())));
    }

View Full Code Here

Examples of org.apache.hadoop.util.bloom.BloomFilter

        super.setup(context);
        Configuration c=context.getConfiguration();
        int vectorSize=c.getInt(VECTOR_SIZE,0);
        int nbHash=c.getInt(NB_HASH,0);
        String hashType=c.get(HASH_TYPE, "murmur");
        f=new BloomFilter(vectorSize,nbHash, Hash.parseHashType(hashType));
    }

View Full Code Here

Examples of org.deuce.transaction.tl2.BloomFilter

 * @since 0.4
 */
public class BloomFilterTest extends TestCase {


  public void testCheckInFilter(){
    BloomFilter filter = new BloomFilter();
    filter.add(34254354);
    Assert.assertTrue(filter.contains(34254354));
  }

View Full Code Here

Examples of org.elasticsearch.common.bloom.BloomFilter

            // no version, get the version from the index, we know that we refresh on flush
            Searcher searcher = searcher();
            try {
                UnicodeUtil.UTF8Result utf8 = Unicode.fromStringAsUtf8(get.uid().text());
                for (IndexReader reader : searcher.searcher().subReaders()) {
                    BloomFilter filter = bloomCache.filter(reader, UidFieldMapper.NAME, asyncLoadBloomFilter);
                    // we know that its not there...
                    if (!filter.isPresent(utf8.result, 0, utf8.length)) {
                        continue;
                    }
                    UidField.DocIdAndVersion docIdAndVersion = UidField.loadDocIdAndVersion(reader, get.uid());
                    if (docIdAndVersion != null && docIdAndVersion.docId != Lucene.NO_DOC) {
                        return new GetResult(searcher, docIdAndVersion);

View Full Code Here

Examples of org.elasticsearch.common.util.BloomFilter


    public static void main(String[] args) throws Exception {
        SecureRandom random = new SecureRandom();
        final int ELEMENTS = (int) SizeValue.parseSizeValue("1m").singles();
        final double fpp = 0.01;
        BloomFilter gFilter = BloomFilter.create(ELEMENTS, fpp);
        System.out.println("G SIZE: " + new ByteSizeValue(gFilter.getSizeInBytes()));


        FuzzySet lFilter = FuzzySet.createSetBasedOnMaxMemory((int) gFilter.getSizeInBytes());
        //FuzzySet lFilter = FuzzySet.createSetBasedOnQuality(ELEMENTS, 0.97f);


        for (int i = 0; i < ELEMENTS; i++) {
            BytesRef bytesRef = new BytesRef(Strings.randomBase64UUID(random));
            gFilter.put(bytesRef);
            lFilter.addValue(bytesRef);
        }


        int lFalse = 0;
        int gFalse = 0;
        for (int i = 0; i < ELEMENTS; i++) {
            BytesRef bytesRef = new BytesRef(Strings.randomBase64UUID(random));
            if (gFilter.mightContain(bytesRef)) {
                gFalse++;
            }
            if (lFilter.contains(bytesRef) == FuzzySet.ContainsResult.MAYBE) {
                lFalse++;
            }

View Full Code Here

0 1 2 3 4 5 6

TOP

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.