Examples of Tokenizer

  • org.apache.jena.riot.tokens.Tokenizer
  • org.apache.lucene.analysis.Tokenizer
    A Tokenizer is a TokenStream whose input is a Reader.

    This is an abstract class.

    NOTE: subclasses must override {@link #incrementToken()} if the new TokenStream API is usedand {@link #next(Token)} or {@link #next()} if the oldTokenStream API is used.

    NOTE: Subclasses overriding {@link #incrementToken()} mustcall {@link AttributeSource#clearAttributes()} beforesetting attributes. Subclasses overriding {@link #next(Token)} must call{@link Token#clear()} before setting Token attributes.

  • org.apache.myfaces.trinidadinternal.el.Tokenizer
    converts a EL expression into tokens. @author The Oracle ADF Faces Team
  • org.apache.uima.lucas.indexer.Tokenizer
  • org.crsh.cli.impl.tokenizer.Tokenizer
  • org.eclipse.orion.server.cf.manifest.v2.Tokenizer
  • org.eclipse.osgi.framework.internal.core.Tokenizer
    Simple tokenizer class. Used to parse data.
  • org.exist.storage.analysis.Tokenizer
  • org.geoserver.ows.util.KvpUtils.Tokenizer
  • org.hsqldb.Tokenizer
    Provides the ability to tokenize SQL character sequences. Extensively rewritten and extended in successive versions of HSQLDB. @author Thomas Mueller (Hypersonic SQL Group) @version 1.8.0 @since Hypersonic SQL
  • org.jboss.dna.common.text.TokenStream.Tokenizer
  • org.jboss.forge.shell.command.parser.Tokenizer
    @author Lincoln Baxter, III
  • org.jstripe.tokenizer.Tokenizer
  • org.languagetool.tokenizers.Tokenizer
    Interface for classes that tokenize text into smaller units. @author Daniel Naber
  • org.modeshape.common.text.TokenStream.Tokenizer
  • org.openjena.riot.tokens.Tokenizer
  • org.radargun.utils.Tokenizer
    Tokenizer that allows string delims instead of char delims @author Radim Vansa <rvansa@redhat.com>
  • org.sonatype.maven.polyglot.atom.parsing.Tokenizer
    Taken from the Loop programming language compiler pipeline. @author dhanji@gmail.com (Dhanji R. Prasanna)
  • org.spoofax.jsglr.client.imploder.Tokenizer
  • org.supercsv_voltpatches.tokenizer.Tokenizer
    Reads the CSV file, line by line. If you want the line-reading functionality of this class, but want to define your own implementation of {@link #readColumns(List)}, then consider writing your own Tokenizer by extending AbstractTokenizer. @author Kasper B. Graversen @author James Bassett
  • org.zkoss.selector.lang.Tokenizer
    @author simonpai
  • weka.core.tokenizers.Tokenizer
    A superclass for all tokenizer algorithms. @author FracPete (fracpete at waikato dot ac dot nz) @version $Revision: 1.3 $

  • Examples of org.apache.lucene.analysis.Tokenizer

        final int codeLen = _TestUtil.nextInt(random(), 1, 8);
        Analyzer a = new Analyzer() {

          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new MockTokenizer(reader, MockTokenizer.WHITESPACE, false);
            return new TokenStreamComponents(tokenizer, new DoubleMetaphoneFilter(tokenizer, codeLen, false));
          }
         
        };
        checkRandomData(random(), a, 1000 * RANDOM_MULTIPLIER);
       
        Analyzer b = new Analyzer() {

          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new MockTokenizer(reader, MockTokenizer.WHITESPACE, false);
            return new TokenStreamComponents(tokenizer, new DoubleMetaphoneFilter(tokenizer, codeLen, true));
          }
         
        };
        checkRandomData(random(), b, 1000 * RANDOM_MULTIPLIER);
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

     
      public void testEmptyTerm() throws IOException {
        Analyzer a = new Analyzer() {
          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new KeywordTokenizer(reader);
            return new TokenStreamComponents(tokenizer, new DoubleMetaphoneFilter(tokenizer, 8, random().nextBoolean()));
          }
        };
        checkOneTerm(a, "", "");
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

        }
      }
     
      static void assertAlgorithm(String algName, String inject, String input,
          String[] expected) throws Exception {
        Tokenizer tokenizer = new MockTokenizer(new StringReader(input), MockTokenizer.WHITESPACE, false);
        Map<String,String> args = new HashMap<String,String>();
        args.put("encoder", algName);
        args.put("inject", inject);
        PhoneticFilterFactory factory = new PhoneticFilterFactory(args);
        factory.inform(new ClasspathResourceLoader(factory.getClass()));
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

      static private Analyzer newTestAnalyzer() {
        return new Analyzer() {
          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new MockTokenizer(reader, MockTokenizer.WHITESPACE, false);
            return new TokenStreamComponents(tokenizer, tokenizer);
          }

          @Override
          protected Reader initReader(String fieldName, Reader reader) {
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

    */
    public class TestElision extends BaseTokenStreamTestCase {

      public void testElision() throws Exception {
        String test = "Plop, juste pour voir l'embrouille avec O'brian. M'enfin.";
        Tokenizer tokenizer = new StandardTokenizer(TEST_VERSION_CURRENT, new StringReader(test));
        CharArraySet articles = new CharArraySet(TEST_VERSION_CURRENT, asSet("l", "M"), false);
        TokenFilter filter = new ElisionFilter(tokenizer, articles);
        List<String> tas = filter(filter);
        assertEquals("embrouille", tas.get(4));
        assertEquals("O'brian", tas.get(6));
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

     
      public void testEmptyTerm() throws IOException {
        Analyzer a = new Analyzer() {
          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new KeywordTokenizer(reader);
            return new TokenStreamComponents(tokenizer, new ElisionFilter(tokenizer, FrenchAnalyzer.DEFAULT_ARTICLES));
          }
        };
        checkOneTerm(a, "", "");
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

      public void testRandom() throws Exception {
        Analyzer analyzer = new Analyzer() {

          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new MockTokenizer(reader, MockTokenizer.WHITESPACE, false);
            return new TokenStreamComponents(tokenizer, tokenizer);
          }

          @Override
          protected Reader initReader(String fieldName, Reader reader) {
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

        final NormalizeCharMap map = builder.build();

        Analyzer analyzer = new Analyzer() {
          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new MockTokenizer(reader, MockTokenizer.WHITESPACE, false);
            return new TokenStreamComponents(tokenizer, tokenizer);
          }

          @Override
          protected Reader initReader(String fieldName, Reader reader) {
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

          @Override
          protected TokenStreamComponents createComponents(String field, Reader reader) {
            final CharArraySet keywords = new CharArraySet(version, 1, false);
            keywords.add("liście");

            final Tokenizer src = new StandardTokenizer(TEST_VERSION_CURRENT, reader);
            TokenStream result = new StandardFilter(TEST_VERSION_CURRENT, src);
            result = new SetKeywordMarkerFilter(result, keywords);
            result = new MorfologikFilter(result, TEST_VERSION_CURRENT);

            return new TokenStreamComponents(src, result);
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

      // so in this case we behave like WDF, and preserve any modified offsets
      public void testInvalidOffsets() throws Exception {
        Analyzer analyzer = new Analyzer() {
          @Override
          protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
            Tokenizer tokenizer = new MockTokenizer(reader, MockTokenizer.WHITESPACE, false);
            TokenFilter filters = new ASCIIFoldingFilter(tokenizer);
            filters = new NGramTokenFilter(TEST_VERSION_CURRENT, filters, 2, 2);
            return new TokenStreamComponents(tokenizer, filters);
          }
        };
    View Full Code Here
    TOP
    Copyright © 2018 www.massapi.com. All rights reserved.
    All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.