Examples of Tokenizer

  • org.apache.jena.riot.tokens.Tokenizer
  • org.apache.lucene.analysis.Tokenizer
    A Tokenizer is a TokenStream whose input is a Reader.

    This is an abstract class.

    NOTE: subclasses must override {@link #incrementToken()} if the new TokenStream API is usedand {@link #next(Token)} or {@link #next()} if the oldTokenStream API is used.

    NOTE: Subclasses overriding {@link #incrementToken()} mustcall {@link AttributeSource#clearAttributes()} beforesetting attributes. Subclasses overriding {@link #next(Token)} must call{@link Token#clear()} before setting Token attributes.

  • org.apache.myfaces.trinidadinternal.el.Tokenizer
    converts a EL expression into tokens. @author The Oracle ADF Faces Team
  • org.apache.uima.lucas.indexer.Tokenizer
  • org.crsh.cli.impl.tokenizer.Tokenizer
  • org.eclipse.orion.server.cf.manifest.v2.Tokenizer
  • org.eclipse.osgi.framework.internal.core.Tokenizer
    Simple tokenizer class. Used to parse data.
  • org.exist.storage.analysis.Tokenizer
  • org.geoserver.ows.util.KvpUtils.Tokenizer
  • org.hsqldb.Tokenizer
    Provides the ability to tokenize SQL character sequences. Extensively rewritten and extended in successive versions of HSQLDB. @author Thomas Mueller (Hypersonic SQL Group) @version 1.8.0 @since Hypersonic SQL
  • org.jboss.dna.common.text.TokenStream.Tokenizer
  • org.jboss.forge.shell.command.parser.Tokenizer
    @author Lincoln Baxter, III
  • org.jstripe.tokenizer.Tokenizer
  • org.languagetool.tokenizers.Tokenizer
    Interface for classes that tokenize text into smaller units. @author Daniel Naber
  • org.modeshape.common.text.TokenStream.Tokenizer
  • org.openjena.riot.tokens.Tokenizer
  • org.radargun.utils.Tokenizer
    Tokenizer that allows string delims instead of char delims @author Radim Vansa <rvansa@redhat.com>
  • org.sonatype.maven.polyglot.atom.parsing.Tokenizer
    Taken from the Loop programming language compiler pipeline. @author dhanji@gmail.com (Dhanji R. Prasanna)
  • org.spoofax.jsglr.client.imploder.Tokenizer
  • org.supercsv_voltpatches.tokenizer.Tokenizer
    Reads the CSV file, line by line. If you want the line-reading functionality of this class, but want to define your own implementation of {@link #readColumns(List)}, then consider writing your own Tokenizer by extending AbstractTokenizer. @author Kasper B. Graversen @author James Bassett
  • org.zkoss.selector.lang.Tokenizer
    @author simonpai
  • weka.core.tokenizers.Tokenizer
    A superclass for all tokenizer algorithms. @author FracPete (fracpete at waikato dot ac dot nz) @version $Revision: 1.3 $

  • Examples of nu.validator.htmlparser.impl.Tokenizer

        if (isFragment) {
          builder.setFragmentContext(null);
        }
        builder.setDoctypeExpectation(DoctypeExpectation.NO_DOCTYPE_ERRORS);
        try {
          builder.startTokenization(new Tokenizer(builder));
        } catch (SAXException ex) {
          throw new SomethingWidgyHappenedError(ex);
        }
        builder.setErrorHandler(
            new ErrorHandler() {
    View Full Code Here

    Examples of opennlp.ccg.lexicon.Tokenizer

            // load grammar
            URL grammarURL = new File(grammarfile).toURI().toURL();
            System.out.println("Loading grammar from URL: " + grammarURL);
            Grammar grammar = new Grammar(grammarURL);
            Tokenizer tokenizer = grammar.lexicon.tokenizer;
            System.out.println();
           
            // set up parser
            Parser parser = new Parser(grammar);
            // instantiate scorer
            try {
                System.out.println("Instantiating parsing sign scorer from class: " + parseScorerClass);
                SignScorer parseScorer = (SignScorer) Class.forName(parseScorerClass).newInstance();
                parser.setSignScorer(parseScorer);
                System.out.println();
            } catch (Exception exc) {
                throw (RuntimeException) new RuntimeException().initCause(exc);
            }
            // instantiate supertagger
            try {
              Supertagger supertagger;
              if (supertaggerClass != null) {
                    System.out.println("Instantiating supertagger from class: " + supertaggerClass);
                    supertagger = (Supertagger) Class.forName(supertaggerClass).newInstance();
              }
              else {
                System.out.println("Instantiating supertagger from config file: " + stconfig);
                supertagger = WordAndPOSDictionaryLabellingStrategy.supertaggerFactory(stconfig);
              }
                parser.setSupertagger(supertagger);
                System.out.println();
            } catch (Exception exc) {
                throw (RuntimeException) new RuntimeException().initCause(exc);
            }
           
            // loop through input
            BufferedReader in = new BufferedReader(new FileReader(inputfile));
            String line;
            Map<String,String> predInfoMap = new HashMap<String,String>();
            System.out.println("Parsing " + inputfile);
            System.out.println();
            int count = 1;
            while ((line = in.readLine()) != null) {
              String id = "s" + count;
              try {
                // parse it
                System.out.println(line);
          parser.parse(line);
          int numParses = Math.min(nbestListSize, parser.getResult().size());
          for (int i=0; i < numParses; i++) {
              Sign thisParse = parser.getResult().get(i);
              // convert lf
              Category cat = thisParse.getCategory();
              LF convertedLF = null;
              String predInfo = null;
              if (cat.getLF() != null) {
            // convert LF
            LF flatLF = cat.getLF();
            cat = cat.copy();
            Nominal index = cat.getIndexNominal();
            convertedLF = HyloHelper.compactAndConvertNominals(flatLF, index, thisParse);
            // get pred info
            predInfoMap.clear();
            Testbed.extractPredInfo(flatLF, predInfoMap);
            predInfo = Testbed.getPredInfo(predInfoMap);
              }
              // add test item, sign
              Element item = RegressionInfo.makeTestItem(grammar, line, 1, convertedLF);
              String actualID = (nbestListSize == 1) ? id : id + "-" + (i+1);
              item.setAttribute("info", actualID);
              outRoot.addContent(item);
              signMap.put(actualID, thisParse);
              // Add parsed words as a separate LF element
              Element fullWordsElt = new Element("full-words");
              fullWordsElt.addContent(tokenizer.format(thisParse.getWords()));
              item.addContent(fullWordsElt);
              if (predInfo != null) {
            Element predInfoElt = new Element("pred-info");
            predInfoElt.setAttribute("data", predInfo);
            item.addContent(predInfoElt);
    View Full Code Here

    Examples of opennlp.tools.tokenize.Tokenizer

              new FileInputStream(
                  new File(modelDir, "en-ner-" + names[mi] + ".bin")
              )));
        }

        Tokenizer tokenizer = SimpleTokenizer.INSTANCE; //<co id="co.opennlp.name.2"/>
        for (int si = 0; si < sentences.length; si++) { //<co id="co.opennlp.name.3"/>
          List<Annotation> allAnnotations = new ArrayList<Annotation>();
          String[] tokens = tokenizer.tokenize(sentences[si]);//<co id="co.opennlp.name.4"/>
          for (int fi = 0; fi < finders.length; fi++) { //<co id="co.opennlp.name.5"/>
            Span[] spans = finders[fi].find(tokens); //<co id="co.opennlp.name.6"/>
            double[] probs = finders[fi].probs(spans); //<co id="co.opennlp.name.7"/>
            for (int ni = 0; ni < spans.length; ni++) {
              allAnnotations.add( //<co id="co.opennlp.name.8"/>
    View Full Code Here

    Examples of org.apache.cocoon.util.Tokenizer

        protected Query getQuery( int i ) {
            return (Query) queries.elementAt( i );
        }

        private String replaceCharWithString( String in, char c, String with ) {
            Tokenizer tok;
            StringBuffer replaced = null;
            if ( in.indexOf( c ) > -1 ) {
                tok = new Tokenizer( in, c );
                replaced = new StringBuffer();
                while ( tok.hasMoreTokens() ) {
                    replaced.append( tok.nextToken() );
                    if ( tok.hasMoreTokens() )
                        replaced.append( with );
                }
            }
            if ( replaced != null ) {
                return replaced.toString();
    View Full Code Here

    Examples of org.apache.ctakes.core.nlp.tokenizer.Tokenizer

       * The file is delimited with "|" and has two fields:<br>
       * hyphen-term|frequency
       */
      public HyphenTextModifierImpl(String hyphenfilename, int windowSize) {
        iv_windowSize = windowSize;
        iv_tokenizer = new Tokenizer();
        BufferedReader br;
        try {
          br = new BufferedReader(new FileReader(new File(hyphenfilename)));

          String line = "";
    View Full Code Here

    Examples of org.apache.felix.gogo.runtime.Tokenizer

        }

        // hello world
        private void testHello(CharSequence text) throws Exception
        {
            Tokenizer t = new Tokenizer(text);
            assertEquals(Type.WORD, t.next());
            assertEquals("hello", t.value().toString());
            assertEquals(Type.WORD, t.next());
            assertEquals("world", t.value().toString());
            assertEquals(Type.NEWLINE, t.next());
            assertEquals(Type.EOT, t.next());
        }
    View Full Code Here

    Examples of org.apache.hadoop.hbase.codec.prefixtree.encode.tokenize.Tokenizer

        this.qualifierDeduplicator = USE_HASH_COLUMN_SORTER ? new ByteRangeHashSet()
            : new ByteRangeTreeSet();
        this.timestampEncoder = new LongEncoder();
        this.mvccVersionEncoder = new LongEncoder();
        this.cellTypeEncoder = new CellTypeEncoder();
        this.rowTokenizer = new Tokenizer();
        this.familyTokenizer = new Tokenizer();
        this.qualifierTokenizer = new Tokenizer();
        this.rowWriter = new RowSectionWriter();
        this.familyWriter = new ColumnSectionWriter();
        this.qualifierWriter = new ColumnSectionWriter();

        reset(outputStream, includeMvccVersion);
    View Full Code Here

    Examples of org.apache.jena.riot.tokens.Tokenizer

            totalTuples += n ;
        }
       
        protected Tokenizer makeTokenizer(InputStream in)
        {
            Tokenizer tokenizer = TokenizerFactory.makeTokenizerUTF8(in) ;
            return tokenizer ;
        }
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

        return new SimpleTokenizer(reader);
      }

      public TokenStream reusableTokenStream(String fieldName, Reader reader)
          throws IOException {
        Tokenizer tokenizer = (Tokenizer) getPreviousTokenStream();
        if (tokenizer == null) {
          tokenizer = new SimpleTokenizer(reader);
          setPreviousTokenStream(tokenizer);
        } else {
          tokenizer.reset(reader);
        }
        return tokenizer;
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.Tokenizer

          }
        }

        @Override
        public TokenStream reusableTokenStream(String fieldName, Reader reader) throws IOException {
          Tokenizer tokenizer = (Tokenizer) getPreviousTokenStream();
          if (tokenizer == null) {
            tokenizer = new SingleCharTokenizer(reader);
            setPreviousTokenStream(tokenizer);
          } else
            tokenizer.reset(reader);
          return tokenizer;
        }
    View Full Code Here
    TOP
    Copyright © 2018 www.massapi.com. All rights reserved.
    All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.