Examples of ASCIIFoldingFilter

See: http://en.wikipedia.org/wiki/Latin_characters_in_Unicode The set of character conversions supported by this class is a superset of those supported by Lucene's {@link ISOLatin1AccentFilter} which stripsaccents from Latin1 characters. For example, 'à' will be replaced by 'a'.
  • org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter
    nicode.org/charts/PDF/U0080.pdf">http://www.unicode.org/charts/PDF/U0080.pdf
  • Latin Extended-A: http://www.unicode.org/charts/PDF/U0100.pdf
  • Latin Extended-B: http://www.unicode.org/charts/PDF/U0180.pdf
  • Latin Extended Additional: http://www.unicode.org/charts/PDF/U1E00.pdf
  • Latin Extended-C: http://www.unicode.org/charts/PDF/U2C60.pdf
  • Latin Extended-D: http://www.unicode.org/charts/PDF/UA720.pdf
  • IPA Extensions: http://www.unicode.org/charts/PDF/U0250.pdf
  • Phonetic Extensions: http://www.unicode.org/charts/PDF/U1D00.pdf
  • Phonetic Extensions Supplement: http://www.unicode.org/charts/PDF/U1D80.pdf
  • General Punctuation: http://www.unicode.org/charts/PDF/U2000.pdf
  • Superscripts and Subscripts: http://www.unicode.org/charts/PDF/U2070.pdf
  • Enclosed Alphanumerics: http://www.unicode.org/charts/PDF/U2460.pdf
  • Dingbats: http://www.unicode.org/charts/PDF/U2700.pdf
  • Supplemental Punctuation: http://www.unicode.org/charts/PDF/U2E00.pdf
  • Alphabetic Presentation Forms: http://www.unicode.org/charts/PDF/UFB00.pdf
  • Halfwidth and Fullwidth Forms: http://www.unicode.org/charts/PDF/UFF00.pdf See: http://en.wikipedia.org/wiki/Latin_characters_in_Unicode For example, 'à' will be replaced by 'a'.

  • Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

      @Override
      public TokenStream tokenStream(String fieldName, Reader reader) {
        TokenStream result = new StandardTokenizer(Version.LUCENE_CURRENT, reader);
        result = new StandardFilter(result);
        result = new ASCIIFoldingFilter(result);
        result = new LowerCaseFilter(result);
        return result;
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

    * &lt;/fieldType&gt;</pre>
    * @version $Id: ASCIIFoldingFilterFactory.java 1073344 2011-02-22 14:35:02Z koji $
    */
    public class ASCIIFoldingFilterFactory extends BaseTokenFilterFactory {
      public ASCIIFoldingFilter create(TokenStream input) {
        return new ASCIIFoldingFilter(input);
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

      @Override
      public TokenStream tokenStream(String fieldName, Reader reader) {
        TokenStream result = new StandardTokenizer(LuceneVersion.getVersion(), reader);
        result = new StandardFilter(result);
        result = new LowerCaseFilter(result);
        result = new ASCIIFoldingFilter(result);
        List<String> list = Arrays.asList(ENGLISH_STOP_WORDS);
        Set<String> set = new HashSet<String>(list);
        result = new StopFilter(false, result, set, true);
        result = new EdgeNGramTokenFilter(result, Side.FRONT, 1, 20);
        return result;
    View Full Code Here

    Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

        @Inject public ASCIIFoldingTokenFilterFactory(Index index, @IndexSettings Settings indexSettings, @Assisted String name, @Assisted Settings settings) {
            super(index, indexSettings, name, settings);
        }

        @Override public TokenStream create(TokenStream tokenStream) {
            return new ASCIIFoldingFilter(tokenStream);
        }
    View Full Code Here

    Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

      @Override
      protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
        Tokenizer tokenizer = new StandardTokenizer(Version.LUCENE_31, reader);
        TokenStream result = new StandardFilter(Version.LUCENE_31, tokenizer);
        result = new LowerCaseFilter(Version.LUCENE_31, result);
        result = new ASCIIFoldingFilter(result);
        result = new AlphaNumericMaxLengthFilter(result);
        result = new StopFilter(Version.LUCENE_31, result, stopwords);
        result = new PorterStemFilter(result);
        return new TokenStreamComponents(tokenizer, result);
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

    import org.apache.lucene.analysis.ASCIIFoldingFilter;
    import org.apache.lucene.analysis.TokenStream;

    public class ASCIIFoldingFilterFactory extends BaseTokenFilterFactory {
      public ASCIIFoldingFilter create(TokenStream input) {
        return new ASCIIFoldingFilter( input );
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

        /** Constructs a {@link StandardTokenizer} filtered by a {@link
          StandardFilter}, a {@link LowerCaseFilter} and a {@link StopFilter}. */
        @Override
        public TokenStream tokenStream(String fieldName, Reader reader) {
            TokenStream result = super.tokenStream(fieldName, reader);
            result = new ASCIIFoldingFilter(result);
           
            return result;
        }
    View Full Code Here

    Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

      public TokenStream tokenStream(String fieldName, java.io.Reader reader) {

        TokenStream result = new StandardTokenizer(Version.LUCENE_31, reader);
        result = new StandardFilter(Version.LUCENE_31, result);
        result = new LowerCaseFilter(Version.LUCENE_31, result);
        result = new ASCIIFoldingFilter(result);
        result = new AlphaNumericMaxLengthFilter(result);
        result = new StopFilter(Version.LUCENE_31, result, stopSet);
        return new PorterStemFilter(result);
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

      @Override
      public TokenStream tokenStream(String fieldName, Reader reader) {
        TokenStream result = new StandardTokenizer(LuceneTestCase.TEST_VERSION_CURRENT, reader);
        result = new StandardFilter(Version.LUCENE_31, result);
        result = new ASCIIFoldingFilter(result);
        result = new LowerCaseFilter(LuceneTestCase.TEST_VERSION_CURRENT, result);
        return result;
      }
    View Full Code Here

    Examples of org.apache.lucene.analysis.ASCIIFoldingFilter

      @Override
      protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
        Tokenizer tokenizer = new StandardTokenizer(LUCENE_VERSION, reader);
        TokenStream result = new StandardFilter(LUCENE_VERSION, tokenizer);
        result = new LowerCaseFilter(LUCENE_VERSION, result);
        result = new ASCIIFoldingFilter(result);
        result = new AlphaNumericMaxLengthFilter(result);
        result = new StopFilter(LUCENE_VERSION, result, stopwords);
        result = new PorterStemFilter(result);
        return new TokenStreamComponents(tokenizer, result);
      }
    View Full Code Here
    TOP
    Copyright © 2018 www.massapi.com. All rights reserved.
    All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.