Package org.apache.droids.parse.html

Examples of org.apache.droids.parse.html.LinkExtractor


    String charset = entity.getCharset();
    if (charset == null) {
      charset = "UTF-8";
    }
    EchoHandler data = new EchoHandler(charset);
    LinkExtractor extractor = new LinkExtractor(link, elements);
   
    TeeContentHandler parallelHandler = new TeeContentHandler(data, extractor);

    InputStream instream = entity.obtainContent();
    try {
      parser.parse(instream, parallelHandler, metadata);
     
      return new ParseImpl(data.toString(), extractor.getLinks());
    } catch (SAXException ex) {
      throw new DroidsException("Failure parsing document " + link.getId(), ex);
    } catch (TikaException ex) {
      throw new DroidsException("Failure parsing document " + link.getId(), ex);
    } finally {
View Full Code Here

TOP

Related Classes of org.apache.droids.parse.html.LinkExtractor

Copyright © 2018 www.massapicom. All rights reserved.
All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.