Package bixo.examples.crawl

Examples of bixo.examples.crawl.SimpleBodyContentHandler


   
    @SuppressWarnings("rawtypes")
    @Override
    protected void process(ParsedDatum datum, Document doc, TupleEntryCollector collector,
                    FlowProcess process) throws Exception {
        SimpleBodyContentHandler bodyContentHandler = new SimpleBodyContentHandler();
        SAXWriter writer = new SAXWriter(bodyContentHandler);
        writer.write(doc);

        float pageScore = getScore(bodyContentHandler.toString());
       
        // Get the outlinks.
        Outlink[] outlinks = getOutlinks(doc);

        // Extract all of the images, and use them as page results.
View Full Code Here

TOP

Related Classes of bixo.examples.crawl.SimpleBodyContentHandler

Copyright © 2018 www.massapicom. All rights reserved.
All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.