Package org.archive.modules.net

Examples of org.archive.modules.net.Robotstxt


                CrawlServer s = getServerCache().getServerFor(curi.getUURI());
                String ua = curi.getUserAgent();
                if (ua == null) {
                    ua = metadata.getUserAgent();
                }
                Robotstxt rep = s.getRobotstxt();
                if (rep != null) {
                    long crawlDelay = (long)(1000 * rep.getDirectivesFor(ua).getCrawlDelay());
                    crawlDelay =
                        (crawlDelay > respectThreshold)
                            ? respectThreshold
                            : crawlDelay;
                    if (crawlDelay > durationToWait) {
View Full Code Here

TOP

Related Classes of org.archive.modules.net.Robotstxt

Copyright © 2018 www.massapicom. All rights reserved.
All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.