[TOC]
- Summary:
- Nested |
- Field |
- Constr |
- Method
- Detail:
- Field |
- Constr |
- Method
org.apache.nutch.protocol
Interface RobotRules
public interface RobotRules
This class holds the rules which were parsed from a robots.txt file, and can test paths against those rules.
Method Summary
Methods Modifier and Type Method and Description long getCrawlDelay()
Get Crawl-Delay, in milliseconds.
long getExpireTime()
Get expire time
boolean isAllowed(URL url)
Returns false if the robots.txt file prohibits us from accessing the given url, or true otherwise.
Method Detail
-
getExpireTime
long getExpireTime()
Get expire time
-
getCrawlDelay
long getCrawlDelay()
Get Crawl-Delay, in milliseconds. This returns -1 if not set.
-
isAllowed
boolean isAllowed(URL url)
Returns false if the robots.txt file prohibits us from accessing the given url, or true otherwise.
- Summary:
- Nested |
- Field |
- Constr |
- Method
- Detail:
- Field |
- Constr |
- Method
