[TOC]
- Summary:
- Nested |
- Field |
- Constr |
- Method
- Detail:
- Field |
- Constr |
- Method
org.apache.nutch.protocol
Interface RobotRules
public interface RobotRules
This class holds the rules which were parsed from a robots.txt file, and can test paths against those rules.
Method Summary
Methods Modifier and Type Method and Description long
getCrawlDelay()
Get Crawl-Delay, in milliseconds.
long
getExpireTime()
Get expire time
boolean
isAllowed(URL url)
Returns false
if the robots.txt
file prohibits us from accessing the given url
, or true
otherwise.
Method Detail
-
getExpireTime
long getExpireTime()
Get expire time
-
getCrawlDelay
long getCrawlDelay()
Get Crawl-Delay, in milliseconds. This returns -1 if not set.
-
isAllowed
boolean isAllowed(URL url)
Returns false
if the robots.txt
file prohibits us from accessing the given url
, or true
otherwise.
- Summary:
- Nested |
- Field |
- Constr |
- Method
- Detail:
- Field |
- Constr |
- Method