[TOC]

  • Summary:
  • Nested |
  • Field |
  • Constr |
  • Method
  • Detail:
  • Field |
  • Constr |
  • Method

org.apache.nutch.protocol

Interface RobotRules


public interface RobotRules

This class holds the rules which were parsed from a robots.txt file, and can test paths against those rules.

Method Summary

Methods Modifier and Type Method and Description long getCrawlDelay() Get Crawl-Delay, in milliseconds. long getExpireTime() Get expire time boolean isAllowed(URL url) Returns false if the robots.txt file prohibits us from accessing the given url, or true otherwise.

Method Detail

-  

getExpireTime

long getExpireTime()

Get expire time

-  

getCrawlDelay

long getCrawlDelay()

Get Crawl-Delay, in milliseconds. This returns -1 if not set.

-  

isAllowed

boolean isAllowed(URL url)

Returns false if the robots.txt file prohibits us from accessing the given url, or true otherwise.

  • Summary:
  • Nested |
  • Field |
  • Constr |
  • Method
  • Detail:
  • Field |
  • Constr |
  • Method

Copyright © 2014 The Apache Software Foundation