- Prev
- Next
Uses of Interface
org.apache.nutch.protocol.Protocol
Packages that use Protocol Package Description org.apache.nutch.protocol
Classes related to theProtocolinterface, see alsoorg.apache.nutch.net.protocols. org.apache.nutch.protocol.file
Protocol plugin which supports retrieving local file resources. org.apache.nutch.protocol.ftp
Protocol plugin which supports retrieving documents via the ftp protocol. org.apache.nutch.protocol.http
Protocol plugin which supports retrieving documents via the http protocol. org.apache.nutch.protocol.http.api
Common API used by HTTP plugins (http,httpclient)
Uses of Protocol in org.apache.nutch.protocol
Methods in org.apache.nutch.protocol that return Protocol Modifier and Type Method and Description Protocol ProtocolFactory.getProtocol(String urlString)
Returns the appropriate Protocol implementation for a url.
Methods in org.apache.nutch.protocol with parameters of type Protocol Modifier and Type Method and Description crawlercommons.robots.BaseRobotRules RobotRulesParser.getRobotRulesSet(Protocol protocol,
org.apache.hadoop.io.Text url) abstract crawlercommons.robots.BaseRobotRules RobotRulesParser.getRobotRulesSet(Protocol protocol,
URL url)
Uses of Protocol in org.apache.nutch.protocol.file
Classes in org.apache.nutch.protocol.file that implement Protocol Modifier and Type Class and Description class File
This class is a protocol plugin used for file: scheme.
Uses of Protocol in org.apache.nutch.protocol.ftp
Classes in org.apache.nutch.protocol.ftp that implement Protocol Modifier and Type Class and Description class Ftp
This class is a protocol plugin used for ftp: scheme.
Methods in org.apache.nutch.protocol.ftp with parameters of type Protocol Modifier and Type Method and Description crawlercommons.robots.BaseRobotRules FtpRobotRulesParser.getRobotRulesSet(Protocol ftp,
URL url)
The hosts for which the caching of robots rules is yet to be done, it sends a Ftp request to the host corresponding to the URL passed, gets robots file, parses the rules and caches the rules object to avoid re-work in future.
Uses of Protocol in org.apache.nutch.protocol.http
Classes in org.apache.nutch.protocol.http that implement Protocol Modifier and Type Class and Description class Http
Uses of Protocol in org.apache.nutch.protocol.http.api
Classes in org.apache.nutch.protocol.http.api that implement Protocol Modifier and Type Class and Description class HttpBase
Methods in org.apache.nutch.protocol.http.api with parameters of type Protocol Modifier and Type Method and Description crawlercommons.robots.BaseRobotRules HttpRobotRulesParser.getRobotRulesSet(Protocol http,
URL url)
Get the rules from robots.txt which applies for the given url.
- Prev
- Next
