- Prev
- Next
Uses of Interface
org.apache.nutch.protocol.Protocol
Packages that use Protocol Package Description org.apache.nutch.protocol
Classes related to theProtocol
interface, see alsoorg.apache.nutch.net.protocols
. org.apache.nutch.protocol.file
Protocol plugin which supports retrieving local file resources. org.apache.nutch.protocol.ftp
Protocol plugin which supports retrieving documents via the ftp protocol. org.apache.nutch.protocol.http
Protocol plugin which supports retrieving documents via the http protocol. org.apache.nutch.protocol.http.api
Common API used by HTTP plugins (http
,httpclient
)
Uses of Protocol in org.apache.nutch.protocol
Methods in org.apache.nutch.protocol that return Protocol Modifier and Type Method and Description Protocol
ProtocolFactory.getProtocol(String urlString)
Returns the appropriate Protocol
implementation for a url.
Methods in org.apache.nutch.protocol with parameters of type Protocol Modifier and Type Method and Description crawlercommons.robots.BaseRobotRules
RobotRulesParser.getRobotRulesSet(Protocol protocol,
org.apache.hadoop.io.Text url)
abstract crawlercommons.robots.BaseRobotRules
RobotRulesParser.getRobotRulesSet(Protocol protocol,
URL url)
Uses of Protocol in org.apache.nutch.protocol.file
Classes in org.apache.nutch.protocol.file that implement Protocol Modifier and Type Class and Description class
File
This class is a protocol plugin used for file: scheme.
Uses of Protocol in org.apache.nutch.protocol.ftp
Classes in org.apache.nutch.protocol.ftp that implement Protocol Modifier and Type Class and Description class
Ftp
This class is a protocol plugin used for ftp: scheme.
Methods in org.apache.nutch.protocol.ftp with parameters of type Protocol Modifier and Type Method and Description crawlercommons.robots.BaseRobotRules
FtpRobotRulesParser.getRobotRulesSet(Protocol ftp,
URL url)
The hosts for which the caching of robots rules is yet to be done, it sends a Ftp request to the host corresponding to the URL
passed, gets robots file, parses the rules and caches the rules object to avoid re-work in future.
Uses of Protocol in org.apache.nutch.protocol.http
Classes in org.apache.nutch.protocol.http that implement Protocol Modifier and Type Class and Description class
Http
Uses of Protocol in org.apache.nutch.protocol.http.api
Classes in org.apache.nutch.protocol.http.api that implement Protocol Modifier and Type Class and Description class
HttpBase
Methods in org.apache.nutch.protocol.http.api with parameters of type Protocol Modifier and Type Method and Description crawlercommons.robots.BaseRobotRules
HttpRobotRulesParser.getRobotRulesSet(Protocol http,
URL url)
Get the rules from robots.txt which applies for the given url
.
- Prev
- Next