- Prev
- Next
Uses of Package
org.apache.nutch.plugin
Packages that use org.apache.nutch.plugin Package Description org.apache.nutch.analysis.lang
Text document language identifier. org.apache.nutch.collection
Subcollection is a subset of an index. org.apache.nutch.indexer
Index content, configure and run indexing and cleaning jobs to add, update, and delete documents from an index. org.apache.nutch.indexer.anchor
An indexing plugin for inbound anchor text. org.apache.nutch.indexer.basic
A basic indexing plugin, adds basic fields: url, host, title, content, etc. org.apache.nutch.indexer.feed
Indexing filter to index meta data from RSS feeds. org.apache.nutch.indexer.metadata
Indexing filter to add document metadata to the index. org.apache.nutch.indexer.more
A more indexing plugin, adds "more" index fields: last modified date, MIME type, content length. org.apache.nutch.indexer.staticfield
A simple plugin called at indexing that adds fields with static data. org.apache.nutch.indexer.subcollection
Indexing filter to assign documents to subcollections. org.apache.nutch.indexer.tld
Top Level Domain Indexing plugin. org.apache.nutch.indexer.urlmeta
URL Meta Tag Indexing Plugin org.apache.nutch.indexwriter.dummy
Index writer plugin for debugging, writes pairs ofto a text file, action is one of "add", "update", or "delete". org.apache.nutch.indexwriter.elastic
Index writer plugin for Elasticsearch. org.apache.nutch.indexwriter.solr
Index writer plugin for Apache Solr. org.apache.nutch.microformats.reltag
A microformats Rel-Tag Parser/Indexer/Querier plugin. org.apache.nutch.net
Web-related interfaces: URLfilters
andnormalizers
. org.apache.nutch.parse
TheParse
interface and related classes. org.apache.nutch.parse.ext
Parse wrapper to run external command to do the parsing. org.apache.nutch.parse.feed
Parse RSS feeds. org.apache.nutch.parse.headings
Parse filter to extract headings (h1, h2, etc.) from DOM parse tree. org.apache.nutch.parse.html
An HTML document parsing plugin. org.apache.nutch.parse.js
Parser and parse filter plugin to extract all (possible) links from JavaScript files and embedded JavaScript code snippets. org.apache.nutch.parse.metatags
Parse filter to extract meta tags: keywords, description, etc. org.apache.nutch.parse.swf
Parse Flash SWF files. org.apache.nutch.parse.tika
Parse various document formats with help of Apache Tika. org.apache.nutch.parse.zip
Parse ZIP files: embedded files are recursively passed to appropriate parsers. org.apache.nutch.plugin
The NutchPlugin
System. org.apache.nutch.protocol
Classes related to theProtocol
interface, see alsoorg.apache.nutch.net.protocols
. org.apache.nutch.protocol.file
Protocol plugin which supports retrieving local file resources. org.apache.nutch.protocol.ftp
Protocol plugin which supports retrieving documents via the ftp protocol. org.apache.nutch.protocol.http
Protocol plugin which supports retrieving documents via the http protocol. org.apache.nutch.protocol.http.api
Common API used by HTTP plugins (http
,httpclient
) org.apache.nutch.scoring
TheScoringFilter
interface. org.apache.nutch.scoring.depth
Scoring filter to stop crawling at a configurable depth (number of "hops" from seed URLs). org.apache.nutch.scoring.link
Scoring filter used in conjunction withWebGraph
. org.apache.nutch.scoring.opic
Scoring filter implementing a variant of the Online Page Importance Computation (OPIC) algorithm. org.apache.nutch.scoring.tld
Top Level Domain Scoring plugin. org.apache.nutch.scoring.urlmeta
URL Meta Tag Scoring Plugin org.apache.nutch.urlfilter.api
GenericURL filter
library, abstracting away from regular expression implementations. org.apache.nutch.urlfilter.automaton
URL filter plugin based on dk.brics.automaton Finite-State Automata for JavaTM. org.apache.nutch.urlfilter.domain
URL filter plugin to include only URLs which match an element in a given list of domain suffixes, domain names, and/or host names. org.apache.nutch.urlfilter.domainblacklist
URL filter plugin to exclude URLs by domain suffixes, domain names, and/or host names. org.apache.nutch.urlfilter.prefix
URL filter plugin to include only URLs which match one of a given list of URL prefixes. org.apache.nutch.urlfilter.regex
URL filter plugin to include and/or exclude URLs matching Java regular expressions. org.apache.nutch.urlfilter.suffix
URL filter plugin to either exclude or include only URLs which match one of the given (path) suffixes. org.apache.nutch.urlfilter.validator
URL filter plugin that validates given urls. org.creativecommons.nutch
Sample plugins that parse and index Creative Commons medadata.
Classes in org.apache.nutch.plugin used by org.apache.nutch.analysis.lang Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.collection Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer.anchor Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer.basic Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer.feed Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer.metadata Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer.more Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer.staticfield Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer.subcollection Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer.tld Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexer.urlmeta Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexwriter.dummy Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexwriter.elastic Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.indexwriter.solr Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.microformats.reltag Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.net Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse Class and Description Extension An
Extension
is a kind of listener descriptor that will be installed on a concreteExtensionPoint
that acts as kind of Publisher. Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse.ext Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse.feed Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse.headings Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse.html Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse.js Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse.metatags Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse.swf Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse.tika Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.parse.zip Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.plugin Class and Description Extension An
Extension
is a kind of listener descriptor that will be installed on a concreteExtensionPoint
that acts as kind of Publisher. ExtensionPoint TheExtensionPoint
provide meta information of a extension point. Plugin A nutch-plugin is an container for a set of custom logic that provide extensions to the nutch core functionality or another plugin that provides an API for extending. PluginClassLoader ThePluginClassLoader
contains only classes of the runtime libraries setuped in the plugin manifest file and exported libraries of plugins that are required pluguin. PluginDescriptor ThePluginDescriptor
provide access to all meta information of a nutch-plugin, as well to the internationalizable resources and the plugin own classloader. PluginRepository The plugin repositority is a registry of all plugins. PluginRuntimeExceptionPluginRuntimeException
will be thrown until a exception in the plugin managemnt occurs.
Classes in org.apache.nutch.plugin used by org.apache.nutch.protocol Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.protocol.file Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.protocol.ftp Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.protocol.http Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.protocol.http.api Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.scoring Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.scoring.depth Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.scoring.link Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.scoring.opic Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.scoring.tld Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.scoring.urlmeta Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.urlfilter.api Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.urlfilter.automaton Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.urlfilter.domain Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.urlfilter.domainblacklist Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.urlfilter.prefix Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.urlfilter.regex Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.urlfilter.suffix Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.apache.nutch.urlfilter.validator Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
Classes in org.apache.nutch.plugin used by org.creativecommons.nutch Class and Description Pluggable Defines the capability of a class to be plugged into Nutch.
- Prev
- Next