- Prev
- Next
Uses of Package
org.apache.nutch.crawl
Packages that use org.apache.nutch.crawl Package Description org.apache.nutch.analysis.lang
Text document language identifier. org.apache.nutch.crawl
Crawl control code and tools to run the crawler. org.apache.nutch.fetcher
The Nutch robot. org.apache.nutch.indexer
Index content, configure and run indexing and cleaning jobs to add, update, and delete documents from an index. org.apache.nutch.indexer.anchor
An indexing plugin for inbound anchor text. org.apache.nutch.indexer.basic
A basic indexing plugin, adds basic fields: url, host, title, content, etc. org.apache.nutch.indexer.feed
Indexing filter to index meta data from RSS feeds. org.apache.nutch.indexer.metadata
Indexing filter to add document metadata to the index. org.apache.nutch.indexer.more
A more indexing plugin, adds "more" index fields: last modified date, MIME type, content length. org.apache.nutch.indexer.staticfield
A simple plugin called at indexing that adds fields with static data. org.apache.nutch.indexer.subcollection
Indexing filter to assign documents to subcollections. org.apache.nutch.indexer.tld
Top Level Domain Indexing plugin. org.apache.nutch.indexer.urlmeta
URL Meta Tag Indexing Plugin org.apache.nutch.metadata
A Multi-valued Metadata container, and set of constant fields for Nutch Metadata. org.apache.nutch.microformats.reltag
A microformats Rel-Tag Parser/Indexer/Querier plugin. org.apache.nutch.protocol
Classes related to theProtocol
interface, see alsoorg.apache.nutch.net.protocols
. org.apache.nutch.protocol.file
Protocol plugin which supports retrieving local file resources. org.apache.nutch.protocol.ftp
Protocol plugin which supports retrieving documents via the ftp protocol. org.apache.nutch.protocol.http
Protocol plugin which supports retrieving documents via the http protocol. org.apache.nutch.protocol.http.api
Common API used by HTTP plugins (http
,httpclient
) org.apache.nutch.scoring
TheScoringFilter
interface. org.apache.nutch.scoring.depth
Scoring filter to stop crawling at a configurable depth (number of "hops" from seed URLs). org.apache.nutch.scoring.link
Scoring filter used in conjunction withWebGraph
. org.apache.nutch.scoring.opic
Scoring filter implementing a variant of the Online Page Importance Computation (OPIC) algorithm. org.apache.nutch.scoring.tld
Top Level Domain Scoring plugin. org.apache.nutch.scoring.urlmeta
URL Meta Tag Scoring Plugin org.apache.nutch.scoring.webgraph
Scoring implementation based on link analysis (LinkRank
), seeWebGraph
. org.apache.nutch.segment
A segment stores all data from on generate/fetch/update cycle: fetch list, protocol status, raw content, parsed content, and extracted outgoing links. org.apache.nutch.tools
Miscellaneous tools. org.apache.nutch.tools.arc
Tools to read the Arc file format. org.creativecommons.nutch
Sample plugins that parse and index Creative Commons medadata.
Classes in org.apache.nutch.crawl used by org.apache.nutch.analysis.lang Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.crawl Class and Description AbstractFetchSchedule This class provides common methods for implementations of
FetchSchedule
. AdaptiveFetchSchedule This class implements an adaptive re-fetch algorithm. CrawlDatum FetchSchedule This interface defines the contract for implementations that manipulate fetch times and re-fetch intervals. Generator.SelectorEntry Inlink Inlinks A list ofInlink
s. MapWritable Deprecated.
Use org.apache.hadoop.io.MapWritable instead.
Classes in org.apache.nutch.crawl used by org.apache.nutch.fetcher Class and Description CrawlDatum NutchWritable
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer Class and Description CrawlDatum Inlinks A list of
Inlink
s. NutchWritable
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer.anchor Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer.basic Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer.feed Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer.metadata Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer.more Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer.staticfield Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer.subcollection Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer.tld Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.indexer.urlmeta Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.metadata Class and Description NutchWritable
Classes in org.apache.nutch.crawl used by org.apache.nutch.microformats.reltag Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.protocol Class and Description CrawlDatum
Classes in org.apache.nutch.crawl used by org.apache.nutch.protocol.file Class and Description CrawlDatum
Classes in org.apache.nutch.crawl used by org.apache.nutch.protocol.ftp Class and Description CrawlDatum
Classes in org.apache.nutch.crawl used by org.apache.nutch.protocol.http Class and Description CrawlDatum
Classes in org.apache.nutch.crawl used by org.apache.nutch.protocol.http.api Class and Description CrawlDatum
Classes in org.apache.nutch.crawl used by org.apache.nutch.scoring Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.scoring.depth Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.scoring.link Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.scoring.opic Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.scoring.tld Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.scoring.urlmeta Class and Description CrawlDatum Inlinks A list of
Inlink
s.
Classes in org.apache.nutch.crawl used by org.apache.nutch.scoring.webgraph Class and Description CrawlDatum NutchWritable
Classes in org.apache.nutch.crawl used by org.apache.nutch.segment Class and Description CrawlDatum NutchWritable
Classes in org.apache.nutch.crawl used by org.apache.nutch.tools Class and Description CrawlDatum Generator.SelectorEntry
Classes in org.apache.nutch.crawl used by org.apache.nutch.tools.arc Class and Description NutchWritable
Classes in org.apache.nutch.crawl used by org.creativecommons.nutch Class and Description CrawlDatum Inlinks A list of
Inlink
s.
- Prev
- Next