[TOC]

  • Prev
  • Next

Uses of Class

org.apache.nutch.indexer.NutchDocument

Uses of NutchDocument in org.apache.nutch.analysis.lang

Methods in org.apache.nutch.analysis.lang that return NutchDocument Modifier and Type Method and Description NutchDocument LanguageIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Methods in org.apache.nutch.analysis.lang with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument LanguageIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Uses of NutchDocument in org.apache.nutch.indexer

Fields in org.apache.nutch.indexer declared as NutchDocument Modifier and Type Field and Description NutchDocument NutchIndexAction.doc

Methods in org.apache.nutch.indexer that return NutchDocument Modifier and Type Method and Description NutchDocument IndexingFilters.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) Run all defined filters. NutchDocument IndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) Adds fields or otherwise modifies the document that will be indexed for a parse.

Methods in org.apache.nutch.indexer with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument IndexingFilters.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) Run all defined filters. NutchDocument IndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) Adds fields or otherwise modifies the document that will be indexed for a parse. void IndexWriters.update(NutchDocument doc) void IndexWriter.update(NutchDocument doc) void IndexWriters.write(NutchDocument doc) void IndexWriter.write(NutchDocument doc)

Constructors in org.apache.nutch.indexer with parameters of type NutchDocument Constructor and Description NutchIndexAction(NutchDocument doc, byte action)

Uses of NutchDocument in org.apache.nutch.indexer.anchor

Methods in org.apache.nutch.indexer.anchor that return NutchDocument Modifier and Type Method and Description NutchDocument AnchorIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) The AnchorIndexingFilter filter object which supports boolean configuration settings for the deduplication of anchors.

Methods in org.apache.nutch.indexer.anchor with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument AnchorIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) The AnchorIndexingFilter filter object which supports boolean configuration settings for the deduplication of anchors.

Uses of NutchDocument in org.apache.nutch.indexer.basic

Methods in org.apache.nutch.indexer.basic that return NutchDocument Modifier and Type Method and Description NutchDocument BasicIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) The BasicIndexingFilter filter object which supports few configuration settings for adding basic searchable fields.

Methods in org.apache.nutch.indexer.basic with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument BasicIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) The BasicIndexingFilter filter object which supports few configuration settings for adding basic searchable fields.

Uses of NutchDocument in org.apache.nutch.indexer.feed

Methods in org.apache.nutch.indexer.feed that return NutchDocument Modifier and Type Method and Description NutchDocument FeedIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) Extracts out the relevant fields: FEED_AUTHOR FEED_TAGS FEED_PUBLISHED FEED_UPDATED FEED And sends them to the Indexer for indexing within the Nutch index.

Methods in org.apache.nutch.indexer.feed with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument FeedIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) Extracts out the relevant fields: FEED_AUTHOR FEED_TAGS FEED_PUBLISHED FEED_UPDATED FEED And sends them to the Indexer for indexing within the Nutch index.

Uses of NutchDocument in org.apache.nutch.indexer.metadata

Methods in org.apache.nutch.indexer.metadata that return NutchDocument Modifier and Type Method and Description NutchDocument MetadataIndexer.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Methods in org.apache.nutch.indexer.metadata with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument MetadataIndexer.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Uses of NutchDocument in org.apache.nutch.indexer.more

Methods in org.apache.nutch.indexer.more that return NutchDocument Modifier and Type Method and Description NutchDocument MoreIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Methods in org.apache.nutch.indexer.more with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument MoreIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Uses of NutchDocument in org.apache.nutch.indexer.staticfield

Methods in org.apache.nutch.indexer.staticfield that return NutchDocument Modifier and Type Method and Description NutchDocument StaticFieldIndexer.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) The StaticFieldIndexer filter object which adds fields as per configuration setting.

Methods in org.apache.nutch.indexer.staticfield with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument StaticFieldIndexer.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) The StaticFieldIndexer filter object which adds fields as per configuration setting.

Uses of NutchDocument in org.apache.nutch.indexer.subcollection

Methods in org.apache.nutch.indexer.subcollection that return NutchDocument Modifier and Type Method and Description NutchDocument SubcollectionIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Methods in org.apache.nutch.indexer.subcollection with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument SubcollectionIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Uses of NutchDocument in org.apache.nutch.indexer.tld

Methods in org.apache.nutch.indexer.tld that return NutchDocument Modifier and Type Method and Description NutchDocument TLDIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text urlText, CrawlDatum datum, Inlinks inlinks)

Methods in org.apache.nutch.indexer.tld with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument TLDIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text urlText, CrawlDatum datum, Inlinks inlinks)

Uses of NutchDocument in org.apache.nutch.indexer.urlmeta

Methods in org.apache.nutch.indexer.urlmeta that return NutchDocument Modifier and Type Method and Description NutchDocument URLMetaIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) This will take the metatags that you have listed in your "urlmeta.tags" property, and looks for them inside the CrawlDatum object.

Methods in org.apache.nutch.indexer.urlmeta with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument URLMetaIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks) This will take the metatags that you have listed in your "urlmeta.tags" property, and looks for them inside the CrawlDatum object.

Uses of NutchDocument in org.apache.nutch.indexwriter.dummy

Methods in org.apache.nutch.indexwriter.dummy with parameters of type NutchDocument Modifier and Type Method and Description void DummyIndexWriter.update(NutchDocument doc) void DummyIndexWriter.write(NutchDocument doc)

Uses of NutchDocument in org.apache.nutch.indexwriter.elastic

Methods in org.apache.nutch.indexwriter.elastic with parameters of type NutchDocument Modifier and Type Method and Description void ElasticIndexWriter.update(NutchDocument doc) void ElasticIndexWriter.write(NutchDocument doc)

Uses of NutchDocument in org.apache.nutch.indexwriter.solr

Methods in org.apache.nutch.indexwriter.solr with parameters of type NutchDocument Modifier and Type Method and Description void SolrIndexWriter.update(NutchDocument doc) void SolrIndexWriter.write(NutchDocument doc)

Uses of NutchDocument in org.apache.nutch.microformats.reltag

Methods in org.apache.nutch.microformats.reltag that return NutchDocument Modifier and Type Method and Description NutchDocument RelTagIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Methods in org.apache.nutch.microformats.reltag with parameters of type NutchDocument Modifier and Type Method and Description NutchDocument RelTagIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Uses of NutchDocument in org.apache.nutch.scoring

Methods in org.apache.nutch.scoring with parameters of type NutchDocument Modifier and Type Method and Description float ScoringFilters.indexerScore(org.apache.hadoop.io.Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore) float ScoringFilter.indexerScore(org.apache.hadoop.io.Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore) This method calculates a Lucene document boost. float AbstractScoringFilter.indexerScore(org.apache.hadoop.io.Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)

Uses of NutchDocument in org.apache.nutch.scoring.depth

Methods in org.apache.nutch.scoring.depth with parameters of type NutchDocument Modifier and Type Method and Description float DepthScoringFilter.indexerScore(org.apache.hadoop.io.Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)

Uses of NutchDocument in org.apache.nutch.scoring.link

Methods in org.apache.nutch.scoring.link with parameters of type NutchDocument Modifier and Type Method and Description float LinkAnalysisScoringFilter.indexerScore(org.apache.hadoop.io.Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)

Uses of NutchDocument in org.apache.nutch.scoring.opic

Methods in org.apache.nutch.scoring.opic with parameters of type NutchDocument Modifier and Type Method and Description float OPICScoringFilter.indexerScore(org.apache.hadoop.io.Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore) Dampen the boost value by scorePower.

Uses of NutchDocument in org.apache.nutch.scoring.tld

Methods in org.apache.nutch.scoring.tld with parameters of type NutchDocument Modifier and Type Method and Description float TLDScoringFilter.indexerScore(org.apache.hadoop.io.Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore)

Uses of NutchDocument in org.apache.nutch.scoring.urlmeta

Methods in org.apache.nutch.scoring.urlmeta with parameters of type NutchDocument Modifier and Type Method and Description float URLMetaScoringFilter.indexerScore(org.apache.hadoop.io.Text url, NutchDocument doc, CrawlDatum dbDatum, CrawlDatum fetchDatum, Parse parse, Inlinks inlinks, float initScore) Boilerplate

Uses of NutchDocument in org.creativecommons.nutch

Methods in org.creativecommons.nutch that return NutchDocument Modifier and Type Method and Description NutchDocument CCIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

Methods in org.creativecommons.nutch with parameters of type NutchDocument Modifier and Type Method and Description void CCIndexingFilter.addUrlFeatures(NutchDocument doc, String urlString) Add the features represented by a license URL. NutchDocument CCIndexingFilter.filter(NutchDocument doc, Parse parse, org.apache.hadoop.io.Text url, CrawlDatum datum, Inlinks inlinks)

  • Prev
  • Next

Copyright © 2014 The Apache Software Foundation