org.apache.nutch.segment
Class SegmentMergeFilters
- java.lang.Object
- org.apache.nutch.segment.SegmentMergeFilters
public class SegmentMergeFilters extends Object
This class wraps all SegmentMergeFilter
extensions in a single object so it is easier to operate on them. If any of extensions returns false this one will return false as well.
Constructor Summary
Constructors Constructor and Description SegmentMergeFilters(org.apache.hadoop.conf.Configuration conf)
Method Summary
Methods Modifier and Type Method and Description boolean
filter(org.apache.hadoop.io.Text key,
CrawlDatum generateData,
CrawlDatum fetchData,
CrawlDatum sigData,
Content content,
ParseData parseData,
ParseText parseText,
Collection
Iterates over all SegmentMergeFilter
extensions and if any of them returns false, it will return false as well.
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Constructor Detail
-
SegmentMergeFilters
public SegmentMergeFilters(org.apache.hadoop.conf.Configuration conf)
Method Detail
-
filter
public boolean filter(org.apache.hadoop.io.Text key, CrawlDatum generateData, CrawlDatum fetchData, CrawlDatum sigData, Content content, ParseData parseData, ParseText parseText, Collection<CrawlDatum> linked)
Iterates over all SegmentMergeFilter
extensions and if any of them returns false, it will return false as well.
- Returns:
- true values for this key (URL) should be merged into the new segment.