[TOC]

org.apache.nutch.crawl

Class Generator.Selector

    • All Implemented Interfaces:
    • Closeable, AutoCloseable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Mapper, org.apache.hadoop.mapred.Partitioner, org.apache.hadoop.mapred.Reducer
    • Enclosing class:
    • Generator

public static class Generator.Selector
extends Object
implements org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.Text,CrawlDatum,org.apache.hadoop.io.FloatWritable,Generator.SelectorEntry>, org.apache.hadoop.mapred.Partitioner<org.apache.hadoop.io.FloatWritable,org.apache.hadoop.io.Writable>, org.apache.hadoop.mapred.Reducer<org.apache.hadoop.io.FloatWritable,Generator.SelectorEntry,org.apache.hadoop.io.FloatWritable,Generator.SelectorEntry>

Selects entries due for fetch.

Constructor Summary

Constructors Constructor and Description Generator.Selector()

Method Summary

Methods Modifier and Type Method and Description void close() void configure(org.apache.hadoop.mapred.JobConf job) int getPartition(org.apache.hadoop.io.FloatWritable key, org.apache.hadoop.io.Writable value, int numReduceTasks) Partition by host / domain or IP. void map(org.apache.hadoop.io.Text key, CrawlDatum value, org.apache.hadoop.mapred.OutputCollector output, org.apache.hadoop.mapred.Reporter reporter) Select & invert subset due for fetch. void reduce(org.apache.hadoop.io.FloatWritable key, Iterator values, org.apache.hadoop.mapred.OutputCollector output, org.apache.hadoop.mapred.Reporter reporter) Collect until limit is reached.

-    

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail

-  

Generator.Selector

public Generator.Selector()

Method Detail

-  

configure

public void configure(org.apache.hadoop.mapred.JobConf job)
  - Specified by: 
  - <code>configure</code> in interface <code>org.apache.hadoop.mapred.JobConfigurable</code>        
-  

close

public void close()
  - Specified by: 
  - <code>close</code> in interface <code>Closeable</code> 
  - Specified by: 
  - <code>close</code> in interface <code>AutoCloseable</code>        
-  

map

public void map(org.apache.hadoop.io.Text key,
       CrawlDatum value,
       org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.FloatWritable,Generator.SelectorEntry> output,
       org.apache.hadoop.mapred.Reporter reporter)
         throws IOException

Select & invert subset due for fetch.

  - Specified by: 
  - <code>map</code> in interface <code>org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.text,crawldatum,org.apache.hadoop.io.floatwritable,generator.selectorentry></org.apache.hadoop.io.text,crawldatum,org.apache.hadoop.io.floatwritable,generator.selectorentry></code> 
  - Throws: 
  - <code>IOException</code>       
-  

getPartition

public int getPartition(org.apache.hadoop.io.FloatWritable key,
               org.apache.hadoop.io.Writable value,
               int numReduceTasks)

Partition by host / domain or IP.

  - Specified by: 
  - <code>getPartition</code> in interface <code>org.apache.hadoop.mapred.Partitioner<org.apache.hadoop.io.floatwritable,org.apache.hadoop.io.writable></org.apache.hadoop.io.floatwritable,org.apache.hadoop.io.writable></code>        
-  

reduce

public void reduce(org.apache.hadoop.io.FloatWritable key,
          Iterator<Generator.SelectorEntry> values,
          org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.FloatWritable,Generator.SelectorEntry> output,
          org.apache.hadoop.mapred.Reporter reporter)
            throws IOException

Collect until limit is reached.

  - Specified by: 
  - <code>reduce</code> in interface <code>org.apache.hadoop.mapred.Reducer<org.apache.hadoop.io.floatwritable,generator.selectorentry,org.apache.hadoop.io.floatwritable,generator.selectorentry></org.apache.hadoop.io.floatwritable,generator.selectorentry,org.apache.hadoop.io.floatwritable,generator.selectorentry></code> 
  - Throws: 
  - <code>IOException</code>      

Copyright © 2014 The Apache Software Foundation