[TOC]

org.apache.nutch.scoring.webgraph

Class NodeDumper.Sorter

  • java.lang.Object
    • org.apache.hadoop.conf.Configured
    • org.apache.nutch.scoring.webgraph.NodeDumper.Sorter
    • All Implemented Interfaces:
    • Closeable, AutoCloseable, org.apache.hadoop.conf.Configurable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Mapper, org.apache.hadoop.mapred.Reducer
    • Enclosing class:
    • NodeDumper

public static class NodeDumper.Sorter
extends org.apache.hadoop.conf.Configured
implements org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.Text,Node,org.apache.hadoop.io.FloatWritable,org.apache.hadoop.io.Text>, org.apache.hadoop.mapred.Reducer<org.apache.hadoop.io.FloatWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.FloatWritable>

Outputs the top urls sorted in descending order. Depending on the flag set on the command line, the top urls could be for number of inlinks, for number of outlinks, or for link analysis score.

Constructor Summary

Constructors Constructor and Description NodeDumper.Sorter()

Method Summary

Methods Modifier and Type Method and Description void close() void configure(org.apache.hadoop.mapred.JobConf conf) Configures the job, sets the flag for type of content and the topN number if any. void map(org.apache.hadoop.io.Text key, Node node, org.apache.hadoop.mapred.OutputCollector output, org.apache.hadoop.mapred.Reporter reporter) Outputs the url with the appropriate number of inlinks, outlinks, or for score. void reduce(org.apache.hadoop.io.FloatWritable key, Iterator values, org.apache.hadoop.mapred.OutputCollector output, org.apache.hadoop.mapred.Reporter reporter) Flips and collects the url and numeric sort value.

-    

Methods inherited from class org.apache.hadoop.conf.Configured

getConf, setConf

-    

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail

-  

NodeDumper.Sorter

public NodeDumper.Sorter()

Method Detail

-  

configure

public void configure(org.apache.hadoop.mapred.JobConf conf)

Configures the job, sets the flag for type of content and the topN number if any.

  - Specified by: 
  - <code>configure</code> in interface <code>org.apache.hadoop.mapred.JobConfigurable</code>        
-  

close

public void close()
  - Specified by: 
  - <code>close</code> in interface <code>Closeable</code> 
  - Specified by: 
  - <code>close</code> in interface <code>AutoCloseable</code>        
-  

map

public void map(org.apache.hadoop.io.Text key,
       Node node,
       org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.FloatWritable,org.apache.hadoop.io.Text> output,
       org.apache.hadoop.mapred.Reporter reporter)
         throws IOException

Outputs the url with the appropriate number of inlinks, outlinks, or for score.

  - Specified by: 
  - <code>map</code> in interface <code>org.apache.hadoop.mapred.Mapper<org.apache.hadoop.io.text,node,org.apache.hadoop.io.floatwritable,org.apache.hadoop.io.text></org.apache.hadoop.io.text,node,org.apache.hadoop.io.floatwritable,org.apache.hadoop.io.text></code> 
  - Throws: 
  - <code>IOException</code>       
-  

reduce

public void reduce(org.apache.hadoop.io.FloatWritable key,
          Iterator<org.apache.hadoop.io.Text> values,
          org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.Text,org.apache.hadoop.io.FloatWritable> output,
          org.apache.hadoop.mapred.Reporter reporter)
            throws IOException

Flips and collects the url and numeric sort value.

  - Specified by: 
  - <code>reduce</code> in interface <code>org.apache.hadoop.mapred.Reducer<org.apache.hadoop.io.floatwritable,org.apache.hadoop.io.text,org.apache.hadoop.io.text,org.apache.hadoop.io.floatwritable></org.apache.hadoop.io.floatwritable,org.apache.hadoop.io.text,org.apache.hadoop.io.text,org.apache.hadoop.io.floatwritable></code> 
  - Throws: 
  - <code>IOException</code>      

Copyright © 2014 The Apache Software Foundation