[TOC]

org.apache.nutch.tools.arc

Class ArcInputFormat

  • java.lang.Object
    • org.apache.hadoop.mapred.FileInputFormat
    • org.apache.nutch.tools.arc.ArcInputFormat

    • All Implemented Interfaces:
    • org.apache.hadoop.mapred.InputFormat

public class ArcInputFormat
extends org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.BytesWritable>

A input format the reads arc files.

Nested Class Summary

-    

Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat

org.apache.hadoop.mapred.FileInputFormat.Counter

Field Summary

-    

Fields inherited from class org.apache.hadoop.mapred.FileInputFormat

LOG

Constructor Summary

Constructors Constructor and Description ArcInputFormat()

Method Summary

Methods Modifier and Type Method and Description org.apache.hadoop.mapred.RecordReader getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter) Returns the RecordReader for reading the arc file.

-    

Methods inherited from class org.apache.hadoop.mapred.FileInputFormat

addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize

-    

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail

-  

ArcInputFormat

public ArcInputFormat()

Method Detail

-  

getRecordReader

public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.BytesWritable> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                                                                                  org.apache.hadoop.mapred.JobConf job,
                                                                                                                  org.apache.hadoop.mapred.Reporter reporter)
                                                                                                                    throws IOException

Returns the RecordReader for reading the arc file.

  - Specified by: 
  - <code>getRecordReader</code> in interface <code>org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></code> 
  - Specified by: 
  - <code>getRecordReader</code> in class <code>org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></code> 
  - Parameters:
  - <code>split</code> - The InputSplit of the arc file to process.
  - <code>job</code> - The job configuration.
  - <code>reporter</code> - The progress reporter. 
  - Throws: 
  - <code>IOException</code>      

Copyright © 2014 The Apache Software Foundation