- Prev Class
- Next Class
org.apache.nutch.tools.arc
Class ArcInputFormat
- java.lang.Object
- org.apache.hadoop.mapred.FileInputFormat
- org.apache.nutch.tools.arc.ArcInputFormat
- org.apache.hadoop.mapred.FileInputFormat
- All Implemented Interfaces:
- org.apache.hadoop.mapred.InputFormat
public class ArcInputFormat extends org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.BytesWritable>
A input format the reads arc files.
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat
org.apache.hadoop.mapred.FileInputFormat.Counter
Field Summary
-
Fields inherited from class org.apache.hadoop.mapred.FileInputFormat
LOG
Constructor Summary
Constructors Constructor and Description ArcInputFormat()
Method Summary
Methods Modifier and Type Method and Description org.apache.hadoop.mapred.RecordReader
getRecordReader(org.apache.hadoop.mapred.InputSplit split,
org.apache.hadoop.mapred.JobConf job,
org.apache.hadoop.mapred.Reporter reporter)
Returns the RecordReader
for reading the arc file.
-
Methods inherited from class org.apache.hadoop.mapred.FileInputFormat
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Constructor Detail
-
ArcInputFormat
public ArcInputFormat()
Method Detail
-
getRecordReader
public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.BytesWritable> getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter) throws IOException
Returns the RecordReader
for reading the arc file.
- Specified by:
- <code>getRecordReader</code> in interface <code>org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></code>
- Specified by:
- <code>getRecordReader</code> in class <code>org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></code>
- Parameters:
- <code>split</code> - The InputSplit of the arc file to process.
- <code>job</code> - The job configuration.
- <code>reporter</code> - The progress reporter.
- Throws:
- <code>IOException</code>
- Prev Class
- Next Class