Class ArcInputFormat

[TOC]

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

org.apache.nutch.tools.arc

java.lang.Object
- org.apache.hadoop.mapred.FileInputFormat
- org.apache.nutch.tools.arc.ArcInputFormat

- All Implemented Interfaces:
- org.apache.hadoop.mapred.InputFormat

public class ArcInputFormat
extends org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.BytesWritable>

A input format the reads arc files.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapred.FileInputFormat

org.apache.hadoop.mapred.FileInputFormat.Counter

Field Summary

Fields inherited from class org.apache.hadoop.mapred.FileInputFormat

LOG

Constructor Summary

Constructors Constructor and Description ArcInputFormat()

Method Summary

Methods Modifier and Type Method and Description org.apache.hadoop.mapred.RecordReader getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter) Returns the RecordReader for reading the arc file.

Methods inherited from class org.apache.hadoop.mapred.FileInputFormat

addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail

ArcInputFormat

public ArcInputFormat()

Method Detail

getRecordReader

public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.BytesWritable> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                                                                                  org.apache.hadoop.mapred.JobConf job,
                                                                                                                  org.apache.hadoop.mapred.Reporter reporter)
                                                                                                                    throws IOException

Returns the RecordReader for reading the arc file.

  - Specified by: 
  - <code>getRecordReader</code> in interface <code>org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></code> 
  - Specified by: 
  - <code>getRecordReader</code> in class <code>org.apache.hadoop.mapred.FileInputFormat<org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></org.apache.hadoop.io.text,org.apache.hadoop.io.byteswritable></code> 
  - Parameters:
  - <code>split</code> - The InputSplit of the arc file to process.
  - <code>job</code> - The job configuration.
  - <code>reporter</code> - The progress reporter. 
  - Throws: 
  - <code>IOException</code>

Prev Class
Next Class

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method