org.apache.nutch.indexer
Class NutchDocument
- java.lang.Object
- org.apache.nutch.indexer.NutchDocument
- All Implemented Interfaces:
- Iterable<Map.Entry<String,NutchField>>, org.apache.hadoop.io.Writable
public class NutchDocument extends Object implements org.apache.hadoop.io.Writable, Iterable<Map.Entry<String,NutchField>>
A NutchDocument
is the unit of indexing.
Field Summary
Fields Modifier and Type Field and Description static byte
VERSION
Constructor Summary
Constructors Constructor and Description NutchDocument()
Method Summary
Methods Modifier and Type Method and Description void
add(String name,
Object value)
Metadata
getDocumentMeta()
NutchField
getField(String name)
Collection
getFieldNames()
Object
getFieldValue(String name)
float
getWeight()
Iterator
iterator()
Iterate over all fields.
void
readFields(DataInput in)
NutchField
removeField(String name)
void
setWeight(float weight)
String
toString()
void
write(DataOutput out)
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Field Detail
-
VERSION
public static final byte VERSION
- See Also:
- [Constant Field Values](../../../../constant-values.html#org.apache.nutch.indexer.NutchDocument.VERSION)
Constructor Detail
-
NutchDocument
public NutchDocument()
Method Detail
-
add
public void add(String name, Object value)
-
getFieldValue
public Object getFieldValue(String name)
-
getField
public NutchField getField(String name)
-
removeField
public NutchField removeField(String name)
-
getFieldNames
public Collection<String> getFieldNames()
-
iterator
public Iterator<Map.Entry<String,NutchField>> iterator()
Iterate over all fields.
- Specified by:
- <code>iterator</code> in interface <code>Iterable<map.entry<string,nutchfield>></map.entry<string,nutchfield></code>
-
getWeight
public float getWeight()
-
setWeight
public void setWeight(float weight)
-
getDocumentMeta
public Metadata getDocumentMeta()
-
readFields
public void readFields(DataInput in) throws IOException
- Specified by:
- <code>readFields</code> in interface <code>org.apache.hadoop.io.Writable</code>
- Throws:
- <code>IOException</code>
-
write
public void write(DataOutput out) throws IOException
- Specified by:
- <code>write</code> in interface <code>org.apache.hadoop.io.Writable</code>
- Throws:
- <code>IOException</code>
-
toString
public String toString()
- Overrides:
- <code>toString</code> in class <code>Object</code>