org.apache.nutch.indexer
Class NutchDocument
- java.lang.Object
- org.apache.nutch.indexer.NutchDocument
- All Implemented Interfaces:
- Iterable<Map.Entry<String,NutchField>>, org.apache.hadoop.io.Writable
public class NutchDocument extends Object implements org.apache.hadoop.io.Writable, Iterable<Map.Entry<String,NutchField>>
A NutchDocument is the unit of indexing.
Field Summary
Fields Modifier and Type Field and Description static byte VERSION
Constructor Summary
Constructors Constructor and Description NutchDocument()
Method Summary
Methods Modifier and Type Method and Description void add(String name,
Object value) Metadata getDocumentMeta() NutchField getField(String name) Collection getFieldNames() Object getFieldValue(String name) float getWeight() Iterator iterator()
Iterate over all fields.
void readFields(DataInput in) NutchField removeField(String name) void setWeight(float weight) String toString() void write(DataOutput out)
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Field Detail
-
VERSION
public static final byte VERSION
- See Also:
- [Constant Field Values](../../../../constant-values.html#org.apache.nutch.indexer.NutchDocument.VERSION)
Constructor Detail
-
NutchDocument
public NutchDocument()
Method Detail
-
add
public void add(String name, Object value)
-
getFieldValue
public Object getFieldValue(String name)
-
getField
public NutchField getField(String name)
-
removeField
public NutchField removeField(String name)
-
getFieldNames
public Collection<String> getFieldNames()
-
iterator
public Iterator<Map.Entry<String,NutchField>> iterator()
Iterate over all fields.
- Specified by:
- <code>iterator</code> in interface <code>Iterable<map.entry<string,nutchfield>></map.entry<string,nutchfield></code>
-
getWeight
public float getWeight()
-
setWeight
public void setWeight(float weight)
-
getDocumentMeta
public Metadata getDocumentMeta()
-
readFields
public void readFields(DataInput in) throws IOException
- Specified by:
- <code>readFields</code> in interface <code>org.apache.hadoop.io.Writable</code>
- Throws:
- <code>IOException</code>
-
write
public void write(DataOutput out) throws IOException
- Specified by:
- <code>write</code> in interface <code>org.apache.hadoop.io.Writable</code>
- Throws:
- <code>IOException</code>
-
toString
public String toString()
- Overrides:
- <code>toString</code> in class <code>Object</code>
