org.apache.nutch.scoring.webgraph
Class NodeReader
- java.lang.Object
- org.apache.hadoop.conf.Configured
- org.apache.nutch.scoring.webgraph.NodeReader
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable
public class NodeReader extends org.apache.hadoop.conf.Configured
Reads and prints to system out information for a single node from the NodeDb in the WebGraph.
Constructor Summary
Constructors Constructor and Description NodeReader()
NodeReader(org.apache.hadoop.conf.Configuration conf)
Method Summary
Methods Modifier and Type Method and Description void
dumpUrl(org.apache.hadoop.fs.Path webGraphDb,
String url)
Prints the content of the Node represented by the url to system out.
static void
main(String[] args)
Runs the NodeReader tool.
-
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Constructor Detail
-
NodeReader
public NodeReader()
-
NodeReader
public NodeReader(org.apache.hadoop.conf.Configuration conf)
Method Detail
-
dumpUrl
public void dumpUrl(org.apache.hadoop.fs.Path webGraphDb, String url) throws IOException
Prints the content of the Node represented by the url to system out.
- Parameters:
- <code>webGraphDb</code> - The webgraph from which to get the node.
- <code>url</code> - The url of the node.
- Throws:
- <code>IOException</code> - If an error occurs while getting the node.
-
main
public static void main(String[] args) throws Exception
Runs the NodeReader tool. The command line arguments must contain a webgraphdb path and a url. The url must match the normalized url that is contained in the NodeDb of the WebGraph.
- Throws:
- <code>Exception</code>