Class LinkDumper
- java.lang.Object
- org.apache.hadoop.conf.Configured
- org.apache.nutch.scoring.webgraph.LinkDumper
- All Implemented Interfaces:
- org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool
public class LinkDumper extends org.apache.hadoop.conf.Configured implements org.apache.hadoop.util.Tool
The LinkDumper tool creates a database of node to inlink information that can be read using the nested Reader class. This allows the inlink and scoring state of a single url to be reviewed quickly to determine why a given url is ranking a certain way. This tool is to be used with the LinkRank analysis.
Nested Class Summary
Nested Classes Modifier and Type Class and Description static class
Inverts outlinks from the WebGraph to inlinks and attaches node information.
static class
Bean class which holds url to node information.
static class
Writable class which holds an array of LinkNode objects.
static class
Merges LinkNode objects into a single array value per url.
static class
Reader class which will print out the url and all of its inlinks to system out.
Field Summary
Fields Modifier and Type Field and Description static String
static org.slf4j.Logger
Constructor Summary
Constructors Constructor and Description LinkDumper()
Method Summary
Methods Modifier and Type Method and Description void
dumpLinks(org.apache.hadoop.fs.Path webGraphDb)
Runs the inverter and merger jobs of the LinkDumper tool to create the url to inlink node database.
static void
main(String[] args)
run(String[] args)
Runs the LinkDumper tool.
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
Field Detail
public static final org.slf4j.Logger LOG
public static final String DUMP_DIR
- See Also:
- [Constant Field Values](../../../../../constant-values.html#org.apache.nutch.scoring.webgraph.LinkDumper.DUMP_DIR)
Constructor Detail
public LinkDumper()
Method Detail
public void dumpLinks(org.apache.hadoop.fs.Path webGraphDb) throws IOException
Runs the inverter and merger jobs of the LinkDumper tool to create the url to inlink node database.
- Throws:
- <code>IOException</code>
public static void main(String[] args) throws Exception
- Throws:
- <code>Exception</code>
public int run(String[] args) throws Exception
Runs the LinkDumper tool. This simply creates the database, to read the values the nested Reader tool must be used.
- Specified by:
- <code>run</code> in interface <code>org.apache.hadoop.util.Tool</code>
- Throws:
- <code>Exception</code>