[TOC]

org.apache.nutch.scoring.webgraph

Class LoopReader

  • java.lang.Object
    • org.apache.hadoop.conf.Configured
    • org.apache.nutch.scoring.webgraph.LoopReader
    • All Implemented Interfaces:
    • org.apache.hadoop.conf.Configurable

public class LoopReader
extends org.apache.hadoop.conf.Configured

The LoopReader tool prints the loopset information for a single url.

Constructor Summary

Constructors Constructor and Description LoopReader() LoopReader(org.apache.hadoop.conf.Configuration conf)

Method Summary

Methods Modifier and Type Method and Description void dumpUrl(org.apache.hadoop.fs.Path webGraphDb, String url) Prints loopset for a single url. static void main(String[] args) Runs the LoopReader tool.

-    

Methods inherited from class org.apache.hadoop.conf.Configured

getConf, setConf

-    

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail

-  

LoopReader

public LoopReader()
-  

LoopReader

public LoopReader(org.apache.hadoop.conf.Configuration conf)

Method Detail

-  

dumpUrl

public void dumpUrl(org.apache.hadoop.fs.Path webGraphDb,
           String url)
             throws IOException

Prints loopset for a single url. The loopset information will show any outlink url the eventually forms a link cycle.

  - Parameters:
  - <code>webGraphDb</code> - The WebGraph to check for loops
  - <code>url</code> - The url to check. 
  - Throws: 
  - <code>IOException</code> - If an error occurs while printing loopset information.       
-  

main

public static void main(String[] args)
                 throws Exception

Runs the LoopReader tool. For this tool to work the loops job must have already been run on the corresponding WebGraph.

  - Throws: 
  - <code>Exception</code>      

Copyright © 2014 The Apache Software Foundation