[TOC]
org.apache.nutch.net
Interface URLNormalizer
- All Superinterfaces:
- org.apache.hadoop.conf.Configurable
- All Known Implementing Classes:
- BasicURLNormalizer, HostURLNormalizer, PassURLNormalizer, QuerystringURLNormalizer, RegexURLNormalizer
public interface URLNormalizer extends org.apache.hadoop.conf.Configurable
Interface used to convert URLs to normal form and optionally perform substitutions
Field Summary
Fields Modifier and Type Field and Description static String
X_POINT_ID
Method Summary
Methods Modifier and Type Method and Description String
normalize(String urlString,
String scope)
-
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
Field Detail
-
X_POINT_ID
static final String X_POINT_ID
Method Detail
-
normalize
String normalize(String urlString, String scope) throws MalformedURLException
- Throws:
- <code>MalformedURLException</code>