[TOC]

org.apache.nutch.protocol.ftp

Class Client

  • java.lang.Object
    • org.apache.commons.net.SocketClient
    • org.apache.commons.net.telnet.TelnetClient
      • org.apache.commons.net.ftp.FTP
      • org.apache.nutch.protocol.ftp.Client

public class Client
extends org.apache.commons.net.ftp.FTP

Client.java encapsulates functionalities necessary for nutch to get dir list and retrieve file from an FTP server. This class takes care of all low level details of interacting with an FTP server and provides a convenient higher level interface. Modified from FtpClient.java in apache commons-net. Notes by John Xing: ftp server implementations are hardly uniform and none seems to follow RFCs whole-heartedly. We have no choice, but assume common denominator as following: (1) Use stream mode for data transfer. Block mode will be better for multiple file downloading and partial file downloading. However not every ftpd has block mode support. (2) Use passive mode for data connection. So Nutch will work if we run behind firewall. (3) Data connection is opened/closed per ftp command for the reasons listed in (1). There are ftp servers out there, when partial downloading is enforced by closing data channel socket on our client side, the server side immediately closes control channel (socket). Our codes deal with such a bad behavior. (4) LIST is used to obtain remote file attributes if possible. MDTM & SIZE would be nice, but not as ubiquitously implemented as LIST. (5) Avoid using ABOR in single thread? Do not use it at all. About exceptions: Some specific exceptions are re-thrown as one of FtpException.java In fact, each function throws FtpException.java or pass IOException.

  • Author:
  • John Xing

Field Summary

Fields Modifier and Type Field and Description protected static int TERMINAL_TYPE protected static int TERMINAL_TYPE_IS protected static int TERMINAL_TYPE_SEND

-    

Fields inherited from class org.apache.commons.net.ftp.FTP

_commandSupport_, ASCII_FILE_TYPE, BINARY_FILE_TYPE, BLOCK_TRANSFER_MODE, CARRIAGE_CONTROL_TEXT_FORMAT, COMPRESSED_TRANSFER_MODE, DEFAULT_CONTROL_ENCODING, DEFAULT_DATA_PORT, DEFAULT_PORT, EBCDIC_FILE_TYPE, FILE_STRUCTURE, IMAGE_FILE_TYPE, LOCAL_FILE_TYPE, NON_PRINT_TEXT_FORMAT, PAGE_STRUCTURE, RECORD_STRUCTURE, STREAM_TRANSFER_MODE, TELNET_TEXT_FORMAT

-    

Fields inherited from class org.apache.commons.net.telnet.TelnetClient

readerThread

-    

Fields inherited from class org.apache.commons.net.SocketClient

_defaultPort_, _input_, _isConnected_, _output_, _socket_, _socketFactory_, _timeout_, NETASCII_EOL

Constructor Summary

Constructors Constructor and Description Client() Public default constructor

Method Summary

Methods Modifier and Type Method and Description protected Socket __openPassiveDataConnection(int command, String arg) open a passive data connection socket void disconnect() Closes the connection to the FTP server and restores connection parameters to the default values. String getSystemName() Fetches the system type name from the server and returns the string. boolean isRemoteVerificationEnabled() Return whether or not verification of the remote host participating in data connections is enabled. boolean login(String username, String password) Login to the FTP server using the provided username and password. boolean logout() Logout of the FTP server by sending the QUIT command. void retrieveFile(String path, OutputStream os, int limit) retrieve file for path void retrieveList(String path, List entries, int limit, org.apache.commons.net.ftp.FTPFileEntryParser parser) retrieve list reply for path boolean sendNoOp() Sends a NOOP command to the FTP server. void setDataTimeout(int timeout) Sets the timeout in milliseconds to use for data connection. boolean setFileType(int fileType) Sets the file type to be transferred. void setRemoteVerificationEnabled(boolean enable) Enable or disable verification that the remote host taking part of a data connection is the same as the host to which the control connection is attached.

-    

Methods inherited from class org.apache.commons.net.ftp.FTP

_connectAction_, abor, acct, addProtocolCommandListener, allo, allo, appe, cdup, cwd, dele, getControlEncoding, getReply, getReplyCode, getReplyString, getReplyStrings, help, help, list, list, mkd, mode, nlst, nlst, noop, pass, pasv, port, pwd, quit, rein, removeProtocolCommandListener, rest, retr, rmd, rnfr, rnto, sendCommand, sendCommand, sendCommand, sendCommand, setControlEncoding, site, smnt, stat, stat, stor, stou, stou, stru, syst, type, type, user

-    

Methods inherited from class org.apache.commons.net.telnet.TelnetClient

addOptionHandler, deleteOptionHandler, getInputStream, getLocalOptionState, getOutputStream, getReaderThread, getRemoteOptionState, registerNotifHandler, registerSpyStream, sendAYT, setReaderThread, stopSpyStream, unregisterNotifHandler

-    

Methods inherited from class org.apache.commons.net.SocketClient

connect, connect, connect, connect, connect, connect, getDefaultPort, getDefaultTimeout, getLocalAddress, getLocalPort, getRemoteAddress, getRemotePort, getSoLinger, getSoTimeout, getTcpNoDelay, isConnected, setDefaultPort, setDefaultTimeout, setSocketFactory, setSoLinger, setSoTimeout, setTcpNoDelay, verifyRemote

-    

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

-  

TERMINAL_TYPE

protected static final int TERMINAL_TYPE
  - See Also:
  - [Constant Field Values](../../../../../constant-values.html#org.apache.nutch.protocol.ftp.Client.TERMINAL_TYPE)       
-  

TERMINAL_TYPE_SEND

protected static final int TERMINAL_TYPE_SEND
  - See Also:
  - [Constant Field Values](../../../../../constant-values.html#org.apache.nutch.protocol.ftp.Client.TERMINAL_TYPE_SEND)       
-  

TERMINAL_TYPE_IS

protected static final int TERMINAL_TYPE_IS
  - See Also:
  - [Constant Field Values](../../../../../constant-values.html#org.apache.nutch.protocol.ftp.Client.TERMINAL_TYPE_IS)       

Constructor Detail

-  

Client

public Client()

Public default constructor

Method Detail

-  

__openPassiveDataConnection

protected Socket __openPassiveDataConnection(int command,
                                 String arg)
                                      throws IOException,
                                             FtpExceptionCanNotHaveDataConnection

open a passive data connection socket

  - Parameters:
  - <code>command</code> - 
  - <code>arg</code> -  
  - Returns:
  -  
  - Throws: 
  - <code>IOException</code> 
  - <code>FtpExceptionCanNotHaveDataConnection</code>       
-  

setDataTimeout

public void setDataTimeout(int timeout)

Sets the timeout in milliseconds to use for data connection. set immediately after opening the data connection.

-  

disconnect

public void disconnect()
                throws IOException

Closes the connection to the FTP server and restores connection parameters to the default values.

  - Overrides: 
  - <code>disconnect</code> in class <code>org.apache.commons.net.ftp.FTP</code> 
  - Throws: 
  - <code>IOException</code> - If an error occurs while disconnecting.       
-  

setRemoteVerificationEnabled

public void setRemoteVerificationEnabled(boolean enable)

Enable or disable verification that the remote host taking part of a data connection is the same as the host to which the control connection is attached. The default is for verification to be enabled. You may set this value at any time, whether the FTPClient is currently connected or not.

  - Parameters:
  - <code>enable</code> - True to enable verification, false to disable verification.       
-  

isRemoteVerificationEnabled

public boolean isRemoteVerificationEnabled()

Return whether or not verification of the remote host participating in data connections is enabled. The default behavior is for verification to be enabled.

  - Returns:
  - True if verification is enabled, false if not.       
-  

login

public boolean login(String username,
            String password)
              throws IOException

Login to the FTP server using the provided username and password.

  - Parameters:
  - <code>username</code> - The username to login under.
  - <code>password</code> - The password to use. 
  - Returns:
  - True if successfully completed, false if not. 
  - Throws: 
  - <code>org.apache.commons.net.ftp.FTPConnectionClosedException</code> - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself. 
  - <code>IOException</code> - If an I/O error occurs while either sending a command to the server or receiving a reply from the server.       
-  

logout

public boolean logout()
               throws IOException

Logout of the FTP server by sending the QUIT command.

  - Returns:
  - True if successfully completed, false if not. 
  - Throws: 
  - <code>org.apache.commons.net.ftp.FTPConnectionClosedException</code> - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself. 
  - <code>IOException</code> - If an I/O error occurs while either sending a command to the server or receiving a reply from the server.       
-  

retrieveList

public void retrieveList(String path,
                List<org.apache.commons.net.ftp.FTPFile> entries,
                int limit,
                org.apache.commons.net.ftp.FTPFileEntryParser parser)
                  throws IOException,
                         FtpExceptionCanNotHaveDataConnection,
                         FtpExceptionUnknownForcedDataClose,
                         FtpExceptionControlClosedByForcedDataClose

retrieve list reply for path

  - Parameters:
  - <code>path</code> - 
  - <code>entries</code> - 
  - <code>limit</code> - 
  - <code>parser</code> -  
  - Throws: 
  - <code>IOException</code> 
  - <code>FtpExceptionCanNotHaveDataConnection</code> 
  - <code>FtpExceptionUnknownForcedDataClose</code> 
  - <code>FtpExceptionControlClosedByForcedDataClose</code>       
-  

retrieveFile

public void retrieveFile(String path,
                OutputStream os,
                int limit)
                  throws IOException,
                         FtpExceptionCanNotHaveDataConnection,
                         FtpExceptionUnknownForcedDataClose,
                         FtpExceptionControlClosedByForcedDataClose

retrieve file for path

  - Parameters:
  - <code>path</code> - 
  - <code>os</code> - 
  - <code>limit</code> -  
  - Throws: 
  - <code>IOException</code> 
  - <code>FtpExceptionCanNotHaveDataConnection</code> 
  - <code>FtpExceptionUnknownForcedDataClose</code> 
  - <code>FtpExceptionControlClosedByForcedDataClose</code>       
-  

setFileType

public boolean setFileType(int fileType)
                    throws IOException

Sets the file type to be transferred. This should be one of FTP.ASCII_FILE_TYPE , FTP.IMAGE_FILE_TYPE , etc. The file type only needs to be set when you want to change the type. After changing it, the new type stays in effect until you change it again. The default file type is FTP.ASCII_FILE_TYPE if this method is never called.

  - Parameters:
  - <code>fileType</code> - The <code> \_FILE\_TYPE </code> constant indcating the type of file. 
  - Returns:
  - True if successfully completed, false if not. 
  - Throws: 
  - <code>org.apache.commons.net.ftp.FTPConnectionClosedException</code> - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself. 
  - <code>IOException</code> - If an I/O error occurs while either sending a command to the server or receiving a reply from the server.       
-  

getSystemName

public String getSystemName()
                     throws IOException,
                            FtpExceptionBadSystResponse

Fetches the system type name from the server and returns the string. This value is cached for the duration of the connection after the first call to this method. In other words, only the first time that you invoke this method will it issue a SYST command to the FTP server. FTPClient will remember the value and return the cached value until a call to disconnect.

  - Returns:
  - The system type name obtained from the server. null if the information could not be obtained. 
  - Throws: 
  - <code>org.apache.commons.net.ftp.FTPConnectionClosedException</code> - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself. 
  - <code>IOException</code> - If an I/O error occurs while either sending a command to the server or receiving a reply from the server. 
  - <code>FtpExceptionBadSystResponse</code>       
-  

sendNoOp

public boolean sendNoOp()
                 throws IOException

Sends a NOOP command to the FTP server. This is useful for preventing server timeouts.

  - Returns:
  - True if successfully completed, false if not. 
  - Throws: 
  - <code>org.apache.commons.net.ftp.FTPConnectionClosedException</code> - If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself. 
  - <code>IOException</code> - If an I/O error occurs while either sending a command to the server or receiving a reply from the server.      

Copyright © 2014 The Apache Software Foundation