通用ipc超时控制
除了DataNode外的超时,比如Yarn自己通信超时那么也会报如下错误。
failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel
增大core-site.xml中的ipc.ping.interval。
如果关闭ipc.client.ping。则要增大ipc.client.rpc-timeout.ms。
如下是Hadoop3.1.1的源码。由源码可知。如果开启ipc.client.ping心跳,则没有RPC超时。以心跳超时为准。
/**
* The time after which a RPC will timeout.
* If ping is not enabled (via ipc.client.ping), then the timeout value is the
* same as the pingInterval.
* If ping is enabled, then there is no timeout value.
*
* @param conf Configuration
* @return the timeout period in milliseconds. -1 if no timeout value is set
* @deprecated use {@link #getRpcTimeout(Configuration)} instead
*/
@Deprecated
final public static int getTimeout(Configuration conf) {
int timeout = getRpcTimeout(conf);
if (timeout > 0) {
return timeout;
}
if (!conf.getBoolean(CommonConfigurationKeys.IPC_CLIENT_PING_KEY,
CommonConfigurationKeys.IPC_CLIENT_PING_DEFAULT)) {
return getPingInterval(conf);
}
return -1;
}
/**
* The time after which a RPC will timeout.
*
* @param conf Configuration
* @return the timeout period in milliseconds.
*/
public static final int getRpcTimeout(Configuration conf) {
int timeout =
conf.getInt(CommonConfigurationKeys.IPC_CLIENT_RPC_TIMEOUT_KEY,
CommonConfigurationKeys.IPC_CLIENT_RPC_TIMEOUT_DEFAULT);
return (timeout < 0) ? 0 : timeout;
}
参考资料
https://www.cnblogs.com/yjt1993/p/11164492.html
https://yq.aliyun.com/articles/476766