通用ipc超时控制

除了DataNode外的超时,比如Yarn自己通信超时那么也会报如下错误。


failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel

增大core-site.xml中的ipc.ping.interval。
如果关闭ipc.client.ping。则要增大ipc.client.rpc-timeout.ms。
如下是Hadoop3.1.1的源码。由源码可知。如果开启ipc.client.ping心跳,则没有RPC超时。以心跳超时为准。

  1. /**
  2. * The time after which a RPC will timeout.
  3. * If ping is not enabled (via ipc.client.ping), then the timeout value is the
  4. * same as the pingInterval.
  5. * If ping is enabled, then there is no timeout value.
  6. *
  7. * @param conf Configuration
  8. * @return the timeout period in milliseconds. -1 if no timeout value is set
  9. * @deprecated use {@link #getRpcTimeout(Configuration)} instead
  10. */
  11. @Deprecated
  12. final public static int getTimeout(Configuration conf) {
  13. int timeout = getRpcTimeout(conf);
  14. if (timeout > 0) {
  15. return timeout;
  16. }
  17. if (!conf.getBoolean(CommonConfigurationKeys.IPC_CLIENT_PING_KEY,
  18. CommonConfigurationKeys.IPC_CLIENT_PING_DEFAULT)) {
  19. return getPingInterval(conf);
  20. }
  21. return -1;
  22. }
  23. /**
  24. * The time after which a RPC will timeout.
  25. *
  26. * @param conf Configuration
  27. * @return the timeout period in milliseconds.
  28. */
  29. public static final int getRpcTimeout(Configuration conf) {
  30. int timeout =
  31. conf.getInt(CommonConfigurationKeys.IPC_CLIENT_RPC_TIMEOUT_KEY,
  32. CommonConfigurationKeys.IPC_CLIENT_RPC_TIMEOUT_DEFAULT);
  33. return (timeout < 0) ? 0 : timeout;
  34. }

参考资料

https://www.cnblogs.com/yjt1993/p/11164492.html
https://yq.aliyun.com/articles/476766