早来来上班,开发告诉我,java热更新代码有的机器可以热更新,有的机器不可以热更新.
然后登陆服务器,查看业务日志
昨天热更class,报错Unable to open socket file: target process not responding or HotSpot VM not loaded=
com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
jps查看pid
[root@txy-zh-ysj-game1 ~]# jps
19968 ChatApplication
19873 ChatApplication
5858 Application
5989 Application
17829 Jps
5800 Application
19817 ChatApplication
16814 Application
5743 Application
19920 ChatApplication
20020 ChatApplication
19764 ChatApplication
5919 Application
根据pid查看堆栈信息
[root@txy-zh-ysj-game1 2020-05-19]# jstack 5858
5858: Unable to open socket file: target process not responding or HotSpot VM not loaded
The -F option can be used when the target process is not responding
[root@txy-zh-ysj-game1 2020-05-19]# jstack 5858 -F
Attaching to core -F from executable 5858, please wait...
Error attaching to core file: cannot open binary file
sun.jvm.hotspot.debugger.DebuggerException: cannot open binary file
at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.attach0(Native Method)
at sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.attach(LinuxDebuggerLocal.java:286)
at sun.jvm.hotspot.HotSpotAgent.attachDebugger(HotSpotAgent.java:673)
at sun.jvm.hotspot.HotSpotAgent.setupDebuggerLinux(HotSpotAgent.java:611)
at sun.jvm.hotspot.HotSpotAgent.setupDebugger(HotSpotAgent.java:337)
at sun.jvm.hotspot.HotSpotAgent.go(HotSpotAgent.java:304)
at sun.jvm.hotspot.HotSpotAgent.attach(HotSpotAgent.java:156)
at sun.jvm.hotspot.tools.Tool.start(Tool.java:191)
at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)
at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)
at sun.tools.jstack.JStack.main(JStack.java:106)
发现都基本报的是相同的错误,找不到socket文件.
百度分析得知:
jvm运行时会生成一个目录hsperfdata$USER($USER是启动java进程的用户),在linux中默认是/tmp。目录下会有些pid文件,存放jvm进程信息。
jps、jstack等工具读取/tmp/hsperfdata$USER下的pid文件获取连接信息。
查看/tmp/hsperfdata_root下,文件都存在,为啥找不到呢…
[root@txy-zh-ysj-game1 2020-05-19]# ls /tmp/hsperfdata_root/
16814 19764 19817 19873 19920 19968 20020 5743 5800 5858 5919 5989
查看pid5858进程用到了哪些文件
[root@txy-zh-ysj-game1 ~]# lsof -p 5858
发现了使用了这个文件/tmp/.java_pid5858.tmp
查看tmp目录下所有的文件
发现并没有.java_pid5858.tmp文件,但是有其他3个类似的文件
查看另外几个文件的类型,正好是socket类型文件.
[root@txy-zh-ysj-game1 ~]# file /tmp/.java_pid19764
/tmp/.java_pid19764: socket
[root@txy-zh-ysj-game1 ~]# file /tmp/.java_pid19968
/tmp/.java_pid19968: socket
执行19968的栈信息,发现可以正常输出.
[root@txy-zh-ysj-game1 ~]# jstack 19968
这下基本确认问题了,/tmp目录下丢失.java_pid5858.tmp文件导致无法导出堆内存,java热更新已需要借助socket文件
排查为啥文件会丢失,我没有对/tmp目录的文件进行删除操作呀
前几天热更新,都可以热更新的
根据查询,java 热更新最后一次在5月15日,都是可以成功的.
19号热更新,就只有部分进程生效了
只能去查系统日志了.
查询5月15号之后的系统日志
May 16 18:56:01 txy-zh-ysj-game1 systemd: Starting Cleanup of Temporary Directories…
May 16 18:56:01 txy-zh-ysj-game1 systemd: Started Cleanup of Temporary Directories.
发现两条异常日志,原来是systemd-tmpfiles-clean.service服务
linux系统会自动按照一定的规则清理/tmp目录下的文件,.
centos6 使用的是tmpwatch,默认是没装这个命令的(怪不得我们程序部署在centos6上面从来没出现热更新不生效的问题)
centos7 根据服务systemd-tmpfiles-clean.service 进行临时文件的清理,清理规则定义在配置文件/usr/lib/tmpfiles.d/tmp.conf,调用命令为/usr/bin/systemd-tmpfiles —clean,执行时间依靠systemd-tmpfiles-clean.timer进行管理
[root@txy-zh-ysj-game1 2020-05-19]# cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
[root@txy-zh-ysj-game1 2020-05-19]# cat /usr/lib/systemd/system/systemd-tmpfiles-clean.timer
# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
[Unit]
Description=Daily Cleanup of Temporary Directories
Documentation=man:tmpfiles.d(5) man:systemd-tmpfiles(8)
[Timer]
OnBootSec=15min
#开机15分钟执行服务
OnUnitActiveSec=1d
#距离上次执行该服务1天后执行服务
[root@txy-zh-ysj-game1 2020-05-19]# cat /usr/lib/systemd/system/systemd-tmpfiles-clean.service
# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
[Unit]
Description=Cleanup of Temporary Directories
Documentation=man:tmpfiles.d(5) man:systemd-tmpfiles(8)
DefaultDependencies=no
Conflicts=shutdown.target
After=systemd-readahead-collect.service systemd-readahead-replay.service local-fs.target time-sync.target
Before=shutdown.target
[Service]
Type=oneshot
ExecStart=/usr/bin/systemd-tmpfiles --clean
IOSchedulingClass=idle
[root@txy-zh-ysj-game1 2020-05-19]# cat /usr/lib/tmpfiles.d/tmp.conf
# This file is part of systemd.
#
# systemd is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.
# See tmpfiles.d(5) for details
# Clear tmp directories separately, to make them easier to override
v /tmp 1777 root root 10d
v /var/tmp 1777 root root 30d
# Exclude namespace mountpoints created with PrivateTmp=yes
x /tmp/systemd-private-%b-*
X /tmp/systemd-private-%b-*/tmp
x /var/tmp/systemd-private-%b-*
X /var/tmp/systemd-private-%b-*/tmp
v 需要清理的目录
x 忽略的目录及目录下的子文件,可以使用shell通配符
X 忽略的指定目录,不包含子文件,可以使用shell通配符
tmpfiles文件详细文档….
http://www.jinbuguo.com/systemd/tmpfiles.d.html
最终处理:
echo "x /tmp/hsperfdata" >>/usr/lib/tmpfiles.d/tmp.conf
echo "X /tmp/.java_" >>/usr/lib/tmpfiles.d/tmp.conf
最后将这个写成playpook写入系统初始化里面,避免出现这样的问题.
no_clean_java_tmpfile.yml
---
- hosts: txy-zh-ysj-game1
remote_user: root
tasks:
- name: /usr/lib/tmpfiles.d/tmp.conf is exitS
shell: ls /usr/lib/tmpfiles.d/tmp.conf
ignore_errors: True
register: result
- name: add x /tmp/hsperfdata*
lineinfile: dest=/usr/lib/tmpfiles.d/tmp.conf line='x /tmp/hsperfdata*'
when: result is succeeded
- name: add X /tmp/.java_*
lineinfile: dest=/usr/lib/tmpfiles.d/tmp.conf line='X /tmp/.java_*'
when: result is succeeded
notify: addline_handlers
handlers:
- name: addline_handlers
service: name=systemd-tmpfiles-clean state=restarted enabled=yes
# - shell: echo "file not exit"
# when: result is failed