日志收集

  • Amazon Linux, CentOS, RHEL, Ubuntu — sosreport [1]
  • SELS, OpenSUSE — supportconfig [2]
  • Amazon Linux 2 — no tools
  • Kdump on Nitro Instances (All Distributions) [3]

[1] https://access.redhat.com/solutions/3592
[2] https://kb.vmware.com/s/article/2032614
[3] https://www.cnblogs.com/terryares/articles/12034214.html

sosreport 收集的比较齐全,并且还包含一些配置文件。

启动流程

image.png

Grub 修复

Grub 坏了之后的屏幕截图
image.png
Grub 修复命令
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/ch-working_with_the_grub_2_boot_loader#sec-Reinstalling_GRUB_2
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/installation_guide/sect-grub-installing

还可以用命令拷贝前512 字节即可。

检查grub 的命令
root@ip-172-31-14-197:~# file -s /dev/nvme0n1
/dev/nvme0n1: DOS/MBR boot sector

第四代实例驱动包含: netfront 和 BLKfront

装包装坏了,可以用rpm -Va -root=/ 查看哪些包被该过,然后重新安装yum reinstall glibc –installroot=/mnt

User data

Too many logs to serial console
https://paragon-cn.amazon.com/hz/view-case?caseId=1450001374
Python corrupted
https://paragon-cn.amazon.com/hz/view-case?caseId=1446445574
user data logs
/var/log/cloud-init.log
/var/log/cloud-init-output.log
Debug mode

!/bin/bash -ex

exec > >(tee /var/log/user-data.log|logger -t user-data -s 2>/dev/console) 2>&1 echo BEGIN date ‘+%Y-%m-%d %H:%M:%S’ echo END

网络问题

Asymmetric routing issue
https://paragon-cn.amazon.com/hz/view-case?caseId=1451319384
Network configuration missing
https://paragon-cn.amazon.com/hz/view-case?caseId=1450776794
Nic name changed
https://paragon-cn.amazon.com/hz/view-case?caseId=1452185724

FStab

Use filesystem UUID or label to mount the point
Add option _netdev if trying to attach a NFS share or SMB share
Case:
https://paragon-cn.amazon.com/hz/view-case?caseId=1450779874
blkid

SSH

ssh service failed to start
Use session manager to login to the system

https://paragon-cn.amazon.com/hz/view-case?caseId=1452276314

Linux 权限检查
/var/log/secure

内核模块
lsinitrd /boot/initramfs-4.18.0-305.19.1.el8_4.x86_64.img