CentOS 和 Ubuntu 内核版本升级

755次阅读
没有评论

共计 21758 个字符,预计需要花费 55 分钟才能阅读完成。

前言

k8s的worker节点在生产环境中发生了内存泄露,原因是kmem与3.10版本的内核会产生不兼容。

详细问题可参考 cloud.tencent.com/develo…

直接升级内核是最简单的处理手段。

常用的修改Linux操作系统的内核方式有三种:

  1. 源码编译内核
  2. 从官方下载已经编译好的包安装内核
  3. 包管理器安装内核

本文以企业内常用发行版 CentOS 7 和 Ubuntu 20.04为例,介绍如何使用包管理器进行内核更新的方法。

CentOS 7

创建一台全新服务器。

# 查看内核版本
[root@ecs-images ~]# uname -a
Linux ecs-images 3.10.0-1160.92.1.el7.x86_64 #1 SMP Tue Jun 20 11:48:01 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

yum

仓库配置

elrepo​:一个为RHEL系提供额外软件包的项目。由社区维护,发行的包经过测试和验证。可以保障稳定性和兼容性。

导入elrepo​源
[root@ecs-images ~]# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
[root@ecs-images ~]# yum install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm

导入源后更新包缓存

此处非必选,当执行后续动作时,yum也会自动生成新缓存。

[root@ecs-images ~]# yum makecache

重点提醒

在导入新源后,不要轻易执行yum update -y​。运维的任何操作都需要保证确定性​。

很多人会用yum update -y​来生成新的包缓存,但这是一个错误的做法。因为它会更新操作系统内所有可更新的软件包。尤其在生产环境上,不必要的更新操作容易引起预期之外的风险。

列出可用内核版本

内核的分类:

  • kernel-ml​:ml是英文【 mainline stable 】的缩写,即最新的稳定主线版本。
  • kernel-lt​:lt是英文【 long term support 】的缩写,elrepo-kernel中罗列出来的长期支持版本。
[root@ecs-images ~]# yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * elrepo-kernel: mirrors.thzhost.com
elrepo-kernel                                                                                | 3.0 kB  00:00:00   
elrepo-kernel/primary_db                                                                     | 2.1 MB  00:00:05   
Available Packages
kernel-lt.x86_64                                          5.4.251-1.el7.elrepo                         elrepo-kernel
kernel-lt-devel.x86_64                                    5.4.251-1.el7.elrepo                         elrepo-kernel
kernel-lt-doc.noarch                                      5.4.251-1.el7.elrepo                         elrepo-kernel
kernel-lt-headers.x86_64                                  5.4.251-1.el7.elrepo                         elrepo-kernel
kernel-lt-tools.x86_64                                    5.4.251-1.el7.elrepo                         elrepo-kernel
kernel-lt-tools-libs.x86_64                               5.4.251-1.el7.elrepo                         elrepo-kernel
kernel-lt-tools-libs-devel.x86_64                         5.4.251-1.el7.elrepo                         elrepo-kernel
kernel-ml.x86_64                                          6.4.7-1.el7.elrepo                           elrepo-kernel
kernel-ml-devel.x86_64                                    6.4.7-1.el7.elrepo                           elrepo-kernel
kernel-ml-doc.noarch                                      6.4.7-1.el7.elrepo                           elrepo-kernel
kernel-ml-headers.x86_64                                  6.4.7-1.el7.elrepo                           elrepo-kernel
kernel-ml-tools.x86_64                                    6.4.7-1.el7.elrepo                           elrepo-kernel
kernel-ml-tools-libs.x86_64                               6.4.7-1.el7.elrepo                           elrepo-kernel
kernel-ml-tools-libs-devel.x86_64                         6.4.7-1.el7.elrepo                           elrepo-kernel
perf.x86_64                                               5.4.251-1.el7.elrepo                         elrepo-kernel
python-perf.x86_64                                        5.4.251-1.el7.elrepo                         elrepo-kernel

更新内核

安装内核

选择指定内核版本安装,或直接安装最新版本。建议使用kernel-lt​,通常它相比kernel-ml​具备更好的兼容性。

# 安装指定lt版本
[root@ecs-images ~]# yum --enablerepo=elrepo-kernel install kernel-lt-devel-5.4.251-1.el7.elrepo.x86_64 kernel-lt-5.4.251-1.el7.elrepo.x86_64 -y

# 安装最新lt版本
[root@ecs-images ~]# yum --enablerepo=elrepo-kernel install kernel-lt-devel kernel-lt -y

查看系统已存在的内核
[root@ecs-images ~]# awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (5.4.251-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-1160.92.1.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-1160.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-57beda17722b499da37e22c55c2ef57f) 7 (Core)

可以发现,新安装的内核在/etc/grub2.cfg​位于第一位。

查看最高优先级引导的内核
[root@ecs-images ~]# grub2-editenv list
saved_entry=CentOS Linux (3.10.0-1160.92.1.el7.x86_64) 7 (Core)

设置内核启动顺序
[root@ecs-images ~]# grub2-set-default 0

再次查看最高优先级引导的内核
[root@ecs-images ~]# grub2-editenv list
saved_entry=0

可以发现,saved_entry​的值由之前的具体内核版本CentOS Linux (3.10.0-1160.92.1.el7.x86_64) 7 (Core)​变为了索引值0​。

重启验证
[root@ecs-images ~]# reboot
Connection to 172.27.3.74 closed by remote host.
Connection to 172.27.3.74 closed.

 @ops-2701 /home/pengyinwei$ ssh root@172.27.3.74 "sudo sh -c 'uname -a '"
root@172.27.3.74's password: 
Linux ecs-images 5.4.251-1.el7.elrepo.x86_64 #1 SMP Thu Jul 27 18:49:53 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

ansible playbook

使用ansible playbook编排,可以将流程沉淀并方便后续再次执行。

playbook示例

# 查看目录结构
@ops-2701 /ops/scripts/os_init/test$ ls
hosts  update-kernel.yml

# hosts文件,配置需要执行的hosts.由于是全新机器,此处采用账号密码方式进行ssh登录
@ops-2701 /ops/scripts/os_init/test$ cat hosts
[test]
test-images ansible_host=172.27.3.74 ansible_user="root" ansible_ssh_pass="thisisafakepassword"

# ansible剧本
@ops-2701 /ops/scripts/os_init/test$ cat update-kernel.yml 
- name: Upgrade Kernel
  hosts: test
  gather_facts: false
  become: yes
  tasks:
    - name: Kernel | 导入key
      shell: rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
      args:
        warn: false

    - name: Kernel | 导入仓库
      shell: yum install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm -y
      args:
        warn: false

    - name: Kernel | 更新内核
      yum:
        name: kernel-lt-devel,kernel-lt
        enablerepo: elrepo-kernel
        state: latest

    - name: Kernel | 设置内核启动顺序
      shell: grub2-set-default 0
      args:
        warn: false

    - name: Kernel | 记录更新的内核版本
      shell: awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg|head -n 1
      register: kernel_version_1

    - name: Kernel | 确认是否重启
      pause:
        prompt: "安装的内核版本是{{ kernel_version_1['stdout_lines'] }}.确认是否重启,请输入(yes)以确定... "
      register: my_pause_1
      delegate_to: localhost

    - name: Kernel | 确认用户输入
      debug:
        msg: "未输入yes,不进行重启"
      when: my_pause_1.user_input != "yes"

    - name: Kernel | 重启
      reboot:
        msg: "等待重启完成..."
        test_command: uname -r
      when: my_pause_1.user_input == "yes"

    - name: Kernel | 记录当前内核版本
      shell: uname -r
      register: kernel_version_2
      when: my_pause_1.user_input == "yes"

    - name: Kernel | 打印当前内核版本
      debug:
        msg: "{{ kernel_version_2['stdout_lines'] }}"
      when: my_pause_1.user_input == "yes"

playbook中,有高危操作reboot​。使用pause​,由用户来决策仅当输入yes才执行reboot而不是直接执行。

执行结果

ansible  @ops-2701 /ops/scripts/os_init/test$ ansible-playbook -i hosts update-kernel.yml 

PLAY [Upgrade Kernel] **********************************************************************************************

TASK [Kernel | 导入key] **********************************************************************************************
Wednesday 02 August 2023  18:33:28 +0800 (0:00:00.026)       0:00:00.026 ****** 
changed: [test-images]

TASK [Kernel | 导入仓库] ***********************************************************************************************
Wednesday 02 August 2023  18:33:29 +0800 (0:00:01.223)       0:00:01.249 ****** 
changed: [test-images]

TASK [Kernel | 更新内核] **********************************记录更新的内核版本*************************************************************
Wednesday 02 August 2023  18:33:31 +0800 (0:00:02.268)       0:00:03.517 ****** 
ok: [test-images]

TASK [Kernel | 设置内核启动顺序] *******************************************************************************************
Wednesday 02 August 2023  18:33:35 +0800 (0:00:03.628)       0:00:07.146 ****** 
changed: [test-images]

TASK [Kernel | 记录更新的内核版本] ******************************************************************************************
Wednesday 02 August 2023  18:33:35 +0800 (0:00:00.223)       0:00:07.370 ****** 
changed: [test-images]

TASK [Kernel | 确认是否重启] *********************************************************************************************
Wednesday 02 August 2023  18:33:35 +0800 (0:00:00.200)       0:00:07.571 ****** 
[Kernel | 确认是否重启]
安装的内核版本是['CentOS Linux (5.4.251-1.el7.elrepo.x86_64) 7 (Core)'].确认是否重启,请输入(yes)以确定... :
ok: [test-images]

TASK [Kernel | 确认用户输入] *********************************************************************************************
Wednesday 02 August 2023  18:33:39 +0800 (0:00:03.364)       0:00:10.936 ****** 

TASK [Kernel | 重启] *************************************************************************************************
Wednesday 02 August 2023  18:33:39 +0800 (0:00:00.060)       0:00:10.996 ****** 
changed: [test-images]

TASK [Kernel | 记录当前内核版本] *******************************************************************************************
Wednesday 02 August 2023  18:34:01 +0800 (0:00:22.328)       0:00:33.325 ****** 
changed: [test-images]

TASK [Kernel | 打印当前内核版本] *******************************************************************************************
Wednesday 02 August 2023  18:34:01 +0800 (0:00:00.360)       0:00:33.685 ****** 
ok: [test-images] => {
    "msg": [
        "5.4.251-1.el7.elrepo.x86_64"
    ]
}

PLAY RECAP *********************************************************************************************************
test-images                : ok=9    changed=6    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   

Wednesday 02 August 2023  18:34:02 +0800 (0:00:00.071)       0:00:33.756 ****** 
=============================================================================== 
Kernel | 重启 ------------------------------------------------------------------------------------------------ 22.33s
Kernel | 更新内核 ----------------------------------------------------------------------------------------------- 3.63s
Kernel | 确认是否重启 --------------------------------------------------------------------------------------------- 3.36s
Kernel | 导入仓库 ----------------------------------------------------------------------------------------------- 2.27s
Kernel | 导入key ---------------------------------------------------------------------------------------------- 1.22s
Kernel | 记录当前内核版本 ------------------------------------------------------------------------------------------- 0.36s
Kernel | 设置内核启动顺序 ------------------------------------------------------------------------------------------- 0.22s
Kernel | 记录更新的内核版本 ------------------------------------------------------------------------------------------ 0.20s
Kernel | 打印当前内核版本 ------------------------------------------------------------------------------------------- 0.07s
Kernel | 确认用户输入 --------------------------------------------------------------------------------------------- 0.06s

Ubuntu 20.04

这里我们也创建了一台全新的服务器,查看当前的内核版本。

root@ecs-images-ubuntu:~# uname -a
Linux ecs-images-ubuntu 5.4.0-153-generic #170-Ubuntu SMP Fri Jun 16 13:43:31 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

apt

仓库配置

Ubuntu默认的源即可进行内核更新,无需像CentOS导入外部源。

更新源缓存
root@ecs-images-ubuntu:~# apt update

有趣的点来了。

在 CentOS 下,yum update​和yum upgrade​都会执行直接更新可用的软件包。更新源缓存是使用yum makecache​。

而在 Ubuntu 中,apt update​用于更新源缓存,apt upgrade​则是用于更新软件包。

列出可用内核版本

内核的分类:

  1. linux-image-generic​:这是 Ubuntu 20.04 默认安装的内核版本,也被称为 GA (General Availability) 内核。它是在 Ubuntu 20.04 发布时提供的稳定内核版本。
  2. linux-image-generic-hwe-20.04​:这是 Ubuntu 20.04 的 HWE (Hardware Enablement) 内核版本,用于提供对较新硬件的支持。它是在 Ubuntu 20.04.2 发布时引入的。
  3. linux-image-lowlatency​:这是针对需要低延迟的应用场景(如音频/视频处理)而设计的内核版本。它提供了与 linux-image-generic​ 相同的功能,但优化了内核调度以减少延迟。
root@ecs-images-ubuntu:~# apt show linux-image-generic-hwe-20.04 -a
Package: linux-image-generic-hwe-20.04
Version: 5.15.0.78.85~20.04.38
Priority: optional
Section: kernel
Source: linux-meta-hwe-5.15
Origin: Ubuntu
Maintainer: Ubuntu Kernel Team <kernel-team@lists.ubuntu.com>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 20.5 kB
Provides: spl-modules (= 2.1.5-1ubuntu6~22.04.1), v4l2loopback-modules (= 0.12.7-2ubuntu2~22.04.1), virtualbox-guest-modules (= 5.15.0-78), wireguard-modules (= 1.0.0), zfs-modules (= 2.1.5-1ubuntu6~22.04.1)
Depends: linux-image-5.15.0-78-generic, linux-modules-extra-5.15.0-78-generic, linux-firmware, intel-microcode, amd64-microcode
Recommends: thermald
Download-Size: 2,720 B
APT-Manual-Installed: no
APT-Sources: http://repo.huaweicloud.com/ubuntu focal-updates/main amd64 Packages
Description: Generic Linux kernel image
 This package will always depend on the latest generic kernel image
 available.

Package: linux-image-generic-hwe-20.04
Version: 5.4.0.26.32
Priority: optional
Section: kernel
Source: linux-meta
Origin: Ubuntu
Maintainer: Ubuntu Kernel Team <kernel-team@lists.ubuntu.com>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 17.4 kB
Provides: virtualbox-guest-modules (= 6.1.6-dfsg-1), wireguard-modules (= 1.0.20200413-1), zfs-modules (= 0.8.3-1ubuntu12)
Depends: linux-image-5.4.0-26-generic, linux-modules-extra-5.4.0-26-generic, linux-firmware, intel-microcode, amd64-microcode
Recommends: thermald
Download-Size: 2,832 B
APT-Sources: http://repo.huaweicloud.com/ubuntu focal/main amd64 Packages
Description: Generic Linux kernel image
 This package will always depend on the latest generic kernel image
 available.

检索Version查阅可用的版本。

可以看到,Ubuntu和CentOS理念不同,它并不会给太多历史版本供选择

更新内核

内核安装

这里以安装最新linux-image-generic-hwe-20.04​为示例

root@ecs-images-ubuntu:~# apt install linux-generic-hwe-20.04 -y
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following additional packages will be installed:
  libdbus-glib-1-2 libevdev2 libimobiledevice6 libplist3 libupower-glib3 libusbmuxd6
  linux-headers-5.15.0-78-generic linux-headers-generic-hwe-20.04 linux-hwe-5.15-headers-5.15.0-78
  linux-image-5.15.0-78-generic linux-image-generic-hwe-20.04 linux-modules-5.15.0-78-generic
  linux-modules-extra-5.15.0-78-generic thermald upower usbmuxd
Suggested packages:
  libusbmuxd-tools fdutils linux-doc | linux-hwe-5.15-source-5.15.0 linux-hwe-5.15-tools
The following NEW packages will be installed:
  libdbus-glib-1-2 libevdev2 libimobiledevice6 libplist3 libupower-glib3 libusbmuxd6 linux-generic-hwe-20.04
  linux-headers-5.15.0-78-generic linux-headers-generic-hwe-20.04 linux-hwe-5.15-headers-5.15.0-78
  linux-image-5.15.0-78-generic linux-image-generic-hwe-20.04 linux-modules-5.15.0-78-generic
  linux-modules-extra-5.15.0-78-generic thermald upower usbmuxd
0 upgraded, 17 newly installed, 0 to remove and 25 not upgraded.
...

有趣的点再次来了,执行的是apt install linux-generic-hwe-20.04 -y​,但从输出来看,安装的却是linux-headers-5.15.0-78-generic​。让我们来问问万能的GPT

linux-generic-hwe-20.04 包实际上是一个依赖包,它会安装最新的 HWE(硬件支持)内核和相关的依赖项。

在您的命令输出中,除了安装了 linux-headers-5.15.0-78-generic 外,还安装了其他与内核相关的软件包,例如 linux-image-5.15.0-78-generic、linux-modules-5.15.0-78-generic 等。

这些软件包一起提供了完整的内核功能。linux-headers-5.15.0-78-generic 是内核的头文件,用于编译内核模块。linux-image-5.15.0-78-generic 是实际的内核映像文件,用于启动系统。

解释的很清楚,点个赞。

查看系统已存在的内核
root@ecs-images-ubuntu:~# grep menuentry /boot/grub/grub.cfg | grep with |cut -d "'" -f 2
Ubuntu, with Linux 5.15.0-78-generic
Ubuntu, with Linux 5.15.0-78-generic (recovery mode)
Ubuntu, with Linux 5.4.0-153-generic
Ubuntu, with Linux 5.4.0-153-generic (recovery mode)
Ubuntu, with Linux 5.4.0-26-generic
Ubuntu, with Linux 5.4.0-26-generic (recovery mode)

查看最高优先级引导的内核

ubuntu默认会以序列第一的内核作为最高优先启动项

设置内核启动顺序

编辑/etc/default/grub​配置文件,修改GRUB_DEFAULT​的值

root@ecs-images-ubuntu:~# cat /etc/default/grub

GRUB_DEFAULT=0
GRUB_TIMEOUT_STYLE=menu
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX="net.ifnames=0 consoleblank=600 console=tty0 console=ttyS0,115200n8 nospectre_v2 nopti noibrs noibpb"

可以看到当前配置的值是0​,即使用索引方式,排序第一的内核作为最优先启动。

我们有两种方式可以进行更改

  1. 按菜单项的编号,即索引
  2. 按菜单项的标题,即Ubuntu, with Linux 5.15.0-78-generic

(增添内容,与上文环境并不一致)

索引方式需要注意:

在 CentOS中,GRUB 菜单是扁平菜单结构

而在 Ubuntu 中,GRUB 菜单是有主菜单和子菜单导航语法的,例如:

root@aws-mx-ai-kas-gpu-l40s-02:~# grep menu /boot/grub/grub.cfg 
if [ x"${feature_menuentry_id}" = xy ]; then
  menuentry_id_option="--id"
  menuentry_id_option=""
export menuentry_id_option
set menu_color_normal=white/black
set menu_color_highlight=black/light-gray
menuentry 'Ubuntu' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-4b727438-7c0b-4757-a56f-24bd780b3527' {
submenu 'Advanced options for Ubuntu' $menuentry_id_option 'gnulinux-advanced-4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-1072-aws' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.15.0-1072-aws-advanced-4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-1072-aws (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.15.0-1072-aws-recovery-4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-1033-aws' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.15.0-1033-aws-advanced-4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-1033-aws (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.15.0-1033-aws-recovery-4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-97-generic' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.15.0-97-generic-advanced-4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-97-generic (recovery mode)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-5.15.0-97-generic-recovery-4b727438-7c0b-4757-a56f-24bd780b3527' {
menuentry 'Ubuntu 20.04.6 LTS (20.04) (on /dev/nvme0n1p1)' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-simple-4b727438-7c0b-4757-a56f-24bd780b3527' {
submenu 'Advanced options for Ubuntu 20.04.6 LTS (20.04) (on /dev/nvme0n1p1)' $menuentry_id_option 'osprober-gnulinux-advanced-4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu (on /dev/nvme0n1p1)' --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-/boot/vmlinuz-5.15.0-1072-aws--4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-1072-aws (on /dev/nvme0n1p1)' --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-/boot/vmlinuz-5.15.0-1072-aws--4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-1072-aws (recovery mode) (on /dev/nvme0n1p1)' --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-/boot/vmlinuz-5.15.0-1072-aws-root=PARTUUID=58867aa0-680a-4387-ad0b-6402c0255536 ro recovery nomodeset dis_ucode_ldr panic=-1-4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-1033-aws (on /dev/nvme0n1p1)' --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-/boot/vmlinuz-5.15.0-1033-aws--4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-1033-aws (recovery mode) (on /dev/nvme0n1p1)' --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-/boot/vmlinuz-5.15.0-1033-aws-root=PARTUUID=58867aa0-680a-4387-ad0b-6402c0255536 ro recovery nomodeset dis_ucode_ldr panic=-1-4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-97-generic (on /dev/nvme0n1p1)' --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-/boot/vmlinuz-5.15.0-97-generic--4b727438-7c0b-4757-a56f-24bd780b3527' {
    menuentry 'Ubuntu, with Linux 5.15.0-97-generic (recovery mode) (on /dev/nvme0n1p1)' --class gnu-linux --class gnu --class os $menuentry_id_option 'osprober-gnulinux-/boot/vmlinuz-5.15.0-97-generic-root=PARTUUID=58867aa0-680a-4387-ad0b-6402c0255536 ro recovery nomodeset dis_ucode_ldr panic=-1-4b727438-7c0b-4757-a56f-24bd780b3527' {
set timeout_style=menu
# This file provides an easy way to add custom menu entries.  Simply type the
# menu entries you want to add after this comment.  Be careful not to change

菜单结构为:

Ubuntu
Advanced options for Ubuntu (1)  <-- 这是主菜单的第1项
    └── Ubuntu, with Linux 5.15.0-1072-aws
    └── Ubuntu, with Linux 5.15.0-1072-aws (recovery mode)
    └── Ubuntu, with Linux 5.15.0-1033-aws (2)  <-- 这是子菜单的第2项
    └── Ubuntu, with Linux 5.15.0-1033-aws (recovery mode)
...

如果期望选择 Ubuntu, with Linux 5.15.0-1033-aws​ ,需配置为:GRUB_DEFAULT="1 >2"​。

这表示选择第二个主菜单中的第三个子菜单栏。

当然我们也可以直接使用名称方式,配置为:GRUB_DEFAULT="Advanced options for Ubuntu>Ubuntu, with Linux 5.15.0-1033-aws"

配置完毕后,执行:

update-grub

重启验证
root@ecs-images-ubuntu:~# uname -a
Linux ecs-images-ubuntu 5.15.0-78-generic #85~20.04.1-Ubuntu SMP Mon Jul 17 09:42:39 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

ansible playbook

yml示例

- name: Upgrade Kernel
  hosts: test
  gather_facts: false
  become: yes
  tasks:
    - name: Kernel | 更新缓存
      apt:
        update_cache: yes

    - name: Kernel | 更新内核
      apt:
        name: linux-generic-hwe-20.04
        state: latest

    - name: Kernel | 记录更新的内核版本
      shell: grep menuentry /boot/grub/grub.cfg | grep with | head -n 1 | cut -d "'" -f 2
      register: kernel_version_1

    - name: Kernel | 确认是否重启
      pause:
        prompt: "安装的内核版本是{{ kernel_version_1['stdout_lines'] }}.确认是否重启,请输入(yes)以确定... "
      register: my_pause_1
      delegate_to: localhost

    - name: Kernel | 确认用户输入
      debug:
        msg: "未输入yes,不进行重启"
      when: my_pause_1.user_input != "yes"

    - name: Kernel | 重启
      reboot:
        msg: "等待重启完成..."
        test_command: uname -r
      when: my_pause_1.user_input == "yes"

    - name: Kernel | 记录当前内核版本
      shell: uname -r
      register: kernel_version_2
      when: my_pause_1.user_input == "yes"

    - name: Kernel | 打印当前内核版本
      debug:
        msg: "{{ kernel_version_2['stdout_lines'] }}"
      when: my_pause_1.user_input == "yes"

执行结果

 ansible  @ops-2701 /ops/scripts/os_init/test$ ansible-playbook -i hosts update-kernel.yml 

PLAY [Upgrade Kernel] **********************************************************************************************

TASK [Kernel | 更新缓存] ***********************************************************************************************
Wednesday 02 August 2023  19:41:24 +0800 (0:00:00.025)       0:00:00.025 ****** 
[WARNING]: Updating cache and auto-installing missing dependency: python-apt
changed: [test-images]

TASK [Kernel | 更新内核] ***********************************************************************************************
Wednesday 02 August 2023  19:41:31 +0800 (0:00:07.729)       0:00:07.754 ****** 
ok: [test-images]

TASK [Kernel | 记录更新的内核版本] ******************************************************************************************
Wednesday 02 August 2023  19:41:32 +0800 (0:00:00.994)       0:00:08.749 ****** 
changed: [test-images]

TASK [Kernel | 确认是否重启] *********************************************************************************************
Wednesday 02 August 2023  19:41:33 +0800 (0:00:00.284)       0:00:09.033 ****** 
[Kernel | 确认是否重启]
安装的内核版本是['Ubuntu, with Linux 5.15.0-78-generic'].确认是否重启,请输入(yes)以确定... :
ok: [test-images]

TASK [Kernel | 确认用户输入] *********************************************************************************************
Wednesday 02 August 2023  19:41:41 +0800 (0:00:08.895)       0:00:17.929 ****** 

TASK [Kernel | 重启] *************************************************************************************************
Wednesday 02 August 2023  19:41:41 +0800 (0:00:00.072)       0:00:18.001 ****** 
changed: [test-images]

TASK [Kernel | 记录当前内核版本] *******************************************************************************************
Wednesday 02 August 2023  19:42:25 +0800 (0:00:43.139)       0:01:01.141 ****** 
changed: [test-images]

TASK [Kernel | 打印当前内核版本] *******************************************************************************************
Wednesday 02 August 2023  19:42:25 +0800 (0:00:00.233)       0:01:01.374 ****** 
ok: [test-images] => {
    "msg": [
        "5.15.0-78-generic"
    ]
}

PLAY RECAP *********************************************************************************************************
test-images                : ok=7    changed=4    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0   

Wednesday 02 August 2023  19:42:25 +0800 (0:00:00.066)       0:01:01.441 ****** 
=============================================================================== 
Kernel | 重启 ------------------------------------------------------------------------------------------------ 43.14s
Kernel | 确认是否重启 --------------------------------------------------------------------------------------------- 8.90s
Kernel | 更新缓存 ----------------------------------------------------------------------------------------------- 7.73s
Kernel | 更新内核 ----------------------------------------------------------------------------------------------- 0.99s
Kernel | 记录更新的内核版本 ------------------------------------------------------------------------------------------ 0.28s
Kernel | 记录当前内核版本 ------------------------------------------------------------------------------------------- 0.23s
Kernel | 确认用户输入 --------------------------------------------------------------------------------------------- 0.07s
Kernel | 打印当前内核版本 ------------------------------------------------------------------------------------------- 0.07s

总结

不同的发行版风格迥异,在使用中不能照猫画虎,否则可能引发灾难。细致的探究它们的差别是一件有趣的事情。

运维需要心存敬畏,对生产环境执行的每一项操作,力保能有确定性​的结果。

如果没有特别的需求,且操作系统能够正常联通公网,选择包管理器安装是简便有效的方式。

会多次执行的运维过程应该尽力避免人工操作,常用的命令 脚本 & 剧本(ansible)化​​,进一步迭代可以 服务化​​。技能的沉淀不以人为主体,而是做到每一位继任者上手即可用。

引用链接

正文完
 
pengyinwei
版权声明:本站原创文章,由 pengyinwei 2023-08-03发表,共计21758字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处:https://www.opshub.cn
评论(没有评论)