Ceph入门到精通-Node异常重启,podman osd容器无法启动 ExitCode msg: “crun: mkdir /sys/fs/selinux: operation not permit

systemctl status [email protected]

[email protected] - Ceph osd.8 for jiqunid-fcbcd1d038ec
Loaded: loaded (/etc/systemd/system/[email protected]; enabled; vendor preset:>
Active: failed (Result: exit-code) since Thu 2023-06-01 17:56:46 CST; 15h ago
Process: 24802 ExecStopPost=/bin/rm -f /run/[email protected] /run/cep>
[email protected] - Ceph osd.8 for jiqunid-fcbcd1d038ec
Loaded: loaded (/etc/systemd/system/[email protected]; enabled; vendor preset:
disabled)
Active: failed (Result: exit-code) since Thu 2023-06-01 17:56:46 CST; 15h ago
Process: 24802 ExecStopPost=/bin/rm -f /run/[email protected] /run/ceph
[email protected] (code=exited, status=0/SUCCESS)
Process: 23484 ExecStopPost=/bin/bash /var/lib/ceph/jiqunid-fcbcd1d038ec/osd.8/unit.poststop (cod
e=exited, status=126)
Process: 22539 ExecStart=/bin/bash /var/lib/ceph/jiqunid-fcbcd1d038ec/osd.8/unit.run (code=exited
, status=126)
Process: 22530 ExecStartPre=/bin/rm -f /run/[email protected] /run/ceph
[email protected] (code=exited, status=0/SUCCESS)

Jun 01 17:56:46 node4 systemd[1]: [email protected]: Service RestartSec=10s e
xpired, scheduling restart.
Jun 01 17:56:46 node4 systemd[1]: [email protected]: Scheduled restart job, r

journalctl -ex

/var/log/message 日志

Jun 2 09:48:29 node4 systemd[1]: Started libpod-conmon-a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89.scope.
Jun 2 09:48:29 node4 conmon[425671]: conmon a81adefb8cec33f367de : addr{sun_family=AF_UNIX, sun_path=/proc/self/fd/12/attach}
Jun 2 09:48:29 node4 conmon[425671]: conmon a81adefb8cec33f367de : terminal_ctrl_fd: 12
Jun 2 09:48:29 node4 conmon[425671]: conmon a81adefb8cec33f367de : winsz read side: 16, winsz write side: 16
Jun 2 09:48:29 node4 systemd[1]: tmp-crun.JCkFVp.mount: Succeeded.
Jun 2 09:48:29 node4 systemd[1]: Started libcrun container.
Jun 2 09:48:30 node4 conmon[425671]: conmon a81adefb8cec33f367de : runtime stderr: mkdir /sys/fs/selinux: Operation not permitted
Jun 2 09:48:30 node4 conmon[425671]: conmon a81adefb8cec33f367de : Failed to create container: exit status 1
Jun 2 09:48:30 node4 systemd[1]: tmp-crun.fnLvuX.mount: Succeeded.
Jun 2 09:48:30 node4 systemd[1]: libpod-a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89.scope: Succeeded.
Jun 2 09:48:30 node4 /usr/bin/podman[425675]: time=“2023-06-02T09:48:30+08:00” level=info msg=“[graphdriver] using prior storage driver: overlay”
Jun 2 09:48:30 node4 /usr/bin/podman[425675]: time=“2023-06-02T09:48:30+08:00” level=info msg=“Setting parallel job count to 193”
Jun 2 09:48:30 node4 /usr/bin/podman[425675]: time=“2023-06-02T09:48:30+08:00” level=error msg=“Removing container: failed to clean up and remove container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89: container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89 does not exist in database: no such container”
Jun 2 09:48:30 node4 systemd[1]: libpod-conmon-a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89.scope: Succeeded.
Jun 2 09:48:30 node4 systemd[1]: var-lib-containers-storage-overlay-753de25e4999553571fbe4785f6b76ba07f9bac9cacfabbdaba031716d9c8603-merged.mount: Succeeded.

手动启动日志

[root@node4 osd.8]# /bin/bash unit.run
INFO[0000] /bin/podman filtering at log level debug
DEBU[0000] Called run.PersistentPreRunE(/bin/podman run --privileged --log-level=debug --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init --name ceph-jiqunid-fcbcd1d038ec-osd-8-activate -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7 -e NODE_NAME=node4 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/jiqunid-fcbcd1d038ec:/var/run/ceph:z -v /var/log/ceph/jiqunid-fcbcd1d038ec:/var/log/ceph:z -v /var/lib/ceph/jiqunid-fcbcd1d038ec/crash:/var/lib/ceph/crash:z -v /var/lib/ceph/jiqunid-fcbcd1d038ec/osd.8:/var/lib/ceph/osd/ceph-8:z -v /var/lib/ceph/jiqunid-fcbcd1d038ec/osd.8/config:/etc/ceph/ceph.conf:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /var/lib/ceph/jiqunid-fcbcd1d038ec/selinux:/sys/fs/selinux:ro -v /:/rootfs quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7 activate --osd-id 8 --osd-uuid 134ffbce-9001-4776-bc1b-04151141683e --no-systemd --no-tmpfs)
DEBU[0000] Merged system config “/usr/share/containers/containers.conf”
DEBU[0000] Using conmon: “/usr/bin/conmon”
DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db
DEBU[0000] Using graph driver
DEBU[0000] Using graph root /var/lib/containers/storage
DEBU[0000] Using run root /run/containers/storage
DEBU[0000] Using static dir /var/lib/containers/storage/libpod
DEBU[0000] Using tmp dir /run/libpod
DEBU[0000] Using volume path /var/lib/containers/storage/volumes
DEBU[0000] Set libpod namespace to “”
DEBU[0000] Cached value indicated that overlay is supported
DEBU[0000] Cached value indicated that overlay is supported
DEBU[0000] Cached value indicated that metacopy is not being used
DEBU[0000] Cached value indicated that native-diff is usable
DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false
INFO[0000] [graphdriver] using prior storage driver: overlay
DEBU[0000] Initializing event backend file
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument
DEBU[0000] Configured OCI runtime krun initialization failed: no valid executable found for OCI runtime krun: invalid argument
DEBU[0000] Using OCI runtime “/usr/bin/crun”
INFO[0000] Setting parallel job count to 193
DEBU[0000] Pulling image quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7 (policy: missing)
DEBU[0000] Looking up image “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” in local containers storage
DEBU[0000] Normalized platform linux/arm64 to {arm64 linux [] }
DEBU[0000] Trying “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” …
DEBU[0000] parsed reference into “[overlay@/var/lib/containers/storage+/run/containers/storage]@7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] Found image “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” as “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” in local containers storage
DEBU[0000] Found image “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” as “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” in local containers storage ([overlay@/var/lib/containers/storage+/run/containers/storage]@7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f)
DEBU[0000] exporting opaque data as blob “sha256:7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] Looking up image “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” in local containers storage
DEBU[0000] Normalized platform linux/arm64 to {arm64 linux [] }
DEBU[0000] Trying “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” …
DEBU[0000] parsed reference into “[overlay@/var/lib/containers/storage+/run/containers/storage]@7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] Found image “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” as “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” in local containers storage
DEBU[0000] Found image “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” as “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” in local containers storage ([overlay@/var/lib/containers/storage+/run/containers/storage]@7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f)
DEBU[0000] exporting opaque data as blob “sha256:7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] User mount /var/run/ceph/jiqunid-fcbcd1d038ec:/var/run/ceph options [z]
DEBU[0000] User mount /var/log/ceph/jiqunid-fcbcd1d038ec:/var/log/ceph options [z]
DEBU[0000] User mount /var/lib/ceph/jiqunid-fcbcd1d038ec/crash:/var/lib/ceph/crash options [z]
DEBU[0000] User mount /var/lib/ceph/jiqunid-fcbcd1d038ec/osd.8:/var/lib/ceph/osd/ceph-8 options [z]
DEBU[0000] User mount /var/lib/ceph/jiqunid-fcbcd1d038ec/osd.8/config:/etc/ceph/ceph.conf options [z]
DEBU[0000] User mount /dev:/dev options []
DEBU[0000] User mount /run/udev:/run/udev options []
DEBU[0000] User mount /sys:/sys options []
DEBU[0000] User mount /run/lvm:/run/lvm options []
DEBU[0000] User mount /run/lock/lvm:/run/lock/lvm options []
DEBU[0000] User mount /var/lib/ceph/jiqunid-fcbcd1d038ec/selinux:/sys/fs/selinux options [ro]
DEBU[0000] User mount /:/rootfs options []
DEBU[0000] Looking up image “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” in local containers storage
DEBU[0000] Normalized platform linux/arm64 to {arm64 linux [] }
DEBU[0000] Trying “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” …
DEBU[0000] parsed reference into “[overlay@/var/lib/containers/storage+/run/containers/storage]@7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] Found image “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” as “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” in local containers storage
DEBU[0000] Found image “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” as “quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7” in local containers storage ([overlay@/var/lib/containers/storage+/run/containers/storage]@7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f)
DEBU[0000] exporting opaque data as blob “sha256:7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] Inspecting image 7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f
DEBU[0000] exporting opaque data as blob “sha256:7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] exporting opaque data as blob “sha256:7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] Inspecting image 7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f
DEBU[0000] Inspecting image 7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f
DEBU[0000] Inspecting image 7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f
DEBU[0000] using systemd mode: false
DEBU[0000] setting container name ceph-jiqunid-fcbcd1d038ec-osd-8-activate
DEBU[0000] Loading seccomp profile from “/usr/share/containers/seccomp.json”
INFO[0000] Sysctl net.ipv4.ping_group_range=0 0 ignored in containers.conf, since Network Namespace set to host
DEBU[0000] Adding mount /proc
DEBU[0000] Adding mount /sys/fs/cgroup
DEBU[0000] Allocated lock 1 for container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89
DEBU[0000] parsed reference into “[overlay@/var/lib/containers/storage+/run/containers/storage]@7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] exporting opaque data as blob “sha256:7919d94898640b275ec0950958fdbb8c26bf675638178d05f796274c14dd559f”
DEBU[0000] Cached value indicated that idmapped mounts for overlay are not supported
DEBU[0000] Check for idmapped mounts support
DEBU[0000] Created container “a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89”
DEBU[0000] Container “a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89” has work directory “/var/lib/containers/storage/overlay-containers/a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89/userdata”
DEBU[0000] Container “a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89” has run directory “/run/containers/storage/overlay-containers/a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89/userdata”
DEBU[0000] Not attaching to stdin
DEBU[0000] [graphdriver] trying provided driver “overlay”
DEBU[0000] Cached value indicated that overlay is supported
DEBU[0000] Cached value indicated that overlay is supported
DEBU[0000] Cached value indicated that metacopy is not being used
DEBU[0000] backingFs=xfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false
DEBU[0000] Cached value indicated that volatile is being used
DEBU[0000] overlay: mount_data=lowerdir=/var/lib/containers/storage/overlay/l/RVXPRCOJOTED4UKZMD2TL3TI5O:/var/lib/containers/storage/overlay/l/R6PXKAD3AAVUYNDQNP2DP45Q5Y,upperdir=/var/lib/containers/storage/overlay/753de25e4999553571fbe4785f6b76ba07f9bac9cacfabbdaba031716d9c8603/diff,workdir=/var/lib/containers/storage/overlay/753de25e4999553571fbe4785f6b76ba07f9bac9cacfabbdaba031716d9c8603/work,volatile
DEBU[0000] Mounted container “a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89” at “/var/lib/containers/storage/overlay/753de25e4999553571fbe4785f6b76ba07f9bac9cacfabbdaba031716d9c8603/merged”
DEBU[0000] Created root filesystem for container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89 at /var/lib/containers/storage/overlay/753de25e4999553571fbe4785f6b76ba07f9bac9cacfabbdaba031716d9c8603/merged
DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode subscription
DEBU[0000] Setting Cgroups for container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89 to machine.slice:libpod:a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89
DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d
DEBU[0000] Workdir “/” resolved to host path “/var/lib/containers/storage/overlay/753de25e4999553571fbe4785f6b76ba07f9bac9cacfabbdaba031716d9c8603/merged”
DEBU[0000] Created OCI spec for container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89 at /var/lib/containers/storage/overlay-containers/a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89/userdata/config.json
DEBU[0000] /usr/bin/conmon messages will be logged to syslog
DEBU[0000] running conmon: /usr/bin/conmon args=“[–api-version 1 -c a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89 -u a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89 -r /usr/bin/crun -b /var/lib/containers/storage/overlay-containers/a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89/userdata -p /run/containers/storage/overlay-containers/a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89/userdata/pidfile -n ceph-jiqunid-fcbcd1d038ec-osd-8-activate --exit-dir /run/libpod/exits --full-attach -s -l k8s-file:/var/lib/containers/storage/overlay-containers/a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89/userdata/ctr.log --log-level debug --syslog --conmon-pidfile /run/containers/storage/overlay-containers/a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89/userdata/conmon.pid --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /run/containers/storage --exit-command-arg --log-level --exit-command-arg debug --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /run/libpod --exit-command-arg --network-config-dir --exit-command-arg --exit-command-arg --network-backend --exit-command-arg cni --exit-command-arg --volumepath --exit-command-arg /var/lib/containers/storage/volumes --exit-command-arg --runtime --exit-command-arg crun --exit-command-arg --events-backend --exit-command-arg file --exit-command-arg --syslog --exit-command-arg container --exit-command-arg cleanup --exit-command-arg --rm --exit-command-arg a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89]”
INFO[0000] Running conmon under slice machine.slice and unitName libpod-conmon-a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89.scope
DEBU[0000] Received: -1
DEBU[0000] Cleaning up container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89
DEBU[0000] Network is already cleaned up, skipping…
DEBU[0000] Unmounted container “a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89”
DEBU[0000] Removing container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89
DEBU[0000] Cleaning up container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89
DEBU[0000] Network is already cleaned up, skipping…
DEBU[0000] Container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89 storage is already unmounted, skipping…
DEBU[0000] Removing all exec sessions for container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89
DEBU[0000] Container a81adefb8cec33f367ded229a982c6018a25fc39800190480a230bfca5772d89 storage is already unmounted, skipping…
DEBU[0000] ExitCode msg: “crun: mkdir /sys/fs/selinux: operation not permitted: oci permission denied”
Error: crun: mkdir /sys/fs/selinux: Operation not permitted: OCI permission denied

[root@node4 osd.8]# ll -h
total 60K
lrwxrwxrwx. 1 ceph ceph 111 May 17 15:54 block -> /dev/mapper/ceph–489ee8bc–e0bb–4e6b–9e57–a50591110799-osd–block–134ffbce–9001–4776–bc1b–04151141683e
-rw-------. 1 ceph ceph 37 May 17 15:54 ceph_fsid
-rw-------. 1 ceph ceph 319 May 19 10:07 config
-rw-------. 1 ceph ceph 37 May 17 15:54 fsid
-rw-------. 1 ceph ceph 142 May 19 10:07 keyring
-rw-------. 1 ceph ceph 6 May 17 15:54 ready
-rw-------. 1 ceph ceph 3 May 17 15:54 require_osd_release
-rw-------. 1 ceph ceph 10 May 17 15:54 type
-rw-------. 1 ceph ceph 38 May 19 10:07 unit.configured
-rw-------. 1 ceph ceph 48 May 17 15:54 unit.created
-rw-------. 1 ceph ceph 90 May 17 15:54 unit.image
-rw-------. 1 ceph ceph 484 May 17 15:54 unit.meta
-rw-------. 1 ceph ceph 1.9K May 17 15:54 unit.poststop
-rw------- 1 ceph ceph 3.5K Jun 1 18:04 unit.run
-rw-------. 1 ceph ceph 302 May 17 15:54 unit.stop
-rw-------. 1 ceph ceph 2 May 17 15:54 whoami

[root@node4 jiqunid-fcbcd1d038ec]# ll -h
total 392K
-rw-r–r–. 1 root root 339K May 17 14:52 cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473daecf92e26ee3a51
drwx------. 5 ceph ceph 164 May 30 18:13 crash
drwx------. 2 ceph ceph 167 May 17 15:29 crash.node4
drwx------. 2 nobody nobody 138 May 17 15:34 node-exporter.node4
drwx------. 2 ceph ceph 4.0K May 17 15:52 osd.108
drwx------. 2 ceph ceph 4.0K May 17 15:53 osd.118
drwx------. 2 ceph ceph 4.0K May 17 15:53 osd.128
drwx------. 2 ceph ceph 4.0K May 17 15:53 osd.18
drwx------. 2 ceph ceph 4.0K May 17 15:53 osd.28
drwx------. 2 ceph ceph 4.0K May 17 15:53 osd.38
drwx------. 2 ceph ceph 4.0K May 17 15:54 osd.47
drwx------. 2 ceph ceph 4.0K May 17 15:54 osd.58
drwx------. 2 ceph ceph 4.0K May 17 15:54 osd.68
drwx------. 2 ceph ceph 4.0K May 17 15:54 osd.78
drwx------. 2 ceph ceph 4.0K Jun 2 09:36 osd.8
drwx------. 2 ceph ceph 4.0K May 17 15:55 osd.88
drwx------. 2 ceph ceph 4.0K May 17 15:55 osd.98
drwxr-xr-x 2 root root 6 Jun 1 16:52 selinux
drwxr-xr-x. 2 ceph ceph 6 May 17 15:27 selinux_bak

[root@node4 jiqunid-fcbcd1d038ec]# podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
982b5022e8a8 quay.io/prometheus/node-exporter:v1.3.1 --no-collector.ti… 16 hours ago Up 16 hours ago ceph-jiqunid-fcbcd1d038ec-node-exporter-node4
19ef75609a91 quay.io/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7 -n client.crash.n… 16 hours ago Up 16 hours ago ceph-jiqunid-fcbcd1d038ec-crash-node4

cgroup 分析

正常节点node24

[root@node24 ~]# mount|grep cgroup
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,net_cls,net_prio)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,blkio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpu,cpuacct)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,freezer)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuset)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,perf_event)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,memory)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,pids)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,rdma)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,devices)

异常节点

[root@node4 system.slice]# mount|grep cgroup
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/
systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)

/sys/fs 目录 正常节点

[root@node24 fs]# ll -h
total 0
drwx-----T. 2 root root 0 May 11 13:00 bpf
drwxr-xr-x. 14 root root 360 May 11 13:00 cgroup
drwxr-xr-x. 3 root root 0 May 11 13:00 fuse
drwxr-x—. 2 root root 0 May 11 13:00 pstore
drwxr-xr-x. 8 root root 0 Jan 1 1970 selinux
drwxr-xr-x. 6 root root 0 Jun 2 08:45 xfs

异常节点
[root@node4 fs]# pwd
/sys/fs
[root@node4 fs]# ll -h
total 0
drwx-----T 2 root root 0 Jun 1 17:54 bpf
drwxr-xr-x 14 root root 360 Jun 1 17:54 cgroup
drwxr-xr-x 3 root root 0 Jun 1 17:55 fuse
drwxr-x— 2 root root 0 Jun 1 17:54 pstore
drwxr-xr-x 6 root root 0 Jun 1 17:55 xfs

缺少selinux
mkdir /sys/fs/selinux

[root@node4 fs]# mkdir /sys/fs/selinux
mkdir: cannot create directory ‘/sys/fs/selinux’: Operation not permitted

[root@node4 fs]# tail -fn20 /var/log/audit/audit.log
type=SYSCALL msg=audit(1685670344.372:613): arch=c00000b7 syscall=64 success=yes exit=1 a0=3 a1=ffffe6136340 a2=1 a
3=ffffb29c4fa8 items=0 ppid=2990 pid=425256 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(non
e) ses=7 comm=“sshd” exe=“/usr/sbin/sshd” key=(null)ARCH=aarch64 SYSCALL=write AUID=“root” UID=“root” GID=“root” EU
ID=“root” SUID=“root” FSUID=“root” EGID=“root” SGID=“root” FSGID=“root”
type=PROCTITLE msg=audit(1685670344.372:613): proctitle=737368643A20726F6F74205B707269765D
type=USER_START msg=audit(1685670344.372:614): pid=425256 uid=0 auid=0 ses=7 msg='op=PAM:session_open grantors=pam_
selinux,pam_loginuid,pam_selinux,pam_namespace,pam_keyinit,pam_keyinit,pam_limits,pam_systemd,pam_unix,pam_umask,pa
m_lastlog acct=“root” exe=“/usr/sbin/sshd” hostname=172.22.21.29 addr=172.22.21.29 terminal=ssh res=success’UID=“ro
ot” AUID=“root”
type=CRYPTO_KEY_USER msg=audit(1685670344.382:615): pid=425258 uid=0 auid=0 ses=7 msg='op=destroy kind=server fp=SH
A256:d5:35:8c:75:a7:f1:a3:5b:c9:c7:f6:ae:f8:ac:b3:1e:ac:d2:17:3a:6d:61:e2:17:15:f3:86:86:49:cb:cf:53 direction=? sp
id=425258 suid=0 exe=“/usr/sbin/sshd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“root” SUID=“root”
type=CRYPTO_KEY_USER msg=audit(1685670344.382:616): pid=425258 uid=0 auid=0 ses=7 msg='op=destroy kind=server fp=SH
A256:61:c4:cb:ca:ee:71:b2:d9:4d:a4:d7:3d:26:a7:d5:84:7e:4d:1d:9b:97:89:d5:1f:4e:cb:f0:98:8d:4d:b5:04 direction=? sp
id=425258 suid=0 exe=“/usr/sbin/sshd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“root” SUID=“root”
type=CRYPTO_KEY_USER msg=audit(1685670344.382:617): pid=425258 uid=0 auid=0 ses=7 msg='op=destroy kind=server fp=SH
A256:36:47:f4:83:a3:96:18:3e:58:ba:f9:65:49:ee:23:60:2f:4b:77:01:c8:f8:bc:44:7a:fc:40:fc:c2:7e:d0:a7 direction=? sp
id=425258 suid=0 exe=“/usr/sbin/sshd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“root” SUID=“root”
type=CRED_ACQ msg=audit(1685670344.382:618): pid=425258 uid=0 auid=0 ses=7 msg='op=PAM:setcred grantors=pam_env,pam
_unix acct=“root” exe=“/usr/sbin/sshd” hostname=172.22.21.29 addr=172.22.21.29 terminal=ssh res=success’UID=“root”
AUID=“root”
type=USER_LOGIN msg=audit(1685670344.422:619): pid=425256 uid=0 auid=0 ses=7 msg='op=login id=0 exe=“/usr/sbin/sshd
" hostname=? addr=172.22.21.29 terminal=/dev/pts/1 res=success’UID=“root” AUID=“root” ID=“root”
type=USER_START msg=audit(1685670344.422:620): pid=425256 uid=0 auid=0 ses=7 msg='op=login id=0 exe=”/usr/sbin/sshd
" hostname=? addr=172.22.21.29 terminal=/dev/pts/1 res=success’UID=“root” AUID=“root” ID=“root”
type=CRYPTO_KEY_USER msg=audit(1685670344.422:621): pid=425256 uid=0 auid=0 ses=7 msg='op=destroy kind=server fp=SH
A256:36:47:f4:83:a3:96:18:3e:58:ba:f9:65:49:ee:23:60:2f:4b:77:01:c8:f8:bc:44:7a:fc:40:fc:c2:7e:d0:a7 direction=? sp
id=425259 suid=0 exe=“/usr/sbin/sshd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“root” SUID=“root”
type=SERVICE_START msg=audit(1685670618.803:622): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-coll
ect comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”
type=SERVICE_STOP msg=audit(1685670618.803:623): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-colle
ct comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”
type=SERVICE_START msg=audit(1685671208.797:624): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-coll
ect comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”
type=SERVICE_STOP msg=audit(1685671208.797:625): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-colle
ct comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”
type=SERVICE_START msg=audit(1685671818.800:626): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-coll
ect comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”
type=SERVICE_STOP msg=audit(1685671818.800:627): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-colle
ct comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”
type=SERVICE_START msg=audit(1685672418.804:628): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-coll
ect comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”
type=SERVICE_STOP msg=audit(1685672418.804:629): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-colle
ct comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”
type=SERVICE_START msg=audit(1685673018.797:630): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-coll
ect comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”
type=SERVICE_STOP msg=audit(1685673018.797:631): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=sysstat-colle
ct comm=“systemd” exe=“/usr/lib/systemd/systemd” hostname=? addr=? terminal=? res=success’UID=“root” AUID=“unset”

措施

1、修改 osd.x 目录下的 unit.run
修改日志级别debug

/bin/podman run --privileged --log-level=debug --rm --ipc=host --stop-signal=SIGTERM --net=host
--entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init --name ceph-425d480a-f
47a-11ed-81bd-fcbcd1d038ec-osd-8-activate -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:bb6fd5f08
eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7 -e NODE_NAME=node4 -e CEPH_USE_RANDOM_N
ONCE=1 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/425d480a-f47a
-11ed-81bd-fcbcd1d038ec:/var/run/ceph:z -v /var/log/ceph/425d480a-f47a-11ed-81bd-fcbcd1d038ec:/
var/log/ceph:z -v /var/lib/ceph/425d480a-f47a-11ed-81bd-fcbcd1d038ec/crash:/var/lib/ceph/crash:
z -v /var/lib/ceph/425d480a-f47a-11ed-81bd-fcbcd1d038ec/osd.8:/var/lib/ceph/osd/ceph-8:z -v /va
r/lib/ceph/425d480a-f47a-11ed-81bd-fcbcd1d038ec/osd.8/config:/etc/ceph/ceph.conf:z -v /dev:/dev
-v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /va
r/lib/ceph/425d480a-f47a-11ed-81bd-fcbcd1d038ec/selinux:/sys/fs/selinux:ro -v /:/rootfs quay.io
/ceph/ceph@sha256:bb6fd5f08eb3bea251a5bf906069075771b93090571d42ec4e328dc17f9692f7 activate --o
sd-id 8 --osd-uuid 134ffbce-9001-4776-bc1b-04151141683e --no-systemd --no-tmpfs
  1. 修改/etc/sysconfig/selinux 文件
    disable 先改成 enforcing
    重启机器 /sys/fs/ 目录下是否创建 selinux
    如果创建了
    重新把enforcing 改成disable
    再重启
  2. 重启 osd pod

猜你喜欢

转载自blog.csdn.net/wxb880114/article/details/131003631