Editing crashreport #72856

ReasonCrashing FunctionWhere to cut BacktraceReports Count
watchdog: BUG: soft lockup - copy_mc_enhanced_fast_string__collapse_huge_page_copy
collapse_huge_page
hpage_collapse_scan_pmd
khugepaged_scan_mm_slot
khugepaged
kthread
ret_from_fork
panic
watchdog_timer_fn
__hrtimer_run_queues
hrtimer_interrupt
__sysvec_apic_timer_interrupt
sysvec_apic_timer_interrupt
asm_sysvec_apic_timer_interrupt
7

Added fields:

Match messages in logs
(every line would be required to be present in log output
Copy from "Messages before crash" column below):
Match messages in full crash
(every line would be required to be present in crash log output
Copy from "Full Crash" column below):
Limit to a test:
(Copy from below "Failing text"):
Delete these reports as invalid (real bug in review or some such)
Bug or comment:
Extra info:

Failures list (last 100):

Failing TestFull CrashMessages before crashComment
replay-vbr test 5b: link checks version of target parent
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [khugepaged:43]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common virtio_balloon i2c_piix4 pcspkr joydev drm fuse ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul crc32_pclmul libata crc32c_intel virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw
CPU: 0 PID: 43 Comm: khugepaged Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.40.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Code: 89 ca e9 cd fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 31 c0 c3 cc cc cc cc 48 89 c8 c3 cc cc cc cc cc cc cc cc cc
RSP: 0018:ffffa6a780167c58 EFLAGS: 00010286
RAX: ffff97e98adf2000 RBX: ffffde16802b7c80 RCX: 0000000000001000
RDX: 0000000000001000 RSI: ffff97e99edf3000 RDI: ffff97e98adf2000
RBP: ffff97e9828e1000 R08: ffff97e984cd6480 R09: ffffde16800a3828
R10: ffff97e9828e0000 R11: 000000000003a500 R12: ffff97e9828e0000
R13: ffff97e984cd6480 R14: ffffde16800a3828 R15: ffff97e9828e0f90
FS: 0000000000000000(0000) GS:ffff97ea3fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f89e61a1648 CR3: 000000008a610003 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? __collapse_huge_page_copy.isra.0+0x6f/0x1c0
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? copy_mc_enhanced_fast_string+0x6/0xf
__collapse_huge_page_copy.isra.0+0x6f/0x1c0
collapse_huge_page+0x4e7/0x740
hpage_collapse_scan_pmd+0x470/0x870
khugepaged_scan_mm_slot.constprop.0+0x2a3/0x520
khugepaged+0xdd/0x200
? __pfx_khugepaged+0x10/0x10
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Kernel panic - not syncing: softlockup: hung tasks
CPU: 0 PID: 43 Comm: khugepaged Kdump: loaded Tainted: G OEL ------- --- 5.14.0-503.40.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<IRQ>
dump_stack_lvl+0x34/0x48
panic+0x107/0x2bb
watchdog_timer_fn.cold+0xc/0x16
? __pfx_watchdog_timer_fn+0x10/0x10
__hrtimer_run_queues+0x112/0x2b0
hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
__sysvec_apic_timer_interrupt+0x4e/0x100
sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-48vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-48vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param mdd.lustre-MDT0000.sync_permission=0
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param mdt.lustre-MDT0000.commit_on_sharing=0
Lustre: DEBUG MARKER: sync; sync; sync
Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup suspend --nolockfs --noflush /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup load /dev/mapper/mds1_flakey --table "0 3964928 flakey 252:0 0 0 1800 1 drop_writes"
Lustre: DEBUG MARKER: dmsetup resume /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000
Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
Lustre: Failing over lustre-MDT0000
LustreError: 55958:0:(obd_class.h:478:obd_check_dev()) Device 33 not setup
LustreError: 55958:0:(obd_class.h:478:obd_check_dev()) Skipped 23 previous similar messages
LDISKFS-fs (dm-3): unmounting filesystem 98bf3e3f-9929-4c32-b724-c68276b25f7b.
Lustre: server umount lustre-MDT0000 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: 10093:0:(client.c:2447:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1750357198/real 1750357198] req@ffff97e9ae431380 x1835380953927040/t0(0) o400->MGC10.240.28.49@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1750357214 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 projid:4294967295
LustreError: MGC10.240.28.49@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail
Link to test
sanity-quota test 7d: Quota reintegration (Transfer index in multiple bulks)
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [khugepaged:44]
Modules linked in: tls mgc(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common pcspkr virtio_balloon i2c_piix4 joydev fuse drm ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw
CPU: 1 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.38.1.el9_5.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Code: 89 ca e9 cd fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 31 c0 c3 cc cc cc cc 48 89 c8 c3 cc cc cc cc cc cc cc cc cc
RSP: 0018:ffffb62c4016fc58 EFLAGS: 00010286
RAX: ffff9771749cb000 RBX: ffffec87c0d272c0 RCX: 0000000000001000
RDX: 0000000000001000 RSI: ffff97725aa0d000 RDI: ffff9771749cb000
RBP: ffff977244e70000 R08: ffff97726a342f00 R09: ffffec87c4139be8
R10: ffff977244e6f000 R11: 000000000003a640 R12: ffff977244e6f000
R13: ffff97726a342f00 R14: ffffec87c4139be8 R15: ffff977244e6fe58
FS: 0000000000000000(0000) GS:ffff97727bd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6b66b15000 CR3: 0000000050410003 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? __collapse_huge_page_copy.isra.0+0x6f/0x1c0
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? copy_mc_enhanced_fast_string+0x6/0xf
__collapse_huge_page_copy.isra.0+0x6f/0x1c0
collapse_huge_page+0x4e7/0x740
hpage_collapse_scan_pmd+0x470/0x870
khugepaged_scan_mm_slot.constprop.0+0x2a3/0x520
? __pfx_wq_barrier_func+0x10/0x10
khugepaged+0xdd/0x200
? __pfx_khugepaged+0x10/0x10
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Kernel panic - not syncing: softlockup: hung tasks
CPU: 1 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G OEL ------- --- 5.14.0-503.38.1.el9_5.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<IRQ>
dump_stack_lvl+0x34/0x48
panic+0x107/0x2bb
watchdog_timer_fn.cold+0xc/0x16
? __pfx_watchdog_timer_fn+0x10/0x10
__hrtimer_run_queues+0x112/0x2b0
hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
__sysvec_apic_timer_interrupt+0x4e/0x100
sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete \*.lustre-OST0000.recovery_status 1475
Lustre: DEBUG MARKER: onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete *.lustre-OST0000.recovery_status 1475
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete \*.lustre-OST0001.recovery_status 1475
Lustre: DEBUG MARKER: onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete *.lustre-OST0001.recovery_status 1475
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete \*.lustre-OST0002.recovery_status 1475
Lustre: DEBUG MARKER: onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete *.lustre-OST0002.recovery_status 1475
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete \*.lustre-OST0003.recovery_status 1475
Lustre: DEBUG MARKER: onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete *.lustre-OST0003.recovery_status 1475
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete \*.lustre-OST0004.recovery_status 1475
Lustre: DEBUG MARKER: onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete *.lustre-OST0004.recovery_status 1475
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete \*.lustre-OST0005.recovery_status 1475
Lustre: DEBUG MARKER: onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete *.lustre-OST0005.recovery_status 1475
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete \*.lustre-OST0006.recovery_status 1475
Lustre: DEBUG MARKER: onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete *.lustre-OST0006.recovery_status 1475
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete \*.lustre-OST0007.recovery_status 1475
Lustre: DEBUG MARKER: onyx-149vm3.onyx.whamcloud.com: executing _wait_recovery_complete *.lustre-OST0007.recovery_status 1475
Link to test
sanity-pcc test 3a: Repeat attach/detach operations
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [khugepaged:44]
Modules linked in: tls osp(OE) ofd(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill i2c_piix4 joydev virtio_balloon intel_rapl_msr intel_rapl_common pcspkr sunrpc drm fuse ext4 mbcache jbd2 ata_generic crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ata_piix ghash_clmulni_intel libata virtio_blk net_failover failover serio_raw
CPU: 0 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.31.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Code: 89 ca e9 cd fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 31 c0 c3 cc cc cc cc 48 89 c8 c3 cc cc cc cc cc cc cc cc cc
RSP: 0000:ffffa43a8016fc58 EFLAGS: 00010286
RAX: ffff97aa3b118000 RBX: ffffe58740ec4600 RCX: 0000000000001000
RDX: 0000000000001000 RSI: ffff97aa3db5d000 RDI: ffff97aa3b118000
RBP: ffff97aa2cf8a000 R08: ffff97aa36b9ce40 R09: ffffe58740b3e268
R10: ffff97aa2cf89000 R11: 000000000003a500 R12: ffff97aa2cf89000
R13: ffff97aa36b9ce40 R14: ffffe58740b3e268 R15: ffff97aa2cf898c0
FS: 0000000000000000(0000) GS:ffff97aabfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fafc7efd024 CR3: 000000005be10004 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? __collapse_huge_page_copy.isra.0+0x6f/0x1c0
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? copy_mc_enhanced_fast_string+0x6/0xf
__collapse_huge_page_copy.isra.0+0x6f/0x1c0
collapse_huge_page+0x4e7/0x740
hpage_collapse_scan_pmd+0x470/0x870
khugepaged_scan_mm_slot.constprop.0+0x2a3/0x520
khugepaged+0xdd/0x200
? __pfx_khugepaged+0x10/0x10
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Kernel panic - not syncing: softlockup: hung tasks
CPU: 0 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G OEL ------- --- 5.14.0-503.31.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<IRQ>
dump_stack_lvl+0x34/0x48
panic+0x107/0x2bb
watchdog_timer_fn.cold+0xc/0x16
? __pfx_watchdog_timer_fn+0x10/0x10
__hrtimer_run_queues+0x112/0x2b0
hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
__sysvec_apic_timer_interrupt+0x4e/0x100
sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Link to test
sanity test 276: Race between mount and obd_statfs
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [khugepaged:43]
Modules linked in: dm_flakey tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common virtio_balloon pcspkr sunrpc joydev i2c_piix4 drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul virtio_net crc32c_intel ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 0 PID: 43 Comm: khugepaged Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.31.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Code: 89 ca e9 cd fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 31 c0 c3 cc cc cc cc 48 89 c8 c3 cc cc cc cc cc cc cc cc cc
RSP: 0000:ffffa302c0167c58 EFLAGS: 00010286
RAX: ffff8b4d435fb000 RBX: fffff912410d7ec0 RCX: 0000000000001000
RDX: 0000000000001000 RSI: ffff8b4d95b2c000 RDI: ffff8b4d435fb000
RBP: ffff8b4d64410000 R08: ffff8b4d90541480 R09: fffff912419103e8
R10: ffff8b4d6440f000 R11: 000000000003a500 R12: ffff8b4d6440f000
R13: ffff8b4d90541480 R14: fffff912419103e8 R15: ffff8b4d6440ffd8
FS: 0000000000000000(0000) GS:ffff8b4dbfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd213aa7000 CR3: 0000000014810006 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? __collapse_huge_page_copy.isra.0+0x6f/0x1c0
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? copy_mc_enhanced_fast_string+0x6/0xf
__collapse_huge_page_copy.isra.0+0x6f/0x1c0
collapse_huge_page+0x4e7/0x740
hpage_collapse_scan_pmd+0x470/0x870
khugepaged_scan_mm_slot.constprop.0+0x2a3/0x520
? __pfx_wq_barrier_func+0x10/0x10
khugepaged+0xdd/0x200
? __pfx_khugepaged+0x10/0x10
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Kernel panic - not syncing: softlockup: hung tasks
CPU: 0 PID: 43 Comm: khugepaged Kdump: loaded Tainted: G OEL ------- --- 5.14.0-503.31.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<IRQ>
dump_stack_lvl+0x34/0x48
panic+0x107/0x2bb
watchdog_timer_fn.cold+0xc/0x16
? __pfx_watchdog_timer_fn+0x10/0x10
__hrtimer_run_queues+0x112/0x2b0
hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
__sysvec_apic_timer_interrupt+0x4e/0x100
sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Lustre: lustre-OST0000-osc-MDT0001: Connection to lustre-OST0000 (at 10.240.25.177@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 1 previous similar message
Lustre: lustre-OST0000-osc-MDT0003: Connection restored to 10.240.25.177@tcp (at 10.240.25.177@tcp)
Lustre: Skipped 2 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
LustreError: lustre-OST0000-osc-MDT0001: operation ost_statfs to node 10.240.25.177@tcp failed: rc = -107
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Autotest: Test running for 255 minutes (lustre-reviews_review-dne-part-1_112718.29)
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
LustreError: lustre-OST0000-osc-MDT0003: operation ost_statfs to node 10.240.25.177@tcp failed: rc = -107
LustreError: Skipped 15 previous similar messages
Lustre: lustre-OST0000-osc-MDT0003: Connection to lustre-OST0000 (at 10.240.25.177@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 23 previous similar messages
Lustre: lustre-OST0000-osc-MDT0003: Connection restored to 10.240.25.177@tcp (at 10.240.25.177@tcp)
Lustre: Skipped 23 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug all all
Link to test
obdfilter-survey test 1c: Object Storage Targets survey, big batch
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [khugepaged:44]
Modules linked in: tls lustre(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common virtio_balloon pcspkr i2c_piix4 joydev drm fuse ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel virtio_net virtio_blk net_failover failover serio_raw
CPU: 0 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.31.1.el9_5.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Code: 89 ca e9 cd fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 31 c0 c3 cc cc cc cc 48 89 c8 c3 cc cc cc cc cc cc cc cc cc
RSP: 0018:ffff9ac80016fc58 EFLAGS: 00010286
RAX: ffff8a3672173000 RBX: ffffd20800c85cc0 RCX: 0000000000001000
RDX: 0000000000001000 RSI: ffff8a36598a9000 RDI: ffff8a3672173000
RBP: ffff8a3675fb1000 R08: ffff8a364a098300 R09: ffffd20800d7ec28
R10: ffff8a3675fb0000 R11: 000000000003a500 R12: ffff8a3675fb0000
R13: ffff8a364a098300 R14: ffffd20800d7ec28 R15: ffff8a3675fb0b98
FS: 0000000000000000(0000) GS:ffff8a36ffc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1916fa8948 CR3: 0000000036134002 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? __collapse_huge_page_copy.isra.0+0x6f/0x1c0
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? copy_mc_enhanced_fast_string+0x6/0xf
__collapse_huge_page_copy.isra.0+0x6f/0x1c0
collapse_huge_page+0x4e7/0x740
hpage_collapse_scan_pmd+0x470/0x870
khugepaged_scan_mm_slot.constprop.0+0x2a3/0x520
khugepaged+0xdd/0x200
? __pfx_khugepaged+0x10/0x10
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Kernel panic - not syncing: softlockup: hung tasks
CPU: 0 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G OEL ------- --- 5.14.0-503.31.1.el9_5.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<IRQ>
dump_stack_lvl+0x34/0x48
panic+0x107/0x2bb
watchdog_timer_fn.cold+0xc/0x16
? __pfx_watchdog_timer_fn+0x10/0x10
__hrtimer_run_queues+0x112/0x2b0
hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
__sysvec_apic_timer_interrupt+0x4e/0x100
sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Autotest: Test running for 10 minutes (lustre-reviews_review-dne-part-9_112456.37)
Autotest: Test running for 15 minutes (lustre-reviews_review-dne-part-9_112456.37)
Autotest: Test running for 20 minutes (lustre-reviews_review-dne-part-9_112456.37)
Link to test
sanityn test 109: Race with several mount instances on 1 node
watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [khugepaged:44]
Modules linked in: xfs libcrc32c ofd(OE) ost(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lzstd(OE) llz4hc(OE) llz4(OE) lustre(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) dm_flakey nfsv3 nfs_acl loop dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) mlx5_core(OE) mlxdevm(OE) ib_uverbs(OE) ib_core(OE) psample mlxfw(OE) mlx_compat(OE) macsec tls pci_hyperv_intf intel_rapl_msr intel_rapl_common virtio_balloon i2c_piix4 pcspkr joydev sunrpc drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel net_failover virtio_blk failover serio_raw [last unloaded: libcfs(OE)]
CPU: 1 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G W OE ------- --- 5.14.0-503.34.1_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Code: 89 ca e9 cd fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 31 c0 c3 cc cc cc cc 48 89 c8 c3 cc cc cc cc cc cc cc cc cc
RSP: 0018:ffffb7bbc016fc58 EFLAGS: 00010286
RAX: ffff9b6f8c5e3000 RBX: fffff933413178c0 RCX: 0000000000001000
RDX: 0000000000001000 RSI: ffff9b6f89214000 RDI: ffff9b6f8c5e3000
RBP: ffff9b6fd650c000 R08: ffff9b6f47ddd240 R09: fffff933425942e8
R10: ffff9b6fd650b000 R11: 000000000003a500 R12: ffff9b6fd650b000
R13: ffff9b6f47ddd240 R14: fffff933425942e8 R15: ffff9b6fd650bf18
FS: 0000000000000000(0000) GS:ffff9b6fffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055b8e6d8e008 CR3: 0000000003370004 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? __collapse_huge_page_copy.isra.0+0x6f/0x1c0
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? copy_mc_enhanced_fast_string+0x6/0xf
__collapse_huge_page_copy.isra.0+0x6f/0x1c0
collapse_huge_page+0x4e7/0x740
hpage_collapse_scan_pmd+0x470/0x870
khugepaged_scan_mm_slot.constprop.0+0x2a3/0x520
? __pfx_wq_barrier_func+0x10/0x10
khugepaged+0xdd/0x200
? __pfx_autoremove_wake_function+0x10/0x10
? __pfx_khugepaged+0x10/0x10
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Kernel panic - not syncing: softlockup: hung tasks
CPU: 1 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G W OEL ------- --- 5.14.0-503.34.1_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<IRQ>
dump_stack_lvl+0x34/0x48
panic+0x107/0x2bb
watchdog_timer_fn.cold+0xc/0x16
? __pfx_watchdog_timer_fn+0x10/0x10
__hrtimer_run_queues+0x112/0x2b0
hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
__sysvec_apic_timer_interrupt+0x4e/0x100
sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 1
Lustre: DEBUG MARKER: Iteration 1
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 2
Lustre: DEBUG MARKER: Iteration 2
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 3
Lustre: DEBUG MARKER: Iteration 3
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 4
Lustre: DEBUG MARKER: Iteration 4
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 5
Lustre: DEBUG MARKER: Iteration 5
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 6
Lustre: DEBUG MARKER: Iteration 6
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 7
Lustre: DEBUG MARKER: Iteration 7
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 8
Lustre: DEBUG MARKER: Iteration 8
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 9
Lustre: DEBUG MARKER: Iteration 9
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 10
Lustre: DEBUG MARKER: Iteration 10
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 11
Lustre: DEBUG MARKER: Iteration 11
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 12
Lustre: DEBUG MARKER: Iteration 12
Autotest: Test running for 855 minutes (lustre-b_es7_0_full-part-3_100.59)
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 13
Lustre: DEBUG MARKER: Iteration 13
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 14
Lustre: DEBUG MARKER: Iteration 14
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 15
Lustre: DEBUG MARKER: Iteration 15
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 16
Lustre: DEBUG MARKER: Iteration 16
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 17
Lustre: DEBUG MARKER: Iteration 17
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 18
Lustre: DEBUG MARKER: Iteration 18
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 19
Lustre: DEBUG MARKER: Iteration 19
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 20
Lustre: DEBUG MARKER: Iteration 20
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 21
Lustre: DEBUG MARKER: Iteration 21
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 22
Lustre: DEBUG MARKER: Iteration 22
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 23
Lustre: DEBUG MARKER: Iteration 23
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 24
Lustre: DEBUG MARKER: Iteration 24
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 25
Lustre: DEBUG MARKER: Iteration 25
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 26
Lustre: DEBUG MARKER: Iteration 26
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 27
Lustre: DEBUG MARKER: Iteration 27
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 28
Lustre: DEBUG MARKER: Iteration 28
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 29
Lustre: DEBUG MARKER: Iteration 29
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 30
Lustre: DEBUG MARKER: Iteration 30
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 31
Lustre: DEBUG MARKER: Iteration 31
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Iteration 32
Lustre: DEBUG MARKER: Iteration 32
Link to test
sanity-lfsck test 44: umount while lfsck is stopping
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [khugepaged:44]
Modules linked in: dm_flakey tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common joydev pcspkr i2c_piix4 virtio_balloon sunrpc drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata virtio_net crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 1 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.23.2_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Code: 89 ca e9 cd fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 48 89 f8 48 89 d1 <f3> a4 31 c0 c3 cc cc cc cc 48 89 c8 c3 cc cc cc cc cc cc cc cc cc
RSP: 0018:ffffafe34016fc58 EFLAGS: 00010286
RAX: ffff9db34b1d4000 RBX: ffffdb15c12c7500 RCX: 0000000000001000
RDX: 0000000000001000 RSI: ffff9db358da9000 RDI: ffff9db34b1d4000
RBP: ffff9db304484000 R08: ffff9db303e64300 R09: ffffdb15c01120e8
R10: ffff9db304483000 R11: 000000000003a500 R12: ffff9db304483000
R13: ffff9db303e64300 R14: ffffdb15c01120e8 R15: ffff9db304483ea0
FS: 0000000000000000(0000) GS:ffff9db3bfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa966fec6b4 CR3: 000000003602e005 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? __collapse_huge_page_copy.isra.0+0x6f/0x1c0
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? copy_mc_enhanced_fast_string+0x6/0xf
__collapse_huge_page_copy.isra.0+0x6f/0x1c0
collapse_huge_page+0x4e7/0x740
hpage_collapse_scan_pmd+0x470/0x870
khugepaged_scan_mm_slot.constprop.0+0x2a3/0x520
? __pfx_wq_barrier_func+0x10/0x10
khugepaged+0xdd/0x200
? __pfx_khugepaged+0x10/0x10
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Kernel panic - not syncing: softlockup: hung tasks
CPU: 1 PID: 44 Comm: khugepaged Kdump: loaded Tainted: G OEL ------- --- 5.14.0-503.23.2_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<IRQ>
dump_stack_lvl+0x34/0x48
panic+0x107/0x2bb
watchdog_timer_fn.cold+0xc/0x16
? __pfx_watchdog_timer_fn+0x10/0x10
__hrtimer_run_queues+0x112/0x2b0
hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
__sysvec_apic_timer_interrupt+0x4e/0x100
sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:copy_mc_enhanced_fast_string+0x6/0xf
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_val=3 fail_loc=0x1600
Lustre: DEBUG MARKER: /usr/sbin/lctl lfsck_start -M lustre-MDT0000 -t namespace -r
LustreError: 366549:0:(lfsck_engine.c:836:lfsck_master_oit_engine()) cfs_fail_timeout id 1600 sleeping for 3000ms
LustreError: 366549:0:(lfsck_engine.c:836:lfsck_master_oit_engine()) Skipped 2 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl lfsck_stop -M lustre-MDT0000
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
Lustre: Failing over lustre-MDT0000
LustreError: 366549:0:(lfsck_engine.c:836:lfsck_master_oit_engine()) cfs_fail_timeout id 1600 awake
LustreError: 366549:0:(lfsck_engine.c:836:lfsck_master_oit_engine()) Skipped 3 previous similar messages
LDISKFS-fs (dm-3): unmounting filesystem 84690e1c-4763-4b59-ad0a-b04dc3e90d4b.
Lustre: server umount lustre-MDT0000 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_val=0 fail_loc=0
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre /dev/mapper/mds1_flakey /mnt/lustre-mds1
LDISKFS-fs (dm-3): mounted filesystem 84690e1c-4763-4b59-ad0a-b04dc3e90d4b r/w with ordered data mode. Quota mode: journalled.
LustreError: MGC10.240.28.46@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail
LustreError: Skipped 4 previous similar messages
Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180
Lustre: Skipped 5 previous similar messages
Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect
Lustre: Skipped 2 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 5 clients reconnect
Lustre: Skipped 2 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdd.lustre-MDT0000.lfsck_namespace
Autotest: Test running for 130 minutes (lustre-reviews_review-dne-part-2_111482.30)
Lustre: lustre-MDT0000: Recovery over after 0:04, of 5 clients 5 recovered and 0 were evicted.
Lustre: Skipped 2 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
LDISKFS-fs (dm-3): unmounting filesystem 84690e1c-4763-4b59-ad0a-b04dc3e90d4b.
Lustre: server umount lustre-MDT0000 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup remove /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup mknodes >/dev/null 2>&1
Lustre: DEBUG MARKER: modprobe -r dm-flakey
Lustre: 10105:0:(client.c:2340:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1741080224/real 1741080224] req@ffff9db33394a700 x1825647223615488/t0(0) o400->MGC10.240.28.46@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1741080240 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
Lustre: 10105:0:(client.c:2340:ptlrpc_expire_one_request()) Skipped 13 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3
LDISKFS-fs (dm-4): unmounting filesystem 3db05193-cc21-40ae-b04d-b9caccab2112.
Lustre: server umount lustre-MDT0002 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds3_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds3_flakey
Lustre: DEBUG MARKER: dmsetup remove /dev/mapper/mds3_flakey
Lustre: DEBUG MARKER: dmsetup mknodes >/dev/null 2>&1
Lustre: DEBUG MARKER: modprobe -r dm-flakey
Lustre: DEBUG MARKER: sysctl -wq kernel/kptr_restrict=1 || true
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-lfsck: start setup 09:24:56 \(1741080296\) ===
Lustre: DEBUG MARKER: === sanity-lfsck: start setup 09:24:56 (1741080296) ===
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds3_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-70vm7.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: onyx-70vm7.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: [ -e /dev/vg_Role_MDS/mdt1 ]
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=1981808 --mkfsoptions="-b 4096" --reformat /dev/vg_Role_MDS/mdt1
LDISKFS-fs (dm-0): mounted filesystem f1be7494-9219-4a68-9eeb-469ba3c1059e r/w with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-0): unmounting filesystem f1be7494-9219-4a68-9eeb-469ba3c1059e.
Autotest: Test running for 135 minutes (lustre-reviews_review-dne-part-2_111482.30)
Autotest: Test running for 140 minutes (lustre-reviews_review-dne-part-2_111482.30)
Link to test
Return to new crashes list