Editing crashreport #70328

ReasonCrashing FunctionWhere to cut BacktraceReports Count
watchdog: BUG: soft lockup - cfs_hash_for_each_relaxcfs_hash_for_each_nolock
ldlm_namespace_cleanup
__ldlm_namespace_free
ldlm_namespace_free_prior
mdt_fini
mdt_device_fini
obd_precleanup
class_cleanup
class_process_config
class_manual_cleanup
server_put_super
generic_shutdown_super
kill_anon_super
deactivate_locked_super
cleanup_mnt
task_work_run
exit_to_user_mode_loop
exit_to_user_mode_prepare
syscall_exit_to_user_mode
do_syscall_64
entry_SYSCALL_64_after_hwframe
37

Added fields:

Match messages in logs
(every line would be required to be present in log output
Copy from "Messages before crash" column below):
Match messages in full crash
(every line would be required to be present in crash log output
Copy from "Full Crash" column below):
Limit to a test:
(Copy from below "Failing text"):
Delete these reports as invalid (real bug in review or some such)
Bug or comment:
Extra info:

Failures list (last 100):

Failing TestFull CrashMessages before crashComment
runtests test complete, duration 661 sec
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [umount:186536]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common pcspkr virtio_balloon i2c_piix4 joydev fuse drm ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul crc32_pclmul libata crc32c_intel virtio_net net_failover ghash_clmulni_intel failover virtio_blk serio_raw [last unloaded: obdecho(OE)]
CPU: 0 PID: 186536 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.40.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 0c e2 c3 eb 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffb01c0afb3650 EFLAGS: 00010282
RAX: ffffb01c05a7a008 RBX: 0000000000000051 RCX: 000000000000000e
RDX: ffffb01c05a71000 RSI: ffffb01c0afb3680 RDI: ffff9bd682cebb00
RBP: ffff9bd682cebb00 R08: 0000000000000018 R09: ffff9bd78af06626
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9bd682cebb00 R14: ffffb01c0afb36f8 R15: 0000000000000000
FS: 00007fc39b2dd540(0000) GS:ffff9bd6ffc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000563c234a5050 CR3: 00000000372aa005 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup.isra.0+0x8e/0x280 [obdclass]
? class_disconnect_exports+0x131/0x300 [obdclass]
class_cleanup+0x2db/0x600 [obdclass]
class_process_config+0x12ef/0x1e00 [obdclass]
? __kmem_cache_alloc_node+0x18f/0x2e0
? class_manual_cleanup+0x160/0x730 [obdclass]
? class_manual_cleanup+0x160/0x730 [obdclass]
class_manual_cleanup+0x1e5/0x730 [obdclass]
server_put_super+0xa86/0xc60 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? locks_dispose_list+0x54/0x70
? kmem_cache_free+0x156/0x360
? locks_dispose_list+0x54/0x70
? flock_lock_inode+0x21c/0x390
? locks_remove_flock+0xe6/0xf0
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __pfx_file_free_rcu+0x10/0x10
? __call_rcu_common.constprop.0+0x117/0x2b0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? kvm_sched_clock_read+0xd/0x20
? sched_clock+0xc/0x30
? sched_clock_cpu+0xb/0x190
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __irq_exit_rcu+0x46/0xc0
? sysvec_apic_timer_interrupt+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fc39b10e06b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === runtests: start cleanup 22:03:33 \(1755036213\) ===
Lustre: DEBUG MARKER: === runtests: start cleanup 22:03:33 (1755036213) ===
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === runtests: finish cleanup 22:03:34 \(1755036214\) ===
Lustre: DEBUG MARKER: === runtests: finish cleanup 22:03:34 (1755036214) ===
Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: sanity-lfsck ============----- Tue Aug 12 10:03:35 PM UTC 2025
Lustre: DEBUG MARKER: -----============= acceptance-small: sanity-lfsck ============----- Tue Aug 12 10:03:35 PM UTC 2025
Lustre: DEBUG MARKER: hostname -I
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/sanity-lfsck.*ex || true
Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/sanity-lfsck.*ex 2>/dev/null ||true
Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests:
Lustre: DEBUG MARKER: excepting tests:
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 43 previous similar messages
Link to test
sanity-quota test 7e: Quota reintegration (inode limits)
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:291900]
Modules linked in: dm_flakey tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common virtio_balloon pcspkr i2c_piix4 joydev drm fuse ext4 mbcache jbd2 ata_generic crct10dif_pclmul crc32_pclmul ata_piix crc32c_intel libata virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 1 PID: 291900 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.40.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 0c 22 46 c6 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa96a4b60b768 EFLAGS: 00010282
RAX: ffffa96a43154008 RBX: 000000000000006b RCX: 000000000000000e
RDX: ffffa96a43149000 RSI: ffffa96a4b60b798 RDI: ffff9bb973e36000
RBP: ffff9bb973e36000 R08: 0000000000000017 R09: ffff9bba76a823f2
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9bb973e36000 R14: ffffa96a4b60b810 R15: 0000000000000000
FS: 00007f2fbca90540(0000) GS:ffff9bb9ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055c01c8cc050 CR3: 00000000331f6005 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? cfs_hash_for_each_relax+0x154/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup.isra.0+0x8e/0x280 [obdclass]
? class_disconnect_exports+0x193/0x300 [obdclass]
class_cleanup+0x2db/0x600 [obdclass]
class_process_config+0x12ef/0x1e00 [obdclass]
? __kmem_cache_alloc_node+0x18f/0x2e0
? class_manual_cleanup+0x160/0x730 [obdclass]
? class_manual_cleanup+0x160/0x730 [obdclass]
class_manual_cleanup+0x1e5/0x730 [obdclass]
server_put_super+0xa86/0xc60 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? __check_object_size.part.0+0x47/0xd0
? locks_dispose_list+0x54/0x70
? kmem_cache_free+0x156/0x360
? locks_dispose_list+0x54/0x70
? flock_lock_inode+0x21c/0x390
? locks_remove_flock+0xe6/0xf0
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __pfx_file_free_rcu+0x10/0x10
? __call_rcu_common.constprop.0+0x117/0x2b0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7f2fbc90e06b
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_*
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight
Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0
Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1
Lustre: DEBUG MARKER: /usr/sbin/lctl dl
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0001.quota_slave.enabled
Lustre: DEBUG MARKER: /usr/sbin/lctl dl
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0003.quota_slave.enabled
Lustre: DEBUG MARKER: /usr/sbin/lctl dl
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0001.quota_slave.enabled
Lustre: DEBUG MARKER: /usr/sbin/lctl dl
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osd-ldiskfs.lustre-MDT0003.quota_slave.enabled
Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds4
Lustre: Failing over lustre-MDT0003
Lustre: lustre-MDT0003: Not available for connect from 10.240.26.102@tcp (stopping)
Lustre: Skipped 8 previous similar messages
Lustre: lustre-MDT0003: Not available for connect from 10.240.28.46@tcp (stopping)
Lustre: Skipped 8 previous similar messages
Lustre: lustre-MDT0003: Not available for connect from 10.240.28.46@tcp (stopping)
Lustre: Skipped 2 previous similar messages
Lustre: lustre-MDT0003-osp-MDT0001: Connection to lustre-MDT0003 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 7 previous similar messages
Lustre: lustre-MDT0003: Not available for connect from 10.240.26.102@tcp (stopping)
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0003: Not available for connect from 10.240.26.102@tcp (stopping)
Lustre: Skipped 12 previous similar messages
Lustre: lustre-MDT0003: Not available for connect from 10.240.26.102@tcp (stopping)
Lustre: Skipped 25 previous similar messages
Link to test
replay-ost-single test complete, duration 1902 sec
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [umount:55930]
Modules linked in: tls dm_flakey osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss sd_mod t10_pi nfsv4 sg iscsi_tcp libiscsi_tcp dns_resolver libiscsi nfs scsi_transport_iscsi lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common pcspkr i2c_piix4 virtio_balloon joydev sunrpc fuse drm ext4 mbcache jbd2 ata_generic ata_piix libata virtio_net crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel net_failover failover virtio_blk serio_raw [last unloaded: dm_flakey]
CPU: 0 PID: 55930 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.40.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 0c b2 cc f6 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0000:ffffaf038547f9a8 EFLAGS: 00010282
RAX: ffffaf03815dc008 RBX: 0000000000000008 RCX: 000000000000000e
RDX: ffffaf03815cd000 RSI: ffffaf038547f9d8 RDI: ffff88a6351f0700
RBP: ffff88a6351f0700 R08: 0000000000000018 R09: ffff88a7069647f1
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff88a6351f0700 R14: ffffaf038547fa50 R15: 0000000000000000
FS: 00007f5933a0f540(0000) GS:ffff88a6bfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd5ae35c88 CR3: 0000000026f2a002 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x169/0x2a8
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? cfs_hash_for_each_relax+0x154/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup.isra.0+0x8e/0x280 [obdclass]
? __pfx_obd_precleanup.isra.0+0x10/0x10 [obdclass]
class_cleanup+0x2db/0x600 [obdclass]
class_process_config+0x12ef/0x1e00 [obdclass]
? __kmem_cache_alloc_node+0x18f/0x2e0
? class_manual_cleanup+0x160/0x730 [obdclass]
? class_manual_cleanup+0x160/0x730 [obdclass]
class_manual_cleanup+0x1e5/0x730 [obdclass]
server_put_super+0xa86/0xc60 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __irq_exit_rcu+0x46/0xc0
? sysvec_apic_timer_interrupt+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7f593390e06b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === replay-ost-single: start cleanup 16:11:14 \(1755015074\) ===
Lustre: DEBUG MARKER: === replay-ost-single: start cleanup 16:11:14 (1755015074) ===
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === replay-ost-single: finish cleanup 16:11:17 \(1755015077\) ===
Lustre: DEBUG MARKER: === replay-ost-single: finish cleanup 16:11:17 (1755015077) ===
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
Lustre: lustre-MDT0000: Not available for connect from 10.240.29.136@tcp (stopping)
Lustre: Skipped 5 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.29.136@tcp (stopping)
Lustre: Skipped 6 previous similar messages
Link to test
conf-sanity test 121: failover MGS
watchdog: BUG: soft lockup - CPU#1 stuck for 25s! [umount:1721651]
Modules linked in: xfs libcrc32c ofd(OE) ost(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lzstd(OE) llz4hc(OE) llz4(OE) lustre(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) dm_flakey nfsv3 nfs_acl loop dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common pcspkr virtio_balloon i2c_piix4 joydev sunrpc drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata virtio_net crct10dif_pclmul crc32_pclmul crc32c_intel net_failover virtio_blk ghash_clmulni_intel failover serio_raw [last unloaded: libcfs(OE)]
CPU: 1 PID: 1721651 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.40.1_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 6c ee c6 f4 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffaac043b57830 EFLAGS: 00010282
RAX: ffffaac041ea3008 RBX: 0000000000000034 RCX: 000000000000000e
RDX: ffffaac041e75000 RSI: ffffaac043b57860 RDI: ffff89ff7057f600
RBP: ffff89ff7057f600 R08: 0000000000000017 R09: ffff8a0075b61e9b
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff89ff7057f600 R14: ffffaac043b578d8 R15: 0000000000000000
FS: 00007fb6de4b0540(0000) GS:ffff89ffffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055706c1fb088 CR3: 00000000379a4006 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x169/0x2a8
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xef/0x590 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
class_manual_cleanup+0x1e3/0x740 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __do_sys_flock+0x134/0x1a0
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fb6de30e06b
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1
LDISKFS-fs (dm-4): mounted filesystem 409d5a4a-1f77-4278-836d-2896bff6f46e r/w with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds3_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds3_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds3_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds3_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3; mount -t lustre -o localrecov /dev/mapper/mds3_flakey /mnt/lustre-mds3
LDISKFS-fs (dm-5): mounted filesystem fe55bcd2-bd52-4814-95a9-f7963799595f r/w with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds3_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds3_flakey 2>/dev/null
Lustre: 1715010:0:(service.c:2203:ptlrpc_server_handle_req_in()) @@@ Slow req_in handling 6s req@ffff89ff5d192450 x1840262831695488/t0(0) o400->bd150297-5221-4a17-9362-2804c0bf21fc@0@lo:0/0 lens 224/0 e 0 to 0 dl 0 ref 1 fl New:/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
Lustre: Failing over lustre-MDT0000
Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 18 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 44 previous similar messages
Lustre: server umount lustre-MDT0000 complete
LDISKFS-fs (dm-4): unmounting filesystem 409d5a4a-1f77-4278-836d-2896bff6f46e.
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
LustreError: lustre-MDT0000: not available for connect from 10.240.28.49@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: Skipped 80 previous similar messages
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre /dev/mapper/mds1_flakey /mnt/lustre-mds1
LDISKFS-fs (dm-4): mounted filesystem 409d5a4a-1f77-4278-836d-2896bff6f46e r/w with ordered data mode. Quota mode: journalled.
Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect
Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 3 clients reconnect
Lustre: lustre-MDT0000-lwp-MDT0002: Connection restored to 10.240.28.48@tcp (at 0@lo)
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: lustre-MDT0000: Recovery over after 0:02, of 3 clients 3 recovered and 0 were evicted.
Lustre: lustre-MDT0000-osp-MDT0002: Connection restored to 10.240.28.48@tcp (at 0@lo)
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mgc.\*.mgs_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mgc.\*.mgs_server_uuid
Lustre: DEBUG MARKER: onyx-49vm4.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mgc.*.mgs_server_uuid
Lustre: DEBUG MARKER: onyx-49vm5.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mgc.*.mgs_server_uuid
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
Lustre: server umount lustre-MDT0000 complete
LDISKFS-fs (dm-4): unmounting filesystem 409d5a4a-1f77-4278-836d-2896bff6f46e.
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
LustreError: 1715001:0:(ldlm_lockd.c:2574:ldlm_cancel_handler()) ldlm_cancel from 10.240.28.49@tcp arrived at 1755015157 with bad export cookie 10213683906329607606
LustreError: 1715001:0:(ldlm_lockd.c:2574:ldlm_cancel_handler()) Skipped 15 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3
Link to test
runtests test 1: All Runtests
watchdog: BUG: soft lockup - CPU#1 stuck for 25s! [umount:177908]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) tls ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i2c_piix4 virtio_balloon pcspkr joydev fuse drm ext4 mbcache jbd2 ata_generic ata_piix virtio_net libata crct10dif_pclmul crc32_pclmul crc32c_intel net_failover ghash_clmulni_intel failover virtio_blk serio_raw [last unloaded: obdecho(OE)]
CPU: 1 PID: 177908 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.40.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 1c dc a7 db 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0000:ffffa4fe486db7f8 EFLAGS: 00010282
RAX: ffffa4fe42eac008 RBX: 0000000000000035 RCX: 000000000000000e
RDX: ffffa4fe42ea3000 RSI: ffffa4fe486db828 RDI: ffff8fbb7b194f00
RBP: ffff8fbb7b194f00 R08: 0000000000000018 R09: ffff8fbc7d6cf3c7
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8fbb7b194f00 R14: ffffa4fe486db8a0 R15: 0000000000000000
FS: 00007fc3c5077540(0000) GS:ffff8fbbffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa19c90d000 CR3: 000000005b3ba004 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? cfs_hash_for_each_relax+0x154/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xdc/0x280 [obdclass]
? class_disconnect_exports+0x131/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
? class_manual_cleanup+0x160/0x740 [obdclass]
class_manual_cleanup+0x1e5/0x740 [obdclass]
server_put_super+0x98f/0xb40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? locks_remove_flock+0xe6/0xf0
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? rcu_nocb_try_bypass+0x5e/0x460
? __pfx_file_free_rcu+0x10/0x10
? __call_rcu_common.constprop.0+0x117/0x2b0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fc3c4f0e06b
Lustre: lustre-MDT0002: haven't heard from client 19f4eb20-65a9-4862-be08-7d91d197f87e (at 10.240.26.181@tcp) in 104 seconds. I think it's dead, and I am evicting it. exp ffff8fbb7df8e000, cur 1752681198 deadline 1752681194 last 1752681094
Lustre: DEBUG MARKER: /usr/sbin/lctl mark touching \/mnt\/lustre at Wed Jul 16 03:53:21 PM UTC 2025 \(@1752681201\)
Lustre: DEBUG MARKER: touching /mnt/lustre at Wed Jul 16 03:53:21 PM UTC 2025 (@1752681201)
Lustre: DEBUG MARKER: /usr/sbin/lctl mark create an empty file \/mnt\/lustre\/hosts.138949
Lustre: DEBUG MARKER: create an empty file /mnt/lustre/hosts.138949
Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.138949
Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.138949
Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing \/etc\/hosts and \/mnt\/lustre\/hosts.138949
Lustre: DEBUG MARKER: comparing /etc/hosts and /mnt/lustre/hosts.138949
Lustre: DEBUG MARKER: /usr/sbin/lctl mark renaming \/mnt\/lustre\/hosts.138949 to \/mnt\/lustre\/hosts.138949.ren
Lustre: DEBUG MARKER: renaming /mnt/lustre/hosts.138949 to /mnt/lustre/hosts.138949.ren
Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.138949 again
Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.138949 again
Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.138949
Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.138949
Lustre: DEBUG MARKER: /usr/sbin/lctl mark removing \/mnt\/lustre\/hosts.138949
Lustre: DEBUG MARKER: removing /mnt/lustre/hosts.138949
Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.138949.2
Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.138949.2
Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.138949.2 to 123 bytes
Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.138949.2 to 123 bytes
Lustre: DEBUG MARKER: /usr/sbin/lctl mark creating \/mnt\/lustre\/d1.runtests
Lustre: DEBUG MARKER: creating /mnt/lustre/d1.runtests
Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying 1000 files from \/etc, \/usr\/bin to \/mnt\/lustre\/d1.runtests\/etc, \/mnt\/lustre\/d1.runtests\/usr\/bin at Wed Jul 16 03:53:29 PM UTC 2025
Lustre: DEBUG MARKER: copying 1000 files from /etc, /usr/bin to /mnt/lustre/d1.runtests/etc, /mnt/lustre/d1.runtests/usr/bin at Wed Jul 16 03:53:29 PM UTC 2025
Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing 1000 newly copied files at Wed Jul 16 03:53:39 PM UTC 2025
Lustre: DEBUG MARKER: comparing 1000 newly copied files at Wed Jul 16 03:53:39 PM UTC 2025
Lustre: DEBUG MARKER: /usr/sbin/lctl mark running createmany -d \/mnt\/lustre\/d1.runtests\/d 1000
Lustre: DEBUG MARKER: running createmany -d /mnt/lustre/d1.runtests/d 1000
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n debug
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=ha
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=super+ioctl+neterror+warning+dlmtrace+error+emerg+ha+rpctrace+vfstrace+config+console+lfsck
Lustre: DEBUG MARKER: /usr/sbin/lctl mark finished at Wed Jul 16 03:53:48 PM UTC 2025 \(27\)
Lustre: DEBUG MARKER: finished at Wed Jul 16 03:53:48 PM UTC 2025 (27)
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.47@tcp (stopping)
Lustre: Skipped 16 previous similar messages
LDISKFS-fs (dm-3): unmounting filesystem 0bd95929-b4a6-4022-b1a8-86a2472b40bb.
Lustre: server umount lustre-MDT0000 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
LustreError: 10462:0:(ldlm_lockd.c:2546:ldlm_cancel_handler()) ldlm_cancel from 10.240.28.47@tcp arrived at 1752681260 with bad export cookie 17431209940234910117
LustreError: 10462:0:(ldlm_lockd.c:2546:ldlm_cancel_handler()) Skipped 4 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3
LustreError: 10462:0:(ldlm_lockd.c:2546:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1752681268 with bad export cookie 17431209940234909487
LDISKFS-fs (dm-4): unmounting filesystem 42ae755c-0811-465b-8f12-cd40b4c04bcc.
Lustre: server umount lustre-MDT0002 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts);
Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts);
Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts);
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1
LDISKFS-fs (dm-3): mounted filesystem 0bd95929-b4a6-4022-b1a8-86a2472b40bb r/w with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Autotest: Test running for 65 minutes (lustre-reviews_review-dne-part-2_115076.43)
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds3_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds3_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds3_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds3_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3; mount -t lustre -o localrecov /dev/mapper/mds3_flakey /mnt/lustre-mds3
LDISKFS-fs (dm-4): mounted filesystem 42ae755c-0811-465b-8f12-cd40b4c04bcc r/w with ordered data mode. Quota mode: journalled.
Lustre: lustre-MDT0002: Not available for connect from 10.240.28.47@tcp (not set up)
Lustre: Skipped 21 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds3_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds3_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P debug_raw_pointers=Y
Lustre: 175708:0:(mgs_llog.c:1343:mgs_modify_param()) MGS: modify general/debug_raw_pointers=Y (mode = 0) failed: rc = -17
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-80vm8.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-80vm7.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-80vm7.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: lctl get_param -n timeout
Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20
Lustre: DEBUG MARKER: Using TIMEOUT=20
Lustre: DEBUG MARKER: [ -f /sys/module/mgc/parameters/mgc_requeue_timeout_min ] && echo 1 > /sys/module/mgc/parameters/mgc_requeue_timeout_min; exit 0
Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P lod.*.mdt_hash=crush
Lustre: 177455:0:(mgs_llog.c:1343:mgs_modify_param()) MGS: modify general/lod.*.mdt_hash=crush (mode = 0) failed: rc = -17
Lustre: DEBUG MARKER: sysctl --values kernel/kptr_restrict
Lustre: DEBUG MARKER: sysctl -wq kernel/kptr_restrict=1
Lustre: DEBUG MARKER: /usr/sbin/lctl mark After first remount. comparing 1000 previously copied files
Lustre: DEBUG MARKER: After first remount. comparing 1000 previously copied files
Lustre: DEBUG MARKER: /usr/sbin/lctl mark running statmany -s \/mnt\/lustre\/d1.runtests\/d 1000 2000
Lustre: DEBUG MARKER: running statmany -s /mnt/lustre/d1.runtests/d 1000 2000
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107
LustreError: Skipped 18 previous similar messages
Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 16 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 1 previous similar message
Link to test
sanity-sec test 27a: test fileset in various nodemaps
watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [umount:179333]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common joydev i2c_piix4 pcspkr virtio_balloon drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net virtio_blk ghash_clmulni_intel net_failover failover serio_raw
CPU: 1 PID: 179333 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.40.1_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 2c a1 20 e9 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa30780ff7790 EFLAGS: 00010282
RAX: ffffa307865f6008 RBX: 000000000000007a RCX: 000000000000000e
RDX: ffffa307865b9000 RSI: ffffa30780ff77c0 RDI: ffff95fa354a0200
RBP: ffff95fa354a0200 R08: 0000000000000018 R09: ffff95fb28e361cd
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff95fa354a0200 R14: ffffa30780ff7838 R15: 0000000000000000
FS: 00007fe68b48f540(0000) GS:ffff95fabfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000556d80d7c088 CR3: 000000002ecd4001 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x169/0x2a8
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x160/0x740 [obdclass]
class_manual_cleanup+0x1e3/0x740 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? kmem_cache_free+0x156/0x360
? locks_dispose_list+0x54/0x70
? flock_lock_inode+0x21c/0x390
? locks_remove_flock+0xe6/0xf0
? rcu_nocb_try_bypass+0x5e/0x460
? __pfx_file_free_rcu+0x10/0x10
? __call_rcu_common.constprop.0+0x117/0x2b0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fe68b30e06b
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param mdt.*.identity_upcall=NONE
Lustre: 177106:0:(mdt_lproc.c:329:identity_upcall_store()) lustre-MDT0001: disable "identity_upcall" with ACL enabled maybe cause unexpected "EACCESS"
Lustre: 177106:0:(mdt_lproc.c:329:identity_upcall_store()) Skipped 5 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.active
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.admin_nodemap
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.c0.trusted_nodemap
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.c0.trusted_nodemap
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.c0.fileset
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.c0.admin_nodemap
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.c0.trusted_nodemap
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.c0.map_mode
Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.c0.deny_unknown
Lustre: 47620:0:(gss_svc_upcall.c:701:gss_svc_searchbyctx()) ctx hdl 0xfdb1373f9745e785 does not have mech ctx: rc = -2
Lustre: 47620:0:(gss_svc_upcall.c:701:gss_svc_searchbyctx()) Skipped 3 previous similar messages
Lustre: lustre-MDT0000-osp-MDT0003: Connection to lustre-MDT0000 (at 10.240.28.49@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 30 previous similar messages
LustreError: lustre-MDT0000-osp-MDT0003: operation mds_statfs to node 10.240.28.49@tcp failed: rc = -107
LustreError: Skipped 12 previous similar messages
Lustre: 9124:0:(client.c:2360:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1749838182/real 1749838182] req@ffff95fa37ebc340 x1834828258916864/t0(0) o400->MGC10.240.28.49@tcp@10.240.28.49@tcp:26/25 lens 224/224 e 0 to 1 dl 1749838198 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
Lustre: 9124:0:(client.c:2360:ptlrpc_expire_one_request()) Skipped 16 previous similar messages
LustreError: MGC10.240.28.49@tcp: Connection to MGS (at 10.240.28.49@tcp) was lost; in progress operations using this service will fail
LustreError: Skipped 6 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
Lustre: lustre-MDT0001: Not available for connect from 10.240.27.21@tcp (stopping)
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.49@tcp (stopping)
Lustre: Skipped 3 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.49@tcp (stopping)
LDISKFS-fs (dm-3): unmounting filesystem f758d908-ea66-41c5-a636-322b91ac1906.
Lustre: server umount lustre-MDT0001 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
LustreError: lustre-MDT0001: not available for connect from 10.240.28.49@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: Skipped 50 previous similar messages
LustreError: lustre-MDT0001: not available for connect from 10.240.27.21@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: Skipped 14 previous similar messages
LustreError: lustre-MDT0001: not available for connect from 10.240.27.21@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: Skipped 16 previous similar messages
LustreError: lustre-MDT0001: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: Skipped 30 previous similar messages
Autotest: Test running for 170 minutes (lustre-b_es-reviews_review-dne-selinux-ssk-part-2_23876.89)
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds4
Lustre: lustre-MDT0003: Not available for connect from 10.240.27.21@tcp (stopping)
Lustre: Skipped 9 previous similar messages
Lustre: lustre-MDT0003: Not available for connect from 10.240.27.21@tcp (stopping)
Lustre: Skipped 7 previous similar messages
Lustre: lustre-MDT0003: Not available for connect from 10.240.27.21@tcp (stopping)
Lustre: Skipped 15 previous similar messages
LustreError: lustre-MDT0001: not available for connect from 10.240.27.21@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: Skipped 49 previous similar messages
Link to test
sanity-pcc test 1c: Test automated attach using Project ID with manual HSM restore
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:457269]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common virtio_balloon i2c_piix4 pcspkr joydev sunrpc drm fuse ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw
CPU: 1 PID: 457269 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.40.1_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 2c 01 0b dc 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffaaf4c9e27680 EFLAGS: 00010282
RAX: ffffaaf4c7545008 RBX: 000000000000006e RCX: 000000000000000e
RDX: ffffaaf4c7533000 RSI: ffffaaf4c9e276b0 RDI: ffff8fcb01cd8800
RBP: ffff8fcb01cd8800 R08: 0000000000000018 R09: ffff8fcbf19ed869
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8fcb01cd8800 R14: ffffaaf4c9e27728 R15: 0000000000000000
FS: 00007fc84bc31540(0000) GS:ffff8fcb7fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f310e91d0e8 CR3: 0000000045c42003 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x169/0x2a8
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x160/0x740 [obdclass]
class_manual_cleanup+0x1e3/0x740 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? rcu_nocb_try_bypass+0x5e/0x460
? __pfx_file_free_rcu+0x10/0x10
? __call_rcu_common.constprop.0+0x117/0x2b0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? terminate_walk+0xe5/0xf0
? path_openat+0xc1/0x280
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __check_object_size.part.0+0x47/0xd0
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? do_syscall_64+0x6b/0xf0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fc84bb0e06b
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: lfs --list-commands
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.49@tcp (stopping)
Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 11 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.49@tcp (stopping)
Lustre: Skipped 2 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 11 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 13 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.49@tcp (stopping)
Lustre: Skipped 15 previous similar messages
Link to test
recovery-small test 149: skip orphan removal at umount
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:83397]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill virtio_balloon i2c_piix4 sunrpc intel_rapl_msr intel_rapl_common joydev pcspkr drm fuse ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata virtio_net net_failover virtio_blk ghash_clmulni_intel failover serio_raw
CPU: 1 PID: 83397 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.38.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 5c df 2f f7 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa99dcb7ff6b0 EFLAGS: 00010282
RAX: ffffa99dc9c70008 RBX: 0000000000000074 RCX: 000000000000000e
RDX: ffffa99dc9c41000 RSI: ffffa99dcb7ff6e0 RDI: ffff8dfd3c12a500
RBP: ffff8dfd3c12a500 R08: 0000000000000017 R09: ffff8dfe3a985714
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8dfd3c12a500 R14: ffffa99dcb7ff758 R15: 0000000000000000
FS: 00007ff9050b0540(0000) GS:ffff8dfdbfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fda1eeb9068 CR3: 000000003b03e004 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xdc/0x280 [obdclass]
? class_disconnect_exports+0x193/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
class_manual_cleanup+0x1e5/0x740 [obdclass]
server_put_super+0x98f/0xb40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? locks_insert_lock_ctx+0x52/0x90
? _raw_spin_unlock+0xa/0x30
? flock_lock_inode+0x21c/0x390
? locks_lock_inode_wait+0x6d/0x180
? syscall_exit_work+0x103/0x130
? rseq_get_rseq_cs+0x1d/0x240
? rseq_ip_fixup+0x6e/0x1a0
? __rseq_handle_notify_resume+0x26/0xb0
? exit_to_user_mode_loop+0xd9/0x130
? exit_to_user_mode_prepare+0xef/0x100
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7ff904f0e06b
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
Lustre: lustre-MDT0001: Not available for connect from 10.240.23.8@tcp (stopping)
Lustre: Skipped 1 previous similar message
LustreError: lustre-MDT0001-osp-MDT0003: operation mds_statfs to node 0@lo failed: rc = -107
LustreError: Skipped 3 previous similar messages
Lustre: lustre-MDT0001-osp-MDT0003: Connection to lustre-MDT0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 13 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 7 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.46@tcp (stopping)
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0001: Not available for connect from 10.240.23.8@tcp (stopping)
Lustre: Skipped 3 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.23.8@tcp (stopping)
Lustre: Skipped 12 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.46@tcp (stopping)
Lustre: Skipped 21 previous similar messages
Link to test
sanity-sec test 62: e2fsck with encrypted files
watchdog: BUG: soft lockup - CPU#0 stuck for 24s! [umount:242541]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) osd_ldiskfs(OE) lquota(OE) ldiskfs(OE) lzstd(OE) llz4hc(OE) llz4(OE) lustre(OE) obdecho(OE) mgc(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common i2c_piix4 pcspkr virtio_balloon joydev sunrpc drm fuse ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul crc32c_intel ata_generic ata_piix virtio_blk ghash_clmulni_intel libata virtio_net net_failover failover serio_raw [last unloaded: lzstd(OE)]
CPU: 0 PID: 242541 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.38.1_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 9c d1 00 e7 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa3dd0787f8f0 EFLAGS: 00010282
RAX: ffffa3dd02ae6008 RBX: 0000000000000071 RCX: 000000000000000e
RDX: ffffa3dd02aa7000 RSI: ffffa3dd0787f920 RDI: ffff8aa7044b3400
RBP: ffff8aa7044b3400 R08: 0000000000000018 R09: ffff8aa83197dd19
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8aa7044b3400 R14: ffffa3dd0787f998 R15: 0000000000000000
FS: 00007f1339136540(0000) GS:ffff8aa7bfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055e41a39c2a0 CR3: 000000001c958001 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x160/0x740 [obdclass]
class_manual_cleanup+0x1e3/0x740 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? __do_sys_newfstatat+0x35/0x60
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? do_user_addr_fault+0x1d6/0x6a0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7f133930e06b
LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 10.240.28.44@tcp failed: rc = -107
Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 10.240.28.44@tcp) was lost; in progress operations using this service will wait for recovery to complete
LustreError: lustre-MDT0000-osp-MDT0003: operation mds_statfs to node 10.240.28.44@tcp failed: rc = -107
Lustre: lustre-MDT0000-osp-MDT0003: Connection to lustre-MDT0000 (at 10.240.28.44@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: lustre-MDT0000-lwp-MDT0001: Connection to lustre-MDT0000 (at 10.240.28.44@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LustreError: MGC10.240.28.44@tcp: Connection to MGS (at 10.240.28.44@tcp) was lost; in progress operations using this service will fail
Lustre: lustre-MDT0001-osp-MDT0003: Connection to lustre-MDT0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.192@tcp (stopping)
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 7 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 10 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 19 previous similar messages
Link to test
runtests test 1: All Runtests
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [umount:85004]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill virtio_balloon i2c_piix4 intel_rapl_msr intel_rapl_common joydev pcspkr sunrpc drm fuse ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul crc32_pclmul libata crc32c_intel virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: obdecho(OE)]
CPU: 0 PID: 85004 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.38.1_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 9c 11 59 c7 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffbccdc0eb38f8 EFLAGS: 00010282
RAX: ffffbccdc65b3008 RBX: 000000000000007a RCX: 000000000000000e
RDX: ffffbccdc658d000 RSI: ffffbccdc0eb3928 RDI: ffff9216f73efc00
RBP: ffff9216f73efc00 R08: 0000000000000018 R09: ffff9217c9913b58
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9216f73efc00 R14: ffffbccdc0eb39a0 R15: 0000000000000000
FS: 00007fec044a9540(0000) GS:ffff92177fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055b0621ba5d0 CR3: 000000000794e001 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? cfs_hash_for_each_relax+0x154/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x160/0x740 [obdclass]
class_manual_cleanup+0x1e3/0x740 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? kmem_cache_free+0x15/0x360
? do_sys_openat2+0x81/0xd0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? do_user_addr_fault+0x1d6/0x6a0
? syscall_exit_work+0x103/0x130
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fec0430e06b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark touching \/mnt\/lustre at Thu May 22 12:09:25 PM UTC 2025 \(@1747915765\)
Lustre: DEBUG MARKER: touching /mnt/lustre at Thu May 22 12:09:25 PM UTC 2025 (@1747915765)
Lustre: DEBUG MARKER: /usr/sbin/lctl mark create an empty file \/mnt\/lustre\/hosts.136152
Lustre: DEBUG MARKER: create an empty file /mnt/lustre/hosts.136152
Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.136152
Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.136152
Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing \/etc\/hosts and \/mnt\/lustre\/hosts.136152
Lustre: DEBUG MARKER: comparing /etc/hosts and /mnt/lustre/hosts.136152
Lustre: DEBUG MARKER: /usr/sbin/lctl mark renaming \/mnt\/lustre\/hosts.136152 to \/mnt\/lustre\/hosts.136152.ren
Lustre: DEBUG MARKER: renaming /mnt/lustre/hosts.136152 to /mnt/lustre/hosts.136152.ren
Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.136152 again
Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.136152 again
Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.136152
Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.136152
Lustre: DEBUG MARKER: /usr/sbin/lctl mark removing \/mnt\/lustre\/hosts.136152
Lustre: DEBUG MARKER: removing /mnt/lustre/hosts.136152
Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.136152.2
Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.136152.2
Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.136152.2 to 123 bytes
Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.136152.2 to 123 bytes
Lustre: DEBUG MARKER: /usr/sbin/lctl mark creating \/mnt\/lustre\/d1.runtests
Lustre: DEBUG MARKER: creating /mnt/lustre/d1.runtests
Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying 721 files from \/etc \/bin to \/mnt\/lustre\/d1.runtests\/etc \/bin at Thu May 22 12:09:34 PM UTC 2025
Lustre: DEBUG MARKER: copying 721 files from /etc /bin to /mnt/lustre/d1.runtests/etc /bin at Thu May 22 12:09:34 PM UTC 2025
Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing 721 newly copied files at Thu May 22 12:10:00 PM UTC 2025
Lustre: DEBUG MARKER: comparing 721 newly copied files at Thu May 22 12:10:00 PM UTC 2025
Lustre: DEBUG MARKER: /usr/sbin/lctl mark running createmany -d \/mnt\/lustre\/d1.runtests\/d 721
Lustre: DEBUG MARKER: running createmany -d /mnt/lustre/d1.runtests/d 721
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=0
Autotest: Test running for 65 minutes (lustre-b_es-reviews_review-dne-part-2_23487.30)
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=super+ioctl+neterror+warning+dlmtrace+error+emerg+ha+rpctrace+vfstrace+config+console+lfsck
Lustre: DEBUG MARKER: /usr/sbin/lctl mark finished at Thu May 22 12:11:20 PM UTC 2025 \(115\)
Lustre: DEBUG MARKER: finished at Thu May 22 12:11:20 PM UTC 2025 (115)
LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 10.240.28.44@tcp failed: rc = -107
LustreError: Skipped 12 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 18 previous similar messages
LDISKFS-fs (dm-3): unmounting filesystem 55872a69-c2e4-4b79-b765-a57fbe80deb3.
Lustre: server umount lustre-MDT0001 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
LustreError: lustre-MDT0001: not available for connect from 10.240.28.192@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: Skipped 232 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds4
Link to test
sanity-flr test complete, duration 2056 sec
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [umount:1163139]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc virtio_balloon i2c_piix4 intel_rapl_msr intel_rapl_common pcspkr joydev drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net virtio_blk ghash_clmulni_intel net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 0 PID: 1163139 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 8c b8 d6 c3 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffbc3380ba7950 EFLAGS: 00010282
RAX: ffffbc3382de3008 RBX: 000000000000006d RCX: 000000000000000e
RDX: ffffbc3382dab000 RSI: ffffbc3380ba7980 RDI: ffffa0a870490a00
RBP: ffffa0a870490a00 R08: 0000000000000018 R09: ffffa0a9810d683f
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffffa0a870490a00 R14: ffffbc3380ba79f8 R15: 0000000000000000
FS: 00007f003f0ba540(0000) GS:ffffa0a8ffc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005590ede58050 CR3: 000000003a18a005 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xdc/0x280 [obdclass]
? class_disconnect_exports+0x131/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
? class_manual_cleanup+0x161/0x7a0 [obdclass]
class_manual_cleanup+0x43b/0x7a0 [obdclass]
server_put_super+0x98f/0xb40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7f003ef0df0b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-flr: start cleanup 22:50:20 \(1746831020\) ===
Lustre: DEBUG MARKER: === sanity-flr: start cleanup 22:50:20 (1746831020) ===
Lustre: 1108514:0:(osd_handler.c:2066:osd_trans_start()) lustre-MDT0000: credits 2472 > trans_max 2464
Lustre: 1108514:0:(osd_handler.c:1960:osd_trans_dump_creds()) create: 10/40/0, destroy: 1/4/0
Lustre: 1108514:0:(osd_handler.c:1960:osd_trans_dump_creds()) Skipped 17 previous similar messages
Lustre: 1108514:0:(osd_handler.c:1967:osd_trans_dump_creds()) attr_set: 125/125/0, xattr_set: 187/1856/0
Lustre: 1108514:0:(osd_handler.c:1967:osd_trans_dump_creds()) Skipped 17 previous similar messages
Lustre: 1108514:0:(osd_handler.c:1974:osd_trans_dump_creds()) write: 44/253/0, punch: 0/0/0, quota 0/0/0
Lustre: 1108514:0:(osd_handler.c:1974:osd_trans_dump_creds()) Skipped 17 previous similar messages
Lustre: 1108514:0:(osd_handler.c:1984:osd_trans_dump_creds()) insert: 11/186/0, delete: 2/5/0
Lustre: 1108514:0:(osd_handler.c:1984:osd_trans_dump_creds()) Skipped 17 previous similar messages
Lustre: 1108514:0:(osd_handler.c:1991:osd_trans_dump_creds()) ref_add: 1/1/0, ref_del: 2/2/0
Lustre: 1108514:0:(osd_handler.c:1991:osd_trans_dump_creds()) Skipped 17 previous similar messages
CPU: 0 PID: 1108514 Comm: mdt00_006 Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x48
osd_trans_start+0x67c/0x6d0 [osd_ldiskfs]
top_trans_start+0x275/0x4d0 [ptlrpc]
mdd_unlink+0x42e/0xee0 [mdd]
? lustre_msg_get_versions+0x23/0x100 [ptlrpc]
mdt_reint_unlink+0xb1e/0x13d0 [mdt]
mdt_reint_rec+0x11c/0x270 [mdt]
mdt_reint_internal+0x4ea/0x9b0 [mdt]
mdt_reint+0x59/0x110 [mdt]
tgt_handle_request0+0x14a/0x770 [ptlrpc]
tgt_request_handle+0x1eb/0xb80 [ptlrpc]
ptlrpc_server_handle_request.isra.0+0x2a0/0xce0 [ptlrpc]
ptlrpc_main+0xa7e/0xfa0 [ptlrpc]
? __pfx_ptlrpc_main+0x10/0x10 [ptlrpc]
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.mdt=none
Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-flr: finish cleanup 22:50:25 \(1746831025\) ===
Lustre: DEBUG MARKER: === sanity-flr: finish cleanup 22:50:25 (1746831025) ===
Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: sanity-lsnapshot ============----- Fri May 9 10:50:26 PM UTC 2025
Lustre: DEBUG MARKER: -----============= acceptance-small: sanity-lsnapshot ============----- Fri May 9 10:50:26 PM UTC 2025
Lustre: DEBUG MARKER: hostname -I
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/sanity-lsnapshot.*ex || true
Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/sanity-lsnapshot.*ex 2>/dev/null ||true
Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests:
Lustre: DEBUG MARKER: excepting tests:
Lustre: DEBUG MARKER: /usr/sbin/lctl mark SKIP: sanity-lsnapshot ZFS only test
Lustre: DEBUG MARKER: SKIP: sanity-lsnapshot ZFS only test
Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: mmp ============----- Fri May 9 10:50:33 PM UTC 2025
Lustre: DEBUG MARKER: -----============= acceptance-small: mmp ============----- Fri May 9 10:50:33 PM UTC 2025
Lustre: DEBUG MARKER: hostname -I
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/mmp.*ex || true
Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/mmp.*ex 2>/dev/null ||true
Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests:
Lustre: DEBUG MARKER: excepting tests:
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.50@tcp (stopping)
Lustre: Skipped 16 previous similar messages
LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107
LustreError: Skipped 8 previous similar messages
Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 5 previous similar messages
LustreError: 1162459:0:(obd_class.h:478:obd_check_dev()) Device 18 not setup
LustreError: 1162459:0:(obd_class.h:478:obd_check_dev()) Skipped 23 previous similar messages
LDISKFS-fs (dm-3): unmounting filesystem 3f7995a9-d779-4e80-8866-23fd689a0455.
Lustre: server umount lustre-MDT0000 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
LustreError: 947988:0:(ldlm_lib.c:1110:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.148@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: 947988:0:(ldlm_lib.c:1110:target_handle_connect()) Skipped 40 previous similar messages
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup remove /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup mknodes >/dev/null 2>&1
Lustre: DEBUG MARKER: modprobe -r dm-flakey
LustreError: 949576:0:(ldlm_lockd.c:2546:ldlm_cancel_handler()) ldlm_cancel from 10.240.28.50@tcp arrived at 1746831060 with bad export cookie 4797925813244203296
LustreError: 949576:0:(ldlm_lockd.c:2546:ldlm_cancel_handler()) Skipped 4 previous similar messages
Lustre: 10127:0:(client.c:2445:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1746831055/real 1746831055] req@ffffa0a8781c7a80 x1831652589231232/t0(0) o400->MGC10.240.28.47@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1746831071 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 projid:4294967295
Lustre: 10127:0:(client.c:2445:ptlrpc_expire_one_request()) Skipped 1 previous similar message
LustreError: MGC10.240.28.47@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3
Link to test
conf-sanity test 61b: large xattr
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [umount:622375]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) tls dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common sunrpc virtio_balloon pcspkr joydev i2c_piix4 drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net net_failover ghash_clmulni_intel virtio_blk failover serio_raw [last unloaded: libcfs(OE)]
CPU: 0 PID: 622375 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.31.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 3c 32 67 f8 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffb5e0c6c337c8 EFLAGS: 00010282
RAX: ffffb5e0c5378008 RBX: 0000000000000071 RCX: 000000000000000e
RDX: ffffb5e0c534f000 RSI: ffffb5e0c6c337f8 RDI: ffff9b3d772dd800
RBP: ffff9b3d772dd800 R08: 0000000000000017 R09: ffff9b3e45f11ce6
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9b3d772dd800 R14: ffffb5e0c6c33870 R15: 0000000000000000
FS: 00007f0bc5637540(0000) GS:ffff9b3dffc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffc79b29988 CR3: 00000000314a4001 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x169/0x2a8
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? cfs_hash_for_each_relax+0x154/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xf3/0x220 [obdclass]
? class_disconnect_exports+0x193/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
class_manual_cleanup+0x1e5/0x740 [obdclass]
server_put_super+0x98f/0xb40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? do_syscall_64+0x6b/0xf0
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __do_sys_flock+0x134/0x1a0
? rseq_get_rseq_cs+0x1d/0x240
? rseq_ip_fixup+0x6e/0x1a0
? fpregs_restore_userregs+0x47/0xd0
? exit_to_user_mode_prepare+0xef/0x100
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? do_sys_openat2+0x81/0xd0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7f0bc550e06b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-82vm10.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-82vm10.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2
LDISKFS-fs (dm-3): mounted filesystem 915264d7-46e2-4556-bb4f-17b60d85a4af r/w with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm8.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm8.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm8.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm8.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-82vm10.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-82vm10.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4; mount -t lustre -o localrecov /dev/mapper/mds4_flakey /mnt/lustre-mds4
LDISKFS-fs (dm-4): mounted filesystem ffb46ee1-20cd-494b-a78c-5e9e3538b604 r/w with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm8.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm8.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm8.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm8.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-67vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-67vm3.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-67vm5.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LustreError: MGC10.240.26.225@tcp: Connection to MGS (at 10.240.26.225@tcp) was lost; in progress operations using this service will fail
LustreError: Skipped 5 previous similar messages
LDISKFS-fs (dm-3): unmounting filesystem 915264d7-46e2-4556-bb4f-17b60d85a4af.
Lustre: server umount lustre-MDT0001 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds4
Link to test
sanity-pfl test complete, duration 2574 sec
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [umount:512348]
Modules linked in: dm_flakey tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc virtio_balloon i2c_piix4 intel_rapl_msr intel_rapl_common joydev pcspkr fuse drm ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul libata crc32_pclmul crc32c_intel virtio_net virtio_blk ghash_clmulni_intel net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 0 PID: 512348 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 8c ea 34 e5 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffb2fa88a2f990 EFLAGS: 00010282
RAX: ffffb2fa81f69008 RBX: 000000000000002f RCX: 000000000000000e
RDX: ffffb2fa81f39000 RSI: ffffb2fa88a2f9c0 RDI: ffff9a48bae48700
RBP: ffff9a48bae48700 R08: 0000000000000018 R09: ffff9a49aedf313d
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9a48bae48700 R14: ffffb2fa88a2fa38 R15: 0000000000000000
FS: 00007ff393878540(0000) GS:ffff9a493fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055f413a208a8 CR3: 0000000026fd8003 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xf3/0x220 [obdclass]
? class_disconnect_exports+0x131/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
? class_manual_cleanup+0x161/0x7a0 [obdclass]
class_manual_cleanup+0x43b/0x7a0 [obdclass]
server_put_super+0x98f/0xb40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? __rseq_handle_notify_resume+0x26/0xb0
? exit_to_user_mode_loop+0xd0/0x130
? exit_to_user_mode_prepare+0xb6/0x100
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7ff39370df0b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-pfl: start cleanup 13:41:03 \(1745502063\) ===
Lustre: DEBUG MARKER: === sanity-pfl: start cleanup 13:41:03 (1745502063) ===
Lustre: 466863:0:(osd_handler.c:2067:osd_trans_start()) lustre-MDT0001: credits 30942 > trans_max 2464
Lustre: 466863:0:(osd_handler.c:1961:osd_trans_dump_creds()) create: 10/40/0, destroy: 1/4/0
Lustre: 466863:0:(osd_handler.c:1961:osd_trans_dump_creds()) Skipped 9 previous similar messages
Lustre: 466863:0:(osd_handler.c:1968:osd_trans_dump_creds()) attr_set: 2023/2023/0, xattr_set: 3034/28428/0
Lustre: 466863:0:(osd_handler.c:1968:osd_trans_dump_creds()) Skipped 9 previous similar messages
Lustre: 466863:0:(osd_handler.c:1975:osd_trans_dump_creds()) write: 44/253/0, punch: 0/0/0, quota 0/0/0
Lustre: 466863:0:(osd_handler.c:1975:osd_trans_dump_creds()) Skipped 9 previous similar messages
Lustre: 466863:0:(osd_handler.c:1985:osd_trans_dump_creds()) insert: 11/186/0, delete: 2/5/0
Lustre: 466863:0:(osd_handler.c:1985:osd_trans_dump_creds()) Skipped 9 previous similar messages
Lustre: 466863:0:(osd_handler.c:1992:osd_trans_dump_creds()) ref_add: 1/1/0, ref_del: 2/2/0
Lustre: 466863:0:(osd_handler.c:1992:osd_trans_dump_creds()) Skipped 9 previous similar messages
CPU: 0 PID: 466863 Comm: mdt00_009 Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x48
osd_trans_start+0x67c/0x6d0 [osd_ldiskfs]
top_trans_start+0x275/0x4d0 [ptlrpc]
mdd_unlink+0x42e/0xee0 [mdd]
? lustre_msg_get_versions+0x23/0x100 [ptlrpc]
mdt_reint_unlink+0xc14/0x1540 [mdt]
mdt_reint_rec+0x11c/0x270 [mdt]
mdt_reint_internal+0x4ea/0x9b0 [mdt]
mdt_reint+0x59/0x110 [mdt]
tgt_handle_request0+0x14a/0x770 [ptlrpc]
tgt_request_handle+0x1eb/0xb80 [ptlrpc]
ptlrpc_server_handle_request.isra.0+0x2a0/0xce0 [ptlrpc]
ptlrpc_main+0xa7e/0xfa0 [ptlrpc]
? __pfx_ptlrpc_main+0x10/0x10 [ptlrpc]
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-pfl: finish cleanup 13:41:06 \(1745502066\) ===
Lustre: DEBUG MARKER: === sanity-pfl: finish cleanup 13:41:06 (1745502066) ===
Autotest: Test running for 370 minutes (lustre-reviews_review-dne-part-1_112737.20)
Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 10.240.28.44@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 9 previous similar messages
Lustre: 9005:0:(client.c:2340:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1745502072/real 1745502072] req@ffff9a48f20cba80 x1830268878699136/t0(0) o400->MGC10.240.28.44@tcp@10.240.28.44@tcp:26/25 lens 224/224 e 0 to 1 dl 1745502088 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
LustreError: MGC10.240.28.44@tcp: Connection to MGS (at 10.240.28.44@tcp) was lost; in progress operations using this service will fail
LustreError: Skipped 1 previous similar message
Lustre: Evicted from MGS (at 10.240.28.44@tcp) after server handle changed from 0x4c5d1be1639da8ba to 0x4c5d1be1639de5de
Lustre: MGC10.240.28.44@tcp: Connection restored to (at 10.240.28.44@tcp)
Lustre: Skipped 18 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm1.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm1.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-138vm3.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-138vm4.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-138vm3.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-138vm4.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-\*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-\*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.44@tcp (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.22.184@tcp (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.44@tcp (stopping)
Lustre: Skipped 8 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.44@tcp (stopping)
Lustre: Skipped 10 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 20 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 30 previous similar messages
Link to test
sanity-pfl test complete, duration 2422 sec
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:519349]
Modules linked in: dm_flakey tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common sunrpc virtio_balloon pcspkr i2c_piix4 joydev drm fuse ext4 mbcache jbd2 ata_generic crct10dif_pclmul crc32_pclmul crc32c_intel ata_piix ghash_clmulni_intel virtio_net libata virtio_blk net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 1 PID: 519349 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 8c 1a 8d de 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffba1dc7c87930 EFLAGS: 00010282
RAX: ffffba1dc33d5008 RBX: 000000000000007d RCX: 000000000000000e
RDX: ffffba1dc339d000 RSI: ffffba1dc7c87960 RDI: ffff9248c6d8ff00
RBP: ffff9248c6d8ff00 R08: 0000000000000018 R09: ffff924a2f7d214a
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9248c6d8ff00 R14: ffffba1dc7c879d8 R15: 0000000000000000
FS: 00007f58af6a9540(0000) GS:ffff92497fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000563d707b3300 CR3: 0000000034c7a005 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xf3/0x220 [obdclass]
? class_disconnect_exports+0x131/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
? class_manual_cleanup+0x161/0x7a0 [obdclass]
class_manual_cleanup+0x43b/0x7a0 [obdclass]
server_put_super+0x98f/0xb40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7f58af50df0b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-pfl: start cleanup 13:09:31 \(1744981771\) ===
Lustre: DEBUG MARKER: === sanity-pfl: start cleanup 13:09:31 (1744981771) ===
Lustre: 473941:0:(osd_handler.c:2067:osd_trans_start()) lustre-MDT0001: credits 30942 > trans_max 2464
Lustre: 473941:0:(osd_handler.c:1961:osd_trans_dump_creds()) create: 10/40/0, destroy: 1/4/0
Lustre: 473941:0:(osd_handler.c:1961:osd_trans_dump_creds()) Skipped 9 previous similar messages
Lustre: 473941:0:(osd_handler.c:1968:osd_trans_dump_creds()) attr_set: 2023/2023/0, xattr_set: 3034/28428/0
Lustre: 473941:0:(osd_handler.c:1968:osd_trans_dump_creds()) Skipped 9 previous similar messages
Lustre: 473941:0:(osd_handler.c:1975:osd_trans_dump_creds()) write: 44/253/0, punch: 0/0/0, quota 0/0/0
Lustre: 473941:0:(osd_handler.c:1975:osd_trans_dump_creds()) Skipped 9 previous similar messages
Lustre: 473941:0:(osd_handler.c:1985:osd_trans_dump_creds()) insert: 11/186/0, delete: 2/5/0
Lustre: 473941:0:(osd_handler.c:1985:osd_trans_dump_creds()) Skipped 9 previous similar messages
Lustre: 473941:0:(osd_handler.c:1992:osd_trans_dump_creds()) ref_add: 1/1/0, ref_del: 2/2/0
Lustre: 473941:0:(osd_handler.c:1992:osd_trans_dump_creds()) Skipped 9 previous similar messages
CPU: 0 PID: 473941 Comm: mdt00_016 Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x48
osd_trans_start+0x67c/0x6d0 [osd_ldiskfs]
top_trans_start+0x275/0x4d0 [ptlrpc]
mdd_unlink+0x42e/0xee0 [mdd]
? lustre_msg_get_versions+0x23/0x100 [ptlrpc]
mdt_reint_unlink+0xc14/0x1540 [mdt]
mdt_reint_rec+0x11c/0x270 [mdt]
mdt_reint_internal+0x4ea/0x9b0 [mdt]
mdt_reint+0x59/0x110 [mdt]
tgt_handle_request0+0x14a/0x770 [ptlrpc]
tgt_request_handle+0x1eb/0xb80 [ptlrpc]
ptlrpc_server_handle_request.isra.0+0x2a0/0xce0 [ptlrpc]
ptlrpc_main+0xa7e/0xfa0 [ptlrpc]
? __pfx_ptlrpc_main+0x10/0x10 [ptlrpc]
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-pfl: finish cleanup 13:09:43 \(1744981783\) ===
Lustre: DEBUG MARKER: === sanity-pfl: finish cleanup 13:09:43 (1744981783) ===
Lustre: 8879:0:(client.c:2340:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1744981787/real 1744981787] req@ffff92497f7cacc0 x1829722534647552/t0(0) o400->MGC10.240.28.44@tcp@10.240.28.44@tcp:26/25 lens 224/224 e 0 to 1 dl 1744981803 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
LustreError: MGC10.240.28.44@tcp: Connection to MGS (at 10.240.28.44@tcp) was lost; in progress operations using this service will fail
LustreError: Skipped 1 previous similar message
Lustre: Evicted from MGS (at 10.240.28.44@tcp) after server handle changed from 0x6565b10c71bcf3a4 to 0x6565b10c71bd353d
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm1.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm1.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-45vm2.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-45vm1.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-45vm2.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-45vm1.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-\*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-\*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 2 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.44@tcp (stopping)
Lustre: Skipped 8 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0001: Not available for connect from 10.240.23.246@tcp (stopping)
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 10 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 20 previous similar messages
Link to test
sanity-flr test complete, duration 1985 sec
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [umount:1161182]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common virtio_balloon i2c_piix4 joydev pcspkr sunrpc drm fuse ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel virtio_net virtio_blk net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 0 PID: 1161182 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.31.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 3c c2 b3 f0 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffb4a300a337e8 EFLAGS: 00010282
RAX: ffffb4a3041de008 RBX: 000000000000001a RCX: 000000000000000e
RDX: ffffb4a3041a7000 RSI: ffffb4a300a33818 RDI: ffff9933b7f2bf00
RBP: ffff9933b7f2bf00 R08: 0000000000000018 R09: ffff9934c666e784
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9933b7f2bf00 R14: ffffb4a300a33890 R15: 0000000000000000
FS: 00007f43ad80f540(0000) GS:ffff9933ffc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f36d9e00000 CR3: 0000000074f92006 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x169/0x2a8
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? cfs_hash_for_each_relax+0x154/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xf3/0x220 [obdclass]
? class_disconnect_exports+0x131/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
? class_manual_cleanup+0x160/0x740 [obdclass]
class_manual_cleanup+0x1e5/0x740 [obdclass]
server_put_super+0x98f/0xb40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? _copy_to_user+0x1a/0x30
? cp_new_stat+0x150/0x180
? __do_sys_newfstatat+0x35/0x60
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7f43ad70e06b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-flr: start cleanup 07:07:18 \(1744960038\) ===
Lustre: DEBUG MARKER: === sanity-flr: start cleanup 07:07:18 (1744960038) ===
Lustre: 948818:0:(osd_handler.c:2067:osd_trans_start()) lustre-MDT0000: credits 2742 > trans_max 2464
Lustre: 948818:0:(osd_handler.c:1961:osd_trans_dump_creds()) create: 10/40/0, destroy: 1/4/0
Lustre: 948818:0:(osd_handler.c:1968:osd_trans_dump_creds()) attr_set: 143/143/0, xattr_set: 214/2108/0
Lustre: 948818:0:(osd_handler.c:1975:osd_trans_dump_creds()) write: 44/253/0, punch: 0/0/0, quota 0/0/0
Lustre: 948818:0:(osd_handler.c:1985:osd_trans_dump_creds()) insert: 11/186/0, delete: 2/5/0
Lustre: 948818:0:(osd_handler.c:1992:osd_trans_dump_creds()) ref_add: 1/1/0, ref_del: 2/2/0
CPU: 0 PID: 948818 Comm: mdt00_000 Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.31.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x48
osd_trans_start+0x67c/0x6d0 [osd_ldiskfs]
top_trans_start+0x275/0x4d0 [ptlrpc]
mdd_unlink+0x42e/0xee0 [mdd]
? lustre_msg_get_versions+0x23/0x100 [ptlrpc]
mdt_reint_unlink+0xb44/0x1430 [mdt]
mdt_reint_rec+0x11c/0x270 [mdt]
mdt_reint_internal+0x4ea/0x9b0 [mdt]
mdt_reint+0x59/0x110 [mdt]
tgt_handle_request0+0x14a/0x770 [ptlrpc]
tgt_request_handle+0x1eb/0xb80 [ptlrpc]
ptlrpc_server_handle_request.isra.0+0x2a0/0xce0 [ptlrpc]
ptlrpc_main+0xa7b/0xfa0 [ptlrpc]
? __pfx_ptlrpc_main+0x10/0x10 [ptlrpc]
kthread+0xe0/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2c/0x50
</TASK>
Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.mdt=none
Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-flr: finish cleanup 07:07:46 \(1744960066\) ===
Lustre: DEBUG MARKER: === sanity-flr: finish cleanup 07:07:46 (1744960066) ===
Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: sanity-lsnapshot ============----- Fri Apr 18 07:07:48 AM UTC 2025
Lustre: DEBUG MARKER: -----============= acceptance-small: sanity-lsnapshot ============----- Fri Apr 18 07:07:48 AM UTC 2025
Lustre: DEBUG MARKER: hostname -I
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/sanity-lsnapshot.*ex || true
Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/sanity-lsnapshot.*ex 2>/dev/null ||true
Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests:
Lustre: DEBUG MARKER: excepting tests:
Lustre: DEBUG MARKER: /usr/sbin/lctl mark SKIP: sanity-lsnapshot ZFS only test
Lustre: DEBUG MARKER: SKIP: sanity-lsnapshot ZFS only test
Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: mmp ============----- Fri Apr 18 07:08:02 AM UTC 2025
Lustre: DEBUG MARKER: -----============= acceptance-small: mmp ============----- Fri Apr 18 07:08:02 AM UTC 2025
Lustre: DEBUG MARKER: hostname -I
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: cat /etc/system-release
Lustre: DEBUG MARKER: test -r /etc/os-release
Lustre: DEBUG MARKER: cat /etc/os-release
Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/mmp.*ex || true
Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/mmp.*ex 2>/dev/null ||true
Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests:
Lustre: DEBUG MARKER: excepting tests:
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
Lustre: lustre-MDT0000: Not available for connect from 10.240.25.240@tcp (stopping)
Lustre: Skipped 18 previous similar messages
Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 11 previous similar messages
LustreError: 1160502:0:(obd_class.h:478:obd_check_dev()) Device 18 not setup
LustreError: 1160502:0:(obd_class.h:478:obd_check_dev()) Skipped 23 previous similar messages
LustreError: 1081935:0:(ldlm_lib.c:1110:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: 1081935:0:(ldlm_lib.c:1110:target_handle_connect()) Skipped 47 previous similar messages
LDISKFS-fs (dm-3): unmounting filesystem d6c14ed0-7346-418b-8287-d1878f35e4a0.
Lustre: server umount lustre-MDT0000 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup remove /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup mknodes >/dev/null 2>&1
Lustre: DEBUG MARKER: modprobe -r dm-flakey
LustreError: 1074222:0:(ldlm_lockd.c:2545:ldlm_cancel_handler()) ldlm_cancel from 10.240.28.47@tcp arrived at 1744960125 with bad export cookie 10453416559053787040
LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.28.47@tcp failed: rc = -107
LustreError: Skipped 12 previous similar messages
Lustre: 10108:0:(client.c:2340:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1744960113/real 1744960113] req@ffff9933d29a6080 x1829685425488128/t0(0) o400->MGC10.240.28.46@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1744960129 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 projid:4294967295
Lustre: 10108:0:(client.c:2340:ptlrpc_expire_one_request()) Skipped 1 previous similar message
LustreError: MGC10.240.28.46@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3
Link to test
conf-sanity test 153c: don't stuck on unreached NID
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [umount:1242770]
Modules linked in: obdecho(OE) ptlrpc_gss(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) tls dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common pcspkr virtio_balloon i2c_piix4 joydev drm fuse ext4 mbcache jbd2 ata_generic crct10dif_pclmul crc32_pclmul ata_piix crc32c_intel libata virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: libcfs(OE)]
CPU: 0 PID: 1242770 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.23.2_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 bc 95 e9 d0 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffb2ad0add3860 EFLAGS: 00010282
RAX: ffffb2ad03f50008 RBX: 0000000000000056 RCX: 000000000000000e
RDX: ffffb2ad03f23000 RSI: ffffb2ad0add3890 RDI: ffff9e16aefb4d00
RBP: ffff9e16aefb4d00 R08: 0000000000000017 R09: ffff9e17af416714
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9e16aefb4d00 R14: ffffb2ad0add3908 R15: 0000000000000000
FS: 00007f80e1652540(0000) GS:ffff9e173fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056116df15050 CR3: 00000000324aa003 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x169/0x2a8
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Lustre: 506115:0:(client.c:2340:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1744743845/real 1744743845] req@ffff9e173f77e000 x1829476772457984/t0(0) o41->lustre-MDT0002-osp-MDT0003@10.240.28.46@tcp:24/4 lens 224/368 e 0 to 1 dl 1744743861 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'osp-pre-2-3.0' uid:0 gid:0
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
Lustre: 506115:0:(client.c:2340:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1744743848/real 1744743848] req@ffff9e16b09889c0 x1829476772458368/t0(0) o400->lustre-MDT0002-osp-MDT0003@10.240.28.46@tcp:24/4 lens 224/224 e 0 to 1 dl 1744743864 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xf3/0x220 [obdclass]
? class_disconnect_exports+0x193/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
class_manual_cleanup+0x1e5/0x740 [obdclass]
server_put_super+0x998/0xb30 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
Lustre: 506114:0:(client.c:2340:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1744743850/real 1744743850] req@ffff9e16b0989040 x1829476772458752/t0(0) o400->lustre-MDT0002-osp-MDT0003@10.240.28.46@tcp:24/4 lens 224/224 e 0 to 1 dl 1744743866 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? locks_lock_inode_wait+0x6d/0x180
? __do_sys_flock+0x134/0x1a0
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7f80e150e06b
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LDISKFS-fs (dm-3): unmounting filesystem 2d056a49-44b5-47f1-b5d8-16e296b6038a.
Lustre: server umount lustre-MDT0001 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Autotest: Test running for 555 minutes (lustre-reviews_review-dne-part-3_112515.31)
Lustre: lustre-MDT0003: haven't heard from client 396f8aa0-359a-4f4a-8962-14b0788699d2 (at 10.240.25.156@tcp) in 33 seconds. I think it's dead, and I am evicting it. exp ffff9e16b1980400, cur 1744743609 expire 1744743579 last 1744743576
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds4
LDISKFS-fs (dm-4): unmounting filesystem da83b8f6-f56b-4b48-8f6f-2073ba5ea2e4.
Lustre: server umount lustre-MDT0003 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm6.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: onyx-66vm6.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: [ -e /dev/mapper/mds2_flakey ]
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: mkfs.lustre --mgsnode=10.240.28.46@tcp --fsname=lustre --mdt --index=1 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions="-b 4096 -E lazy_itable_init" --reformat /dev/mapper/
LDISKFS-fs (dm-3): mounted filesystem 1acad579-b142-4e72-aee2-afb91c9574e3 r/w with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-3): unmounting filesystem 1acad579-b142-4e72-aee2-afb91c9574e3.
Lustre: DEBUG MARKER: [ -e /dev/mapper/mds4_flakey ]
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: mkfs.lustre --mgsnode=10.240.28.46@tcp --fsname=lustre --mdt --index=3 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions="-b 4096 -E lazy_itable_init" --reformat /dev/mapper/
LDISKFS-fs (dm-4): mounted filesystem af45a0dd-4ac5-4333-83df-b569b439bdc7 r/w with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-4): unmounting filesystem af45a0dd-4ac5-4333-83df-b569b439bdc7.
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2
LDISKFS-fs (dm-3): mounted filesystem 1acad579-b142-4e72-aee2-afb91c9574e3 r/w with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-3): unmounting filesystem 1acad579-b142-4e72-aee2-afb91c9574e3.
LDISKFS-fs (dm-3): mounted filesystem 1acad579-b142-4e72-aee2-afb91c9574e3 r/w with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: sync; sleep 1; sync
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4; mount -t lustre -o localrecov /dev/mapper/mds4_flakey /mnt/lustre-mds4
LDISKFS-fs (dm-4): mounted filesystem af45a0dd-4ac5-4333-83df-b569b439bdc7 r/w with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-4): unmounting filesystem af45a0dd-4ac5-4333-83df-b569b439bdc7.
LDISKFS-fs (dm-4): mounted filesystem af45a0dd-4ac5-4333-83df-b569b439bdc7 r/w with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: sync; sleep 1; sync
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-66vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-66vm4.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-66vm5.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0002.ost_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0002.ost_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark os[cp].lustre-OST0000-osc-MDT0002.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: os[cp].lustre-OST0000-osc-MDT0002.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0003.ost_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0003.ost_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0003.ost_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0003.ost_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark os[cp].lustre-OST0000-osc-MDT0003.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark os[cp].lustre-OST0000-osc-MDT0003.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: os[cp].lustre-OST0000-osc-MDT0003.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: os[cp].lustre-OST0000-osc-MDT0003.ost_server_uuid in FULL state after 0 sec
LustreError: lustre-OST0000-osc-MDT0003: operation ost_statfs to node 10.240.25.157@tcp failed: rc = -107
LustreError: Skipped 14 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
Link to test
sanity-pcc test 1c: Test automated attach using Project ID with manual HSM restore
watchdog: BUG: soft lockup - CPU#1 stuck for 25s! [umount:235929]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common sunrpc virtio_balloon i2c_piix4 pcspkr joydev fuse drm ext4 mbcache jbd2 crct10dif_pclmul ata_generic crc32_pclmul crc32c_intel virtio_net ata_piix ghash_clmulni_intel libata virtio_blk net_failover failover serio_raw
CPU: 1 PID: 235929 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.23.2_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 9c c1 21 c9 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffb5bf8a8337d8 EFLAGS: 00010282
RAX: ffffb5bf81c17008 RBX: 0000000000000039 RCX: 000000000000000e
RDX: ffffb5bf81c13000 RSI: ffffb5bf8a833808 RDI: ffff8ade85380d00
RBP: ffff8ade85380d00 R08: 0000000000000018 R09: ffff8adfb719da7c
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8ade85380d00 R14: ffffb5bf8a833880 R15: 0000000000000000
FS: 00007fa2c6669540(0000) GS:ffff8adf3fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb3154f5008 CR3: 000000003864e005 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? cfs_hash_for_each_relax+0x154/0x480 [libcfs]
Lustre: lustre-MDT0001: Not available for connect from 10.240.27.220@tcp (stopping)
Lustre: Skipped 30 previous similar messages
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x160/0x740 [obdclass]
class_manual_cleanup+0x1e3/0x740 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? do_sys_openat2+0x81/0xd0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? locks_lock_inode_wait+0x6d/0x180
? __do_sys_newfstat+0x57/0x60
? __do_sys_flock+0x134/0x1a0
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fa2c650e06b
LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 10.240.27.220@tcp failed: rc = -107
Lustre: lustre-MDT0000-osp-MDT0003: Connection to lustre-MDT0000 (at 10.240.27.220@tcp) was lost; in progress operations using this service will wait for recovery to complete
LustreError: Skipped 1 previous similar message
Lustre: Skipped 3 previous similar messages
Lustre: lustre-MDT0000-lwp-MDT0003: Connection to lustre-MDT0000 (at 10.240.27.220@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LustreError: MGC10.240.27.220@tcp: Connection to MGS (at 10.240.27.220@tcp) was lost; in progress operations using this service will fail
Lustre: lustre-MDT0001: Not available for connect from 10.240.27.220@tcp (stopping)
LustreError: lustre-MDT0001-osp-MDT0003: operation mds_statfs to node 0@lo failed: rc = -107
Lustre: lustre-MDT0001-osp-MDT0003: Connection to lustre-MDT0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.27.220@tcp (stopping)
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 9 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 9 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 19 previous similar messages
Link to test
replay-single test 112d: DNE: cross MDT rename, fail MDT4
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [umount:188748]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i2c_piix4 virtio_balloon pcspkr joydev fuse drm ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw
CPU: 0 PID: 188748 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 2c 6d e0 d7 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffff9ae141c379c0 EFLAGS: 00010282
RAX: ffff9ae1458f2008 RBX: 0000000000000062 RCX: 000000000000000e
RDX: ffff9ae1458d7000 RSI: ffff9ae141c379f0 RDI: ffff8f21c11a7100
RBP: ffff8f21c11a7100 R08: 0000000000000018 R09: ffff8f22c1ed8f34
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8f21c11a7100 R14: ffff9ae141c37a68 R15: 0000000000000000
FS: 00007f95c342d540(0000) GS:ffff8f223fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000557bc6fa2050 CR3: 0000000003c4c002 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xf3/0x220 [obdclass]
? class_disconnect_exports+0x131/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
? class_manual_cleanup+0x161/0x7a0 [obdclass]
class_manual_cleanup+0x43b/0x7a0 [obdclass]
server_put_super+0x998/0xb30 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7f95c330df0b
Lustre: DEBUG MARKER: sync; sync; sync
Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0003 notransno
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: dmsetup suspend --nolockfs --noflush /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: dmsetup load /dev/mapper/mds4_flakey --table "0 3964928 flakey 252:1 0 0 1800 1 drop_writes"
Lustre: DEBUG MARKER: dmsetup resume /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds4 REPLAY BARRIER on lustre-MDT0003
Lustre: DEBUG MARKER: mds4 REPLAY BARRIER on lustre-MDT0003
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds4
Lustre: Failing over lustre-MDT0003
Link to test
replay-dual test 22a: c1 lfs mkdir -i 1 dir1, M1 drop reply
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [umount:33985]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i2c_piix4 joydev pcspkr virtio_balloon drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel net_failover virtio_blk failover serio_raw [last unloaded: obdecho(OE)]
CPU: 0 PID: 33985 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.26.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 bc b5 70 dd 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffad100926b740 EFLAGS: 00010282
RAX: ffffad10016a2008 RBX: 0000000000000028 RCX: 000000000000000e
RDX: ffffad100167b000 RSI: ffffad100926b770 RDI: ffff917f44d67b00
RBP: ffff917f44d67b00 R08: 0000000000000017 R09: ffff918072e9df43
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff917f44d67b00 R14: ffffad100926b7e8 R15: 0000000000000000
FS: 00007fef58037540(0000) GS:ffff917fffc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055716ffe4850 CR3: 000000002ec18003 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x169/0x2a8
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [obdclass]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [obdclass]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5b/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xf3/0x220 [obdclass]
? class_disconnect_exports+0x193/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
class_manual_cleanup+0x1e5/0x740 [obdclass]
server_put_super+0x998/0xb30 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? locks_remove_flock+0xe6/0xf0
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __pfx_file_free_rcu+0x10/0x10
? __call_rcu_common.constprop.0+0x117/0x2b0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? do_sys_openat2+0x81/0xd0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? do_syscall_64+0x6b/0xf0
? do_user_addr_fault+0x1d6/0x6a0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fef57f0e06b
Lustre: DEBUG MARKER: lctl set_param fail_loc=0x119
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: *** cfs_fail_loc=119, val=2147483648***
LustreError: 9055:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff917f73eeed80 x1827215334550784/t4295271465(0) o36->684c06c1-ab93-4d5e-94fd-88b75153a897@10.240.24.69@tcp:578/0 lens 560/448 e 0 to 0 dl 1742570778 ref 1 fl Interpret:/200/0 rc 0/0 job:'lfs.0' uid:0 gid:0
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds2
Lustre: Failing over lustre-MDT0001
LustreError: lustre-MDT0001-osp-MDT0003: operation mds_statfs to node 0@lo failed: rc = -107
LustreError: Skipped 4 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.24.70@tcp (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.24.71@tcp (stopping)
Lustre: Skipped 4 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.24.70@tcp (stopping)
Lustre: Skipped 10 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 13 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.24.71@tcp (stopping)
Lustre: Skipped 20 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 54 previous similar messages
Link to test
sanity-pcc test complete, duration 5523 sec
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [umount:516693]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc virtio_balloon i2c_piix4 intel_rapl_msr intel_rapl_common joydev pcspkr drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul virtio_net crc32_pclmul crc32c_intel net_failover virtio_blk ghash_clmulni_intel failover serio_raw
CPU: 0 PID: 516693 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.23.2_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 9c b1 0f e4 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffc33a49a3f840 EFLAGS: 00010282
RAX: ffffc33a4595d008 RBX: 000000000000003e RCX: 000000000000000e
RDX: ffffc33a45955000 RSI: ffffc33a49a3f870 RDI: ffffa014416b3500
RBP: ffffa014416b3500 R08: 0000000000000018 R09: ffffa0154195dcd0
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffffa014416b3500 R14: ffffc33a49a3f8e8 R15: 0000000000000000
FS: 00007fe750c13540(0000) GS:ffffa014bfc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005644614f31e8 CR3: 0000000032274001 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
class_manual_cleanup+0x1e3/0x740 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __do_sys_flock+0x134/0x1a0
? syscall_exit_work+0x103/0x130
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __do_softirq+0x169/0x2a8
? __irq_exit_rcu+0x46/0xc0
? sysvec_apic_timer_interrupt+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fe750b0e06b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-pcc: start cleanup 20:51:23 \(1742417483\) ===
Lustre: DEBUG MARKER: === sanity-pcc: start cleanup 20:51:23 (1742417483) ===
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-pcc: finish cleanup 20:51:39 \(1742417499\) ===
Lustre: DEBUG MARKER: === sanity-pcc: finish cleanup 20:51:39 (1742417499) ===
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107
LustreError: Skipped 1 previous similar message
Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 4 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 36 previous similar messages
Link to test
sanity-flr test complete, duration 1831 sec
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:135095]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i2c_piix4 virtio_balloon pcspkr joydev fuse drm ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_net virtio_blk net_failover failover serio_raw
CPU: 1 PID: 135095 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.ddn1.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 dc b9 f0 f3 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffb2a3839179b8 EFLAGS: 00010282
RAX: ffffb2a382c73008 RBX: 0000000000000078 RCX: 000000000000000e
RDX: ffffb2a382c5d000 RSI: ffffb2a3839179e8 RDI: ffff89e683d6fd00
RBP: ffff89e683d6fd00 R08: 0000000000000018 R09: ffff89e7a77b9736
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff89e683d6fd00 R14: ffffb2a383917a60 R15: 0000000000000000
FS: 00007f79c364e540(0000) GS:ffff89e73fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000556508475e10 CR3: 000000000fc54004 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? cfs_hash_for_each_relax+0x154/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x161/0x7a0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7f79c350df0b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-flr: start cleanup 06:45:26 \(1742280326\) ===
Lustre: DEBUG MARKER: === sanity-flr: start cleanup 06:45:26 (1742280326) ===
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-flr: finish cleanup 06:45:27 \(1742280327\) ===
Lustre: DEBUG MARKER: === sanity-flr: finish cleanup 06:45:27 (1742280327) ===
Autotest: Test running for 35 minutes (lustre-b_es-reviews_review-dne-subtest-change_22558.95)
Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 10.240.28.45@tcp) was lost; in progress operations using this service will wait for recovery to complete
LustreError: lustre-MDT0000-osp-MDT0003: operation mds_statfs to node 10.240.28.45@tcp failed: rc = -107
Lustre: lustre-MDT0000-lwp-MDT0001: Connection to lustre-MDT0000 (at 10.240.28.45@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 2 previous similar messages
Lustre: 8899:0:(client.c:2360:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1742280373/real 1742280373] req@ffff89e683c8ea40 x1826911267724288/t0(0) o400->MGC10.240.28.45@tcp@10.240.28.45@tcp:26/25 lens 224/224 e 0 to 1 dl 1742280389 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
LustreError: MGC10.240.28.45@tcp: Connection to MGS (at 10.240.28.45@tcp) was lost; in progress operations using this service will fail
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.45@tcp (stopping)
Lustre: lustre-MDT0001-osp-MDT0003: Connection to lustre-MDT0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.45@tcp (stopping)
Lustre: Skipped 9 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.45@tcp (stopping)
Lustre: Skipped 9 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.45@tcp (stopping)
Lustre: Skipped 9 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.45@tcp (stopping)
Lustre: Skipped 9 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.29.134@tcp (stopping)
Lustre: Skipped 19 previous similar messages
Link to test
replay-dual test 22a: c1 lfs mkdir -i 1 dir1, M1 drop reply
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:34080]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i2c_piix4 pcspkr joydev virtio_balloon drm fuse ext4 mbcache jbd2 ata_generic crct10dif_pclmul ata_piix crc32_pclmul crc32c_intel libata virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: obdecho]
CPU: 1 PID: 34080 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 fc 39 f5 e8 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa7f209327940 EFLAGS: 00010282
RAX: ffffa7f2015bb008 RBX: 0000000000000022 RCX: 000000000000000e
RDX: ffffa7f201591000 RSI: ffffa7f209327970 RDI: ffff947582eab000
RBP: ffff947582eab000 R08: 0000000000000017 R09: ffff9476b3c71932
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff947582eab000 R14: ffffa7f2093279e8 R15: 0000000000000000
FS: 00007f70902d5540(0000) GS:ffff94763fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000562f96d7eba0 CR3: 0000000033ef6002 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0xf3/0x220 [obdclass]
? class_disconnect_exports+0x193/0x300 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x1102/0x1ab0 [obdclass]
class_manual_cleanup+0x43b/0x7a0 [obdclass]
server_put_super+0x998/0xb30 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7f709010df0b
Lustre: DEBUG MARKER: lctl set_param fail_loc=0x119
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds2
Lustre: Failing over lustre-MDT0001
Lustre: lustre-MDT0001: Not available for connect from 10.240.25.161@tcp (stopping)
LustreError: lustre-MDT0001-osp-MDT0003: operation mds_statfs to node 0@lo failed: rc = -19
LustreError: Skipped 5 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.23.244@tcp (stopping)
Lustre: Skipped 10 previous similar messages
LustreError: 34080:0:(ldlm_resource.c:1146:ldlm_resource_complain()) lustre-MDT0000-osp-MDT0001: namespace resource [0x2000013a1:0x72:0x0].0xf7117594 (ffff947590263180) refcount nonzero (1) after lock cleanup; forcing cleanup.
Lustre: lustre-MDT0001: Not available for connect from 10.240.25.161@tcp (stopping)
Lustre: Skipped 1 previous similar message
LustreError: 11829:0:(llog_cat.c:589:llog_cat_add_rec()) llog_write_rec -5: lh=ffff9475d34c7600
LustreError: 11829:0:(update_trans.c:1038:top_trans_stop()) lustre-MDT0000-osp-MDT0001: write updates failed: rc = -5
LustreError: 11829:0:(update_trans.c:1061:top_trans_stop()) lustre-MDT0000-osp-MDT0001: stop trans failed: rc = -5
Lustre: *** cfs_fail_loc=119, val=2147483648***
LustreError: 11829:0:(ldlm_lib.c:3250:target_send_reply_msg()) @@@ dropping reply req@ffff9475b3d00680 x1824534698248960/t4295271463(0) o36->d91c4fee-5267-4e26-a0af-1b3151df20b2@10.240.23.244@tcp:587/0 lens 560/448 e 1 to 0 dl 1740014357 ref 1 fl Interpret:/200/0 rc -5/0 job:'lfs.0' uid:0 gid:0
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 12 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 14 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 30 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.25.161@tcp (stopping)
Lustre: Skipped 47 previous similar messages
Link to test
sanity-sec test 27ab: test nodemap idmap offset
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [umount:255607]
Modules linked in: dm_flakey osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc virtio_balloon intel_rapl_msr intel_rapl_common i2c_piix4 joydev pcspkr drm fuse ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 0 PID: 255607 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 dc 69 9e e0 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffff978340a6b970 EFLAGS: 00010282
RAX: ffff9783447e7008 RBX: 0000000000000056 RCX: 000000000000000e
RDX: ffff9783447b9000 RSI: ffff978340a6b9a0 RDI: ffff8a066f4dd700
RBP: ffff8a066f4dd700 R08: 0000000000000018 R09: ffff8a078343d2f3
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8a066f4dd700 R14: ffff978340a6ba18 R15: 0000000000000000
FS: 00007f426fc5c540(0000) GS:ffff8a06ffc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005585a3a12050 CR3: 00000000368fe001 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x161/0x7a0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? kvm_sched_clock_read+0xd/0x20
? sched_clock+0xc/0x30
? sched_clock_cpu+0x9/0xc0
? irqtime_account_irq+0x3c/0xb0
? __do_softirq+0x16a/0x2ac
? __irq_exit_rcu+0x46/0xc0
? sysvec_apic_timer_interrupt+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7f426fb0df0b
Lustre: lustre-MDT0000-lwp-MDT0001: Connection to lustre-MDT0000 (at 10.240.28.45@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 5 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LustreError: MGC10.240.28.45@tcp: Connection to MGS (at 10.240.28.45@tcp) was lost; in progress operations using this service will fail
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 16 previous similar messages
Link to test
conf-sanity test 24a: Multiple MDTs on a single node
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:233621]
Modules linked in: dm_flakey osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) mlx5_core(OE) mlxdevm(OE) ib_uverbs(OE) ib_core(OE) psample mlxfw(OE) mlx_compat(OE) macsec tls pci_hyperv_intf intel_rapl_msr intel_rapl_common sunrpc i2c_piix4 virtio_balloon pcspkr joydev drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 1 PID: 233621 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.21.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 9c 41 4f f6 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffab7e08d53678 EFLAGS: 00010282
RAX: ffffab7e06890008 RBX: 000000000000002b RCX: 000000000000000e
RDX: ffffab7e0687f000 RSI: ffffab7e08d536a8 RDI: ffff8b30036b6500
RBP: ffff8b30036b6500 R08: 0000000000000017 R09: ffff8b314140ff31
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8b30036b6500 R14: ffffab7e08d53720 R15: 0000000000000000
FS: 00007fe124074540(0000) GS:ffff8b30bfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005646cc9f6050 CR3: 0000000031dd4001 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1ad/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x112/0x2b0
? hrtimer_interrupt+0xfc/0x210
? kvm_sched_clock_read+0xd/0x20
? __sysvec_apic_timer_interrupt+0x4e/0x100
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? cfs_hash_for_each_relax+0x154/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
class_manual_cleanup+0x1e3/0x740 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0xf0
kill_anon_super+0x12/0x40
deactivate_locked_super+0x31/0xb0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x12b/0x130
exit_to_user_mode_prepare+0xb9/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x6b/0xf0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? get_page_from_freelist+0x441/0x650
? release_pages+0x17a/0x4d0
? xas_load+0x9/0xa0
? xa_load+0x70/0xb0
? _raw_spin_unlock+0xa/0x30
? list_lru_add+0xcb/0x120
? rcu_nocb_try_bypass+0x5e/0x460
? __pfx_file_free_rcu+0x10/0x10
? __call_rcu_common.constprop.0+0x117/0x2b0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x6b/0xf0
? __count_memcg_events+0x4f/0xb0
? mm_account_fault+0x6c/0x100
? handle_mm_fault+0x116/0x270
? do_user_addr_fault+0x1d6/0x6a0
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x78/0x80
RIP: 0033:0x7fe123f0e06b
Lustre: DEBUG MARKER: grep -c /mnt/lustre-fs2mds' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/fs2mds_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: mkfs.lustre --mgsnode=10.240.28.48@tcp --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions="-b 4096 -E lazy_itable_init" --nomgs --mgsnode=10.24
LDISKFS-fs (dm-1): mounted filesystem 9183023c-e7e4-40ac-9c92-ce678bac39e1 r/w with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-1): unmounting filesystem 9183023c-e7e4-40ac-9c92-ce678bac39e1.
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1
LDISKFS-fs (dm-3): mounted filesystem b78033aa-dbe3-40b1-a0c9-5d0fbe4528b4 r/w with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-105vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-105vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-105vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-105vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-105vm9.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-105vm9.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-105vm7.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-105vm7.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-105vm8.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-105vm8.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-fs2mds
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/fs2mds_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: test -b /dev/vg_Role_MDS/scratch1
Lustre: DEBUG MARKER: blockdev --getsz /dev/vg_Role_MDS/scratch1 2>/dev/null
Lustre: DEBUG MARKER: dmsetup create fs2mds_flakey --table "0 212992 linear /dev/vg_Role_MDS/scratch1 0"
Lustre: DEBUG MARKER: dmsetup mknodes >/dev/null 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/fs2mds_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/fs2mds_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-fs2mds; mount -t lustre -o localrecov /dev/mapper/fs2mds_flakey /mnt/lustre-fs2mds
LDISKFS-fs (dm-4): mounted filesystem 9183023c-e7e4-40ac-9c92-ce678bac39e1 r/w with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-4): unmounting filesystem 9183023c-e7e4-40ac-9c92-ce678bac39e1.
LDISKFS-fs (dm-4): mounted filesystem 9183023c-e7e4-40ac-9c92-ce678bac39e1 r/w with ordered data mode. Quota mode: journalled.
Lustre: Setting parameter 969362ae-MDT0000.mdt.identity_upcall in log 969362ae-MDT0000
Lustre: 232085:0:(mgc_request_server.c:552:mgc_llog_local_copy()) MGC10.240.28.48@tcp: no remote llog for 969362ae-sptlrpc, check MGS config
Lustre: ctl-969362ae-MDT0000: No data found on store. Initialize space.
Lustre: Skipped 1 previous similar message
Autotest: Test running for 170 minutes (lustre-b_es7_0_full-part-3_52.31)
Lustre: 969362ae-MDT0000: new disk, initializing
Lustre: ctl-969362ae-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt
Lustre: Skipped 1 previous similar message
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm5.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/fs2mds_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: sync; sleep 1; sync
Lustre: DEBUG MARKER: e2label /dev/mapper/fs2mds_flakey 2>/dev/null
Lustre: 969362ae-OST0000-osc-MDT0000: update sequence from 0x100000000 to 0x240000400
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-105vm9.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-105vm9.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: lctl --device 969362ae-MDT0000 changelog_register -n
Lustre: 969362ae-MDD0000: changelog on
Lustre: DEBUG MARKER: lctl --device 969362ae-MDT0000 changelog_deregister cl1
Lustre: 969362ae-MDD0000: changelog off
Lustre: DEBUG MARKER: grep -c /mnt/lustre-fs2mds' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-fs2mds
Lustre: Failing over 969362ae-MDT0000
Lustre: 969362ae-MDT0000: Not available for connect from 10.240.28.172@tcp (stopping)
Lustre: 969362ae-MDT0000: Not available for connect from 10.240.28.170@tcp (stopping)
Lustre: 969362ae-MDT0000: Not available for connect from 10.240.28.172@tcp (stopping)
Lustre: 969362ae-MDT0000: Not available for connect from 10.240.28.172@tcp (stopping)
Lustre: Skipped 1 previous similar message
Lustre: 969362ae-MDT0000: Not available for connect from 10.240.28.172@tcp (stopping)
Lustre: Skipped 3 previous similar messages
Link to test
replay-dual test 22a: c1 lfs mkdir -i 1 dir1, M1 drop reply
watchdog: BUG: soft lockup - CPU#1 stuck for 25s! [umount:34128]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common sunrpc virtio_balloon joydev i2c_piix4 pcspkr fuse drm ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: obdecho]
CPU: 1 PID: 34128 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 8c 29 a2 ea 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffb3878704b9b8 EFLAGS: 00010282
RAX: ffffb3878140b008 RBX: 000000000000001f RCX: 000000000000000e
RDX: ffffb38781403000 RSI: ffffb3878704b9e8 RDI: ffff934cf5993b00
RBP: ffff934cf5993b00 R08: 0000000000000017 R09: ffff934df71e0ab0
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff934cf5993b00 R14: ffffb3878704ba60 R15: 0000000000000000
FS: 00007f43be259540(0000) GS:ffff934d7fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f512deeb2f8 CR3: 0000000036b58001 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __sysvec_apic_timer_interrupt+0x5f/0x110
Lustre: lustre-MDT0001: Not available for connect from 10.240.24.111@tcp (stopping)
Lustre: Skipped 46 previous similar messages
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? cfs_hash_for_each_relax+0x154/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7dd/0xa30 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
? __irq_exit_rcu+0x46/0xc0
? sysvec_apic_timer_interrupt+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7f43be10df0b
Lustre: DEBUG MARKER: lctl set_param fail_loc=0x119
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: *** cfs_fail_loc=119, val=2147483648***
LustreError: 28073:0:(ldlm_lib.c:3250:target_send_reply_msg()) @@@ dropping reply req@ffff934cef3cb0c0 x1818988883580544/t4295271435(0) o36->34a23642-94e5-49ed-a733-c880a663b0c3@10.240.24.131@tcp:552/0 lens 560/448 e 0 to 0 dl 1734725547 ref 1 fl Interpret:/200/0 rc 0/0 job:'lfs.0' uid:0 gid:0
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds2
Lustre: Failing over lustre-MDT0001
Lustre: lustre-MDT0001: Not available for connect from 10.240.27.52@tcp (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.24.131@tcp (stopping)
Lustre: Skipped 2 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.27.52@tcp (stopping)
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0001: Not available for connect from 10.240.24.132@tcp (stopping)
Lustre: Skipped 10 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.24.111@tcp (stopping)
Lustre: Skipped 6 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.27.52@tcp (stopping)
Lustre: Skipped 27 previous similar messages
Link to test
sanity-pfl test complete, duration 2182 sec
watchdog: BUG: soft lockup - CPU#1 stuck for 21s! [umount:817218]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill i2c_piix4 virtio_balloon sunrpc intel_rapl_msr intel_rapl_common joydev pcspkr drm fuse ext4 mbcache jbd2 ata_generic crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel ata_piix libata virtio_blk net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 1 PID: 817218 Comm: umount Kdump: loaded Tainted: G W OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 8c 59 ab d1 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa88300b0b9c0 EFLAGS: 00010282
RAX: ffffa883063be008 RBX: 000000000000004c RCX: 000000000000000e
RDX: ffffa88306391000 RSI: ffffa88300b0b9f0 RDI: ffff93c9c0163600
RBP: ffff93c9c0163600 R08: 0000000000000018 R09: ffff93cabbf15cf3
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff93c9c0163600 R14: ffffa88300b0ba68 R15: 0000000000000000
FS: 00007fc88d66d540(0000) GS:ffff93ca3fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000557c950ae1d0 CR3: 0000000033a46005 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x161/0x7a0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7dd/0xa30 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? _raw_spin_unlock_irq+0xa/0x30
? sigprocmask+0xb4/0xe0
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7fc88d50df0b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-pfl: start cleanup 07:05:52 \(1734419152\) ===
Lustre: DEBUG MARKER: === sanity-pfl: start cleanup 07:05:52 (1734419152) ===
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanity-pfl: finish cleanup 07:05:54 \(1734419154\) ===
Lustre: DEBUG MARKER: === sanity-pfl: finish cleanup 07:05:54 (1734419154) ===
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
Lustre: Failing over lustre-MDT0000
Lustre: lustre-MDT0000: Not available for connect from 10.240.22.164@tcp (stopping)
LustreError: 700219:0:(ldlm_lockd.c:2571:ldlm_cancel_handler()) ldlm_cancel from 10.240.28.48@tcp arrived at 1734419159 with bad export cookie 4887197198536710475
LustreError: 700219:0:(ldlm_lockd.c:2571:ldlm_cancel_handler()) Skipped 6 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.48@tcp (stopping)
Lustre: Skipped 2 previous similar messages
Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 6 previous similar messages
Lustre: server umount lustre-MDT0000 complete
LDISKFS-fs (dm-3): unmounting filesystem 07287f07-9ecd-4ae0-8c33-880407296b4c.
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup remove /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup mknodes >/dev/null 2>&1
Lustre: DEBUG MARKER: modprobe -r dm-flakey
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: test -b /dev/vg_Role_MDS/mdt1
Lustre: DEBUG MARKER: blockdev --getsz /dev/vg_Role_MDS/mdt1 2>/dev/null
Lustre: DEBUG MARKER: dmsetup create mds1_flakey --table "0 3964928 linear /dev/vg_Role_MDS/mdt1 0"
Lustre: 10274:0:(client.c:2358:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1734419165/real 1734419165] req@ffff93c9b36b8000 x1818649728277120/t0(0) o400->MGC10.240.28.47@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1734419181 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
Lustre: 10274:0:(client.c:2358:ptlrpc_expire_one_request()) Skipped 1 previous similar message
LustreError: MGC10.240.28.47@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail
LustreError: Skipped 1 previous similar message
Lustre: DEBUG MARKER: dmsetup mknodes >/dev/null 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1
LDISKFS-fs (dm-3): mounted filesystem 07287f07-9ecd-4ae0-8c33-880407296b4c r/w with ordered data mode. Quota mode: journalled.
LustreError: 10272:0:(client.c:1310:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff93c9bddcb400 x1818649728293376/t0(0) o250->MGC10.240.28.47@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect
Lustre: Skipped 1 previous similar message
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 5 clients reconnect
Lustre: Skipped 1 previous similar message
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/li
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: lustre-MDT0000: Recovery over after 0:04, of 5 clients 5 recovered and 0 were evicted.
Lustre: Skipped 1 previous similar message
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-137vm4.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-137vm4.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-137vm5.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-137vm5.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-\*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0000-mdc-\*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 14 previous similar messages
Lustre: server umount lustre-MDT0000 complete
LDISKFS-fs (dm-3): unmounting filesystem 07287f07-9ecd-4ae0-8c33-880407296b4c.
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup remove /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup mknodes >/dev/null 2>&1
Lustre: DEBUG MARKER: modprobe -r dm-flakey
LustreError: 699331:0:(ldlm_lockd.c:2571:ldlm_cancel_handler()) ldlm_cancel from 10.240.28.48@tcp arrived at 1734419217 with bad export cookie 4887197198536727247
Lustre: 10274:0:(client.c:2358:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1734419214/real 1734419214] req@ffff93c9e53016c0 x1818649728758400/t0(0) o400->MGC10.240.28.47@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1734419230 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3
Lustre: lustre-MDT0002: Not available for connect from 10.240.22.165@tcp (stopping)
Lustre: Skipped 12 previous similar messages
Link to test
sanityn test complete, duration 8496 sec
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:550263]
Modules linked in: dm_flakey tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common virtio_balloon i2c_piix4 pcspkr joydev drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net net_failover ghash_clmulni_intel virtio_blk failover serio_raw [last unloaded: dm_flakey]
CPU: 1 PID: 550263 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 8c 39 4c c7 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffab1f40d7f950 EFLAGS: 00010282
RAX: ffffab1f42edb008 RBX: 0000000000000033 RCX: 000000000000000e
RDX: ffffab1f42eb7000 RSI: ffffab1f40d7f980 RDI: ffff9ef38047ba00
RBP: ffff9ef38047ba00 R08: 0000000000000018 R09: ffff9ef4774cceab
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9ef38047ba00 R14: ffffab1f40d7f9f8 R15: 0000000000000000
FS: 00007f8c7480d540(0000) GS:ffff9ef3ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007feb942a70c8 CR3: 0000000042b38003 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x580 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d7/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x161/0x7a0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7dd/0xa30 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x19/0x40
? do_syscall_64+0x69/0x90
? sysvec_apic_timer_interrupt+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x77/0xe1
RIP: 0033:0x7f8c7470df0b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanityn: start cleanup 00:51:29 \(1734396689\) ===
Lustre: DEBUG MARKER: === sanityn: start cleanup 00:51:29 (1734396689) ===
Autotest: Test running for 350 minutes (lustre-reviews_review-dne-part-5_109713.32)
Autotest: Test running for 355 minutes (lustre-reviews_review-dne-part-5_109713.32)
Autotest: Test running for 360 minutes (lustre-reviews_review-dne-part-5_109713.32)
Autotest: Test running for 365 minutes (lustre-reviews_review-dne-part-5_109713.32)
Autotest: Test running for 370 minutes (lustre-reviews_review-dne-part-5_109713.32)
Autotest: Test running for 375 minutes (lustre-reviews_review-dne-part-5_109713.32)
Lustre: DEBUG MARKER: /usr/sbin/lctl mark === sanityn: finish cleanup 01:18:29 \(1734398309\) ===
Lustre: DEBUG MARKER: === sanityn: finish cleanup 01:18:29 (1734398309) ===
Lustre: lustre-MDT0000-lwp-MDT0003: Connection to lustre-MDT0000 (at 10.240.26.6@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: lustre-MDT0000-osp-MDT0003: Connection to lustre-MDT0000 (at 10.240.26.6@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 3 previous similar messages
Lustre: Skipped 3 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LustreError: MGC10.240.26.6@tcp: Connection to MGS (at 10.240.26.6@tcp) was lost; in progress operations using this service will fail
Lustre: lustre-MDT0001: Not available for connect from 10.240.26.1@tcp (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.26.1@tcp (stopping)
Lustre: Skipped 15 previous similar messages
Lustre: lustre-MDT0001-osp-MDT0003: Connection to lustre-MDT0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 2 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.26.1@tcp (stopping)
Lustre: Skipped 15 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 41 previous similar messages
Link to test
conf-sanity test 90b: check max_mod_rpcs_in_flight is enforced after update
watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [umount:851242]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) tls dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill virtio_balloon i2c_piix4 intel_rapl_msr joydev pcspkr intel_rapl_common sunrpc fuse drm ext4 mbcache jbd2 ata_generic crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ata_piix ghash_clmulni_intel libata virtio_blk net_failover failover serio_raw [last unloaded: libcfs]
CPU: 0 PID: 851242 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-362.24.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 0c 44 58 d6 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa7a486e2b9f0 EFLAGS: 00010282
RAX: ffffa7a481b37008 RBX: 000000000000003a RCX: 000000000000000e
RDX: ffffa7a481b11000 RSI: ffffa7a486e2ba20 RDI: ffff929a438a1b00
RBP: ffff929a438a1b00 R08: 0000000000000017 R09: ffff929b449d8941
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff929a438a1b00 R14: ffffa7a486e2ba98 R15: 0000000000000000
FS: 00007f0de0678540(0000) GS:ffff929affc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3d092af010 CR3: 00000000034f8005 CR4: 00000000001706f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f0de054e60b
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm2.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm2.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Quota mode: journalled.
Lustre: 847448:0:(mgc_request_server.c:552:mgc_llog_local_copy()) MGC10.240.28.45@tcp: no remote llog for lustre-sptlrpc, check MGS config
Lustre: 847448:0:(mgc_request_server.c:552:mgc_llog_local_copy()) Skipped 2 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm2.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm2.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4; mount -t lustre -o localrecov /dev/mapper/mds4_flakey /mnt/lustre-mds4
LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm6.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-60vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-60vm2.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-60vm1.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x159
Lustre: *** cfs_fail_loc=159, val=0***
Lustre: Skipped 3 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0
Lustre: lustre-MDT0001: Client 608d50e9-9eb5-4d4a-9331-cb459de4ce8d (at 10.240.25.32@tcp) reconnecting
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0x159
Lustre: *** cfs_fail_loc=159, val=0***
Lustre: DEBUG MARKER: /usr/sbin/lctl set_param fail_loc=0
Lustre: lustre-MDT0001: Client 608d50e9-9eb5-4d4a-9331-cb459de4ce8d (at 10.240.25.32@tcp) reconnecting
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LDISKFS-fs (dm-3): unmounting filesystem.
Lustre: server umount lustre-MDT0001 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Autotest: Test running for 415 minutes (lustre-reviews_review-dne-part-3_109311.21)
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds4
Link to test
sanity-quota test 39: Project ID interface works correctly
watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [umount:365093]
Modules linked in: dm_flakey tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc i2c_piix4 virtio_balloon intel_rapl_msr intel_rapl_common joydev pcspkr fuse drm ext4 mbcache jbd2 ata_generic crct10dif_pclmul crc32_pclmul crc32c_intel ata_piix virtio_net libata ghash_clmulni_intel net_failover virtio_blk failover serio_raw [last unloaded: dm_flakey]
CPU: 1 PID: 365093 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.31.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 6c 89 b2 c6 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffff9bf480987998 EFLAGS: 00010282
RAX: ffff9bf482358008 RBX: 0000000000000029 RCX: 000000000000000e
RDX: ffff9bf48233f000 RSI: ffff9bf4809879c8 RDI: ffff8f982f79a600
RBP: ffff8f982f79a600 R08: 0000000000000017 R09: ffff8f993bc95d34
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8f982f79a600 R14: ffff9bf480987a40 R15: 0000000000000000
FS: 00007fbf62e9a540(0000) GS:ffff8f98bfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd76a03f030 CR3: 000000003bdfa006 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? cfs_hash_for_each_relax+0x154/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? class_manual_cleanup+0x161/0x7a0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7fbf62d0df0b
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_*
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight
Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0
LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 10.240.28.47@tcp failed: rc = -107
LustreError: Skipped 2 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LustreError: MGC10.240.28.47@tcp: Connection to MGS (at 10.240.28.47@tcp) was lost; in progress operations using this service will fail
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 10 previous similar messages
Autotest: Test running for 220 minutes (lustre-reviews_review-dne-part-4_109200.31)
Link to test
insanity test 0: Fail all nodes, independently
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:22319]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common sunrpc virtio_balloon i2c_piix4 joydev pcspkr fuse drm ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw
CPU: 1 PID: 22319 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-362.24.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 0c c4 d1 e4 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa86586b2ba18 EFLAGS: 00010282
RAX: ffffa86582b4e008 RBX: 0000000000000071 RCX: 000000000000000e
RDX: ffffa86582b45000 RSI: ffffa86586b2ba48 RDI: ffff9170b4a69400
RBP: ffff9170b4a69400 R08: 0000000000000018 R09: ffff9171b20c90c0
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff9170b4a69400 R14: ffffa86586b2bac0 R15: 0000000000000000
FS: 00007f9a66c8e540(0000) GS:ffff91713fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff14798c78 CR3: 0000000018b6a002 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? cfs_hash_for_each_relax+0x154/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? __kmalloc+0x19b/0x370
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f9a66b4e60b
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
Lustre: Failing over lustre-MDT0000
Lustre: lustre-MDT0000: Not available for connect from 10.240.25.117@tcp (stopping)
Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 11 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.25.116@tcp (stopping)
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping)
Lustre: Skipped 13 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.25.117@tcp (stopping)
Lustre: Skipped 3 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.25.116@tcp (stopping)
Lustre: Skipped 29 previous similar messages
Link to test
sanity-pcc test 1f: Test auto RW-PCC cache with non-root user
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:31751]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common virtio_balloon pcspkr i2c_piix4 joydev drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata virtio_net crct10dif_pclmul crc32_pclmul crc32c_intel net_failover failover virtio_blk ghash_clmulni_intel serio_raw
CPU: 1 PID: 31751 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-362.24.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 0c 94 1b f9 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffab88c99d3a30 EFLAGS: 00010282
RAX: ffffab88c3ade008 RBX: 0000000000000078 RCX: 000000000000000e
RDX: ffffab88c3add000 RSI: ffffab88c99d3a60 RDI: ffff8a7cc5aa2800
RBP: ffff8a7cc5aa2800 R08: 0000000000000018 R09: ffff8a7e004e7ce8
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8a7cc5aa2800 R14: ffffab88c99d3ad8 R15: 0000000000000000
FS: 00007ff0e6c8b540(0000) GS:ffff8a7d7fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffd8594c78 CR3: 000000003ff32005 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xde/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
? __kmalloc+0x19b/0x370
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7ff0e6b4e60b
LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 10.240.28.50@tcp failed: rc = -107
Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 10.240.28.50@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 10 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LustreError: MGC10.240.28.50@tcp: Connection to MGS (at 10.240.28.50@tcp) was lost; in progress operations using this service will fail
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.50@tcp (stopping)
Lustre: Skipped 11 previous similar messages
LustreError: lustre-MDT0001-osp-MDT0003: operation mds_statfs to node 0@lo failed: rc = -107
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.50@tcp (stopping)
Lustre: Skipped 38 previous similar messages
Link to test
conf-sanity test 51: Verify that mdt_reint handles RMF_MDT_MD correctly when an OST is added
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [umount:458972]
Modules linked in: obdecho(OE) ptlrpc_gss(OE) tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr i2c_piix4 virtio_balloon intel_rapl_common joydev pcspkr drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel virtio_net net_failover virtio_blk failover serio_raw
CPU: 0 PID: 458972 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-362.24.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 ac c3 9a df 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0000:ffff9f2d0a79b978 EFLAGS: 00010282
RAX: ffff9f2d04056008 RBX: 0000000000000037 RCX: 000000000000000e
RDX: ffff9f2d04049000 RSI: ffff9f2d0a79b9a8 RDI: ffff8a5085a0a400
RBP: ffff8a5085a0a400 R08: 0000000000000017 R09: ffff8a51ac361442
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8a5085a0a400 R14: ffff9f2d0a79ba20 R15: 0000000000000000
FS: 00007f0649649540(0000) GS:ffff8a513fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055ccab0b7050 CR3: 000000002bfc8005 CR4: 00000000000606f0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? cfs_hash_for_each_relax+0x154/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xd6/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f064954e60b
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-79vm7.trevis.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm6.trevis.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm3.trevis.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-79vm7.trevis.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: trevis-79vm7.trevis.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: trevis-79vm7.trevis.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: trevis-25vm3.trevis.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: trevis-25vm6.trevis.whamcloud.com: executing set_hostid
Lustre: DEBUG MARKER: [ -e /dev/mapper/mds2_flakey ]
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: mkfs.lustre --mgsnode=trevis-25vm6@tcp --fsname=lustre --mdt --index=1 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions="-b 4096 -E lazy_itable_init" --reformat /dev/mapper/
Autotest: Test running for 210 minutes (lustre-reviews_review-dne-part-3_106337.14)
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-3): unmounting filesystem.
Lustre: DEBUG MARKER: [ -e /dev/mapper/mds4_flakey ]
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: mkfs.lustre --mgsnode=trevis-25vm6@tcp --fsname=lustre --mdt --index=3 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=200000 --mkfsoptions="-b 4096 -E lazy_itable_init" --reformat /dev/mapper/
LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-4): unmounting filesystem.
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm6.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-25vm6.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-3): unmounting filesystem.
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Quota mode: journalled.
Lustre: srv-lustre-MDT0001: No data found on store. Initialize space.
Lustre: Skipped 1 previous similar message
Lustre: lustre-MDT0001: new disk, initializing
Lustre: Skipped 1 previous similar message
Lustre: cli-ctl-lustre-MDT0001: Allocated super-sequence [0x0000000240000400-0x0000000280000400]:1:mdt]
Lustre: Skipped 1 previous similar message
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-79vm7.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-79vm7.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-79vm7.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-79vm7.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: sync; sleep 1; sync
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm6.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-25vm6.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4; mount -t lustre -o localrecov /dev/mapper/mds4_flakey /mnt/lustre-mds4
LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-4): unmounting filesystem.
LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Quota mode: journalled.
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-79vm7.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-79vm7.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-79vm7.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-79vm7.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: sync; sleep 1; sync
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm3.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-25vm3.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm3.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-25vm3.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0001-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0001-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: trevis-25vm2.trevis.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: trevis-25vm1.trevis.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osc.lustre-OST0001-osc-[-0-9a-f]\*.ost_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec
Lustre: lustre-OST0001-osc-MDT0003: Connection to lustre-OST0001 (at 10.240.38.102@tcp) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 37 previous similar messages
LustreError: lustre-OST0000-osc-MDT0003: operation ost_statfs to node 10.240.38.102@tcp failed: rc = -107
LustreError: Skipped 10 previous similar messages
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
Lustre: lustre-MDT0001: Not available for connect from 10.240.38.105@tcp (stopping)
Lustre: Skipped 10 previous similar messages
Link to test
conf-sanity test 50i: activate deactivated MDT
watchdog: BUG: soft lockup - CPU#1 stuck for 25s! [umount:451909]
Modules linked in: obdecho(OE) ptlrpc_gss(OE) tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common joydev pcspkr i2c_piix4 virtio_balloon drm fuse ext4 mbcache jbd2 ata_generic ata_piix crct10dif_pclmul libata crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel net_failover virtio_blk failover serio_raw
CPU: 1 PID: 451909 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-362.24.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 ac 73 57 c3 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffb88e837dfa18 EFLAGS: 00010282
RAX: ffffb88e86f50008 RBX: 000000000000005e RCX: 000000000000000e
RDX: ffffb88e86f15000 RSI: ffffb88e837dfa48 RDI: ffff8dabad3a6700
RBP: ffff8dabad3a6700 R08: 0000000000000017 R09: ffff8dacaa53c9e6
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8dabad3a6700 R14: ffffb88e837dfac0 R15: 0000000000000000
FS: 00007fa7eb071540(0000) GS:ffff8dac3fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffbdb001a8 CR3: 0000000027ab8006 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? cfs_hash_for_each_relax+0x154/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4e0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xd6/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10bc/0x1bc0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7fa7eaf4e60b
Lustre: DEBUG MARKER: tunefs.lustre --param mdc.active=0 /dev/mapper/mds2_flakey
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-3): unmounting filesystem.
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds2_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Quota mode: journalled.
LDISKFS-fs (dm-3): unmounting filesystem.
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Quota mode: journalled.
Lustre: 441736:0:(mgc_request_server.c:537:mgc_llog_local_copy()) MGC10.240.28.46@tcp: no remote llog for lustre-sptlrpc, check MGS config
Lustre: 441736:0:(mgc_request_server.c:537:mgc_llog_local_copy()) Skipped 4 previous similar messages
Lustre: 441736:0:(mgc_request_server.c:581:mgc_process_server_cfg_log()) MGC10.240.28.46@tcp: local log lustre-sptlrpc are not valid and/or remote logs are not accessbile rc = -2
Lustre: 441736:0:(mgc_request_server.c:581:mgc_process_server_cfg_log()) Skipped 4 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds2_flakey 2>/dev/null
LustreError: lustre-MDT0003: not available for connect from 10.240.28.46@tcp (no target). If you are running an HA pair check that the target is mounted on the other server.
LustreError: Skipped 31 previous similar messages
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds4_flakey 2>&1
Lustre: DEBUG MARKER: test -b /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds4; mount -t lustre -o localrecov /dev/mapper/mds4_flakey /mnt/lustre-mds4
LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Quota mode: journalled.
Lustre: setting import lustre-MDT0001_UUID INACTIVE by administrator request
Lustre: Skipped 1 previous similar message
LustreError: 443148:0:(osp_object.c:635:osp_attr_get()) lustre-MDT0001-osp-MDT0003: osp_attr_get update error [0x200000009:0x1:0x0]: rc = -108
LustreError: 443148:0:(lod_sub_object.c:931:lod_sub_prep_llog()) lustre-MDT0003-mdtlov: can't get id from catalogs: rc = -108
LustreError: 443148:0:(lod_dev.c:523:lod_sub_recovery_thread()) lustre-MDT0001-osp-MDT0003: get update log duration 0, retries 0, failed: rc = -108
LustreError: 443148:0:(lod_dev.c:364:lod_sub_recreate_llog()) lustre-MDT0003-mdtlov: can't access update_log: rc = -108
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds4_flakey 2>/dev/null
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0002-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0003-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm9.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-76vm9.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0000-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm9.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-76vm9.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0001-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) osc.lustre-OST0001-osc-[-0-9a-f]\*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid
Lustre: DEBUG MARKER: onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid
Lustre: lustre-MDT0001: Received new MDS connection from 0@lo, keep former export from same NID
LustreError: lustre-MDT0001-osp-MDT0003: This client was evicted by lustre-MDT0001; in progress operations using this service will fail.
LustreError: Skipped 1 previous similar message
LustreError: 444934:0:(osp_object.c:635:osp_attr_get()) lustre-MDT0001-osp-MDT0003: osp_attr_get update error [0x200000009:0x1:0x0]: rc = -108
LustreError: 444934:0:(osp_object.c:635:osp_attr_get()) Skipped 1 previous similar message
LustreError: 444934:0:(lod_sub_object.c:931:lod_sub_prep_llog()) lustre-MDT0003-mdtlov: can't get id from catalogs: rc = -108
LustreError: 444934:0:(obd_config.c:2034:class_config_llog_handler()) MGC10.240.28.46@tcp: cfg command failed: rc = -108
Lustre: cmd=cf00f 0:lustre-MDT0003-mdtlov 1:lustre-MDT0001-osp-MDT0003.active=1
LustreError: 441751:0:(mgc_request.c:609:do_requeue()) failed processing log: -108
Lustre: lustre-MDT0001-osp-MDT0003: Connection restored to (at 0@lo)
Lustre: lustre-MDT0001: Client d3b9f308-f9f3-4fa1-8e49-6b5735f969d6 (at 10.240.26.102@tcp) reconnecting
Lustre: lustre-MDT0001: Received new MDS connection from 10.240.28.46@tcp, keep former export from same NID
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-\*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm8.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: onyx-76vm7.onyx.whamcloud.com: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0001-mdc-\*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0000-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0000-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0000-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0000-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0000-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0000-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0000-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0000-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0000-osp-MDT0002.mdt_server_uuid 50
Autotest: Test running for 205 minutes (lustre-reviews_review-dne-part-3_106172.14)
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0000-osp-MDT0002.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0000-osp-MDT0002.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0000-osp-MDT0002.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0000-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0000-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0000-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0000-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0000-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0000-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0000-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0000-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0001-osp-MDT0000.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0001-osp-MDT0000.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0001-osp-MDT0000.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0001-osp-MDT0000.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0001-osp-MDT0002.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0001-osp-MDT0002.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0001-osp-MDT0002.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0001-osp-MDT0002.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0001-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0001-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0001-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0001-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0001-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0001-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0001-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0001-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0002-osp-MDT0000.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0002-osp-MDT0000.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0002-osp-MDT0000.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0002-osp-MDT0000.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0002-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0002-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0002-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0002-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0002-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0002-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0002-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0002-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0002-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0002-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0002-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0002-osp-MDT0003.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0002-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0002-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0002-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0002-osp-MDT0003.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0003-osp-MDT0000.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0003-osp-MDT0000.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0003-osp-MDT0000.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0003-osp-MDT0000.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0003-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm4.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0003-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0003-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm4.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0003-osp-MDT0001.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0003-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0003-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0003-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0003-osp-MDT0001.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-99vm3.onyx.whamcloud.com: executing wait_import_state \(FULL\|IDLE\) osp.lustre-MDT0003-osp-MDT0002.mdt_server_uuid 50
Lustre: DEBUG MARKER: onyx-99vm3.onyx.whamcloud.com: executing wait_import_state (FULL|IDLE) osp.lustre-MDT0003-osp-MDT0002.mdt_server_uuid 50
Lustre: DEBUG MARKER: /usr/sbin/lctl mark osp.lustre-MDT0003-osp-MDT0002.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: osp.lustre-MDT0003-osp-MDT0002.mdt_server_uuid in FULL state after 0 sec
Lustre: DEBUG MARKER: lctl get_param -n at_min
Lustre: setting import lustre-MDT0001_UUID INACTIVE by administrator request
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.lustre-MDT0001-osp-MDT0003.active
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LustreError: MGC10.240.28.46@tcp: Connection to MGS (at 10.240.28.46@tcp) was lost; in progress operations using this service will fail
LustreError: Skipped 4 previous similar messages
LDISKFS-fs (dm-3): unmounting filesystem.
Lustre: server umount lustre-MDT0001 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds4' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds4
Link to test
sanity test 133g: Check reads/writes of server lustre proc files with bad area io
watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [umount:484313]
Modules linked in: dm_flakey tls obdecho(OE) ptlrpc_gss(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common virtio_balloon i2c_piix4 pcspkr joydev fuse drm ext4 mbcache jbd2 ata_generic crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ata_piix libata ghash_clmulni_intel virtio_blk net_failover failover serio_raw [last unloaded: dm_flakey]
CPU: 1 PID: 484313 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-362.18.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 cc e0 5d e0 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa746c8bff940 EFLAGS: 00010282
RAX: ffffa746c2021008 RBX: 000000000000003a RCX: 000000000000000e
RDX: ffffa746c1fe9000 RSI: ffffa746c8bff970 RDI: ffff8d74f27a2900
RBP: ffff8d74f27a2900 R08: 0000000000000017 R09: ffff8d760721579d
R10: ffffffffffffffff R11: 000000000000000f R12: 0000000000000000
R13: ffff8d74f27a2900 R14: ffffa746c8bff9e8 R15: 0000000000000000
FS: 00007fc343021540(0000) GS:ffff8d757fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000559dc0c8f050 CR3: 0000000032e00003 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4f0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xd6/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10a9/0x1bb0 [obdclass]
class_manual_cleanup+0x439/0x7a0 [obdclass]
server_put_super+0x7ee/0xa40 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? __do_sys_newfstatat+0x35/0x60
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? exc_page_fault+0x62/0x150
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7fc342f4e60b
LustreError: 11-0: lustre-MDT0000-osp-MDT0003: operation mds_statfs to node 10.240.28.44@tcp failed: rc = -107
LustreError: Skipped 1 previous similar message
Lustre: lustre-MDT0000-osp-MDT0003: Connection to lustre-MDT0000 (at 10.240.28.44@tcp) was lost; in progress operations using this service will wait for recovery to complete
Autotest: Test running for 200 minutes (lustre-reviews_review-ldiskfs-dne_104387.36)
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
LustreError: 166-1: MGC10.240.28.44@tcp: Connection to MGS (at 10.240.28.44@tcp) was lost; in progress operations using this service will fail
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.171@tcp (stopping)
Lustre: Skipped 2 previous similar messages
Lustre: lustre-MDT0001-osp-MDT0003: Connection to lustre-MDT0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 3 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.171@tcp (stopping)
Lustre: Skipped 40 previous similar messages
Link to test
recovery-small test 149: skip orphan removal at umount
watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [umount:113606]
Modules linked in: tls osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill virtio_balloon i2c_piix4 joydev pcspkr sunrpc intel_rapl_msr intel_rapl_common dm_mod drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net ghash_clmulni_intel virtio_blk net_failover failover serio_raw
CPU: 1 PID: 113606 Comm: umount Kdump: loaded Tainted: G OE ------- --- 5.14.0-362.18.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 dc e0 34 cc 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0000:ffffa752c1d3fa30 EFLAGS: 00010282
RAX: ffffa752c8848008 RBX: 0000000000000023 RCX: 000000000000000e
RDX: ffffa752c880b000 RSI: ffffa752c1d3fa60 RDI: ffff9691768d2f00
RBP: ffff9691768d2f00 R08: 0000065a09525222 R09: 0000000000000025
R10: 0000000000000000 R11: 000000000000000f R12: 0000000000000000
R13: ffff9691768d2f00 R14: ffffa752c1d3fad8 R15: 0000000000000000
FS: 00007fe069258540(0000) GS:ffff9691ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f93ff289048 CR3: 000000003890c002 CR4: 00000000001706e0
Call Trace:
<IRQ>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
? watchdog_timer_fn+0x1b2/0x210
? __pfx_watchdog_timer_fn+0x10/0x10
? __hrtimer_run_queues+0x12a/0x2c0
? hrtimer_interrupt+0xfc/0x210
? __do_softirq+0x16a/0x2ac
? __sysvec_apic_timer_interrupt+0x5f/0x110
? sysvec_apic_timer_interrupt+0x6d/0x90
</IRQ>
<TASK>
? asm_sysvec_apic_timer_interrupt+0x16/0x20
? cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
? __pfx_ldlm_resource_clean+0x10/0x10 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4f0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xd6/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x2d5/0x600 [obdclass]
class_process_config+0x10a9/0x1bb0 [obdclass]
class_manual_cleanup+0x436/0x790 [obdclass]
server_put_super+0x7dd/0xa30 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x100/0x160
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? __irq_exit_rcu+0x46/0xc0
? sysvec_apic_timer_interrupt+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7fe06914e60b
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds2
Lustre: lustre-MDT0001: Not available for connect from 10.240.23.167@tcp (stopping)
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.167@tcp (stopping)
LustreError: 11-0: lustre-MDT0001-osp-MDT0003: operation mds_statfs to node 0@lo failed: rc = -107
LustreError: Skipped 9 previous similar messages
Lustre: lustre-MDT0001-osp-MDT0003: Connection to lustre-MDT0001 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete
Lustre: Skipped 15 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.23.168@tcp (stopping)
Lustre: Skipped 10 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
Lustre: Skipped 11 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.28.48@tcp (stopping)
Lustre: Skipped 2 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.23.168@tcp (stopping)
Lustre: Skipped 14 previous similar messages
Lustre: lustre-MDT0001: Not available for connect from 10.240.23.167@tcp (stopping)
Lustre: Skipped 39 previous similar messages
Link to test
sanity test 133f: Check reads/writes of client lustre proc files with bad area io
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [umount:503496]
Modules linked in: tls obdecho(OE) ptlrpc_gss(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common virtio_balloon pcspkr i2c_piix4 joydev drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net net_failover ghash_clmulni_intel failover virtio_blk serio_raw [last unloaded: llog_test]
CPU: 0 PID: 503496 Comm: umount Kdump: loaded Tainted: G OE -------- --- 5.14.0-284.30.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 3c 59 ec ea 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffa44441287988 EFLAGS: 00010282
RAX: ffffa44442c1d008 RBX: 0000000000000015 RCX: 000000000000000e
RDX: ffffa44442c0b000 RSI: ffffa444412879b8 RDI: ffff98cdf6963400
RBP: ffff98cdf6963400 R08: 00000b64c02162c7 R09: 0000000000000025
R10: ffffa44441287988 R11: ffff98cdf8afb371 R12: 0000000000000000
R13: ffff98cdf6963400 R14: ffffa44441287a30 R15: 0000000000000000
FS: 00007fac04239540(0000) GS:ffff98ce7fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055aa4549b340 CR3: 0000000033f2c002 CR4: 00000000000606f0
Call Trace:
<TASK>
? cleanup_resource+0x300/0x300 [ptlrpc]
? cleanup_resource+0x300/0x300 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4f0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xd6/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x1af/0x580 [obdclass]
class_process_config+0x10a9/0x1bb0 [obdclass]
class_manual_cleanup+0x436/0x790 [obdclass]
server_put_super+0x7dd/0xa30 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x131/0x190
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x30
do_syscall_64+0x69/0x90
? __do_sys_newfstatat+0x35/0x60
? syscall_exit_work+0x11a/0x150
? syscall_exit_to_user_mode+0x12/0x30
? do_syscall_64+0x69/0x90
? __irq_exit_rcu+0x46/0xe0
? sysvec_apic_timer_interrupt+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fac0414e87b
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
Lustre: Failing over lustre-MDT0000
Lustre: lustre-MDT0000: Not available for connect from 10.240.26.223@tcp (stopping)
Lustre: Skipped 8 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.26.222@tcp (stopping)
Lustre: lustre-MDT0000: Not available for connect from 10.240.22.243@tcp (stopping)
Lustre: lustre-MDT0000: Not available for connect from 10.240.26.223@tcp (stopping)
Lustre: Skipped 6 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.26.223@tcp (stopping)
Lustre: Skipped 8 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.26.223@tcp (stopping)
Lustre: Skipped 17 previous similar messages
Link to test
sanity test 133f: Check reads/writes of client lustre proc files with bad area io
watchdog: BUG: soft lockup - CPU#1 stuck for 25s! [umount:494838]
Modules linked in: tls obdecho(OE) ptlrpc_gss(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) libcfs(OE) dm_flakey dm_mod rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill intel_rapl_msr intel_rapl_common sunrpc i2c_piix4 pcspkr joydev virtio_balloon drm fuse ext4 mbcache jbd2 ata_generic ata_piix libata crct10dif_pclmul crc32_pclmul crc32c_intel virtio_net net_failover ghash_clmulni_intel failover virtio_blk serio_raw [last unloaded: llog_test]
CPU: 1 PID: 494838 Comm: umount Kdump: loaded Tainted: G OE -------- --- 5.14.0-284.30.1_lustre.el9.x86_64 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:cfs_hash_for_each_relax+0x15d/0x480 [libcfs]
Code: 24 38 00 00 00 00 8b 40 2c 89 44 24 18 49 8b 45 38 48 8d 74 24 30 4c 89 ef 48 8b 00 e8 3c d9 09 c2 48 85 c0 0f 84 1e 02 00 00 <4c> 8b 38 4d 85 ff 0f 84 f8 01 00 00 49 8b 45 28 4c 89 ef 4c 89 fe
RSP: 0018:ffffc01980cdfa10 EFLAGS: 00010282
RAX: ffffc01986735008 RBX: 0000000000000067 RCX: 000000000000000e
RDX: ffffc019866fb000 RSI: ffffc01980cdfa40 RDI: ffff98f0eee12200
RBP: ffff98f0eee12200 R08: ffff98f0c5e94e90 R09: ffff98f0c5e94e90
R10: ffffc01980cdfa10 R11: ffff98f0c4390e6b R12: 0000000000000000
R13: ffff98f0eee12200 R14: ffffc01980cdfab8 R15: 0000000000000000
FS: 00007f6bc8a78540(0000) GS:ffff98f17fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000559b9ec6a2b0 CR3: 0000000036470005 CR4: 00000000001706e0
Call Trace:
<TASK>
? cleanup_resource+0x300/0x300 [ptlrpc]
? cleanup_resource+0x300/0x300 [ptlrpc]
cfs_hash_for_each_nolock+0x12e/0x210 [libcfs]
ldlm_namespace_cleanup+0x2b/0xc0 [ptlrpc]
__ldlm_namespace_free+0x58/0x4f0 [ptlrpc]
ldlm_namespace_free_prior+0x5a/0x1f0 [ptlrpc]
mdt_fini+0xd6/0x570 [mdt]
mdt_device_fini+0x2b/0xc0 [mdt]
obd_precleanup+0x1e4/0x220 [obdclass]
class_cleanup+0x1af/0x580 [obdclass]
class_process_config+0x10a9/0x1bb0 [obdclass]
class_manual_cleanup+0x436/0x790 [obdclass]
server_put_super+0x7dd/0xa30 [ptlrpc]
generic_shutdown_super+0x74/0x120
kill_anon_super+0x14/0x30
deactivate_locked_super+0x31/0xa0
cleanup_mnt+0x131/0x190
task_work_run+0x5c/0x90
exit_to_user_mode_loop+0x122/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x30
do_syscall_64+0x69/0x90
? hrtimer_interrupt+0x126/0x210
? kvm_sched_clock_read+0x14/0x40
? sched_clock_cpu+0x9/0xb0
? irqtime_account_irq+0x3c/0xb0
? __irq_exit_rcu+0x46/0xe0
? sysvec_apic_timer_interrupt+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f6bc894e87b
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
Lustre: Failing over lustre-MDT0000
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.173@tcp (stopping)
Lustre: lustre-MDT0000: Not available for connect from 10.240.26.204@tcp (stopping)
Lustre: Skipped 6 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.26.205@tcp (stopping)
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.173@tcp (stopping)
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.173@tcp (stopping)
Lustre: Skipped 8 previous similar messages
Lustre: lustre-MDT0000: Not available for connect from 10.240.28.173@tcp (stopping)
Lustre: Skipped 17 previous similar messages
Link to test
Return to new crashes list