Editing crashreport #74690

ReasonCrashing FunctionWhere to cut BacktraceReports Count
BUG: unable to handle kernel NULL pointer dereferencelu_context_key_getmdd_close
mdt_mfd_close
mdt_obd_disconnect
class_disconnect_export_list
class_disconnect_stale_exports
target_recovery_thread
kthread
ret_from_fork
2

Added fields:

Match messages in logs
(every line would be required to be present in log output
Copy from "Messages before crash" column below):
Match messages in full crash
(every line would be required to be present in crash log output
Copy from "Full Crash" column below):
Limit to a test:
(Copy from below "Failing text"):
Delete these reports as invalid (real bug in review or some such)
Bug or comment:
Extra info:

Failures list (last 100):

Failing TestFull CrashMessages before crashComment
replay-dual test 26: dbench and tar with mds failover
BUG: unable to handle kernel NULL pointer dereference at 0000000000000024
PGD 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 PID: 144809 Comm: tgt_recover_0 Kdump: loaded Tainted: P W OE -------- - - 4.18.0-553.89.1.el8_lustre.x86_64 #1
Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014
RIP: 0010:lu_context_key_get+0x2b/0x80 [obdclass]
Code: 1f 44 00 00 55 53 48 63 46 20 48 39 34 c5 80 74 2f c1 75 1f 48 89 f3 8b 37 48 89 fd f7 c6 00 02 00 00 74 3f 48 8b 55 10 5b 5d <48> 8b 04 c2 e9 c7 d7 bc e2 48 c7 c7 a0 77 1d c1 48 c7 c2 38 83 19
RSP: 0018:ff3966f38a483ca8 EFLAGS: 00010282
RAX: 0000000000000004 RBX: ff1660d8ce746f78 RCX: 0000000000000000
RDX: 0000000000000004 RSI: ff1660d8fba1e698 RDI: ff1660d8fba1e698
RBP: ff1660d8e1dd2800 R08: 0000000000000000 R09: c0000000ffff7fff
R10: 0000000000000001 R11: ff3966f38a483ab0 R12: ff1660d8d21f12b0
R13: ff1660d8b9666400 R14: ff1660d8bc9db600 R15: ff1660d8af6ff150
FS: 0000000000000000(0000) GS:ff1660d8fba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000024 CR3: 0000000137610005 CR4: 0000000000771ef0
PKRU: 55555554
Call Trace:
? __die_body+0x1a/0x60
? no_context+0x1ba/0x3f0
? __bad_area_nosemaphore+0x157/0x180
? do_error_trap+0x9e/0xd0
? do_page_fault+0x37/0x12d
? page_fault+0x1e/0x30
? lu_context_key_get+0x2b/0x80 [obdclass]
mdd_close+0x73/0xf00 [mdd]
mdt_mfd_close+0x6e2/0xc10 [mdt]
mdt_obd_disconnect+0x23f/0x820 [mdt]
class_disconnect_export_list+0x21c/0x590 [obdclass]
class_disconnect_stale_exports+0x26f/0x3b0 [obdclass]
? exp_lock_replay_healthy+0x30/0x30 [ptlrpc]
target_recovery_thread+0x62d/0x1250 [ptlrpc]
? srso_alias_return_thunk+0x5/0xfcdfd
? replay_request_or_update.isra.31+0xa90/0xa90 [ptlrpc]
kthread+0x134/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x40
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_zfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) dm_mod zfs(POE) spl(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc intel_rapl_msr intel_rapl_common kvm_amd ccp kvm irqbypass iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr joydev i2c_i801 lpc_ich virtio_balloon ext4 mbcache jbd2 ahci libahci libata crc32c_intel virtio_net net_failover failover serio_raw virtio_blk [last unloaded: obdecho]
CR2: 0000000000000024
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-152vm44.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-152vm44.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: sync; sync; sync
Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno
Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000
Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
Lustre: MGS: Client 32319d8c-256e-44cc-9164-5b758577f787 (at 10.240.47.90@tcp) reconnecting
Lustre: DEBUG MARKER: /usr/sbin/lctl mark test_26 fail mds1 1 times
Lustre: DEBUG MARKER: test_26 fail mds1 1 times
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
Lustre: 7988:0:(client.c:2478:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1774605265/real 1774605265] req@ff1660d8bb9f1380 x1860806070566400/t0(0) o103->MGC10.240.47.92@tcp@0@lo:17/18 lens 328/224 e 0 to 1 dl 1774605281 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'ldlm_bl.0' uid:0 gid:0 projid:4294967295
Lustre: MGS: Client d7f22288-df3e-4cee-95a6-a381e17f9cb8 (at 0@lo) reconnecting
LustreError: 144173:0:(ldlm_resource.c:1170:ldlm_resource_complain()) MGC10.240.47.92@tcp: namespace resource [0x65727473756c:0x2:0x0].0x0 (ff1660d8b9fa8300) refcount nonzero (1) after lock cleanup; forcing cleanup.
Lustre: 8024:0:(mgc_request.c:1917:mgc_process_log()) MGC10.240.47.92@tcp: IR log lustre-mdtir failed, not fatal: rc = -5
Lustre: Failing over lustre-MDT0000
LustreError: 8016:0:(client.c:1380:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ff1660d8cd3ddd40 x1860806071278080/t0(0) o105->MGS@10.240.47.90@tcp:15/16 lens 336/224 e 0 to 0 dl 0 ref 1 fl Rpc:QU/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 projid:4294967295
LustreError: 8016:0:(client.c:1380:ptlrpc_import_delay_req()) Skipped 2 previous similar messages
Lustre: server umount lustre-MDT0000 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 ||
Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs;
Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1
LustreError: 144757:0:(ldlm_resource.c:1170:ldlm_resource_complain()) MGC10.240.47.92@tcp: namespace resource [0x65727473756c:0x2:0x0].0x0 (ff1660d8b9fa8f00) refcount nonzero (1) after lock cleanup; forcing cleanup.
Lustre: 8024:0:(mgc_request.c:1917:mgc_process_log()) MGC10.240.47.92@tcp: IR log lustre-mdtir failed, not fatal: rc = -5
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-152vm46.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-152vm46.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-152vm46.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: trevis-152vm46.trevis.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null
Lustre: 144809:0:(ldlm_lib.c:2067:extend_recovery_timer()) lustre-MDT0000: extended recovery timer reached hard limit: 180, extend: 1
Lustre: 144809:0:(ldlm_lib.c:2067:extend_recovery_timer()) Skipped 4 previous similar messages
LustreError: 144809:0:(mdt_open.c:1729:mdt_reint_open()) lustre-MDT0000: name 'RESULTS1.PRN' present, but FID [0x20000afe1:0x17:0x0] is invalid
LustreError: 144809:0:(mdt_handler.c:5295:mdt_intent_open()) @@@ Replay open failed with -5 req@ff1660d8c8086a40 x1860805781154176/t0(154618823102) o101->9196f5d8-a1a4-49be-ad23-4d93e7c96dd9@10.240.47.89@tcp:544/0 lens 584/608 e 0 to 0 dl 1774605394 ref 1 fl Complete:/604/0 rc 0/0 job:'dbench.0' uid:0 gid:0 projid:0
Lustre: 144809:0:(genops.c:1620:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client 9196f5d8-a1a4-49be-ad23-4d93e7c96dd9@10.240.47.89@tcp
Lustre: lustre-MDT0000: disconnecting 1 stale clients
------------[ cut here ]------------
Probable access of uninitialized array lc_tags:d1f4f000
WARNING: CPU: 0 PID: 144809 at /tmp/rpmbuild-lustre-jenkins-RFOiZdjK/BUILD/lustre-2.17.51_23_g649b37b/lustre/obdclass/lu_object.c:1613 lu_context_key_get+0x70/0x80 [obdclass]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_zfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) dm_mod zfs(POE) spl(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc intel_rapl_msr intel_rapl_common kvm_amd ccp kvm irqbypass iTCO_wdt iTCO_vendor_support crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr joydev i2c_i801 lpc_ich virtio_balloon ext4 mbcache jbd2 ahci libahci libata crc32c_intel virtio_net net_failover failover serio_raw virtio_blk [last unloaded: obdecho]
CPU: 0 PID: 144809 Comm: tgt_recover_0 Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.89.1.el8_lustre.x86_64 #1
Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014
RIP: 0010:lu_context_key_get+0x70/0x80 [obdclass]
Code: 44 1a c1 c7 05 55 0e 0a 00 00 00 04 00 e8 e8 2c 63 ff 48 c7 c7 a0 77 1d c1 e8 2c 0b 63 ff 48 c7 c7 58 83 19 c1 e8 8d 2c fc e1 <0f> 0b 48 63 43 20 eb ad 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41
RSP: 0018:ff3966f38a483c98 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffffc11e6a80 RCX: 0000000000000000
RDX: ff1660d8fba2eec0 RSI: ff1660d8fba1e698 RDI: ff1660d8fba1e698
RBP: ff3966f38a483dc8 R08: 0000000000000000 R09: c0000000ffff7fff
R10: 0000000000000001 R11: ff3966f38a483ab0 R12: ff1660d8d21f12b0
R13: ff1660d8b9666400 R14: ff1660d8bc9db600 R15: ff1660d8af6ff150
FS: 0000000000000000(0000) GS:ff1660d8fba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055ac9a521d68 CR3: 0000000137610005 CR4: 0000000000771ef0
PKRU: 55555554
Call Trace:
? __warn+0x94/0xe0
? lu_context_key_get+0x70/0x80 [obdclass]
? lu_context_key_get+0x70/0x80 [obdclass]
? report_bug+0xb1/0xe0
? do_error_trap+0x9e/0xd0
? do_invalid_op+0x36/0x40
? lu_context_key_get+0x70/0x80 [obdclass]
? invalid_op+0x14/0x20
? lu_context_key_get+0x70/0x80 [obdclass]
? lu_context_key_get+0x70/0x80 [obdclass]
mdd_close+0x73/0xf00 [mdd]
mdt_mfd_close+0x6e2/0xc10 [mdt]
mdt_obd_disconnect+0x23f/0x820 [mdt]
class_disconnect_export_list+0x21c/0x590 [obdclass]
class_disconnect_stale_exports+0x26f/0x3b0 [obdclass]
? exp_lock_replay_healthy+0x30/0x30 [ptlrpc]
target_recovery_thread+0x62d/0x1250 [ptlrpc]
? srso_alias_return_thunk+0x5/0xfcdfd
? replay_request_or_update.isra.31+0xa90/0xa90 [ptlrpc]
kthread+0x134/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x40
---[ end trace 1f6d74f64f49d4c8 ]---
Link to test
replay-dual test 26: dbench and tar with mds failover
BUG: unable to handle kernel NULL pointer dereference at 0000000000000024
PGD 0
Oops: 0000 [#1] SMP NOPTI
CPU: 0 PID: 259990 Comm: tgt_recover_0 Kdump: loaded Tainted: G W OE -------- - - 4.18.0-553.89.1.el8_lustre.x86_64 #1
Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014
RIP: 0010:lu_context_key_get+0x2b/0x80 [obdclass]
Code: 1f 44 00 00 55 53 48 63 46 20 48 39 34 c5 80 24 a1 c0 75 1f 48 89 f3 8b 37 48 89 fd f7 c6 00 02 00 00 74 3f 48 8b 55 10 5b 5d <48> 8b 04 c2 e9 c7 27 4b d3 48 c7 c7 a0 27 8f c0 48 c7 c2 38 33 8b
RSP: 0018:ff6c89fe4a8a7ca8 EFLAGS: 00010282
RAX: 0000000000000004 RBX: ff2d3859c3d6b618 RCX: 0000000000000000
RDX: 0000000000000004 RSI: ff2d3859fba1e698 RDI: ff2d3859fba1e698
RBP: ff2d3859c097b700 R08: 0000000000000000 R09: c0000000ffff7fff
R10: 0000000000000001 R11: ff6c89fe4a8a7ab0 R12: ff2d3859c3db22b0
R13: ff2d3859c050f800 R14: ff2d3859c0bc4780 R15: ff2d3859c2d6d000
FS: 0000000000000000(0000) GS:ff2d3859fba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000024 CR3: 000000012f410004 CR4: 0000000000771ef0
PKRU: 55555554
Call Trace:
? __die_body+0x1a/0x60
? no_context+0x1ba/0x3f0
? __bad_area_nosemaphore+0x157/0x180
? do_error_trap+0x9e/0xd0
? do_page_fault+0x37/0x12d
? page_fault+0x1e/0x30
? lu_context_key_get+0x2b/0x80 [obdclass]
mdd_close+0x73/0xf00 [mdd]
mdt_mfd_close+0x6e2/0xc10 [mdt]
mdt_obd_disconnect+0x23f/0x820 [mdt]
class_disconnect_export_list+0x21c/0x590 [obdclass]
class_disconnect_stale_exports+0x26f/0x3b0 [obdclass]
? exp_lock_replay_healthy+0x30/0x30 [ptlrpc]
target_recovery_thread+0x62d/0x1250 [ptlrpc]
? srso_alias_return_thunk+0x5/0xfcdfd
? replay_request_or_update.isra.31+0xa90/0xa90 [ptlrpc]
kthread+0x134/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x40
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) dm_flakey dm_mod intel_rapl_msr intel_rapl_common kvm_amd ccp kvm rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache iTCO_wdt iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev pcspkr i2c_i801 virtio_balloon lpc_ich sunrpc ext4 mbcache jbd2 ahci libahci libata crc32c_intel virtio_net serio_raw net_failover failover virtio_blk
CR2: 0000000000000024
Lustre: MGS: Client 1ed6ab3b-59cf-48ef-ae99-bff3729f331a (at 10.240.29.127@tcp) reconnecting
Lustre: 8365:0:(client.c:2478:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1774607453/real 1774607453] req@ff2d3859b35aba80 x1860805848136192/t0(0) o103->MGC10.240.29.129@tcp@0@lo:17/18 lens 328/224 e 0 to 1 dl 1774607469 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'ldlm_bl.0' uid:0 gid:0 projid:4294967295
LustreError: 257449:0:(ldlm_resource.c:1170:ldlm_resource_complain()) MGC10.240.29.129@tcp: namespace resource [0x65727473756c:0x2:0x0].0x0 (ff2d3859b9f806c0) refcount nonzero (1) after lock cleanup; forcing cleanup.
Lustre: 8407:0:(mgc_request.c:1917:mgc_process_log()) MGC10.240.29.129@tcp: IR log lustre-mdtir failed, not fatal: rc = -5
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm255.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-155vm255.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: sync; sync; sync
Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup suspend --nolockfs --noflush /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup load /dev/mapper/mds1_flakey --table "0 4071424 flakey 252:0 0 0 1800 1 drop_writes"
Lustre: DEBUG MARKER: dmsetup resume /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000
Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000
Lustre: DEBUG MARKER: /usr/sbin/lctl mark test_26 fail mds1 1 times
Lustre: DEBUG MARKER: test_26 fail mds1 1 times
Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true
Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1
Lustre: Failing over lustre-MDT0000
LustreError: 258713:0:(ldlm_resource.c:1170:ldlm_resource_complain()) lustre-MDT0001-osp-MDT0000: namespace resource [0x2400032e0:0xf:0x0].0x0 (ff2d3859c0053300) refcount nonzero (1) after lock cleanup; forcing cleanup.
Lustre: lustre-MDT0000: Not available for connect from 10.240.29.126@tcp (stopping)
LustreError: 10755:0:(client.c:1380:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ff2d38598598b740 x1860805848453888/t0(0) o105->MGS@0@lo:15/16 lens 336/224 e 0 to 0 dl 0 ref 1 fl Rpc:QU/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 projid:4294967295
LustreError: 10755:0:(client.c:1380:ptlrpc_import_delay_req()) Skipped 3 previous similar messages
Lustre: server umount lustre-MDT0000 complete
Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
Lustre: DEBUG MARKER: modprobe dm-flakey;
Autotest: Test running for 80 minutes (lustre-reviews_review-dne-part-8_122951.33)
Lustre: DEBUG MARKER: modprobe dm-flakey;
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1
Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1
Lustre: DEBUG MARKER: dmsetup table /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup suspend --nolockfs --noflush /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: dmsetup load /dev/mapper/mds1_flakey --table "0 4071424 linear 252:0 0"
Lustre: DEBUG MARKER: dmsetup resume /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey
Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1
LDISKFS-fs (dm-3): 9 truncates cleaned up
LDISKFS-fs (dm-3): recovery complete
LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
LustreError: 259937:0:(ldlm_resource.c:1170:ldlm_resource_complain()) MGC10.240.29.129@tcp: namespace resource [0x65727473756c:0x2:0x0].0x0 (ff2d3859b9f80a80) refcount nonzero (1) after lock cleanup; forcing cleanup.
LustreError: 259937:0:(ldlm_resource.c:1170:ldlm_resource_complain()) Skipped 1 previous similar message
Lustre: 8407:0:(mgc_request.c:1917:mgc_process_log()) MGC10.240.29.129@tcp: IR log lustre-mdtir failed, not fatal: rc = -5
Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm257.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm257.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-155vm257.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: onyx-155vm257.onyx.whamcloud.com: executing set_default_debug -1 all
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey 2>/dev/null
Lustre: 259990:0:(ldlm_lib.c:2067:extend_recovery_timer()) lustre-MDT0000: extended recovery timer reached hard limit: 180, extend: 1
Lustre: 259990:0:(ldlm_lib.c:2067:extend_recovery_timer()) Skipped 39 previous similar messages
Lustre: 259990:0:(genops.c:1620:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client 221af37f-3824-4662-b987-01202e96fc97@10.240.29.126@tcp
Lustre: 259990:0:(genops.c:1620:class_disconnect_stale_exports()) Skipped 2 previous similar messages
Lustre: lustre-MDT0000: disconnecting 1 stale clients
Lustre: Skipped 2 previous similar messages
------------[ cut here ]------------
Probable access of uninitialized array lc_tags:c020c000
WARNING: CPU: 0 PID: 259990 at /tmp/rpmbuild-lustre-jenkins-RFOiZdjK/BUILD/lustre-2.17.51_23_g649b37b/lustre/obdclass/lu_object.c:1613 lu_context_key_get+0x70/0x80 [obdclass]
Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lustre(OE) mdc(OE) lov(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE) dm_flakey dm_mod intel_rapl_msr intel_rapl_common kvm_amd ccp kvm rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache iTCO_wdt iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev pcspkr i2c_i801 virtio_balloon lpc_ich sunrpc ext4 mbcache jbd2 ahci libahci libata crc32c_intel virtio_net serio_raw net_failover failover virtio_blk
CPU: 0 PID: 259990 Comm: tgt_recover_0 Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.89.1.el8_lustre.x86_64 #1
Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014
RIP: 0010:lu_context_key_get+0x70/0x80 [obdclass]
Code: f4 8b c0 c7 05 55 0e 0a 00 00 00 04 00 e8 e8 6c ab ff 48 c7 c7 a0 27 8f c0 e8 2c 4b ab ff 48 c7 c7 58 33 8b c0 e8 8d 7c 8a d2 <0f> 0b 48 63 43 20 eb ad 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41
RSP: 0018:ff6c89fe4a8a7c98 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffffc0901a80 RCX: 0000000000000000
RDX: ff2d3859fba2eec0 RSI: ff2d3859fba1e698 RDI: ff2d3859fba1e698
RBP: ff6c89fe4a8a7dc8 R08: 0000000000000000 R09: c0000000ffff7fff
R10: 0000000000000001 R11: ff6c89fe4a8a7ab0 R12: ff2d3859c3db22b0
R13: ff2d3859c050f800 R14: ff2d3859c0bc4780 R15: ff2d3859c2d6d000
FS: 0000000000000000(0000) GS:ff2d3859fba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc6ecafa040 CR3: 000000012f410004 CR4: 0000000000771ef0
PKRU: 55555554
Call Trace:
? __warn+0x94/0xe0
? lu_context_key_get+0x70/0x80 [obdclass]
? lu_context_key_get+0x70/0x80 [obdclass]
? report_bug+0xb1/0xe0
? do_error_trap+0x9e/0xd0
? do_invalid_op+0x36/0x40
? lu_context_key_get+0x70/0x80 [obdclass]
? invalid_op+0x14/0x20
? lu_context_key_get+0x70/0x80 [obdclass]
? lu_context_key_get+0x70/0x80 [obdclass]
mdd_close+0x73/0xf00 [mdd]
mdt_mfd_close+0x6e2/0xc10 [mdt]
mdt_obd_disconnect+0x23f/0x820 [mdt]
class_disconnect_export_list+0x21c/0x590 [obdclass]
class_disconnect_stale_exports+0x26f/0x3b0 [obdclass]
? exp_lock_replay_healthy+0x30/0x30 [ptlrpc]
target_recovery_thread+0x62d/0x1250 [ptlrpc]
? srso_alias_return_thunk+0x5/0xfcdfd
? replay_request_or_update.isra.31+0xa90/0xa90 [ptlrpc]
kthread+0x134/0x150
? set_kthread_struct+0x50/0x50
ret_from_fork+0x1f/0x40
---[ end trace 0f85b6467e16cbf4 ]---
Link to test
Return to new crashes list