Match messages in logs (every line would be required to be present in log output Copy from "Messages before crash" column below): | |
Match messages in full crash (every line would be required to be present in crash log output Copy from "Full Crash" column below): | |
Limit to a test: (Copy from below "Failing text"): | |
Delete these reports as invalid (real bug in review or some such) | |
Bug or comment: | |
Extra info: |
Failing Test | Full Crash | Messages before crash | Comment |
---|---|---|---|
sanity-sec test 15: test id mapping | LustreError: 189733:0:(mdt_lproc.c:1610:mdt_counter_incr()) ASSERTION( nm->nm_md_stats ) failed: LustreError: 189733:0:(mdt_lproc.c:1610:mdt_counter_incr()) LBUG CPU: 0 PID: 189733 Comm: mdt_out00_002 Kdump: loaded Tainted: G OE ------- --- 5.14.0-427.42.1_lustre.el9.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0x34/0x48 lbug_with_loc.cold+0x5/0x43 [libcfs] mdt_counter_incr+0x188/0x190 [mdt] mdt_statfs+0x55b/0x8b0 [mdt] ? tgt_request_preprocess+0x20f/0x4b0 [ptlrpc] tgt_handle_request0+0x14a/0x770 [ptlrpc] tgt_request_handle+0x1eb/0xb80 [ptlrpc] ptlrpc_server_handle_request.isra.0+0x2a0/0xce0 [ptlrpc] ptlrpc_main+0xa7e/0xfa0 [ptlrpc] ? __pfx_ptlrpc_main+0x10/0x10 [ptlrpc] kthread+0xe0/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> | Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.squash_uid Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.squash_gid Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.59652_0.id Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.59652_1.id Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.59652_2.id Lustre: 8843:0:(client.c:2445:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1746449314/real 1746449314] req@ffff947b4c630d00 x1831277218547968/t0(0) o13->lustre-OST0000-osc-MDT0001@10.240.28.6@tcp:7/4 lens 224/368 e 0 to 1 dl 1746449330 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'osp-pre-0-1.0' uid:0 gid:0 projid:4294967295 Lustre: 8843:0:(client.c:2445:ptlrpc_expire_one_request()) Skipped 1 previous similar message Lustre: lustre-OST0000-osc-MDT0001: Connection to lustre-OST0000 (at 10.240.28.6@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 19 previous similar messages Lustre: 8842:0:(client.c:2445:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1746449315/real 1746449315] req@ffff947b4420d6c0 x1831277218548608/t0(0) o13->lustre-OST0007-osc-MDT0003@10.240.28.6@tcp:7/4 lens 224/368 e 0 to 1 dl 1746449331 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'osp-pre-7-3.0' uid:0 gid:0 projid:4294967295 Lustre: 8842:0:(client.c:2445:ptlrpc_expire_one_request()) Skipped 11 previous similar messages Lustre: 8842:0:(client.c:2445:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1746449316/real 1746449316] req@ffff947b4fdd0340 x1831277218552320/t0(0) o13->lustre-OST0006-osc-MDT0001@10.240.28.6@tcp:7/4 lens 224/368 e 0 to 1 dl 1746449332 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'osp-pre-6-1.0' uid:0 gid:0 projid:4294967295 Lustre: 8842:0:(client.c:2445:ptlrpc_expire_one_request()) Skipped 10 previous similar messages Lustre: 8842:0:(client.c:2445:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1746449318/real 1746449318] req@ffff947b4420e080 x1831277218552960/t0(0) o13->lustre-OST0005-osc-MDT0001@10.240.28.6@tcp:7/4 lens 224/368 e 0 to 1 dl 1746449334 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'osp-pre-5-1.0' uid:0 gid:0 projid:4294967295 Lustre: 8842:0:(client.c:2445:ptlrpc_expire_one_request()) Skipped 4 previous similar messages Lustre: 8842:0:(client.c:2445:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1746449325/real 1746449325] req@ffff947b3dff0d00 x1831277218560000/t0(0) o400->lustre-OST0006-osc-MDT0003@10.240.28.6@tcp:28/4 lens 224/224 e 0 to 1 dl 1746449341 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 projid:4294967295 Lustre: 8842:0:(client.c:2445:ptlrpc_expire_one_request()) Skipped 19 previous similar messages | Link to test |
sanity-sec test 15: test id mapping | LustreError: 186025:0:(mdt_lproc.c:1610:mdt_counter_incr()) ASSERTION( nm->nm_md_stats ) failed: LustreError: 186025:0:(mdt_lproc.c:1610:mdt_counter_incr()) LBUG CPU: 0 PID: 186025 Comm: mdt_out00_000 Kdump: loaded Tainted: G OE ------- --- 5.14.0-503.38.1_lustre.el9.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: <TASK> dump_stack_lvl+0x34/0x48 lbug_with_loc.cold+0x5/0x43 [libcfs] mdt_counter_incr+0x188/0x190 [mdt] mdt_statfs+0x55b/0x8b0 [mdt] ? tgt_request_preprocess+0x20f/0x4b0 [ptlrpc] tgt_handle_request0+0x14a/0x770 [ptlrpc] tgt_request_handle+0x1eb/0xb80 [ptlrpc] ptlrpc_server_handle_request.isra.0+0x2a0/0xce0 [ptlrpc] ptlrpc_main+0xa7b/0xfa0 [ptlrpc] ? __pfx_ptlrpc_main+0x10/0x10 [ptlrpc] kthread+0xe0/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> | Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.squash_uid Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.squash_gid Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.42148_0.id Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.42148_1.id Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.42148_2.id LNet: Host 10.240.40.250 reset our connection while we were sending data; it may have rebooted: rc = -104 Lustre: 8869:0:(client.c:2445:ptlrpc_expire_one_request()) @@@ Request sent has failed due to network error: [sent 1746449120/real 1746449120] req@ffff992d04022a40 x1831276865482624/t0(0) o400->lustre-OST0000-osc-MDT0001@10.240.40.250@tcp:28/4 lens 224/224 e 0 to 1 dl 1746449136 ref 1 fl Rpc:eXNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 projid:4294967295 Lustre: 8869:0:(client.c:2445:ptlrpc_expire_one_request()) Skipped 1 previous similar message Lustre: lustre-OST0000-osc-MDT0001: Connection to lustre-OST0000 (at 10.240.40.250@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 19 previous similar messages Autotest: Killing test framework, node(s) in the cluster crashed (lustre-reviews_review-dne-part-2_113101.30) Autotest: Sleeping to ensure other nodes in the cluster have not crashed (lustre-reviews_review-dne-part-2_113101.30) Lustre: 8869:0:(client.c:2445:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1746449116/real 1746449116] req@ffff992d081b9d40 x1831276865479296/t0(0) o13->lustre-OST0002-osc-MDT0003@10.240.40.250@tcp:7/4 lens 224/368 e 0 to 1 dl 1746449132 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'osp-pre-2-3.0' uid:0 gid:0 projid:4294967295 Lustre: 8869:0:(client.c:2445:ptlrpc_expire_one_request()) Skipped 16 previous similar messages Lustre: 8870:0:(client.c:2445:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1746449117/real 1746449117] req@ffff992d081b8680 x1831276865480704/t0(0) o13->lustre-OST0001-osc-MDT0003@10.240.40.250@tcp:7/4 lens 224/368 e 0 to 1 dl 1746449133 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'osp-pre-1-3.0' uid:0 gid:0 projid:4294967295 Lustre: 8870:0:(client.c:2445:ptlrpc_expire_one_request()) Skipped 27 previous similar messages Autotest: Test running for 130 minutes (lustre-reviews_review-dne-part-2_113101.30) Autotest: trevis-58vm3 crashed during sanity-sec (lustre-reviews_review-dne-part-2_113101.30) Lustre: lustre-MDT0003: haven't heard from client lustre-MDT0003-lwp-OST0000_UUID (at 10.240.40.250@tcp) in 104 seconds. I think it's dead, and I am evicting it. exp ffff992cf7508000, cur 1746449213 deadline 1746449209 last 1746449109 | Link to test |