| Match messages in logs (every line would be required to be present in log output Copy from "Messages before crash" column below): | |
| Match messages in full crash (every line would be required to be present in crash log output Copy from "Full Crash" column below): | |
| Limit to a test: (Copy from below "Failing text"): | |
| Delete these reports as invalid (real bug in review or some such) | |
| Bug or comment: | |
| Extra info: |
| Failing Test | Full Crash | Messages before crash | Comment |
|---|---|---|---|
| sanity-lfsck test 4: FID-in-dirent can be rebuilt after MDT file-level backup/restore | LustreError: 283349:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 283349:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 283349 Comm: mount.lustre Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7ff45098adbe | Autotest: Test running for 70 minutes (lustre-reviews_review-dne-part-2_122950.27) Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -f /tmp/backup_restore.tgz Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: tar zcf /tmp/backup_restore.tgz --xattrs --xattrs-include=trusted.* --sparse -C /mnt/lustre-brpt/ . Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: [ -e /dev/mapper/mds1_flakey ] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=100000 --mkfsoptions="-b 4096" --reformat /dev/mapper/mds1_flakey LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: tar zxfp /tmp/backup_restore.tgz --xattrs --xattrs-include=trusted.* --sparse -C /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/OBJECTS/* /mnt/lustre-brpt/CATALOGS Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -f /tmp/backup_restore.tgz Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey lustre-MDT0000 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1 Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1 Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr,noscrub /dev/mapper/mds1_flakey /mnt/lustre-mds1 LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc Lustre: lustre-MDT0000: reset Object Index mappings | Link to test |
| sanity-pfl test 9: Replay layout extend object instantiation | LustreError: 53522:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 53522:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 53522 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? vsnprintf+0x340/0x520 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? xa_find_after+0xe9/0x110 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fbe8fec0dbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 52650:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 52939:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 19537:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.120@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 19537:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 25 previous similar messages LustreError: 24160:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.118@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 24160:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.119@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 24160:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 6 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.22.121@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| runtests test 1: All Runtests | LustreError: 28992:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 28992:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 28992 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0x22e/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f616e675dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl mark touching \/mnt\/lustre at Fri Mar 27 09:31:20 UTC 2026 \(@1774603880\) Lustre: DEBUG MARKER: touching /mnt/lustre at Fri Mar 27 09:31:20 UTC 2026 (@1774603880) Lustre: DEBUG MARKER: /usr/sbin/lctl mark create an empty file \/mnt\/lustre\/hosts.15694 Lustre: DEBUG MARKER: create an empty file /mnt/lustre/hosts.15694 Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15694 Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15694 Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing \/etc\/hosts and \/mnt\/lustre\/hosts.15694 Lustre: DEBUG MARKER: comparing /etc/hosts and /mnt/lustre/hosts.15694 Lustre: DEBUG MARKER: /usr/sbin/lctl mark renaming \/mnt\/lustre\/hosts.15694 to \/mnt\/lustre\/hosts.15694.ren Lustre: DEBUG MARKER: renaming /mnt/lustre/hosts.15694 to /mnt/lustre/hosts.15694.ren Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15694 again Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15694 again Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.15694 Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.15694 Lustre: DEBUG MARKER: /usr/sbin/lctl mark removing \/mnt\/lustre\/hosts.15694 Lustre: DEBUG MARKER: removing /mnt/lustre/hosts.15694 Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15694.2 Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15694.2 Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.15694.2 to 123 bytes Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.15694.2 to 123 bytes Lustre: DEBUG MARKER: /usr/sbin/lctl mark creating \/mnt\/lustre\/d1.runtests Lustre: DEBUG MARKER: creating /mnt/lustre/d1.runtests Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying 1000 files from \/etc, \/usr\/bin to \/mnt\/lustre\/d1.runtests\/etc, \/mnt\/lustre\/d1.runtests\/usr\/bin at Fri Mar 27 09:31:26 UTC 2026 Lustre: DEBUG MARKER: copying 1000 files from /etc, /usr/bin to /mnt/lustre/d1.runtests/etc, /mnt/lustre/d1.runtests/usr/bin at Fri Mar 27 09:31:26 UTC 2026 Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing 1000 newly copied files at Fri Mar 27 09:31:33 UTC 2026 Lustre: DEBUG MARKER: comparing 1000 newly copied files at Fri Mar 27 09:31:33 UTC 2026 Lustre: DEBUG MARKER: /usr/sbin/lctl mark running createmany -d \/mnt\/lustre\/d1.runtests\/d 1000 Lustre: DEBUG MARKER: running createmany -d /mnt/lustre/d1.runtests/d 1000 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n debug Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=ha Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=super+ioctl+neterror+warning+dlmtrace+error+emerg+ha+rpctrace+vfstrace+config+console+lfsck Lustre: DEBUG MARKER: /usr/sbin/lctl mark finished at Fri Mar 27 09:31:40 UTC 2026 \(20\) Lustre: DEBUG MARKER: finished at Fri Mar 27 09:31:40 UTC 2026 (20) Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: lustre-MDT0000: Not available for connect from 10.240.22.120@tcp (stopping) Lustre: Skipped 6 previous similar messages LustreError: 27566:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 9499:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.22.122@tcp arrived at 1774603930 with bad export cookie 11349884235830644711 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.22.122@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message LustreError: 8357:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8357:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 15 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 8072:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1774603938 with bad export cookie 11349884235830644494 LustreError: 8072:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: MGC10.240.22.121@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 23384:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.120@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23384:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: lustre-MDT0002: Not available for connect from 10.240.22.120@tcp (stopping) Lustre: Skipped 6 previous similar messages LustreError: 27958:0:(obd_class.h:479:obd_check_dev()) Device 34 not setup LustreError: 27958:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Autotest: Test running for 5 minutes (lustre-reviews_review-dne-zfs-part-2_122950.36) Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 29016:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 29016:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-dual test 0a: expired recovery with lost client | LustreError: 32600:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 32600:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 32600 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f836b4bfdbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 31726:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 32016:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 23733:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.122@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23733:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 23 previous similar messages LustreError: 8369:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.120@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8369:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages LustreError: 8369:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.118@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8369:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; LustreError: 8375:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.122@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8375:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.22.121@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lnet test 226: test missing route for 1 of 2 routers | LustreError: 274624:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 274624:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 274624 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0x22e/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f86dc4c3dbe | Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod Key type lgssc unregistered LNet: 230954:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 230954:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.22.121@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 233422:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 233422:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm65.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm65.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.22.121@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.120@tcp1 LNet: 236746:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 234980:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.120@tcp is being used as a gateway but routing feature is not turned on LNetError: 234980:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.120@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.119@tcp1 LNetError: 234980:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.120@tcp is being used as a gateway but routing feature is not turned on LNetError: 234980:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 234980:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.120@tcp is being used as a gateway but routing feature is not turned on LNetError: 234980:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message LNetError: 234980:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.120@tcp has gone from down to up LNetError: 234980:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.22.118@tcp Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.119@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.119@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.119@tcp1; e LNet: 236456:0:(lib-move.c:2247:lnet_handle_find_routed_path()) No peer NI for gateway 10.240.22.119@tcp1. Attempting to find an alternative route. Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.120@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.120@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.120@tcp1; e Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.119@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.119@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.119@tcp1; e Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 237626:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 237626:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing load_lnet alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.22.121@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 240406:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 240406:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm65.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: onyx-157vm65.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.22.121@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.119@tcp1 LNet: 243538:0:(router.c:718:lnet_add_route()) Consider turning discovery on to enable full Multi-Rail routing functionality LNet: 243538:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 241964:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.119@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 241964:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.119@tcp1 has gone from up to down LNetError: 241964:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.119@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 241964:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.119@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 241964:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.119@tcp1 has gone from down to up Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.22.118@tcp Autotest: Test running for 40 minutes (lustre-reviews_review-dne-zfs-part-2_122950.36) Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.119@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.119@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.119@tcp1; e Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 244177:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 244177:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.22.121@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 246958:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 246958:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm65.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm65.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.22.121@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.120@tcp1 LNet: 250282:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 248516:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.120@tcp is being used as a gateway but routing feature is not turned on LNetError: 248516:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.120@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.119@tcp1 LNetError: 248516:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.119@tcp is being used as a gateway but routing feature is not turned on LNetError: 248516:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 248516:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.119@tcp is being used as a gateway but routing feature is not turned on LNetError: 248516:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message LNetError: 248516:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.120@tcp has gone from down to up LNetError: 248516:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.22.118@tcp Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm65.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm65.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest LNet: 251648:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251648:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251648:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251648:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251648:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251648:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251648:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.120@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.120@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.120@tcp1; e Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.119@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.119@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.119@tcp1; e Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 252169:0:(timer.c:222:stt_shutdown()) waiting for 1 threads to terminate LNet: 252169:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 252169:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.22.121@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 261051:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 261051:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm65.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm65.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.22.121@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.120@tcp1 LNet: 264375:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 262609:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.120@tcp is being used as a gateway but routing feature is not turned on LNetError: 262609:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.120@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.119@tcp1 LNetError: 262609:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.120@tcp is being used as a gateway but routing feature is not turned on LNetError: 262609:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 262609:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.119@tcp has gone from down to up LNetError: 262609:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message LNetError: 262609:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.120@tcp is being used as a gateway but routing feature is not turned on LNetError: 262609:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.22.118@tcp Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm65.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm65.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest LNet: 265758:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265758:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265758:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265758:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265758:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265758:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265758:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer Lustre: DEBUG MARKER: echo 5 > /sys/module/lnet/parameters/alive_router_check_interval Lustre: DEBUG MARKER: echo 5 > /sys/module/lnet/parameters/router_ping_timeout Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait 5s for LST to start Lustre: DEBUG MARKER: Wait 5s for LST to start Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: /usr/sbin/lctl mark Disable routing on onyx-157vm66 Lustre: DEBUG MARKER: Disable routing on onyx-157vm66 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait for lst to finish Lustre: DEBUG MARKER: Wait for lst to finish LNetError: 262609:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.119@tcp is being used as a gateway but routing feature is not turned on LNetError: 262609:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.119@tcp has gone from up to down LNetError: 262609:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message LNetError: 262609:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.119@tcp is being used as a gateway but routing feature is not turned on Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait 5s for LST to start Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: Wait 5s for LST to start Lustre: DEBUG MARKER: /usr/sbin/lctl mark Enable routing on onyx-157vm66 Lustre: DEBUG MARKER: Enable routing on onyx-157vm66 LNetError: 262609:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.119@tcp is being used as a gateway but routing feature is not turned on LNetError: 262609:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait for lst to finish Lustre: DEBUG MARKER: Wait for lst to finish LNet: 1 peer NIs in recovery (showing 1): 10.240.22.119@tcp1 Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.120@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.120@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.120@tcp1; e LNetError: 262609:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.119@tcp has gone from down to up Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.119@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.119@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.119@tcp1; e Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 267220:0:(timer.c:222:stt_shutdown()) waiting for 1 threads to terminate LNet: 267220:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 267220:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Autotest: Test running for 45 minutes (lustre-reviews_review-dne-zfs-part-2_122950.36) Key type ._llcrypt registered Key type .llcrypt registered libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: Lustre: Build Version: 2.16.59_76_g8d55e38 LNet: Added LNI 10.240.22.121@tcp [8/256/0/180] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lfsck test 1a: LFSCK can find out and repair crashed FID-in-dirent | LustreError: 29742:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 29742:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 29742 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fd338a25dbe | Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 29766:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 29766:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 4 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-sec test 27a: test fileset in various nodemaps | LustreError: 244695:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 244695:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 244695 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7efd528b0dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_activate 1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin --value 1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted --value 1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.default.admin_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin=1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted=1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin=1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted=1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_set_fileset --name default --fileset /thisisaverylongsubdirtotestlongfilesetsandtotestmultiplefilesetfragmentsonthenodemapiam_default Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.fileset | awk '/primary/ { if ($3 == "/thisisaverylongsubdirtotestlongfilesetsandtotestmultiplefilesetfragmentsonthenodemapiam_default") print $3 }' Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.fileset Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.22.122@tcp (stopping) LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 240403:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 12293:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.22.122@tcp arrived at 1774608493 with bad export cookie 6862561024446920934 LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.22.122@tcp failed: rc = -107 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.22.122@tcp) was lost; in progress operations using this service will wait for recovery to complete LustreError: 15860:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 15860:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 14 previous similar messages LustreError: 15860:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.122@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 15860:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 12294:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1774608501 with bad export cookie 6862561024446920717 LustreError: 12294:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: MGC10.240.22.121@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0002: Not available for connect from 10.240.22.122@tcp (stopping) Lustre: Skipped 2 previous similar messages LustreError: 12590:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.122@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: lustre-MDT0002: Not available for connect from 10.240.22.122@tcp (stopping) LustreError: 12590:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 9 previous similar messages LustreError: 240795:0:(obd_class.h:479:obd_check_dev()) Device 34 not setup LustreError: 240795:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm69.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm69.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing unload_modules_local LNet: 242509:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 242509:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.121@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm66.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm67.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm69.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm66.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm67.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm69.onyx.whamcloud.com: executing load_modules_local libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: Lustre: Build Version: 2.16.59_76_g8d55e38 LNet: Added LNI 10.240.22.121@tcp [8/256/0/180] Key type lgssc registered Lustre: Echo OBD driver; http://www.lustre.org/ Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-hsm test 26e: RAoLU with a non-started coordinator | LustreError: 63699:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 63699:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 63699 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fac12e4adbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x47:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x47:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x48:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x48:0x0'.*action='ARCHIVE'/ {print $6}' | cut -f3 -d/ Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0000.hsm_control='shutdown' Lustre: Modifying parameter lustre.mdt.lustre-MDT0000.hsm_control=shutdown in log params Lustre: Skipped 3 previous similar messages Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0001.hsm_control='shutdown' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0002.hsm_control='shutdown' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0003.hsm_control='shutdown' Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 63114:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 38251:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.167@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 38251:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 22 previous similar messages LustreError: 40029:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.168@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 40029:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 9509:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9509:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 13 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.31.170@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| insanity test 0: Fail all nodes, independently | LustreError: 23407:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23407:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23407 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f06c4670dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 22824:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 11332:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.168@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11332:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 18 previous similar messages LustreError: 8375:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.169@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8375:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 13 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.31.170@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-flr test 50A: mirror split update layout generation | LustreError: 101817:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 101817:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 101817 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f7cf51f3dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.31.171@tcp (stopping) LustreError: 101234:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8340:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8340:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 18 previous similar messages LustreError: 8341:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.169@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8341:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages LustreError: 55030:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.171@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 55030:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 8340:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.169@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8340:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 11 previous similar messages LustreError: MGC10.240.31.170@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| Module load | LustreError: 19038:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 19038:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 19038 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0x22e/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f592daf7dbe | Lustre: Lustre: Build Version: 2.16.59_76_g8d55e38 LNet: Added LNI 10.240.31.170@tcp [8/256/0/180] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: Setting parameter lustre-MDT0000.mdt.identity_upcall=/usr/sbin/l_getidentity in log lustre-MDT0000 Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space. Lustre: lustre-MDT0000: new disk, initializing Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/li Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm166.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm166.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm166.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm166.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' Lustre: DEBUG MARKER: sync; sleep 1; sync Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null Lustre: 8356:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0001/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000240000400-0x0000000280000400]:1:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm167.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm167.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3; mount -t lustre -o localrecov lustre-mdt3/mdt3 /mnt/lustre-mds3 Lustre: 8356:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0002/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: srv-lustre-MDT0002: No data found on store. Initialize space. Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0002: new disk, initializing Lustre: lustre-MDT0002: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000280000400-0x00000002c0000400]:2:mdt Lustre: cli-ctl-lustre-MDT0002: Allocated super-sequence [0x0000000280000400-0x00000002c0000400]:2:mdt] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/li Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm166.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm166.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm166.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm166.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' Lustre: DEBUG MARKER: sync; sleep 1; sync Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 2>/dev/null Lustre: 8356:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0003/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x00000002c0000400-0x0000000300000400]:3:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm167.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm167.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P debug_raw_pointers=Y Lustre: Modifying parameter general.debug_raw_pointers=Y in log params Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000300000400-0x0000000340000400]:0:ost Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x100000000 to 0x300000403 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x100010000 to 0x340000403 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000380000400-0x00000003c0000400]:2:ost Lustre: Skipped 1 previous similar message Lustre: lustre-OST0002-osc-MDT0000: update sequence from 0x100020000 to 0x380000403 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0003-osc-MDT0000: update sequence from 0x100030000 to 0x3c0000403 Lustre: lustre-OST0004-osc-MDT0000: update sequence from 0x100040000 to 0x400000401 Lustre: DEBUG MARKER: onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0005-osc-MDT0000: update sequence from 0x100050000 to 0x440000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000480000400-0x00000004c0000400]:6:ost Lustre: Skipped 3 previous similar messages Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0006-osc-MDT0000: update sequence from 0x100060000 to 0x480000403 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm165.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0007-osc-MDT0000: update sequence from 0x100070000 to 0x4c0000403 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm164.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm164.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: lctl get_param -n timeout Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20 Lustre: DEBUG MARKER: Using TIMEOUT=20 Lustre: DEBUG MARKER: [ -f /sys/module/mgc/parameters/mgc_requeue_timeout_min ] && echo 1 > /sys/module/mgc/parameters/mgc_requeue_timeout_min; exit 0 Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.sys.jobid_var='procname_uid' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P lod.*.mdt_hash=crush Lustre: Setting parameter general.lod.*.mdt_hash=crush in log params Lustre: DEBUG MARKER: sysctl --values kernel/kptr_restrict Lustre: DEBUG MARKER: sysctl -wq kernel/kptr_restrict=1 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Client: 2.16.59.76 Lustre: DEBUG MARKER: Client: 2.16.59.76 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl mark MDS: 2.16.59.76 Lustre: DEBUG MARKER: MDS: 2.16.59.76 Lustre: DEBUG MARKER: /usr/sbin/lctl mark OSS: 2.16.59.76 Lustre: DEBUG MARKER: OSS: 2.16.59.76 Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: mmp ============----- Fri Mar 27 09:25:15 UTC 2026 Lustre: DEBUG MARKER: -----============= acceptance-small: mmp ============----- Fri Mar 27 09:25:15 UTC 2026 Lustre: DEBUG MARKER: hostname -I Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/mmp.*ex || true Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/mmp.*ex 2>/dev/null ||true Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests: Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.31.169@tcp (stopping) Lustre: Skipped 7 previous similar messages LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: Skipped 2 previous similar messages Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 16123:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8066:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.31.171@tcp arrived at 1774603529 with bad export cookie 17005118640773728226 LustreError: 8066:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 2 previous similar messages LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.31.171@tcp failed: rc = -107 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.31.171@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 13994:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1774603537 with bad export cookie 17005118640773728009 LustreError: 13994:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 2 previous similar messages LustreError: MGC10.240.31.170@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 16515:0:(obd_class.h:479:obd_check_dev()) Device 34 not setup LustreError: 16515:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: /usr/sbin/lctl mark SKIP: mmp ldiskfs only test Lustre: DEBUG MARKER: SKIP: mmp ldiskfs only test Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: replay-ost-single ============----- Fri Mar 27 09:26:09 UTC 2026 Lustre: DEBUG MARKER: -----============= acceptance-small: replay-ost-single ============----- Fri Mar 27 09:26:09 UTC 2026 Lustre: DEBUG MARKER: hostname -I Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/replay-ost-single.*ex || true Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/replay-ost-single.*ex 2>/dev/null ||true Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests: Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: /usr/sbin/lctl mark skipping tests SLOW=no: 5 Lustre: DEBUG MARKER: skipping tests SLOW=no: 5 Lustre: DEBUG MARKER: /usr/sbin/lctl mark === replay-ost-single: start setup 09:26:15 \(1774603575\) === Lustre: DEBUG MARKER: === replay-ost-single: start setup 09:26:15 (1774603575) === Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 19062:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 19062:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 15 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-quota test 7c: Quota reintegration (restart mds during reintegration) | LustreError: 142473:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 142473:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 142473 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f23f3c4adbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_* Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0 Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ugp LustreError: 141599:0:(qsd_reint.c:618:qqi_reint_delayed()) lustre-MDT0000: Delaying reintegration for qtype:0 until pending updates are flushed. LustreError: 141599:0:(qsd_reint.c:618:qqi_reint_delayed()) Skipped 11 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 LustreError: Skipped 1 previous similar message Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) LustreError: 141794:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.31.170@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| insanity test 0: Fail all nodes, independently | LustreError: 39865:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 39865:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 39865 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f9294aaddbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 39284:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-flr test 50A: mirror split update layout generation | LustreError: 31306:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 31306:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 31306 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f115bfe8dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 30724:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-quota test 7c: Quota reintegration (restart mds during reintegration) | LustreError: 119269:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 119269:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 119269 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fb4e4018dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_* Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0 Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ugp Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 LustreError: 118568:0:(qsd_reint.c:618:qqi_reint_delayed()) lustre-MDT0000: Delaying reintegration for qtype:0 until pending updates are flushed. LustreError: 118568:0:(qsd_reint.c:618:qqi_reint_delayed()) Skipped 5 previous similar messages Lustre: Failing over lustre-MDT0000 LustreError: 118592:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-single test 0a: empty replay | LustreError: 15128:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 15128:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 15128 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f0e7314bdbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 14258:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 14547:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lsnapshot test 1b: mount snapshot without original filesystem mounted | LustreError: 22930:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 22930:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 22930 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f1fecad7dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot create -F lustre -n lss_1b_0 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.39.138@tcp (stopping) Lustre: Skipped 1 previous similar message LustreError: 20263:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup LustreError: 20263:0:(obd_class.h:479:obd_check_dev()) Skipped 5 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot mount -F lustre -n lss_1b_0 Lustre: 2388f2ac-MDT0000: set dev_rdonly on this device Lustre: 2388f2ac-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot mount -F lustre -n lss_1b_0 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot umount -F lustre -n lss_1b_0 Lustre: Failing over 2388f2ac-MDT0000 LustreError: 22022:0:(obd_class.h:479:obd_check_dev()) Device 9 not setup LustreError: 22022:0:(obd_class.h:479:obd_check_dev()) Skipped 7 previous similar messages Lustre: server umount 2388f2ac-MDT0000 complete Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-scrub test 1c: Auto detect kinds of OI file(s) removed/recreated cases | LustreError: 217557:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 217557:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 217557 Comm: mount.lustre Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f9218b62dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 215319:0:(obd_class.h:479:obd_check_dev()) Device 14 not setup LustreError: 215319:0:(obd_class.h:479:obd_check_dev()) Skipped 117 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds3 LustreError: MGC10.240.23.222@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: Skipped 3 previous similar messages Lustre: Failing over lustre-MDT0002 Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/oi.16.0 Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: test -b /dev/mapper/mds3_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds3_flakey /mnt/lustre-brpt LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/oi.16.0 Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1 Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1 Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr,noscrub,notcu /dev/mapper/mds1_flakey /mnt/lustre-mds1 LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc Lustre: lustre-MDT0000: invalid oi count 63, remove them, then set it to 64 | Link to test |
| sanityn test 33c: Cancel cross-MDT lock should trigger Sync-on-Lock-Cancel | LustreError: 43258:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 43258:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 43258 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? ptlrpc_set_import_discon+0x50a/0x870 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f903ae4adbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 42577:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.47.86@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| recovery-small test 23: client hang when close a file after mds crash | LustreError: 68698:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 68698:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 68698 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f2415845dbe | Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0x123 Lustre: *** cfs_fail_loc=123, val=2147483648*** Lustre: Skipped 4 previous similar messages Lustre: DEBUG MARKER: lctl set_param fail_loc=0 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 68115:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8332:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.47.85@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8332:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 25 previous similar messages LustreError: 37260:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.47.113@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 37260:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages LustreError: 8332:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.47.79@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8332:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; LustreError: 26351:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.47.85@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 26351:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 9 previous similar messages Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.47.86@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-single test 0a: empty replay | LustreError: 23784:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23784:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23784 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f69ee36edbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 22912:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 23201:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 11629:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.142@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11629:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 19 previous similar messages LustreError: 11629:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.143@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11629:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages LustreError: 11629:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.141@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11629:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 2 previous similar messages LustreError: 8363:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.142@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8363:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 4 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.28.144@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| ost-pools test 25: Create new pool and restart MDS | LustreError: 176853:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 176853:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 176853 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fbfa3023dbe | Lustre: DEBUG MARKER: lctl pool_new lustre.testpool1 Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1 2>/dev/null || echo foo Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0002-mdtlov.pools.testpool1 2>/dev/null || echo foo Lustre: DEBUG MARKER: lctl pool_add lustre.testpool1 OST0000; sync Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.28.145@tcp (stopping) LustreError: 176266:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 12238:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.142@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 12238:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 18 previous similar messages LustreError: 8353:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.143@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8353:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages LustreError: 9495:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.145@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9495:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 6 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.28.144@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity test 60a: llog_test run from kernel module and test llog_reader | LustreError: 202520:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 202520:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 202520 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f9d49720dbe | Lustre: DEBUG MARKER: ! which run-llog.sh &> /dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl mark test_60 run 27592 - from kernel mode Lustre: DEBUG MARKER: test_60 run 27592 - from kernel mode Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /dev/null Lustre: DEBUG MARKER: bash run-llog.sh Lustre: 200264:0:(llog_test.c:2454:llog_test_device_init()) Setup llog-test device over MGS device Lustre: 200264:0:(llog_test.c:98:llog_test_1()) 1a: create a log with name: 1b2c4be2 Lustre: 200264:0:(llog_test.c:115:llog_test_1()) 1b: close newly-created log Lustre: 200264:0:(llog_test.c:146:llog_test_2()) 2a: re-open a log with name: 1b2c4be2 Lustre: 200264:0:(llog_test.c:166:llog_test_2()) 2b: create a log without specified NAME & LOGID Lustre: 200264:0:(llog_test.c:184:llog_test_2()) 2b: write 1 llog records, check llh_count Lustre: 200264:0:(llog_test.c:197:llog_test_2()) 2c: re-open the log by LOGID and verify llh_count Lustre: 200264:0:(llog_test.c:244:llog_test_2()) 2d: destroy this log Lustre: 200264:0:(llog_test.c:404:llog_test_3()) 3a: write 1023 fixed-size llog records Lustre: 200264:0:(llog_test.c:368:llog_test3_process()) test3: processing records from index 501 to the end Lustre: 200264:0:(llog_test.c:378:llog_test3_process()) test3: total 525 records processed with 0 paddings Lustre: 200264:0:(llog_test.c:460:llog_test_3()) 3b: write 566 variable size llog records Lustre: 200264:0:(llog_test.c:532:llog_test_3()) 3c: write records with variable size until BITMAP_SIZE, return -ENOSPC Lustre: 200264:0:(llog_test.c:555:llog_test_3()) 3c: wrote 63962 more records before end of llog is reached Lustre: 200264:0:(llog_test.c:584:llog_test_4()) 4a: create a catalog log with name: 1b2c4be3 Lustre: 200264:0:(llog_test.c:599:llog_test_4()) 4b: write 1 record into the catalog Lustre: 200264:0:(llog_test.c:626:llog_test_4()) 4c: cancel 1 log record Lustre: 200264:0:(llog_test.c:638:llog_test_4()) 4d: write 64767 more log records Lustre: 200264:0:(llog_test.c:654:llog_test_4()) 4e: add 5 large records, one record per block Lustre: 200264:0:(llog_test.c:674:llog_test_4()) 4f: put newly-created catalog Lustre: 200264:0:(llog_test.c:773:llog_test_5()) 5a: re-open catalog by id Lustre: 200264:0:(llog_test.c:786:llog_test_5()) 5b: print the catalog entries.. we expect 2 Lustre: 200270:0:(llog_test.c:703:cat_print_cb()) seeing record at index 1 - [0x1:0x1b:0x0] in log [0xa:0x14:0x0] Lustre: 200264:0:(llog_test.c:798:llog_test_5()) 5c: Cancel 64767 records, see one log zapped Lustre: 200264:0:(llog_test.c:806:llog_test_5()) 5c: print the catalog entries.. we expect 1 Lustre: 200271:0:(llog_test.c:703:cat_print_cb()) seeing record at index 2 - [0x1:0x1c:0x0] in log [0xa:0x14:0x0] Lustre: 200271:0:(llog_test.c:703:cat_print_cb()) Skipped 1 previous similar message Lustre: 200264:0:(llog_test.c:818:llog_test_5()) 5d: add 1 record to the log with many canceled empty pages Lustre: 200264:0:(llog_test.c:826:llog_test_5()) 5e: print plain log entries.. expect 6 Lustre: 200264:0:(llog_test.c:838:llog_test_5()) 5f: print plain log entries reversely.. expect 6 Lustre: 200264:0:(llog_test.c:852:llog_test_5()) 5g: close re-opened catalog Lustre: 200264:0:(llog_test.c:882:llog_test_6()) 6a: re-open log 1b2c4be2 using client API Lustre: MGS: non-config logname received: 1b2c4be2 Lustre: 200264:0:(llog_test.c:914:llog_test_6()) 6b: process log 1b2c4be2 using client API Lustre: 200264:0:(llog_test.c:918:llog_test_6()) 6b: processed 63962 records Lustre: 200264:0:(llog_test.c:925:llog_test_6()) 6c: process log 1b2c4be2 reversely using client API Lustre: 200264:0:(llog_test.c:929:llog_test_6()) 6c: processed 63962 records Lustre: 200264:0:(llog_test.c:1077:llog_test_7()) 7a: test llog_logid_rec Lustre: 200264:0:(llog_test.c:1088:llog_test_7()) 7b: test llog_unlink64_rec Lustre: 200264:0:(llog_test.c:1099:llog_test_7()) 7c: test llog_setattr64_rec Lustre: 200264:0:(llog_test.c:1110:llog_test_7()) 7d: test llog_size_change_rec Lustre: 200264:0:(llog_test.c:1121:llog_test_7()) 7e: test llog_changelog_rec Lustre: 200264:0:(llog_test.c:1028:llog_test_7_sub()) 7_sub: records are not aligned, written 64071 from 64767 Lustre: 200264:0:(llog_test.c:1133:llog_test_7()) 7f: test llog_changelog_user_rec2 Lustre: 200264:0:(llog_test.c:1028:llog_test_7_sub()) 7_sub: records are not aligned, written 64139 from 64767 Lustre: 200264:0:(llog_test.c:1144:llog_test_7()) 7g: test llog_gen_rec Lustre: 200264:0:(llog_test.c:1155:llog_test_7()) 7h: test llog_setattr64_rec_v2 Lustre: 200264:0:(llog_test.c:1028:llog_test_7_sub()) 7_sub: records are not aligned, written 64071 from 64767 Lustre: 200264:0:(llog_test.c:1264:llog_test_8()) 8a: fill the first plain llog Lustre: 200264:0:(llog_test.c:1293:llog_test_8()) 8b: first llog [0x1:0x28:0x0] Lustre: 200264:0:(llog_test.c:1309:llog_test_8()) 8b: fill the second plain llog Lustre: 200264:0:(llog_test.c:1333:llog_test_8()) 8b: pin llog [0x1:0x2a:0x0] Lustre: 200264:0:(llog_test.c:1336:llog_test_8()) 8b: clean first llog record in catalog Lustre: 200264:0:(llog_test.c:1347:llog_test_8()) 8c: corrupt first chunk in the middle Lustre: 200264:0:(llog_test.c:1350:llog_test_8()) 8c: corrupt second chunk at start Lustre: 200264:0:(llog_test.c:1353:llog_test_8()) 8d: count survived records Lustre: 200264:0:(llog_test.c:1383:llog_test_8()) 8d: close re-opened catalog Lustre: 200264:0:(llog_test.c:1444:llog_test_9()) 9a: test llog_logid_rec Lustre: 200264:0:(llog_test.c:1428:llog_test_9_sub()) 9_sub: record type 1064553b in log 0x1:0x2c:0x0 Lustre: 200264:0:(llog_test.c:1455:llog_test_9()) 9b: test llog_obd_cfg_rec Lustre: 200264:0:(llog_test.c:1466:llog_test_9()) 9c: test llog_changelog_rec Lustre: 200264:0:(llog_test.c:1478:llog_test_9()) 9d: test llog_changelog_user_rec2 Lustre: 200264:0:(llog_test.c:1579:llog_test_10()) 10a: create a catalog log with name: 1b2c4be4 Lustre: 200264:0:(llog_test.c:1609:llog_test_10()) 10b: write 64767 log records Lustre: 200264:0:(llog_test.c:1635:llog_test_10()) 10c: write 129534 more log records Lustre: 200264:0:(llog_test.c:1667:llog_test_10()) 10c: write 64767 more log records Lustre: 200264:0:(llog_cat.c:80:llog_cat_new_log()) MGS: there are no more free slots in catalog 1b2c4be4 LustreError: 200264:0:(llog_cat.c:583:llog_cat_add_rec()) MGS: initialization error: rc = -28 Lustre: 200264:0:(llog_test.c:1694:llog_test_10()) 10c: wrote 64011 records then 756 failed with ENOSPC Lustre: 200264:0:(llog_test.c:1713:llog_test_10()) 10d: Cancel 64767 records, see one log zapped Lustre: 200264:0:(llog_test.c:1727:llog_test_10()) 10d: print the catalog entries.. we expect 3 Lustre: 200283:0:(llog_test.c:703:cat_print_cb()) seeing record at index 2 - [0x1:0x31:0x0] in log [0xa:0x15:0x0] Lustre: 200264:0:(llog_test.c:1757:llog_test_10()) 10e: write 64767 more log records Lustre: 200264:0:(llog_cat.c:80:llog_cat_new_log()) MGS: there are no more free slots in catalog 1b2c4be4 Lustre: 200264:0:(llog_cat.c:80:llog_cat_new_log()) Skipped 755 previous similar messages LustreError: 200264:0:(llog_cat.c:583:llog_cat_add_rec()) MGS: initialization error: rc = -28 LustreError: 200264:0:(llog_cat.c:583:llog_cat_add_rec()) Skipped 755 previous similar messages Lustre: 200264:0:(llog_test.c:1784:llog_test_10()) 10e: wrote 64578 records then 189 failed with ENOSPC Lustre: 200264:0:(llog_test.c:1786:llog_test_10()) 10e: print the catalog entries.. we expect 4 Lustre: 200264:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200264:0:(llog_test.c:703:cat_print_cb()) seeing record at index 2 - [0x1:0x31:0x0] in log [0xa:0x15:0x0] Lustre: 200264:0:(llog_test.c:703:cat_print_cb()) Skipped 2 previous similar messages Lustre: 200264:0:(llog_test.c:1823:llog_test_10()) 10e: catalog successfully wrap around, last_idx 1, first 1 Lustre: 200264:0:(llog_test.c:1840:llog_test_10()) 10f: Cancel 64767 records, see one log zapped Lustre: 200264:0:(llog_test.c:1854:llog_test_10()) 10f: print the catalog entries.. we expect 3 Lustre: 200264:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200264:0:(llog_cat.c:971:llog_cat_process_or_fork()) Skipped 1 previous similar message Lustre: 200264:0:(llog_test.c:1885:llog_test_10()) 10f: write 64767 more log records Lustre: 200264:0:(llog_cat.c:80:llog_cat_new_log()) MGS: there are no more free slots in catalog 1b2c4be4 Lustre: 200264:0:(llog_cat.c:80:llog_cat_new_log()) Skipped 188 previous similar messages LustreError: 200264:0:(llog_cat.c:583:llog_cat_add_rec()) MGS: initialization error: rc = -28 LustreError: 200264:0:(llog_cat.c:583:llog_cat_add_rec()) Skipped 188 previous similar messages Lustre: 200264:0:(llog_test.c:1912:llog_test_10()) 10f: wrote 64578 records then 189 failed with ENOSPC Lustre: 200264:0:(llog_test.c:1959:llog_test_10()) 10g: Cancel 64767 records, see one log zapped Lustre: 200264:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200264:0:(llog_test.c:1971:llog_test_10()) 10g: print the catalog entries.. we expect 3 Lustre: 200264:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200264:0:(llog_test.c:703:cat_print_cb()) seeing record at index 4 - [0x1:0x33:0x0] in log [0xa:0x15:0x0] Lustre: 200264:0:(llog_test.c:703:cat_print_cb()) Skipped 6 previous similar messages Lustre: 200264:0:(llog_test.c:2001:llog_test_10()) 10g: Cancel 64767 records, see one log zapped Lustre: 200264:0:(llog_test.c:2015:llog_test_10()) 10g: print the catalog entries.. we expect 2 Lustre: 200264:0:(llog_test.c:2053:llog_test_10()) 10g: Cancel 64767 records, see one log zapped Lustre: 200264:0:(llog_test.c:2067:llog_test_10()) 10g: print the catalog entries.. we expect 1 Lustre: 200264:0:(llog_test.c:2093:llog_test_10()) 10g: llh_cat_idx has also successfully wrapped! Lustre: 200286:0:(llog_test.c:1538:cat_check_old_cb()) seeing record at index 2 - [0x1:0x35:0x0] in log [0xa:0x15:0x0] Lustre: 200264:0:(llog_test.c:2117:llog_test_10()) 10h: write 64767 more log records LustreError: 200264:0:(llog_osd.c:684:llog_osd_write_rec()) cfs_race id 1317 sleeping LustreError: 200286:0:(llog.c:682:llog_process_thread()) cfs_fail_race id 1317 waking LustreError: 200264:0:(llog_osd.c:684:llog_osd_write_rec()) cfs_fail_race id 1317 awake: rc=4442 LustreError: 200286:0:(llog.c:682:llog_process_thread()) cfs_fail_race id 1317 waking Lustre: 200286:0:(llog_test.c:1538:cat_check_old_cb()) seeing record at index 3 - [0x1:0x36:0x0] in log [0xa:0x15:0x0] LustreError: 200264:0:(llog_osd.c:684:llog_osd_write_rec()) cfs_fail_race id 1317 waking Lustre: 200264:0:(llog_test.c:2144:llog_test_10()) 10h: wrote 64767 records then 0 failed with ENOSPC Lustre: 200264:0:(llog_test.c:2157:llog_test_10()) 10: put newly-created catalog Lustre: 200264:0:(llog_test.c:2187:llog_test_11()) 11: create a plain nameless log Lustre: 200264:0:(llog_test.c:2212:llog_test_11()) 11: size 8216 in 1 blocks after 1 rec Lustre: 200264:0:(llog_test.c:2214:llog_test_11()) 11: add few records Lustre: 200264:0:(llog_test.c:2231:llog_test_11()) 11: size 10616 in 1 blocks with few recs lctl (200264): drop_caches: 3 Lustre: 200264:0:(llog_test.c:2256:llog_test_11()) 11: re-open the log by LOGID and verify llh_count Lustre: 200264:0:(llog_test.c:2271:llog_test_11()) 11: size 10616 in 1 blocks after re-open Lustre: DEBUG MARKER: /usr/sbin/lctl dk Lustre: DEBUG MARKER: which llog_reader 2> /dev/null Lustre: DEBUG MARKER: ls -d /usr/sbin/llog_reader Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 200774:0:(client.c:1370:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ff1dcca9fe812700 x1860805099863296/t0(0) o1000->lustre-MDT0001-osp-MDT0000@10.240.23.233@tcp:24/4 lens 304/4320 e 0 to 0 dl 0 ref 2 fl Rpc:QU/200/ffffffff rc 0/-1 job:'umount.0' uid:0 gid:0 projid:4294967295 LustreError: 200774:0:(osp_object.c:617:osp_attr_get()) lustre-MDT0001-osp-MDT0000: osp_attr_get update error [0x20000000a:0x1:0x0]: rc = -5 LustreError: 200774:0:(llog_cat.c:443:llog_cat_close()) lustre-MDT0001-osp-MDT0000: failure destroying log during cleanup: rc = -5 Lustre: lustre-MDT0000: Not available for connect from 10.240.23.233@tcp (stopping) LustreError: 200774:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup LustreError: 200774:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value canmount lustre-mdt1/mdt1 LustreError: 8376:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.23.233@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8376:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 10 previous similar messages Lustre: DEBUG MARKER: zfs get -H -o value mountpoint lustre-mdt1/mdt1 LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 LustreError: Skipped 7 previous similar messages Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 13 previous similar messages Lustre: DEBUG MARKER: zfs set canmount=noauto lustre-mdt1/mdt1 Lustre: DEBUG MARKER: zfs set mountpoint=legacy lustre-mdt1/mdt1 LustreError: 91629:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.23.229@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 91629:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 14 previous similar messages Lustre: DEBUG MARKER: mount -t zfs lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.23.232@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-pcc test 1c: Test automated attach using Project ID with manual HSM restore | LustreError: 23873:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23873:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23873 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fae0ed89dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: zpool get all Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: Skipped 2 previous similar messages Lustre: lustre-MDT0000: Not available for connect from 10.240.39.247@tcp (stopping) Lustre: Skipped 1 previous similar message LustreError: 22594:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && LustreError: 8049:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.39.249@tcp arrived at 1774603487 with bad export cookie 412177843531655982 LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.39.249@tcp failed: rc = -107 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.39.249@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 9488:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1774603494 with bad export cookie 412177843531655765 LustreError: 9488:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: MGC10.240.39.248@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 22887:0:(obd_class.h:479:obd_check_dev()) Device 33 not setup LustreError: 22887:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: zpool set feature@project_quota=enabled lustre-mdt1 Lustre: DEBUG MARKER: zpool set feature@project_quota=enabled lustre-mdt3 Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 23897:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23897:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 15 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-scrub test 0: Do not auto trigger OI scrub for non-backup/restore case | LustreError: 27649:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 27649:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 27649 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0x22e/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f83882eddbe | Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.39.249@tcp (stopping) Lustre: Skipped 9 previous similar messages LustreError: 26524:0:(obd_class.h:479:obd_check_dev()) Device 27 not setup LustreError: 26524:0:(obd_class.h:479:obd_check_dev()) Skipped 25 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 20812:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.39.249@tcp arrived at 1774604794 with bad export cookie 194227385246588374 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds3 LustreError: MGC10.240.39.248@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: Failing over lustre-MDT0002 LustreError: 26917:0:(obd_class.h:479:obd_check_dev()) Device 26 not setup LustreError: 26917:0:(obd_class.h:479:obd_check_dev()) Skipped 15 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Autotest: Test running for 5 minutes (lustre-reviews_review-dne-zfs-part-7_122950.41) Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 27673:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| large-scale test 3a: recovery time, 2 clients | LustreError: 23467:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23467:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23467 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7ff29012adbe | Lustre: DEBUG MARKER: /usr/sbin/lctl mark 1 : Starting failover on mds1 Lustre: DEBUG MARKER: 1 : Starting failover on mds1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.39.249@tcp (stopping) LustreError: 22882:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 12439:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 12439:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages LustreError: 8330:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.246@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8330:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 12439:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.247@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 12439:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages LustreError: 9474:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9474:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 6 previous similar messages LustreError: 9474:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.249@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9474:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 13 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.39.248@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| conf-sanity test 0: single mount setup | LustreError: 32905:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 32905:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 32905 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f750013bdbe | Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 32929:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 32929:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 4 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lfsck test 4: FID-in-dirent can be rebuilt after MDT file-level backup/restore | LustreError: 287437:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 287437:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 287437 Comm: mount.lustre Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0xd3/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f223f9ccdbe | Lustre: 256268:0:(osd_internal.h:1370:osd_trans_exec_op()) lustre-MDT0000: opcode 2: before 257 < left 320, rollback = 2 Lustre: 256268:0:(osd_internal.h:1370:osd_trans_exec_op()) Skipped 1127 previous similar messages Lustre: 256268:0:(osd_handler.c:2088:osd_trans_dump_creds()) create: 1/4/0, destroy: 1/4/0 Lustre: 256268:0:(osd_handler.c:2088:osd_trans_dump_creds()) Skipped 1127 previous similar messages Lustre: 256268:0:(osd_handler.c:2095:osd_trans_dump_creds()) attr_set: 5/5/1, xattr_set: 7/320/0 Lustre: 256268:0:(osd_handler.c:2095:osd_trans_dump_creds()) Skipped 1127 previous similar messages Lustre: 256268:0:(osd_handler.c:2105:osd_trans_dump_creds()) write: 6/38/0, punch: 0/0/0, quota 1/3/0 Lustre: 256268:0:(osd_handler.c:2105:osd_trans_dump_creds()) Skipped 1127 previous similar messages Lustre: 256268:0:(osd_handler.c:2112:osd_trans_dump_creds()) insert: 2/33/0, delete: 2/5/1 Lustre: 256268:0:(osd_handler.c:2112:osd_trans_dump_creds()) Skipped 1127 previous similar messages Lustre: 256268:0:(osd_handler.c:2119:osd_trans_dump_creds()) ref_add: 1/1/0, ref_del: 2/2/1 Lustre: 256268:0:(osd_handler.c:2119:osd_trans_dump_creds()) Skipped 1127 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -f /tmp/backup_restore.tgz Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: tar zcf /tmp/backup_restore.tgz --xattrs --xattrs-include=trusted.* --sparse -C /mnt/lustre-brpt/ . Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: [ -e /dev/mapper/mds1_flakey ] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=100000 --mkfsoptions="-b 4096" --reformat /dev/mapper/mds1_flakey LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: tar zxfp /tmp/backup_restore.tgz --xattrs --xattrs-include=trusted.* --sparse -C /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/OBJECTS/* /mnt/lustre-brpt/CATALOGS Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -f /tmp/backup_restore.tgz Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey lustre-MDT0000 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1 Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1 Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr,noscrub /dev/mapper/mds1_flakey /mnt/lustre-mds1 LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc Lustre: lustre-MDT0000: reset Object Index mappings | Link to test |
| sanity-pfl test 9: Replay layout extend object instantiation | LustreError: 53611:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 53611:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 53611 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? vsnprintf+0x340/0x520 ? xas_load+0x8/0x80 ? xas_find+0x183/0x1c0 ? xa_find_after+0xe9/0x110 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0xd3/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f0ca098fdbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 52739:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.27.53@tcp (stopping) LustreError: 53028:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 LustreError: 24391:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.29.21@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 24391:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 17 previous similar messages LustreError: 24391:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.25.65@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 24391:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; LustreError: 23560:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.29.21@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23560:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.27.52@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-dual test 0a: expired recovery with lost client | LustreError: 32531:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 32531:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 32531 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? cfs_trace_unlock_tcd+0x20/0x70 [libcfs] ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fb09d682dbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 31659:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 31948:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8303:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.27.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8303:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 22 previous similar messages LustreError: 8297:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.29.22@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8297:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 5 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; LustreError: 8303:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.27.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8303:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 11 previous similar messages Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.27.52@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lfsck test 1a: LFSCK can find out and repair crashed FID-in-dirent | LustreError: 29712:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 29712:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 29712 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0x22e/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f588fca8dbe | Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 29736:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 29736:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 2 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| runtests test 1: All Runtests | LustreError: 28948:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 28948:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 28948 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0x22e/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f00238f2dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl mark touching \/mnt\/lustre at Fri Mar 20 11:15:20 UTC 2026 \(@1774005320\) Lustre: DEBUG MARKER: touching /mnt/lustre at Fri Mar 20 11:15:20 UTC 2026 (@1774005320) Lustre: DEBUG MARKER: /usr/sbin/lctl mark create an empty file \/mnt\/lustre\/hosts.15598 Lustre: DEBUG MARKER: create an empty file /mnt/lustre/hosts.15598 Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15598 Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15598 Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing \/etc\/hosts and \/mnt\/lustre\/hosts.15598 Lustre: DEBUG MARKER: comparing /etc/hosts and /mnt/lustre/hosts.15598 Lustre: DEBUG MARKER: /usr/sbin/lctl mark renaming \/mnt\/lustre\/hosts.15598 to \/mnt\/lustre\/hosts.15598.ren Lustre: DEBUG MARKER: renaming /mnt/lustre/hosts.15598 to /mnt/lustre/hosts.15598.ren Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15598 again Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15598 again Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.15598 Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.15598 Lustre: DEBUG MARKER: /usr/sbin/lctl mark removing \/mnt\/lustre\/hosts.15598 Lustre: DEBUG MARKER: removing /mnt/lustre/hosts.15598 Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15598.2 Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15598.2 Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.15598.2 to 123 bytes Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.15598.2 to 123 bytes Lustre: DEBUG MARKER: /usr/sbin/lctl mark creating \/mnt\/lustre\/d1.runtests Lustre: DEBUG MARKER: creating /mnt/lustre/d1.runtests Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying 1000 files from \/etc, \/usr\/bin to \/mnt\/lustre\/d1.runtests\/etc, \/mnt\/lustre\/d1.runtests\/usr\/bin at Fri Mar 20 11:15:26 UTC 2026 Lustre: DEBUG MARKER: copying 1000 files from /etc, /usr/bin to /mnt/lustre/d1.runtests/etc, /mnt/lustre/d1.runtests/usr/bin at Fri Mar 20 11:15:26 UTC 2026 Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing 1000 newly copied files at Fri Mar 20 11:15:32 UTC 2026 Lustre: DEBUG MARKER: comparing 1000 newly copied files at Fri Mar 20 11:15:32 UTC 2026 Lustre: DEBUG MARKER: /usr/sbin/lctl mark running createmany -d \/mnt\/lustre\/d1.runtests\/d 1000 Lustre: DEBUG MARKER: running createmany -d /mnt/lustre/d1.runtests/d 1000 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n debug Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=ha Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=super+ioctl+neterror+warning+dlmtrace+error+emerg+ha+rpctrace+vfstrace+config+console+lfsck Lustre: DEBUG MARKER: /usr/sbin/lctl mark finished at Fri Mar 20 11:15:37 UTC 2026 \(17\) Lustre: DEBUG MARKER: finished at Fri Mar 20 11:15:37 UTC 2026 (17) Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.27.53@tcp (stopping) Lustre: Skipped 3 previous similar messages Autotest: Test running for 5 minutes (lustre-reviews_review-dne-zfs-part-2_122754.37) LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 27521:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8011:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.27.53@tcp arrived at 1774005372 with bad export cookie 5461852924504465860 LustreError: 8011:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.27.53@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 8011:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1774005373 with bad export cookie 5461852924504465643 LustreError: MGC10.240.27.52@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0002: Not available for connect from 10.240.25.65@tcp (stopping) Lustre: Skipped 16 previous similar messages LustreError: 8301:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.27.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8301:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 16 previous similar messages LustreError: 27965:0:(obd_class.h:479:obd_check_dev()) Device 34 not setup LustreError: 27965:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 28972:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lnet test 226: test missing route for 1 of 2 routers | LustreError: 274539:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 274539:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 274539 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0x22e/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f3c72a5bdbe | Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod Key type lgssc unregistered LNet: 230920:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 230920:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.27.52@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 233393:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 233393:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-142vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.27.52@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.29.22@tcp1 LNet: 236718:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 234952:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.29.22@tcp is being used as a gateway but routing feature is not turned on LNetError: 234952:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.29.22@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.25.65@tcp1 LNetError: 234952:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.29.22@tcp is being used as a gateway but routing feature is not turned on LNetError: 234952:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 234952:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.29.22@tcp is being used as a gateway but routing feature is not turned on LNetError: 234952:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message LNetError: 234952:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.29.22@tcp has gone from down to up LNetError: 234952:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message LNetError: 234952:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.25.65@tcp has gone from down to up Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.29.21@tcp Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.25.65@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.25.65@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.25.65@tcp1; else LNet: 236429:0:(lib-move.c:2247:lnet_handle_find_routed_path()) No peer NI for gateway 10.240.25.65@tcp1. Attempting to find an alternative route. Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.29.22@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.29.22@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.29.22@tcp1; else Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.25.65@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.25.65@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.25.65@tcp1; else Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 237597:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 237597:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.27.52@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 240377:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 240377:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm14.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: onyx-142vm14.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers= Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers= libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.27.52@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.25.65@tcp1 LNet: 243509:0:(router.c:718:lnet_add_route()) Consider turning discovery on to enable full Multi-Rail routing functionality LNet: 243509:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 241935:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.25.65@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 241935:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.25.65@tcp1 has gone from up to down LNetError: 241935:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.25.65@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 241935:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.25.65@tcp1 has gone from down to up Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.29.21@tcp Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.25.65@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.25.65@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.25.65@tcp1; else Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 244096:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 244096:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.27.52@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 246876:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 246876:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-142vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.27.52@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.29.22@tcp1 LNet: 250200:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 248434:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.29.22@tcp is being used as a gateway but routing feature is not turned on LNetError: 248434:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.29.22@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.25.65@tcp1 Autotest: Test running for 45 minutes (lustre-reviews_review-dne-zfs-part-2_122754.37) LNetError: 248434:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.29.22@tcp is being used as a gateway but routing feature is not turned on LNetError: 248434:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 248434:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.29.22@tcp is being used as a gateway but routing feature is not turned on LNetError: 248434:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message LNetError: 248434:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.29.22@tcp has gone from down to up LNetError: 248434:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.29.21@tcp Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm14.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-142vm14.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest LNet: 251635:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251635:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251635:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251635:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251635:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251635:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251635:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.29.22@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.29.22@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.29.22@tcp1; else Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.25.65@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.25.65@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.25.65@tcp1; else Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 252138:0:(timer.c:222:stt_shutdown()) waiting for 1 threads to terminate LNet: 252138:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 252138:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.27.52@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 261017:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 261017:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-142vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.27.52@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.29.22@tcp1 LNet: 264341:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 262575:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.29.22@tcp is being used as a gateway but routing feature is not turned on LNetError: 262575:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.29.22@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.25.65@tcp1 LNetError: 262575:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.29.22@tcp is being used as a gateway but routing feature is not turned on LNetError: 262575:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 262575:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.25.65@tcp has gone from down to up LNetError: 262575:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message LNetError: 262575:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.29.22@tcp is being used as a gateway but routing feature is not turned on LNetError: 262575:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.29.21@tcp Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm14.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-142vm14.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest LNet: 265724:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265724:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265724:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265724:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265724:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265724:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265724:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer Lustre: DEBUG MARKER: echo 5 > /sys/module/lnet/parameters/alive_router_check_interval Lustre: DEBUG MARKER: echo 5 > /sys/module/lnet/parameters/router_ping_timeout Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait 5s for LST to start Lustre: DEBUG MARKER: Wait 5s for LST to start Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: /usr/sbin/lctl mark Disable routing on onyx-135vm14 Lustre: DEBUG MARKER: Disable routing on onyx-135vm14 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait for lst to finish Lustre: DEBUG MARKER: Wait for lst to finish LNetError: 262575:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.25.65@tcp is being used as a gateway but routing feature is not turned on LNetError: 262575:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.25.65@tcp has gone from up to down LNetError: 262575:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message LNetError: 262575:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.25.65@tcp is being used as a gateway but routing feature is not turned on Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait 5s for LST to start Lustre: DEBUG MARKER: Wait 5s for LST to start Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: /usr/sbin/lctl mark Enable routing on onyx-135vm14 Lustre: DEBUG MARKER: Enable routing on onyx-135vm14 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait for lst to finish Lustre: DEBUG MARKER: Wait for lst to finish LNetError: 262575:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.25.65@tcp is being used as a gateway but routing feature is not turned on LNetError: 262575:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message LNet: 1 peer NIs in recovery (showing 1): 10.240.25.65@tcp1 Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw LNetError: 262575:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.25.65@tcp has gone from down to up Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.29.22@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.29.22@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.29.22@tcp1; else Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.25.65@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.25.65@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.25.65@tcp1; else Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 267186:0:(timer.c:222:stt_shutdown()) waiting for 1 threads to terminate LNet: 267186:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 267186:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: Lustre: Build Version: 2.16.59_75_gddd27e5 LNet: Added LNI 10.240.27.52@tcp [8/256/0/180] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-sec test 27a: test fileset in various nodemaps | LustreError: 241597:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 241597:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 241597 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? cfs_trace_unlock_tcd+0x20/0x70 [libcfs] ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0xd3/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fb532199dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_activate 1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin --value 1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted --value 1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.default.admin_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin=1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted=1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin=1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted=1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_set_fileset --name default --fileset /thisisaverylongsubdirtotestlongfilesetsandtotestmultiplefilesetfragmentsonthenodemapiam_default Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.fileset | awk '/primary/ { if ($3 == "/thisisaverylongsubdirtotestlongfilesetsandtotestmultiplefilesetfragmentsonthenodemapiam_default") print $3 }' Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.fileset Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.27.53@tcp (stopping) Lustre: lustre-MDT0000: Not available for connect from 10.240.25.65@tcp (stopping) Lustre: Skipped 7 previous similar messages Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: Skipped 1 previous similar message LustreError: 237305:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8005:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.27.53@tcp arrived at 1774010646 with bad export cookie 5129234439883961518 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.27.53@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message LustreError: 8138:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.25.65@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8138:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 20 previous similar messages LustreError: 9436:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.27.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9436:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 8005:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1774010653 with bad export cookie 5129234439883961301 LustreError: 8005:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: MGC10.240.27.52@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0002: Not available for connect from 10.240.27.53@tcp (stopping) Lustre: Skipped 4 previous similar messages LustreError: 8138:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.25.65@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8138:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 9 previous similar messages LustreError: 237698:0:(obd_class.h:479:obd_check_dev()) Device 33 not setup LustreError: 237698:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm14.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-86vm14.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing unload_modules_local LNet: 239413:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 239413:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.27.52@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm14.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-86vm13.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-135vm14.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-142vm15.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-86vm14.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-86vm13.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-135vm14.onyx.whamcloud.com: executing load_modules_local libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-142vm15.onyx.whamcloud.com: executing load_modules_local Lustre: Lustre: Build Version: 2.16.59_75_gddd27e5 LNet: Added LNI 10.240.27.52@tcp [8/256/0/180] Key type lgssc registered Lustre: Echo OBD driver; http://www.lustre.org/ Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-hsm test 26e: RAoLU with a non-started coordinator | LustreError: 64089:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 64089:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 64089 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f6d97b10dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x48:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x49:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x49:0x0'.*action='ARCHIVE'/ {print $6}' | cut -f3 -d/ Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0000.hsm_control='shutdown' Lustre: Modifying parameter lustre.mdt.lustre-MDT0000.hsm_control=shutdown in log params Lustre: Skipped 3 previous similar messages Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0001.hsm_control='shutdown' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0002.hsm_control='shutdown' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0003.hsm_control='shutdown' Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 63504:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 40121:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.211@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 40121:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 20 previous similar messages LustreError: 40120:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.212@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 40120:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 11 previous similar messages LustreError: 8266:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.210@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8266:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.45.213@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| Module load | LustreError: 18945:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 18945:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 18945 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f3c2abccdbe | Lustre: Lustre: Build Version: 2.16.59_75_gddd27e5 LNet: Added LNI 10.240.45.213@tcp [8/256/0/180] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: Setting parameter lustre-MDT0000.mdt.identity_upcall=/usr/sbin/l_getidentity in log lustre-MDT0000 Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space. Lustre: lustre-MDT0000: new disk, initializing Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/li Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm55.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm55.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm55.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm55.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' Lustre: DEBUG MARKER: sync; sleep 1; sync Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null Lustre: 8268:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0001/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000240000400-0x0000000280000400]:1:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm229.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm229.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3; mount -t lustre -o localrecov lustre-mdt3/mdt3 /mnt/lustre-mds3 Lustre: 8269:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0002/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: srv-lustre-MDT0002: No data found on store. Initialize space. Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0002: new disk, initializing Lustre: lustre-MDT0002: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000280000400-0x00000002c0000400]:2:mdt Lustre: cli-ctl-lustre-MDT0002: Allocated super-sequence [0x0000000280000400-0x00000002c0000400]:2:mdt] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/li Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm55.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm55.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm55.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm55.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' Lustre: DEBUG MARKER: sync; sleep 1; sync Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 2>/dev/null Lustre: 8269:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0003/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x00000002c0000400-0x0000000300000400]:3:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm229.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm229.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P debug_raw_pointers=Y Lustre: Modifying parameter general.debug_raw_pointers=Y in log params Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000300000400-0x0000000340000400]:0:ost Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x100000000 to 0x300000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x100010000 to 0x340000403 Lustre: DEBUG MARKER: trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000380000400-0x00000003c0000400]:2:ost Lustre: Skipped 1 previous similar message Lustre: lustre-OST0002-osc-MDT0000: update sequence from 0x100020000 to 0x380000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0003-osc-MDT0000: update sequence from 0x100030000 to 0x3c0000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0004-osc-MDT0000: update sequence from 0x100040000 to 0x400000401 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000440000400-0x0000000480000400]:5:ost Lustre: Skipped 2 previous similar messages Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0005-osc-MDT0000: update sequence from 0x100050000 to 0x440000403 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0006-osc-MDT0000: update sequence from 0x100060000 to 0x480000403 Lustre: lustre-OST0007-osc-MDT0000: update sequence from 0x100070000 to 0x4c0000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm54.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm53.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm53.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: lctl get_param -n timeout Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20 Lustre: DEBUG MARKER: Using TIMEOUT=20 Lustre: DEBUG MARKER: [ -f /sys/module/mgc/parameters/mgc_requeue_timeout_min ] && echo 1 > /sys/module/mgc/parameters/mgc_requeue_timeout_min; exit 0 Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.sys.jobid_var='procname_uid' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P lod.*.mdt_hash=crush Lustre: Setting parameter general.lod.*.mdt_hash=crush in log params Lustre: DEBUG MARKER: sysctl --values kernel/kptr_restrict Lustre: DEBUG MARKER: sysctl -wq kernel/kptr_restrict=1 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Client: 2.16.59.75 Lustre: DEBUG MARKER: Client: 2.16.59.75 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl mark MDS: 2.16.59.75 Lustre: DEBUG MARKER: MDS: 2.16.59.75 Lustre: DEBUG MARKER: /usr/sbin/lctl mark OSS: 2.16.59.75 Lustre: DEBUG MARKER: OSS: 2.16.59.75 Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: mmp ============----- Fri Mar 20 11:06:08 UTC 2026 Lustre: DEBUG MARKER: -----============= acceptance-small: mmp ============----- Fri Mar 20 11:06:08 UTC 2026 Lustre: DEBUG MARKER: hostname -I Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/mmp.*ex || true Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/mmp.*ex 2>/dev/null ||true Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests: Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.38.229@tcp (stopping) Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0000: Not available for connect from 10.240.45.212@tcp (stopping) Lustre: Skipped 9 previous similar messages LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: lustre-MDT0000: Not available for connect from 10.240.38.229@tcp (stopping) Lustre: Skipped 4 previous similar messages LustreError: 16030:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 7983:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.38.229@tcp arrived at 1774004782 with bad export cookie 18063536055454380572 LustreError: 7983:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.38.229@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.38.229@tcp failed: rc = -107 LustreError: 11238:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.38.229@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11238:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 13 previous similar messages LustreError: 11238:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11238:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 9 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 7984:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1774004789 with bad export cookie 18063536055454380355 LustreError: MGC10.240.45.213@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 8276:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.38.229@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8276:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: lustre-MDT0002: Not available for connect from 10.240.38.229@tcp (stopping) LustreError: 16422:0:(obd_class.h:479:obd_check_dev()) Device 33 not setup LustreError: 16422:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: /usr/sbin/lctl mark SKIP: mmp ldiskfs only test Lustre: DEBUG MARKER: SKIP: mmp ldiskfs only test Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: replay-ost-single ============----- Fri Mar 20 11:07:08 UTC 2026 Lustre: DEBUG MARKER: -----============= acceptance-small: replay-ost-single ============----- Fri Mar 20 11:07:08 UTC 2026 Lustre: DEBUG MARKER: hostname -I Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/replay-ost-single.*ex || true Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/replay-ost-single.*ex 2>/dev/null ||true Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests: Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: /usr/sbin/lctl mark skipping tests SLOW=no: 5 Lustre: DEBUG MARKER: skipping tests SLOW=no: 5 Lustre: DEBUG MARKER: /usr/sbin/lctl mark === replay-ost-single: start setup 11:07:14 \(1774004834\) === Lustre: DEBUG MARKER: === replay-ost-single: start setup 11:07:14 (1774004834) === Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 18969:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 18969:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 8 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| insanity test 0: Fail all nodes, independently | LustreError: 23295:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23295:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23295 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fbc0b45bdbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 22712:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8270:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.211@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8270:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 19 previous similar messages LustreError: 8269:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8269:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 9 previous similar messages LustreError: 8270:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.38.229@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8270:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 4 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.45.213@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-quota test 7c: Quota reintegration (restart mds during reintegration) | LustreError: 144088:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 144088:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 144088 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f0a8c5e5dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_* Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0 Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ugp LustreError: 143213:0:(qsd_reint.c:618:qqi_reint_delayed()) lustre-MDT0000: Delaying reintegration for qtype:0 until pending updates are flushed. LustreError: 143213:0:(qsd_reint.c:618:qqi_reint_delayed()) Skipped 11 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.45.211@tcp (stopping) LustreError: 143408:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.45.213@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-flr test 50A: mirror split update layout generation | LustreError: 101824:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 101824:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 101824 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f88aa833dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 101239:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 LustreError: Skipped 1 previous similar message LustreError: 8275:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.211@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8275:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 20 previous similar messages LustreError: 55052:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.210@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8278:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8278:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 13 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 LustreError: 8275:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.211@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.45.213@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-scrub test 0: Do not auto trigger OI scrub for non-backup/restore case | LustreError: 27632:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 27632:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 27632 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0x22e/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fea581fadbe | Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 26561:0:(obd_class.h:479:obd_check_dev()) Device 27 not setup LustreError: 26561:0:(obd_class.h:479:obd_check_dev()) Skipped 25 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 20868:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.25.66@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 24117:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.25.66@tcp arrived at 1774012570 with bad export cookie 15493183754662500981 LustreError: 24117:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds3 LustreError: 20864:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.25.206@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: MGC10.240.22.193@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: Failing over lustre-MDT0002 LustreError: 26953:0:(obd_class.h:479:obd_check_dev()) Device 26 not setup LustreError: 26953:0:(obd_class.h:479:obd_check_dev()) Skipped 15 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 27656:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 27656:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| large-scale test 3a: recovery time, 2 clients | LustreError: 23464:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23464:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23464 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? cfs_trace_unlock_tcd+0x20/0x70 [libcfs] ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f0eed540dbe | Autotest: Test running for 5 minutes (lustre-reviews_review-dne-zfs-part-7_122754.42) Lustre: DEBUG MARKER: /usr/sbin/lctl mark 1 : Starting failover on mds1 Lustre: DEBUG MARKER: 1 : Starting failover on mds1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 22880:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8278:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.29.122@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8278:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 20 previous similar messages LustreError: 8278:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.25.206@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8278:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 8278:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.27.10@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.22.193@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Not available for connect from 10.240.29.122@tcp (not set up) Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-pcc test 1c: Test automated attach using Project ID with manual HSM restore | LustreError: 23991:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23991:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 23991 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0xd3/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f37fcc36dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: zpool get all Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 LustreError: 22712:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 7991:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.25.66@tcp arrived at 1774008251 with bad export cookie 8715230870120879768 LustreError: 7991:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.25.66@tcp failed: rc = -107 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.25.66@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 7991:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1774008254 with bad export cookie 8715230870120879537 LustreError: MGC10.240.22.193@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 23005:0:(obd_class.h:479:obd_check_dev()) Device 34 not setup LustreError: 23005:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: zpool set feature@project_quota=enabled lustre-mdt1 Lustre: DEBUG MARKER: zpool set feature@project_quota=enabled lustre-mdt3 Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 24015:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 24015:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 14 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-scrub test 1c: Auto detect kinds of OI file(s) removed/recreated cases | LustreError: 217695:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 217695:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 217695 Comm: mount.lustre Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0x22e/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f425abd3dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 8 previous similar messages LustreError: 215457:0:(obd_class.h:479:obd_check_dev()) Device 14 not setup LustreError: 215457:0:(obd_class.h:479:obd_check_dev()) Skipped 117 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds3 Lustre: Failing over lustre-MDT0002 LustreError: MGC10.240.22.191@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: Skipped 3 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/oi.16.0 Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: test -b /dev/mapper/mds3_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds3_flakey /mnt/lustre-brpt LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/oi.16.0 Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1 Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1 Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr,noscrub,notcu /dev/mapper/mds1_flakey /mnt/lustre-mds1 LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc Lustre: lustre-MDT0000: invalid oi count 63, remove them, then set it to 64 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: Skipped 6 previous similar messages | Link to test |
| sanity-quota test 7c: Quota reintegration (restart mds during reintegration) | LustreError: 120263:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 120263:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 120263 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fcda9b45dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_* Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0 Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ugp LustreError: 119393:0:(qsd_reint.c:618:qqi_reint_delayed()) lustre-MDT0000: Delaying reintegration for qtype:0 until pending updates are flushed. LustreError: 119393:0:(qsd_reint.c:618:qqi_reint_delayed()) Skipped 5 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 119586:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| insanity test 0: Fail all nodes, independently | LustreError: 39821:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 39821:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 39821 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f5bfb609dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 39240:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lsnapshot test 1b: mount snapshot without original filesystem mounted | LustreError: 23038:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23038:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 23038 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f43c1b96dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot create -F lustre -n lss_1b_0 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.41.66@tcp (stopping) LustreError: 20186:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup LustreError: 20186:0:(obd_class.h:479:obd_check_dev()) Skipped 5 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot mount -F lustre -n lss_1b_0 Lustre: 3e4838c9-MDT0000: set dev_rdonly on this device Lustre: 3e4838c9-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot mount -F lustre -n lss_1b_0 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot umount -F lustre -n lss_1b_0 Lustre: Failing over 3e4838c9-MDT0000 LustreError: 22134:0:(obd_class.h:479:obd_check_dev()) Device 9 not setup LustreError: 22134:0:(obd_class.h:479:obd_check_dev()) Skipped 7 previous similar messages Lustre: server umount 3e4838c9-MDT0000 complete Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-flr test 50A: mirror split update layout generation | LustreError: 31254:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 31254:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 31254 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fcb40f50dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 30671:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-single test 0a: empty replay | LustreError: 15055:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 15055:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 15055 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fef959a1dbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 14185:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 14474:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| recovery-small test 23: client hang when close a file after mds crash | LustreError: 68649:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 68649:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 68649 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f351e572dbe | Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0x123 Lustre: *** cfs_fail_loc=123, val=2147483648*** Lustre: Skipped 4 previous similar messages Lustre: DEBUG MARKER: lctl set_param fail_loc=0 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 68066:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 11441:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.214@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11441:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 24 previous similar messages LustreError: 11248:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.44.248@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11248:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 5 previous similar messages LustreError: 11441:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.213@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11441:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 LustreError: 8288:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.214@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8288:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.39.215@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanityn test 33c: Cancel cross-MDT lock should trigger Sync-on-Lock-Cancel | LustreError: 43186:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 43186:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 43186 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? ptlrpc_set_import_discon+0x50a/0x870 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f3f769f1dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 42506:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.39.215@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| conf-sanity test 0: single mount setup | LustreError: 32927:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 32927:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 32927 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? cfs_trace_unlock_tcd+0x20/0x70 [libcfs] ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7efe260a0dbe | Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-single test 0a: empty replay | LustreError: 23706:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23706:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23706 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f0447913dbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 22834:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 23123:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 11258:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.41.124@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11258:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 22 previous similar messages LustreError: 8300:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8300:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 2 previous similar messages LustreError: 8300:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.41.68@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 LustreError: 8300:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.41.124@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8300:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 8 previous similar messages Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.41.123@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| ost-pools test 25: Create new pool and restart MDS | LustreError: 178865:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 178865:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 178865 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f5e630cedbe | Lustre: DEBUG MARKER: lctl pool_new lustre.testpool1 Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1 2>/dev/null || echo foo Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0002-mdtlov.pools.testpool1 2>/dev/null || echo foo Lustre: DEBUG MARKER: lctl pool_add lustre.testpool1 OST0000; sync Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 178280:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 143395:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.41.122@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 143395:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 25 previous similar messages LustreError: 11560:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.41.68@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 10139:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.41.124@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 10139:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 5 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; LustreError: 8277:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.41.122@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8277:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 8 previous similar messages Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.41.123@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity test 60a: llog_test run from kernel module and test llog_reader | LustreError: 202654:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 202654:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 202654 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f19425cbdbe | Lustre: DEBUG MARKER: ! which run-llog.sh &> /dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl mark test_60 run 21777 - from kernel mode Lustre: DEBUG MARKER: test_60 run 21777 - from kernel mode Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /dev/null Lustre: DEBUG MARKER: bash run-llog.sh Lustre: 200400:0:(llog_test.c:2454:llog_test_device_init()) Setup llog-test device over MGS device Lustre: 200400:0:(llog_test.c:98:llog_test_1()) 1a: create a log with name: 86f7f5bf Lustre: 200400:0:(llog_test.c:115:llog_test_1()) 1b: close newly-created log Lustre: 200400:0:(llog_test.c:146:llog_test_2()) 2a: re-open a log with name: 86f7f5bf Lustre: 200400:0:(llog_test.c:166:llog_test_2()) 2b: create a log without specified NAME & LOGID Lustre: 200400:0:(llog_test.c:184:llog_test_2()) 2b: write 1 llog records, check llh_count Lustre: 200400:0:(llog_test.c:197:llog_test_2()) 2c: re-open the log by LOGID and verify llh_count Lustre: 200400:0:(llog_test.c:244:llog_test_2()) 2d: destroy this log Lustre: 200400:0:(llog_test.c:404:llog_test_3()) 3a: write 1023 fixed-size llog records Lustre: 200400:0:(llog_test.c:368:llog_test3_process()) test3: processing records from index 501 to the end Lustre: 200400:0:(llog_test.c:378:llog_test3_process()) test3: total 525 records processed with 0 paddings Lustre: 200400:0:(llog_test.c:460:llog_test_3()) 3b: write 566 variable size llog records Lustre: 200400:0:(llog_test.c:532:llog_test_3()) 3c: write records with variable size until BITMAP_SIZE, return -ENOSPC Lustre: 200400:0:(llog_test.c:555:llog_test_3()) 3c: wrote 63962 more records before end of llog is reached Lustre: 200400:0:(llog_test.c:584:llog_test_4()) 4a: create a catalog log with name: 86f7f5c0 Lustre: 200400:0:(llog_test.c:599:llog_test_4()) 4b: write 1 record into the catalog Lustre: 200400:0:(llog_test.c:626:llog_test_4()) 4c: cancel 1 log record Lustre: 200400:0:(llog_test.c:638:llog_test_4()) 4d: write 64767 more log records Lustre: 200400:0:(llog_test.c:654:llog_test_4()) 4e: add 5 large records, one record per block Lustre: 200400:0:(llog_test.c:674:llog_test_4()) 4f: put newly-created catalog Lustre: 200400:0:(llog_test.c:773:llog_test_5()) 5a: re-open catalog by id Lustre: 200400:0:(llog_test.c:786:llog_test_5()) 5b: print the catalog entries.. we expect 2 Lustre: 200406:0:(llog_test.c:703:cat_print_cb()) seeing record at index 1 - [0x1:0x1b:0x0] in log [0xa:0x14:0x0] Lustre: 200400:0:(llog_test.c:798:llog_test_5()) 5c: Cancel 64767 records, see one log zapped Lustre: 200400:0:(llog_test.c:806:llog_test_5()) 5c: print the catalog entries.. we expect 1 Lustre: 200407:0:(llog_test.c:703:cat_print_cb()) seeing record at index 2 - [0x1:0x1c:0x0] in log [0xa:0x14:0x0] Lustre: 200407:0:(llog_test.c:703:cat_print_cb()) Skipped 1 previous similar message Lustre: 200400:0:(llog_test.c:818:llog_test_5()) 5d: add 1 record to the log with many canceled empty pages Lustre: 200400:0:(llog_test.c:826:llog_test_5()) 5e: print plain log entries.. expect 6 Lustre: 200400:0:(llog_test.c:838:llog_test_5()) 5f: print plain log entries reversely.. expect 6 Lustre: 200400:0:(llog_test.c:852:llog_test_5()) 5g: close re-opened catalog Lustre: 200400:0:(llog_test.c:882:llog_test_6()) 6a: re-open log 86f7f5bf using client API Lustre: MGS: non-config logname received: 86f7f5bf Lustre: 200400:0:(llog_test.c:914:llog_test_6()) 6b: process log 86f7f5bf using client API Lustre: 200400:0:(llog_test.c:918:llog_test_6()) 6b: processed 63962 records Lustre: 200400:0:(llog_test.c:925:llog_test_6()) 6c: process log 86f7f5bf reversely using client API Lustre: 200400:0:(llog_test.c:929:llog_test_6()) 6c: processed 63962 records Lustre: 200400:0:(llog_test.c:1077:llog_test_7()) 7a: test llog_logid_rec Lustre: 200400:0:(llog_test.c:1088:llog_test_7()) 7b: test llog_unlink64_rec Lustre: 200400:0:(llog_test.c:1099:llog_test_7()) 7c: test llog_setattr64_rec Lustre: 200400:0:(llog_test.c:1110:llog_test_7()) 7d: test llog_size_change_rec Lustre: 200400:0:(llog_test.c:1121:llog_test_7()) 7e: test llog_changelog_rec Lustre: 200400:0:(llog_test.c:1028:llog_test_7_sub()) 7_sub: records are not aligned, written 64071 from 64767 Lustre: 200400:0:(llog_test.c:1133:llog_test_7()) 7f: test llog_changelog_user_rec2 Lustre: 200400:0:(llog_test.c:1028:llog_test_7_sub()) 7_sub: records are not aligned, written 64139 from 64767 Lustre: 200400:0:(llog_test.c:1144:llog_test_7()) 7g: test llog_gen_rec Lustre: 200400:0:(llog_test.c:1155:llog_test_7()) 7h: test llog_setattr64_rec_v2 Lustre: 200400:0:(llog_test.c:1028:llog_test_7_sub()) 7_sub: records are not aligned, written 64071 from 64767 Lustre: 200400:0:(llog_test.c:1264:llog_test_8()) 8a: fill the first plain llog Lustre: 200400:0:(llog_test.c:1293:llog_test_8()) 8b: first llog [0x1:0x28:0x0] Lustre: 200400:0:(llog_test.c:1309:llog_test_8()) 8b: fill the second plain llog Lustre: 200400:0:(llog_test.c:1333:llog_test_8()) 8b: pin llog [0x1:0x2a:0x0] Lustre: 200400:0:(llog_test.c:1336:llog_test_8()) 8b: clean first llog record in catalog Lustre: 200400:0:(llog_test.c:1347:llog_test_8()) 8c: corrupt first chunk in the middle Lustre: 200400:0:(llog_test.c:1350:llog_test_8()) 8c: corrupt second chunk at start Lustre: 200400:0:(llog_test.c:1353:llog_test_8()) 8d: count survived records Lustre: 200400:0:(llog_test.c:1383:llog_test_8()) 8d: close re-opened catalog Lustre: 200400:0:(llog_test.c:1444:llog_test_9()) 9a: test llog_logid_rec Lustre: 200400:0:(llog_test.c:1428:llog_test_9_sub()) 9_sub: record type 1064553b in log 0x1:0x2c:0x0 Lustre: 200400:0:(llog_test.c:1455:llog_test_9()) 9b: test llog_obd_cfg_rec Lustre: 200400:0:(llog_test.c:1466:llog_test_9()) 9c: test llog_changelog_rec Lustre: 200400:0:(llog_test.c:1478:llog_test_9()) 9d: test llog_changelog_user_rec2 Lustre: 200400:0:(llog_test.c:1579:llog_test_10()) 10a: create a catalog log with name: 86f7f5c1 Lustre: 200400:0:(llog_test.c:1609:llog_test_10()) 10b: write 64767 log records Lustre: 200400:0:(llog_test.c:1635:llog_test_10()) 10c: write 129534 more log records Lustre: 200400:0:(llog_test.c:1667:llog_test_10()) 10c: write 64767 more log records Lustre: 200400:0:(llog_cat.c:80:llog_cat_new_log()) MGS: there are no more free slots in catalog 86f7f5c1 LustreError: 200400:0:(llog_cat.c:583:llog_cat_add_rec()) MGS: initialization error: rc = -28 Lustre: 200400:0:(llog_test.c:1694:llog_test_10()) 10c: wrote 64011 records then 756 failed with ENOSPC Lustre: 200400:0:(llog_test.c:1713:llog_test_10()) 10d: Cancel 64767 records, see one log zapped Lustre: 200400:0:(llog_test.c:1727:llog_test_10()) 10d: print the catalog entries.. we expect 3 Lustre: 200418:0:(llog_test.c:703:cat_print_cb()) seeing record at index 2 - [0x1:0x31:0x0] in log [0xa:0x15:0x0] Lustre: 200400:0:(llog_test.c:1757:llog_test_10()) 10e: write 64767 more log records Lustre: 200400:0:(llog_cat.c:80:llog_cat_new_log()) MGS: there are no more free slots in catalog 86f7f5c1 Lustre: 200400:0:(llog_cat.c:80:llog_cat_new_log()) Skipped 755 previous similar messages LustreError: 200400:0:(llog_cat.c:583:llog_cat_add_rec()) MGS: initialization error: rc = -28 LustreError: 200400:0:(llog_cat.c:583:llog_cat_add_rec()) Skipped 755 previous similar messages Lustre: 200400:0:(llog_test.c:1784:llog_test_10()) 10e: wrote 64578 records then 189 failed with ENOSPC Lustre: 200400:0:(llog_test.c:1786:llog_test_10()) 10e: print the catalog entries.. we expect 4 Lustre: 200400:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200400:0:(llog_test.c:703:cat_print_cb()) seeing record at index 2 - [0x1:0x31:0x0] in log [0xa:0x15:0x0] Lustre: 200400:0:(llog_test.c:703:cat_print_cb()) Skipped 2 previous similar messages Lustre: 200400:0:(llog_test.c:1823:llog_test_10()) 10e: catalog successfully wrap around, last_idx 1, first 1 Lustre: 200400:0:(llog_test.c:1840:llog_test_10()) 10f: Cancel 64767 records, see one log zapped Lustre: 200400:0:(llog_test.c:1854:llog_test_10()) 10f: print the catalog entries.. we expect 3 Lustre: 200400:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200400:0:(llog_cat.c:971:llog_cat_process_or_fork()) Skipped 1 previous similar message Lustre: 200400:0:(llog_test.c:1885:llog_test_10()) 10f: write 64767 more log records Lustre: 200400:0:(llog_cat.c:80:llog_cat_new_log()) MGS: there are no more free slots in catalog 86f7f5c1 Lustre: 200400:0:(llog_cat.c:80:llog_cat_new_log()) Skipped 188 previous similar messages LustreError: 200400:0:(llog_cat.c:583:llog_cat_add_rec()) MGS: initialization error: rc = -28 LustreError: 200400:0:(llog_cat.c:583:llog_cat_add_rec()) Skipped 188 previous similar messages Lustre: 200400:0:(llog_test.c:1912:llog_test_10()) 10f: wrote 64578 records then 189 failed with ENOSPC Lustre: 200400:0:(llog_test.c:1959:llog_test_10()) 10g: Cancel 64767 records, see one log zapped Lustre: 200400:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200400:0:(llog_test.c:1971:llog_test_10()) 10g: print the catalog entries.. we expect 3 Lustre: 200400:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200400:0:(llog_test.c:703:cat_print_cb()) seeing record at index 4 - [0x1:0x33:0x0] in log [0xa:0x15:0x0] Lustre: 200400:0:(llog_test.c:703:cat_print_cb()) Skipped 6 previous similar messages Lustre: 200400:0:(llog_test.c:2001:llog_test_10()) 10g: Cancel 64767 records, see one log zapped Lustre: 200400:0:(llog_test.c:2015:llog_test_10()) 10g: print the catalog entries.. we expect 2 Lustre: 200400:0:(llog_test.c:2053:llog_test_10()) 10g: Cancel 64767 records, see one log zapped Lustre: 200400:0:(llog_test.c:2067:llog_test_10()) 10g: print the catalog entries.. we expect 1 Lustre: 200400:0:(llog_test.c:2093:llog_test_10()) 10g: llh_cat_idx has also successfully wrapped! Lustre: 200420:0:(llog_test.c:1538:cat_check_old_cb()) seeing record at index 2 - [0x1:0x35:0x0] in log [0xa:0x15:0x0] Lustre: 200400:0:(llog_test.c:2117:llog_test_10()) 10h: write 64767 more log records LustreError: 200400:0:(llog_osd.c:684:llog_osd_write_rec()) cfs_race id 1317 sleeping LustreError: 200420:0:(llog.c:682:llog_process_thread()) cfs_fail_race id 1317 waking LustreError: 200400:0:(llog_osd.c:684:llog_osd_write_rec()) cfs_fail_race id 1317 awake: rc=4450 LustreError: 200420:0:(llog.c:682:llog_process_thread()) cfs_fail_race id 1317 waking Lustre: 200420:0:(llog_test.c:1538:cat_check_old_cb()) seeing record at index 3 - [0x1:0x36:0x0] in log [0xa:0x15:0x0] LustreError: 200400:0:(llog_osd.c:684:llog_osd_write_rec()) cfs_fail_race id 1317 waking Lustre: 200400:0:(llog_test.c:2144:llog_test_10()) 10h: wrote 64767 records then 0 failed with ENOSPC Lustre: 200400:0:(llog_test.c:2157:llog_test_10()) 10: put newly-created catalog Lustre: 200400:0:(llog_test.c:2187:llog_test_11()) 11: create a plain nameless log Lustre: 200400:0:(llog_test.c:2212:llog_test_11()) 11: size 8216 in 1 blocks after 1 rec Lustre: 200400:0:(llog_test.c:2214:llog_test_11()) 11: add few records Lustre: 200400:0:(llog_test.c:2231:llog_test_11()) 11: size 10616 in 1 blocks with few recs lctl (200400): drop_caches: 3 Lustre: 200400:0:(llog_test.c:2256:llog_test_11()) 11: re-open the log by LOGID and verify llh_count Lustre: 200400:0:(llog_test.c:2271:llog_test_11()) 11: size 10616 in 1 blocks after re-open Lustre: DEBUG MARKER: /usr/sbin/lctl dk Lustre: DEBUG MARKER: which llog_reader 2> /dev/null Lustre: DEBUG MARKER: ls -d /usr/sbin/llog_reader Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 200908:0:(client.c:1370:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ff20a0aa9467f0c0 x1860177076198784/t0(0) o1000->lustre-MDT0001-osp-MDT0000@10.240.29.171@tcp:24/4 lens 304/4320 e 0 to 0 dl 0 ref 2 fl Rpc:QU/200/ffffffff rc 0/-1 job:'umount.0' uid:0 gid:0 projid:4294967295 LustreError: 200908:0:(osp_object.c:617:osp_attr_get()) lustre-MDT0001-osp-MDT0000: osp_attr_get update error [0x20000000a:0x1:0x0]: rc = -5 LustreError: 200908:0:(llog_cat.c:443:llog_cat_close()) lustre-MDT0001-osp-MDT0000: failure destroying log during cleanup: rc = -5 Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 14 previous similar messages Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0000: Not available for connect from 10.240.31.210@tcp (stopping) Lustre: Skipped 7 previous similar messages LustreError: 200908:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup LustreError: 200908:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8312:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.209@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8312:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value canmount lustre-mdt1/mdt1 LustreError: 8315:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.29.171@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8315:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages Lustre: DEBUG MARKER: zfs get -H -o value mountpoint lustre-mdt1/mdt1 Lustre: DEBUG MARKER: zfs set canmount=noauto lustre-mdt1/mdt1 Lustre: DEBUG MARKER: zfs set mountpoint=legacy lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mount -t zfs lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true LustreError: 8310:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8310:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 2 previous similar messages Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.31.211@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lfsck test 4: FID-in-dirent can be rebuilt after MDT file-level backup/restore | LustreError: 282848:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 282848:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 282848 Comm: mount.lustre Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fb07fde7dbe | Lustre: 251685:0:(osd_internal.h:1370:osd_trans_exec_op()) lustre-MDT0000: opcode 2: before 259 < left 497, rollback = 2 Lustre: 251685:0:(osd_internal.h:1370:osd_trans_exec_op()) Skipped 1131 previous similar messages Lustre: 251685:0:(osd_handler.c:2088:osd_trans_dump_creds()) create: 1/4/0, destroy: 0/0/0 Lustre: 251685:0:(osd_handler.c:2088:osd_trans_dump_creds()) Skipped 1131 previous similar messages Lustre: 251685:0:(osd_handler.c:2095:osd_trans_dump_creds()) attr_set: 1/1/0, xattr_set: 4/497/0 Lustre: 251685:0:(osd_handler.c:2095:osd_trans_dump_creds()) Skipped 1131 previous similar messages Lustre: 251685:0:(osd_handler.c:2105:osd_trans_dump_creds()) write: 1/11/0, punch: 0/0/0, quota 1/3/0 Lustre: 251685:0:(osd_handler.c:2105:osd_trans_dump_creds()) Skipped 1131 previous similar messages Lustre: 251685:0:(osd_handler.c:2112:osd_trans_dump_creds()) insert: 4/65/1, delete: 0/0/0 Lustre: 251685:0:(osd_handler.c:2112:osd_trans_dump_creds()) Skipped 1131 previous similar messages Lustre: 251685:0:(osd_handler.c:2119:osd_trans_dump_creds()) ref_add: 2/2/0, ref_del: 0/0/0 Lustre: 251685:0:(osd_handler.c:2119:osd_trans_dump_creds()) Skipped 1131 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -f /tmp/backup_restore.tgz Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: tar zcf /tmp/backup_restore.tgz --xattrs --xattrs-include=trusted.* --sparse -C /mnt/lustre-brpt/ . Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: [ -e /dev/mapper/mds1_flakey ] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=100000 --mkfsoptions="-b 4096" --reformat /dev/mapper/mds1_flakey LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: tar zxfp /tmp/backup_restore.tgz --xattrs --xattrs-include=trusted.* --sparse -C /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/OBJECTS/* /mnt/lustre-brpt/CATALOGS Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -f /tmp/backup_restore.tgz Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey lustre-MDT0000 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1 Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1 Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr,noscrub /dev/mapper/mds1_flakey /mnt/lustre-mds1 LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc Lustre: lustre-MDT0000: reset Object Index mappings | Link to test |
| replay-dual test 0a: expired recovery with lost client | LustreError: 32546:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 32546:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 32546 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7ff263ba6dbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 31672:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) LustreError: 31961:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8315:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8315:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 26 previous similar messages LustreError: 9459:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.27.222@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9459:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages LustreError: 11265:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.26.254@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11265:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 11265:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11265:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 11 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.22.129@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-pfl test 9: Replay layout extend object instantiation | LustreError: 53193:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 53193:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 53193 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? vsnprintf+0x340/0x520 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? xa_find_after+0xe9/0x110 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7faec3ed5dbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 52320:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.22.121@tcp (stopping) LustreError: 52609:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 9436:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9436:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 18 previous similar messages LustreError: 9436:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.27.222@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9436:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 4 previous similar messages LustreError: 20226:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23986:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23986:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 20226:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.22.129@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| runtests test 1: All Runtests | LustreError: 28890:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 28890:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 28890 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f5ee1706dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl mark touching \/mnt\/lustre at Thu Mar 19 08:33:49 UTC 2026 \(@1773909229\) Lustre: DEBUG MARKER: touching /mnt/lustre at Thu Mar 19 08:33:49 UTC 2026 (@1773909229) Lustre: DEBUG MARKER: /usr/sbin/lctl mark create an empty file \/mnt\/lustre\/hosts.15624 Lustre: DEBUG MARKER: create an empty file /mnt/lustre/hosts.15624 Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15624 Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15624 Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing \/etc\/hosts and \/mnt\/lustre\/hosts.15624 Lustre: DEBUG MARKER: comparing /etc/hosts and /mnt/lustre/hosts.15624 Lustre: DEBUG MARKER: /usr/sbin/lctl mark renaming \/mnt\/lustre\/hosts.15624 to \/mnt\/lustre\/hosts.15624.ren Lustre: DEBUG MARKER: renaming /mnt/lustre/hosts.15624 to /mnt/lustre/hosts.15624.ren Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15624 again Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15624 again Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.15624 Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.15624 Lustre: DEBUG MARKER: /usr/sbin/lctl mark removing \/mnt\/lustre\/hosts.15624 Lustre: DEBUG MARKER: removing /mnt/lustre/hosts.15624 Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15624.2 Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15624.2 Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.15624.2 to 123 bytes Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.15624.2 to 123 bytes Lustre: DEBUG MARKER: /usr/sbin/lctl mark creating \/mnt\/lustre\/d1.runtests Lustre: DEBUG MARKER: creating /mnt/lustre/d1.runtests Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying 1000 files from \/etc, \/usr\/bin to \/mnt\/lustre\/d1.runtests\/etc, \/mnt\/lustre\/d1.runtests\/usr\/bin at Thu Mar 19 08:33:55 UTC 2026 Lustre: DEBUG MARKER: copying 1000 files from /etc, /usr/bin to /mnt/lustre/d1.runtests/etc, /mnt/lustre/d1.runtests/usr/bin at Thu Mar 19 08:33:55 UTC 2026 Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing 1000 newly copied files at Thu Mar 19 08:34:02 UTC 2026 Lustre: DEBUG MARKER: comparing 1000 newly copied files at Thu Mar 19 08:34:02 UTC 2026 Lustre: DEBUG MARKER: /usr/sbin/lctl mark running createmany -d \/mnt\/lustre\/d1.runtests\/d 1000 Lustre: DEBUG MARKER: running createmany -d /mnt/lustre/d1.runtests/d 1000 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n debug Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=ha Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=super+ioctl+neterror+warning+dlmtrace+error+emerg+ha+rpctrace+vfstrace+config+console+lfsck Lustre: DEBUG MARKER: /usr/sbin/lctl mark finished at Thu Mar 19 08:34:08 UTC 2026 \(19\) Lustre: DEBUG MARKER: finished at Thu Mar 19 08:34:08 UTC 2026 (19) Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.27.222@tcp (stopping) LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 10.240.22.121@tcp (stopping) Lustre: Skipped 4 previous similar messages Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 27515:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 7993:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.27.222@tcp arrived at 1773909269 with bad export cookie 12784743729585835679 LustreError: 7993:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.27.222@tcp failed: rc = -107 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.27.222@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 7993:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1773909276 with bad export cookie 12784743729585835462 LustreError: MGC10.240.22.129@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 23456:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23456:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 22 previous similar messages Lustre: lustre-MDT0002: Not available for connect from 10.240.22.121@tcp (stopping) Lustre: Skipped 9 previous similar messages LustreError: 27908:0:(obd_class.h:479:obd_check_dev()) Device 34 not setup LustreError: 27908:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 28914:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lfsck test 1a: LFSCK can find out and repair crashed FID-in-dirent | LustreError: 29682:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 29682:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 29682 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f29b79f0dbe | Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 29706:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 29706:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 10 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-sec test 27a: test fileset in various nodemaps | LustreError: 240900:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 240900:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 240900 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f19ed2ecdbe | Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_activate 1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin --value 1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted --value 1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.default.admin_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin=1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted=1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin=1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted=1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_set_fileset --name default --fileset /thisisaverylongsubdirtotestlongfilesetsandtotestmultiplefilesetfragmentsonthenodemapiam_default Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.fileset | awk '/primary/ { if ($3 == "/thisisaverylongsubdirtotestlongfilesetsandtotestmultiplefilesetfragmentsonthenodemapiam_default") print $3 }' Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.fileset Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.27.222@tcp (stopping) LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: Skipped 3 previous similar messages Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 8 previous similar messages LustreError: 236609:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 9457:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.27.222@tcp arrived at 1773913910 with bad export cookie 12878257235488593795 LustreError: 9457:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.27.222@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 8017:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1773913913 with bad export cookie 12878257235488593578 LustreError: MGC10.240.22.129@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0002: Not available for connect from 10.240.27.222@tcp (stopping) Lustre: Skipped 5 previous similar messages LustreError: 8314:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8314:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 20 previous similar messages LustreError: 9458:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.27.222@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9458:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 237002:0:(obd_class.h:479:obd_check_dev()) Device 33 not setup LustreError: 237002:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages LustreError: 9458:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.27.222@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9458:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm94.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-157vm94.onyx.whamcloud.com: executing unload_modules_local LNet: 238717:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 238717:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm94.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm94.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_modules_local libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: Lustre: Build Version: 2.16.59_74_gd000986 LNet: Added LNI 10.240.22.129@tcp [8/256/0/180] Key type lgssc registered Lustre: Echo OBD driver; http://www.lustre.org/ Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lnet test 226: test missing route for 1 of 2 routers | LustreError: 274762:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 274762:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 274762 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0x22e/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f7380577dbe | Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod Key type lgssc unregistered LNet: 231142:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 231142:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.22.129@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 233610:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 233610:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm59.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm59.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.22.129@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.121@tcp1 LNet: 236934:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 235168:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.121@tcp is being used as a gateway but routing feature is not turned on LNetError: 235168:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.121@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.26.255@tcp1 LNetError: 235168:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.121@tcp is being used as a gateway but routing feature is not turned on LNetError: 235168:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 235168:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.26.255@tcp is being used as a gateway but routing feature is not turned on LNetError: 235168:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message LNetError: 235168:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.121@tcp has gone from down to up LNetError: 235168:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.26.254@tcp Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.26.255@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.26.255@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.26.255@tcp1; e LNet: 236644:0:(lib-move.c:2247:lnet_handle_find_routed_path()) No peer NI for gateway 10.240.26.255@tcp1. Attempting to find an alternative route. Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.121@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.121@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.121@tcp1; e Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.26.255@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.26.255@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.26.255@tcp1; e Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 237813:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 237813:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.22.129@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 240594:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 240594:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm59.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-157vm59.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.22.129@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.26.255@tcp1 LNet: 243726:0:(router.c:718:lnet_add_route()) Consider turning discovery on to enable full Multi-Rail routing functionality LNet: 243726:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 242152:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.26.255@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 242152:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.26.255@tcp1 has gone from up to down LNetError: 242152:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.26.255@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 242152:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.26.255@tcp1 has gone from down to up Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.26.254@tcp Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.26.255@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.26.255@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.26.255@tcp1; e Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 244312:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 244312:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.22.129@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 247093:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 247093:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm59.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm59.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.22.129@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.121@tcp1 LNet: 250418:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 248652:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.121@tcp is being used as a gateway but routing feature is not turned on LNetError: 248652:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.121@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.26.255@tcp1 LNetError: 248652:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.121@tcp is being used as a gateway but routing feature is not turned on LNetError: 248652:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 248652:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.121@tcp is being used as a gateway but routing feature is not turned on LNetError: 248652:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message LNetError: 248652:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.121@tcp has gone from down to up LNetError: 248652:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.26.254@tcp Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm59.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm59.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest LNet: 251801:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251801:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251801:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251801:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251801:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251801:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251801:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.121@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.121@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.121@tcp1; e Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.26.255@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.26.255@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.26.255@tcp1; e Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 252304:0:(timer.c:222:stt_shutdown()) waiting for 1 threads to terminate LNet: 252304:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 252304:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.22.129@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing lnet_if_list Autotest: Test running for 45 minutes (lustre-reviews_review-dne-zfs-part-2_122716.37) Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 261237:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 261237:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm59.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm59.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm68.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm68.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm60.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm60.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.22.129@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.22.121@tcp1 LNet: 264561:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 262795:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.121@tcp is being used as a gateway but routing feature is not turned on LNetError: 262795:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.121@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.26.255@tcp1 LNetError: 262795:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.121@tcp is being used as a gateway but routing feature is not turned on LNetError: 262795:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 262795:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.26.255@tcp is being used as a gateway but routing feature is not turned on LNetError: 262795:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.22.121@tcp is being used as a gateway but routing feature is not turned on LNetError: 262795:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.22.121@tcp has gone from down to up LNetError: 262795:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.26.254@tcp Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm76.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-157vm59.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm76.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-157vm59.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest LNet: 265937:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265937:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265937:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265937:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265937:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265937:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265937:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer Lustre: DEBUG MARKER: echo 5 > /sys/module/lnet/parameters/alive_router_check_interval Lustre: DEBUG MARKER: echo 5 > /sys/module/lnet/parameters/router_ping_timeout Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait 5s for LST to start Lustre: DEBUG MARKER: Wait 5s for LST to start Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: /usr/sbin/lctl mark Disable routing on onyx-157vm60 Lustre: DEBUG MARKER: Disable routing on onyx-157vm60 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait for lst to finish Lustre: DEBUG MARKER: Wait for lst to finish LNetError: 262795:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.26.255@tcp is being used as a gateway but routing feature is not turned on LNetError: 262795:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message LNetError: 262795:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.26.255@tcp has gone from up to down LNetError: 262795:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait 5s for LST to start Lustre: DEBUG MARKER: Wait 5s for LST to start Lustre: DEBUG MARKER: Start LST rw LNetError: 262795:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.26.255@tcp is being used as a gateway but routing feature is not turned on LNetError: 262795:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl mark Enable routing on onyx-157vm60 Lustre: DEBUG MARKER: Enable routing on onyx-157vm60 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait for lst to finish Lustre: DEBUG MARKER: Wait for lst to finish LNet: 1 peer NIs in recovery (showing 1): 10.240.26.255@tcp1 Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.22.121@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.22.121@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.22.121@tcp1; e Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.26.255@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.26.255@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.26.255@tcp1; e Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 267406:0:(timer.c:222:stt_shutdown()) waiting for 1 threads to terminate LNet: 267406:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 267406:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.22.129@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: Lustre: Build Version: 2.16.59_74_gd000986 LNet: Added LNI 10.240.22.129@tcp [8/256/0/180] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-hsm test 26e: RAoLU with a non-started coordinator | LustreError: 62932:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 62932:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 62932 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? apic_timer_interrupt+0xa/0x20 server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7efc49e6fdbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x45:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x45:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x46:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x46:0x0'.*action='ARCHIVE'/ {print $6}' | cut -f3 -d/ Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0000.hsm_control='shutdown' Lustre: Modifying parameter lustre.mdt.lustre-MDT0000.hsm_control=shutdown in log params Lustre: Skipped 3 previous similar messages Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0001.hsm_control='shutdown' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0002.hsm_control='shutdown' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0003.hsm_control='shutdown' Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) LustreError: 62347:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8288:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.118@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8288:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 20 previous similar messages LustreError: 8288:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.119@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8288:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 11 previous similar messages LustreError: 39458:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 39458:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.31.120@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| insanity test 0: Fail all nodes, independently | LustreError: 23335:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23335:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 23335 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f067def6dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 22752:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8151:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.118@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8151:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 2 previous similar messages Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 11264:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11264:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages LustreError: 11264:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11264:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 11768:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.118@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11768:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 11 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 LustreError: 8153:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.118@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8153:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 15 previous similar messages Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.31.120@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| Module load | LustreError: 18989:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 18989:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 18989 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f362d940dbe | Lustre: Lustre: Build Version: 2.16.59_74_gd000986 LNet: Added LNI 10.240.31.120@tcp [8/256/0/180] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: Setting parameter lustre-MDT0000.mdt.identity_upcall=/usr/sbin/l_getidentity in log lustre-MDT0000 Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space. Lustre: lustre-MDT0000: new disk, initializing Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/li Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm116.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm116.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm116.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm116.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' Lustre: DEBUG MARKER: sync; sleep 1; sync Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null Lustre: 8311:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0001/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000240000400-0x0000000280000400]:1:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm117.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm117.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3; mount -t lustre -o localrecov lustre-mdt3/mdt3 /mnt/lustre-mds3 Lustre: 8311:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0002/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: srv-lustre-MDT0002: No data found on store. Initialize space. Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0002: new disk, initializing Lustre: lustre-MDT0002: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000280000400-0x00000002c0000400]:2:mdt Lustre: cli-ctl-lustre-MDT0002: Allocated super-sequence [0x0000000280000400-0x00000002c0000400]:2:mdt] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/li Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm116.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm116.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm116.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm116.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' Lustre: DEBUG MARKER: sync; sleep 1; sync Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 2>/dev/null Lustre: 8311:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0003/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x00000002c0000400-0x0000000300000400]:3:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm117.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm117.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P debug_raw_pointers=Y Lustre: Modifying parameter general.debug_raw_pointers=Y in log params Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000300000400-0x0000000340000400]:0:ost Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x100000000 to 0x300000402 Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x100010000 to 0x340000400 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000380000400-0x00000003c0000400]:2:ost Lustre: Skipped 1 previous similar message Lustre: lustre-OST0002-osc-MDT0000: update sequence from 0x100020000 to 0x380000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0003-osc-MDT0000: update sequence from 0x100030000 to 0x3c0000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0005-osc-MDT0000: update sequence from 0x100050000 to 0x440000401 Lustre: lustre-OST0004-osc-MDT0000: update sequence from 0x100040000 to 0x400000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000480000400-0x00000004c0000400]:6:ost Lustre: Skipped 3 previous similar messages Lustre: lustre-OST0006-osc-MDT0000: update sequence from 0x100060000 to 0x480000403 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm115.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0007-osc-MDT0000: update sequence from 0x100070000 to 0x4c0000403 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-155vm114.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: onyx-155vm114.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: lctl get_param -n timeout Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20 Lustre: DEBUG MARKER: Using TIMEOUT=20 Lustre: DEBUG MARKER: [ -f /sys/module/mgc/parameters/mgc_requeue_timeout_min ] && echo 1 > /sys/module/mgc/parameters/mgc_requeue_timeout_min; exit 0 Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.sys.jobid_var='procname_uid' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P lod.*.mdt_hash=crush Lustre: Setting parameter general.lod.*.mdt_hash=crush in log params Lustre: DEBUG MARKER: sysctl --values kernel/kptr_restrict Lustre: DEBUG MARKER: sysctl -wq kernel/kptr_restrict=1 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Client: 2.16.59.74 Lustre: DEBUG MARKER: Client: 2.16.59.74 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl mark MDS: 2.16.59.74 Lustre: DEBUG MARKER: MDS: 2.16.59.74 Lustre: DEBUG MARKER: /usr/sbin/lctl mark OSS: 2.16.59.74 Lustre: DEBUG MARKER: OSS: 2.16.59.74 Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: mmp ============----- Thu Mar 19 08:26:42 UTC 2026 Lustre: DEBUG MARKER: -----============= acceptance-small: mmp ============----- Thu Mar 19 08:26:42 UTC 2026 Lustre: DEBUG MARKER: hostname -I Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/mmp.*ex || true Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/mmp.*ex 2>/dev/null ||true Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests: Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.31.119@tcp (stopping) Lustre: Skipped 7 previous similar messages LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: Skipped 4 previous similar messages Lustre: lustre-MDT0000: Not available for connect from 10.240.31.119@tcp (stopping) Lustre: Skipped 8 previous similar messages LustreError: 16074:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8024:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.31.121@tcp arrived at 1773908816 with bad export cookie 303950609872119123 LustreError: 8024:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.31.121@tcp) was lost; in progress operations using this service will wait for recovery to complete LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.31.121@tcp failed: rc = -107 LustreError: 8326:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8326:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 15 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 11589:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1773908824 with bad export cookie 303950609872118906 LustreError: 11589:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: MGC10.240.31.120@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 8322:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.119@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8322:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 8 previous similar messages Lustre: lustre-MDT0002: Not available for connect from 10.240.31.119@tcp (stopping) Lustre: Skipped 11 previous similar messages LustreError: 8322:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8322:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 16466:0:(obd_class.h:479:obd_check_dev()) Device 34 not setup LustreError: 16466:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: /usr/sbin/lctl mark SKIP: mmp ldiskfs only test Lustre: DEBUG MARKER: SKIP: mmp ldiskfs only test Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: replay-ost-single ============----- Thu Mar 19 08:27:43 UTC 2026 Lustre: DEBUG MARKER: -----============= acceptance-small: replay-ost-single ============----- Thu Mar 19 08:27:43 UTC 2026 Lustre: DEBUG MARKER: hostname -I Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/replay-ost-single.*ex || true Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/replay-ost-single.*ex 2>/dev/null ||true Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests: Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: /usr/sbin/lctl mark skipping tests SLOW=no: 5 Lustre: DEBUG MARKER: skipping tests SLOW=no: 5 Lustre: DEBUG MARKER: /usr/sbin/lctl mark === replay-ost-single: start setup 08:27:49 \(1773908869\) === Lustre: DEBUG MARKER: === replay-ost-single: start setup 08:27:49 (1773908869) === Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 19013:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-flr test 50A: mirror split update layout generation | LustreError: 101778:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 101778:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 101778 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f2e03c25dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 101195:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 54981:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.117@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 54981:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 18 previous similar messages LustreError: 54981:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.118@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11249:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11249:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 8296:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.31.121@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8296:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 11 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.31.120@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-quota test 7c: Quota reintegration (restart mds during reintegration) | LustreError: 141539:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 141539:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 141539 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f97b0ee9dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_* Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0 Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ugp LustreError: 140665:0:(qsd_reint.c:618:qqi_reint_delayed()) lustre-MDT0000: Delaying reintegration for qtype:0 until pending updates are flushed. LustreError: 140665:0:(qsd_reint.c:618:qqi_reint_delayed()) Skipped 11 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 140860:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.31.120@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-quota test 7c: Quota reintegration (restart mds during reintegration) | LustreError: 118130:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 118130:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 118130 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fad5b34bdbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_* Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0 Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ugp LustreError: 117260:0:(qsd_reint.c:618:qqi_reint_delayed()) lustre-MDT0000: Delaying reintegration for qtype:0 until pending updates are flushed. LustreError: 117260:0:(qsd_reint.c:618:qqi_reint_delayed()) Skipped 5 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 117453:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| insanity test 0: Fail all nodes, independently | LustreError: 39801:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 39801:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 39801 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fbcec937dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 39220:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-flr test 50A: mirror split update layout generation | LustreError: 31217:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 31217:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 31217 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fcefe54cdbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 30635:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-single test 0a: empty replay | LustreError: 15094:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 15094:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 15094 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7ff0563ffdbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 14224:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 14513:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lsnapshot test 1b: mount snapshot without original filesystem mounted | LustreError: 22859:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 22859:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 22859 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f9be4a04dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot create -F lustre -n lss_1b_0 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.47.49@tcp (stopping) Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0000: Not available for connect from 10.240.47.49@tcp (stopping) Lustre: Skipped 1 previous similar message LustreError: 20191:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup LustreError: 20191:0:(obd_class.h:479:obd_check_dev()) Skipped 5 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot mount -F lustre -n lss_1b_0 Lustre: 210267aa-MDT0000: set dev_rdonly on this device Lustre: 210267aa-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot mount -F lustre -n lss_1b_0 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot umount -F lustre -n lss_1b_0 Lustre: Failing over 210267aa-MDT0000 LustreError: 21950:0:(obd_class.h:479:obd_check_dev()) Device 9 not setup LustreError: 21950:0:(obd_class.h:479:obd_check_dev()) Skipped 7 previous similar messages Lustre: server umount 210267aa-MDT0000 complete Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-scrub test 1c: Auto detect kinds of OI file(s) removed/recreated cases | LustreError: 217488:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 217488:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 217488 Comm: mount.lustre Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fdcdc342dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 215249:0:(obd_class.h:479:obd_check_dev()) Device 14 not setup LustreError: 215249:0:(obd_class.h:479:obd_check_dev()) Skipped 117 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 9 previous similar messages Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds3 Lustre: Failing over lustre-MDT0002 LustreError: MGC10.240.23.82@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: Skipped 3 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/oi.16.0 Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: test -b /dev/mapper/mds3_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds3_flakey /mnt/lustre-brpt LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/oi.16.0 Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1 Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1 Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr,noscrub,notcu /dev/mapper/mds1_flakey /mnt/lustre-mds1 LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc Lustre: lustre-MDT0000: invalid oi count 63, remove them, then set it to 64 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: Skipped 6 previous similar messages | Link to test |
| recovery-small test 23: client hang when close a file after mds crash | LustreError: 68638:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 68638:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 68638 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fbbaa812dbe | Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0x123 Lustre: *** cfs_fail_loc=123, val=2147483648*** Lustre: Skipped 4 previous similar messages Lustre: DEBUG MARKER: lctl set_param fail_loc=0 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 68055:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 9572:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.46.239@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9572:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 20 previous similar messages LustreError: 9572:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.46.174@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9572:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 11 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.46.184@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanityn test 33c: Cancel cross-MDT lock should trigger Sync-on-Lock-Cancel | LustreError: 43179:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 43179:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 43179 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? ptlrpc_set_import_discon+0x50a/0x870 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f585a027dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.46.239@tcp (stopping) LustreError: 42498:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.46.184@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| ost-pools test 25: Create new pool and restart MDS | LustreError: 177382:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 177382:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 177382 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fd42a5f4dbe | Lustre: DEBUG MARKER: lctl pool_new lustre.testpool1 Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0000-mdtlov.pools.testpool1 2>/dev/null || echo foo Lustre: DEBUG MARKER: lctl get_param -n lod.lustre-MDT0002-mdtlov.pools.testpool1 2>/dev/null || echo foo Lustre: DEBUG MARKER: lctl pool_add lustre.testpool1 OST0000; sync Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 176797:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8301:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.134@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8301:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 19 previous similar messages LustreError: 9450:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.22.72@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9450:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages LustreError: 143349:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.30.251@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.22.71@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-single test 0a: empty replay | LustreError: 23721:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23721:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23721 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f99a6648dbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 22846:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 23135:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8306:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.134@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8306:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 18 previous similar messages LustreError: 8305:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.135@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8305:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages LustreError: 8306:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.30.251@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8306:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 4 previous similar messages LustreError: 8306:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.134@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8306:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 2 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.22.71@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-scrub test 0: Do not auto trigger OI scrub for non-backup/restore case | LustreError: 27576:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 27576:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 27576 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f9894fefdbe | Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 26504:0:(obd_class.h:479:obd_check_dev()) Device 27 not setup LustreError: 26504:0:(obd_class.h:479:obd_check_dev()) Skipped 25 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 24013:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 20738:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.39.212@tcp arrived at 1773911322 with bad export cookie 9245209594714532222 LustreError: 20738:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds3 LustreError: MGC10.240.39.127@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: Failing over lustre-MDT0002 LustreError: 26896:0:(obd_class.h:479:obd_check_dev()) Device 26 not setup LustreError: 26896:0:(obd_class.h:479:obd_check_dev()) Skipped 15 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 27600:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 27600:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 5 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-pcc test 1c: Test automated attach using Project ID with manual HSM restore | LustreError: 23926:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23926:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 23926 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f08846bddbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: zpool get all Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: lustre-MDT0000: Not available for connect from 10.240.39.126@tcp (stopping) Lustre: Skipped 2 previous similar messages LustreError: 22648:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 7990:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.39.212@tcp arrived at 1773909725 with bad export cookie 767133104270750162 LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.39.212@tcp failed: rc = -107 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.39.212@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 7990:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1773909728 with bad export cookie 767133104270749945 LustreError: 7990:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: MGC10.240.39.127@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0002: Not available for connect from 10.240.39.126@tcp (stopping) Lustre: Skipped 12 previous similar messages LustreError: 22941:0:(obd_class.h:479:obd_check_dev()) Device 33 not setup LustreError: 22941:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: zpool set feature@project_quota=enabled lustre-mdt1 Lustre: DEBUG MARKER: zpool set feature@project_quota=enabled lustre-mdt3 Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 23950:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23950:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 13 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| large-scale test 3a: recovery time, 2 clients | LustreError: 23415:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23415:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23415 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f9c7ef6adbe | Lustre: DEBUG MARKER: /usr/sbin/lctl mark 1 : Starting failover on mds1 Lustre: DEBUG MARKER: 1 : Starting failover on mds1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 22831:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8293:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8293:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 19 previous similar messages LustreError: 8287:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.125@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11250:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.126@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11250:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 5 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; LustreError: 11250:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.125@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11250:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 4 previous similar messages Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.39.127@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity test 60a: llog_test run from kernel module and test llog_reader | LustreError: 202534:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 202534:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 202534 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f4d682bfdbe | Lustre: DEBUG MARKER: ! which run-llog.sh &> /dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl mark test_60 run 10535 - from kernel mode Lustre: DEBUG MARKER: test_60 run 10535 - from kernel mode Lustre: DEBUG MARKER: /usr/sbin/lctl dk > /dev/null Lustre: DEBUG MARKER: bash run-llog.sh Lustre: 200228:0:(llog_test.c:2454:llog_test_device_init()) Setup llog-test device over MGS device Lustre: 200228:0:(llog_test.c:98:llog_test_1()) 1a: create a log with name: 8051ae62 Lustre: 200228:0:(llog_test.c:115:llog_test_1()) 1b: close newly-created log Lustre: 200228:0:(llog_test.c:146:llog_test_2()) 2a: re-open a log with name: 8051ae62 Lustre: 200228:0:(llog_test.c:166:llog_test_2()) 2b: create a log without specified NAME & LOGID Lustre: 200228:0:(llog_test.c:184:llog_test_2()) 2b: write 1 llog records, check llh_count Lustre: 200228:0:(llog_test.c:197:llog_test_2()) 2c: re-open the log by LOGID and verify llh_count Lustre: 200228:0:(llog_test.c:244:llog_test_2()) 2d: destroy this log Lustre: 200228:0:(llog_test.c:404:llog_test_3()) 3a: write 1023 fixed-size llog records Lustre: 200228:0:(llog_test.c:368:llog_test3_process()) test3: processing records from index 501 to the end Lustre: 200228:0:(llog_test.c:378:llog_test3_process()) test3: total 525 records processed with 0 paddings Lustre: 200228:0:(llog_test.c:460:llog_test_3()) 3b: write 566 variable size llog records Lustre: 200228:0:(llog_test.c:532:llog_test_3()) 3c: write records with variable size until BITMAP_SIZE, return -ENOSPC Lustre: 200228:0:(llog_test.c:555:llog_test_3()) 3c: wrote 63962 more records before end of llog is reached Lustre: 200228:0:(llog_test.c:584:llog_test_4()) 4a: create a catalog log with name: 8051ae63 Lustre: 200228:0:(llog_test.c:599:llog_test_4()) 4b: write 1 record into the catalog Lustre: 200228:0:(llog_test.c:626:llog_test_4()) 4c: cancel 1 log record Lustre: 200228:0:(llog_test.c:638:llog_test_4()) 4d: write 64767 more log records Lustre: 200228:0:(llog_test.c:654:llog_test_4()) 4e: add 5 large records, one record per block Lustre: 200228:0:(llog_test.c:674:llog_test_4()) 4f: put newly-created catalog Lustre: 200228:0:(llog_test.c:773:llog_test_5()) 5a: re-open catalog by id Lustre: 200228:0:(llog_test.c:786:llog_test_5()) 5b: print the catalog entries.. we expect 2 Lustre: 200235:0:(llog_test.c:703:cat_print_cb()) seeing record at index 1 - [0x1:0x1b:0x0] in log [0xa:0x14:0x0] Lustre: 200228:0:(llog_test.c:798:llog_test_5()) 5c: Cancel 64767 records, see one log zapped Lustre: 200228:0:(llog_test.c:806:llog_test_5()) 5c: print the catalog entries.. we expect 1 Lustre: 200236:0:(llog_test.c:703:cat_print_cb()) seeing record at index 2 - [0x1:0x1c:0x0] in log [0xa:0x14:0x0] Lustre: 200236:0:(llog_test.c:703:cat_print_cb()) Skipped 1 previous similar message Lustre: 200228:0:(llog_test.c:818:llog_test_5()) 5d: add 1 record to the log with many canceled empty pages Lustre: 200228:0:(llog_test.c:826:llog_test_5()) 5e: print plain log entries.. expect 6 Lustre: 200228:0:(llog_test.c:838:llog_test_5()) 5f: print plain log entries reversely.. expect 6 Lustre: 200228:0:(llog_test.c:852:llog_test_5()) 5g: close re-opened catalog Lustre: 200228:0:(llog_test.c:882:llog_test_6()) 6a: re-open log 8051ae62 using client API Lustre: MGS: non-config logname received: 8051ae62 Lustre: 200228:0:(llog_test.c:914:llog_test_6()) 6b: process log 8051ae62 using client API Lustre: 200228:0:(llog_test.c:918:llog_test_6()) 6b: processed 63962 records Lustre: 200228:0:(llog_test.c:925:llog_test_6()) 6c: process log 8051ae62 reversely using client API Lustre: 200228:0:(llog_test.c:929:llog_test_6()) 6c: processed 63962 records Lustre: 200228:0:(llog_test.c:1077:llog_test_7()) 7a: test llog_logid_rec Lustre: 200228:0:(llog_test.c:1088:llog_test_7()) 7b: test llog_unlink64_rec Lustre: 200228:0:(llog_test.c:1099:llog_test_7()) 7c: test llog_setattr64_rec Lustre: 200228:0:(llog_test.c:1110:llog_test_7()) 7d: test llog_size_change_rec Lustre: 200228:0:(llog_test.c:1121:llog_test_7()) 7e: test llog_changelog_rec Lustre: 200228:0:(llog_test.c:1028:llog_test_7_sub()) 7_sub: records are not aligned, written 64071 from 64767 Lustre: 200228:0:(llog_test.c:1133:llog_test_7()) 7f: test llog_changelog_user_rec2 Lustre: 200228:0:(llog_test.c:1028:llog_test_7_sub()) 7_sub: records are not aligned, written 64139 from 64767 Lustre: 200228:0:(llog_test.c:1144:llog_test_7()) 7g: test llog_gen_rec Lustre: 200228:0:(llog_test.c:1155:llog_test_7()) 7h: test llog_setattr64_rec_v2 Lustre: 200228:0:(llog_test.c:1028:llog_test_7_sub()) 7_sub: records are not aligned, written 64071 from 64767 Lustre: 200228:0:(llog_test.c:1264:llog_test_8()) 8a: fill the first plain llog Lustre: 200228:0:(llog_test.c:1293:llog_test_8()) 8b: first llog [0x1:0x28:0x0] Lustre: 200228:0:(llog_test.c:1309:llog_test_8()) 8b: fill the second plain llog Lustre: 200228:0:(llog_test.c:1333:llog_test_8()) 8b: pin llog [0x1:0x2a:0x0] Lustre: 200228:0:(llog_test.c:1336:llog_test_8()) 8b: clean first llog record in catalog Lustre: 200228:0:(llog_test.c:1347:llog_test_8()) 8c: corrupt first chunk in the middle Lustre: 200228:0:(llog_test.c:1350:llog_test_8()) 8c: corrupt second chunk at start Lustre: 200228:0:(llog_test.c:1353:llog_test_8()) 8d: count survived records Lustre: 200228:0:(llog_test.c:1383:llog_test_8()) 8d: close re-opened catalog Lustre: 200228:0:(llog_test.c:1444:llog_test_9()) 9a: test llog_logid_rec Lustre: 200228:0:(llog_test.c:1428:llog_test_9_sub()) 9_sub: record type 1064553b in log 0x1:0x2c:0x0 Lustre: 200228:0:(llog_test.c:1455:llog_test_9()) 9b: test llog_obd_cfg_rec Lustre: 200228:0:(llog_test.c:1466:llog_test_9()) 9c: test llog_changelog_rec Lustre: 200228:0:(llog_test.c:1478:llog_test_9()) 9d: test llog_changelog_user_rec2 Lustre: 200228:0:(llog_test.c:1579:llog_test_10()) 10a: create a catalog log with name: 8051ae64 Lustre: 200228:0:(llog_test.c:1609:llog_test_10()) 10b: write 64767 log records Autotest: Test running for 45 minutes (lustre-reviews_review-dne-zfs-part-1_122716.36) Lustre: 200228:0:(llog_test.c:1635:llog_test_10()) 10c: write 129534 more log records Lustre: 200228:0:(llog_test.c:1667:llog_test_10()) 10c: write 64767 more log records Lustre: 200228:0:(llog_cat.c:80:llog_cat_new_log()) MGS: there are no more free slots in catalog 8051ae64 LustreError: 200228:0:(llog_cat.c:583:llog_cat_add_rec()) MGS: initialization error: rc = -28 Lustre: 200228:0:(llog_test.c:1694:llog_test_10()) 10c: wrote 64011 records then 756 failed with ENOSPC Lustre: 200228:0:(llog_test.c:1713:llog_test_10()) 10d: Cancel 64767 records, see one log zapped Lustre: 200228:0:(llog_test.c:1727:llog_test_10()) 10d: print the catalog entries.. we expect 3 Lustre: 200298:0:(llog_test.c:703:cat_print_cb()) seeing record at index 2 - [0x1:0x31:0x0] in log [0xa:0x15:0x0] Lustre: 200228:0:(llog_test.c:1757:llog_test_10()) 10e: write 64767 more log records Lustre: 200228:0:(llog_cat.c:80:llog_cat_new_log()) MGS: there are no more free slots in catalog 8051ae64 Lustre: 200228:0:(llog_cat.c:80:llog_cat_new_log()) Skipped 755 previous similar messages LustreError: 200228:0:(llog_cat.c:583:llog_cat_add_rec()) MGS: initialization error: rc = -28 LustreError: 200228:0:(llog_cat.c:583:llog_cat_add_rec()) Skipped 755 previous similar messages Lustre: 200228:0:(llog_test.c:1784:llog_test_10()) 10e: wrote 64578 records then 189 failed with ENOSPC Lustre: 200228:0:(llog_test.c:1786:llog_test_10()) 10e: print the catalog entries.. we expect 4 Lustre: 200228:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200228:0:(llog_test.c:703:cat_print_cb()) seeing record at index 2 - [0x1:0x31:0x0] in log [0xa:0x15:0x0] Lustre: 200228:0:(llog_test.c:703:cat_print_cb()) Skipped 2 previous similar messages Lustre: 200228:0:(llog_test.c:1823:llog_test_10()) 10e: catalog successfully wrap around, last_idx 1, first 1 Lustre: 200228:0:(llog_test.c:1840:llog_test_10()) 10f: Cancel 64767 records, see one log zapped Lustre: 200228:0:(llog_test.c:1854:llog_test_10()) 10f: print the catalog entries.. we expect 3 Lustre: 200228:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200228:0:(llog_cat.c:971:llog_cat_process_or_fork()) Skipped 1 previous similar message Lustre: 200228:0:(llog_test.c:1885:llog_test_10()) 10f: write 64767 more log records Lustre: 200228:0:(llog_cat.c:80:llog_cat_new_log()) MGS: there are no more free slots in catalog 8051ae64 Lustre: 200228:0:(llog_cat.c:80:llog_cat_new_log()) Skipped 188 previous similar messages LustreError: 200228:0:(llog_cat.c:583:llog_cat_add_rec()) MGS: initialization error: rc = -28 LustreError: 200228:0:(llog_cat.c:583:llog_cat_add_rec()) Skipped 188 previous similar messages Lustre: 200228:0:(llog_test.c:1912:llog_test_10()) 10f: wrote 64578 records then 189 failed with ENOSPC Lustre: 200228:0:(llog_test.c:1959:llog_test_10()) 10g: Cancel 64767 records, see one log zapped Lustre: 200228:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200228:0:(llog_test.c:1971:llog_test_10()) 10g: print the catalog entries.. we expect 3 Lustre: 200228:0:(llog_cat.c:971:llog_cat_process_or_fork()) MGS: catlog [0xa:0x15:0x0] crosses index zero Lustre: 200228:0:(llog_test.c:703:cat_print_cb()) seeing record at index 4 - [0x1:0x33:0x0] in log [0xa:0x15:0x0] Lustre: 200228:0:(llog_test.c:703:cat_print_cb()) Skipped 6 previous similar messages Lustre: 200228:0:(llog_test.c:2001:llog_test_10()) 10g: Cancel 64767 records, see one log zapped Lustre: 200228:0:(llog_test.c:2015:llog_test_10()) 10g: print the catalog entries.. we expect 2 Lustre: 200228:0:(llog_test.c:2053:llog_test_10()) 10g: Cancel 64767 records, see one log zapped Lustre: 200228:0:(llog_test.c:2067:llog_test_10()) 10g: print the catalog entries.. we expect 1 Lustre: 200228:0:(llog_test.c:2093:llog_test_10()) 10g: llh_cat_idx has also successfully wrapped! Lustre: 200300:0:(llog_test.c:1538:cat_check_old_cb()) seeing record at index 2 - [0x1:0x35:0x0] in log [0xa:0x15:0x0] Lustre: 200228:0:(llog_test.c:2117:llog_test_10()) 10h: write 64767 more log records LustreError: 200228:0:(llog_osd.c:684:llog_osd_write_rec()) cfs_race id 1317 sleeping LustreError: 200300:0:(llog.c:682:llog_process_thread()) cfs_fail_race id 1317 waking LustreError: 200228:0:(llog_osd.c:684:llog_osd_write_rec()) cfs_fail_race id 1317 awake: rc=4458 LustreError: 200300:0:(llog.c:682:llog_process_thread()) cfs_fail_race id 1317 waking Lustre: 200300:0:(llog_test.c:1538:cat_check_old_cb()) seeing record at index 3 - [0x1:0x36:0x0] in log [0xa:0x15:0x0] LustreError: 200228:0:(llog_osd.c:684:llog_osd_write_rec()) cfs_fail_race id 1317 waking Lustre: 200228:0:(llog_test.c:2144:llog_test_10()) 10h: wrote 64767 records then 0 failed with ENOSPC Lustre: 200228:0:(llog_test.c:2157:llog_test_10()) 10: put newly-created catalog Lustre: 200228:0:(llog_test.c:2187:llog_test_11()) 11: create a plain nameless log Lustre: 200228:0:(llog_test.c:2212:llog_test_11()) 11: size 8216 in 1 blocks after 1 rec Lustre: 200228:0:(llog_test.c:2214:llog_test_11()) 11: add few records Lustre: 200228:0:(llog_test.c:2231:llog_test_11()) 11: size 10616 in 1 blocks with few recs lctl (200228): drop_caches: 3 Lustre: 200228:0:(llog_test.c:2256:llog_test_11()) 11: re-open the log by LOGID and verify llh_count Lustre: 200228:0:(llog_test.c:2271:llog_test_11()) 11: size 10616 in 1 blocks after re-open Lustre: DEBUG MARKER: /usr/sbin/lctl dk Lustre: DEBUG MARKER: which llog_reader 2> /dev/null Lustre: DEBUG MARKER: ls -d /usr/sbin/llog_reader Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 200788:0:(client.c:1370:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ff22c6cc9bf4c680 x1860076719771904/t0(0) o1000->lustre-MDT0001-osp-MDT0000@10.240.39.252@tcp:24/4 lens 304/4320 e 0 to 0 dl 0 ref 2 fl Rpc:QU/200/ffffffff rc 0/-1 job:'umount.0' uid:0 gid:0 projid:4294967295 LustreError: 200788:0:(osp_object.c:617:osp_attr_get()) lustre-MDT0001-osp-MDT0000: osp_attr_get update error [0x20000000a:0x1:0x0]: rc = -5 LustreError: 200788:0:(llog_cat.c:443:llog_cat_close()) lustre-MDT0001-osp-MDT0000: failure destroying log during cleanup: rc = -5 Lustre: lustre-MDT0000: Not available for connect from 10.240.39.235@tcp (stopping) LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 LustreError: Skipped 10 previous similar messages Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 15 previous similar messages LustreError: 200788:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup LustreError: 200788:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value canmount lustre-mdt1/mdt1 Lustre: DEBUG MARKER: zfs get -H -o value mountpoint lustre-mdt1/mdt1 Lustre: DEBUG MARKER: zfs set canmount=noauto lustre-mdt1/mdt1 Lustre: DEBUG MARKER: zfs set mountpoint=legacy lustre-mdt1/mdt1 LustreError: 7984:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.42.76@tcp arrived at 1773910085 with bad export cookie 5142297824593242975 LustreError: 91586:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.42.76@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: DEBUG MARKER: mount -t zfs lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 11240:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.252@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 LustreError: 49174:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.39.234@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 49174:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.39.236@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| conf-sanity test 0: single mount setup | LustreError: 32826:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 32826:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 32826 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fe192277dbe | Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 32850:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lfsck test 4: FID-in-dirent can be rebuilt after MDT file-level backup/restore | LustreError: 274617:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 274617:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 274617 Comm: mount.lustre Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0xd3/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f95de687dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -f /tmp/backup_restore.tgz Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: tar zcf /tmp/backup_restore.tgz --xattrs --xattrs-include=trusted.* --sparse -C /mnt/lustre-brpt/ . Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: [ -e /dev/mapper/mds1_flakey ] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: mkfs.lustre --mgs --fsname=lustre --mdt --index=0 --param=sys.timeout=20 --param=mdt.identity_upcall=/usr/sbin/l_getidentity --backfstype=ldiskfs --device-size=100000 --mkfsoptions="-b 4096" --reformat /dev/mapper/mds1_flakey LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: tar zxfp /tmp/backup_restore.tgz --xattrs --xattrs-include=trusted.* --sparse -C /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/OBJECTS/* /mnt/lustre-brpt/CATALOGS Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: rm -f /tmp/backup_restore.tgz Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey lustre-MDT0000 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1 Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1 Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr,noscrub /dev/mapper/mds1_flakey /mnt/lustre-mds1 LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc Lustre: lustre-MDT0000: reset Object Index mappings | Link to test |
| runtests test 1: All Runtests | LustreError: 28903:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 28903:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 28903 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f5eea6bcdbe | Lustre: DEBUG MARKER: /usr/sbin/lctl mark touching \/mnt\/lustre at Wed Mar 18 03:20:49 UTC 2026 \(@1773804049\) Lustre: DEBUG MARKER: touching /mnt/lustre at Wed Mar 18 03:20:49 UTC 2026 (@1773804049) Lustre: DEBUG MARKER: /usr/sbin/lctl mark create an empty file \/mnt\/lustre\/hosts.15653 Lustre: DEBUG MARKER: create an empty file /mnt/lustre/hosts.15653 Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15653 Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15653 Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing \/etc\/hosts and \/mnt\/lustre\/hosts.15653 Lustre: DEBUG MARKER: comparing /etc/hosts and /mnt/lustre/hosts.15653 Lustre: DEBUG MARKER: /usr/sbin/lctl mark renaming \/mnt\/lustre\/hosts.15653 to \/mnt\/lustre\/hosts.15653.ren Lustre: DEBUG MARKER: renaming /mnt/lustre/hosts.15653 to /mnt/lustre/hosts.15653.ren Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15653 again Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15653 again Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.15653 Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.15653 Lustre: DEBUG MARKER: /usr/sbin/lctl mark removing \/mnt\/lustre\/hosts.15653 Lustre: DEBUG MARKER: removing /mnt/lustre/hosts.15653 Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying \/etc\/hosts to \/mnt\/lustre\/hosts.15653.2 Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.15653.2 Lustre: DEBUG MARKER: /usr/sbin/lctl mark truncating \/mnt\/lustre\/hosts.15653.2 to 123 bytes Lustre: DEBUG MARKER: truncating /mnt/lustre/hosts.15653.2 to 123 bytes Lustre: DEBUG MARKER: /usr/sbin/lctl mark creating \/mnt\/lustre\/d1.runtests Lustre: DEBUG MARKER: creating /mnt/lustre/d1.runtests Lustre: DEBUG MARKER: /usr/sbin/lctl mark copying 1000 files from \/etc, \/usr\/bin to \/mnt\/lustre\/d1.runtests\/etc, \/mnt\/lustre\/d1.runtests\/usr\/bin at Wed Mar 18 03:20:54 UTC 2026 Lustre: DEBUG MARKER: copying 1000 files from /etc, /usr/bin to /mnt/lustre/d1.runtests/etc, /mnt/lustre/d1.runtests/usr/bin at Wed Mar 18 03:20:54 UTC 2026 Lustre: DEBUG MARKER: /usr/sbin/lctl mark comparing 1000 newly copied files at Wed Mar 18 03:20:59 UTC 2026 Lustre: DEBUG MARKER: comparing 1000 newly copied files at Wed Mar 18 03:20:59 UTC 2026 Lustre: DEBUG MARKER: /usr/sbin/lctl mark running createmany -d \/mnt\/lustre\/d1.runtests\/d 1000 Lustre: DEBUG MARKER: running createmany -d /mnt/lustre/d1.runtests/d 1000 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n debug Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=ha Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n debug=super+ioctl+neterror+warning+dlmtrace+error+emerg+ha+rpctrace+vfstrace+config+console+lfsck Lustre: DEBUG MARKER: /usr/sbin/lctl mark finished at Wed Mar 18 03:21:04 UTC 2026 \(15\) Lustre: DEBUG MARKER: finished at Wed Mar 18 03:21:04 UTC 2026 (15) Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.28.86@tcp (stopping) Lustre: Skipped 7 previous similar messages Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) LustreError: 27528:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8011:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.28.88@tcp arrived at 1773804073 with bad export cookie 5317809789104370341 LustreError: 8011:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.28.88@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 8010:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1773804076 with bad export cookie 5317809789104370131 LustreError: MGC10.240.28.87@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 23935:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.86@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23935:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 20 previous similar messages Lustre: lustre-MDT0002: Not available for connect from 10.240.28.86@tcp (stopping) Lustre: Skipped 7 previous similar messages LustreError: 8312:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.86@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: lustre-MDT0002: Not available for connect from 10.240.28.86@tcp (stopping) LustreError: 8312:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 2 previous similar messages Lustre: Skipped 6 previous similar messages LustreError: 27920:0:(obd_class.h:479:obd_check_dev()) Device 34 not setup LustreError: 27920:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 28927:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 28927:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 7 previous similar messages Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-dual test 0a: expired recovery with lost client | LustreError: 32545:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 32545:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 32545 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f9210c39dbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 31673:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 31962:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8324:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.88@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8324:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 23 previous similar messages LustreError: 8319:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8319:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message LustreError: 8320:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.85@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8320:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 9 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.28.87@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-pfl test 9: Replay layout extend object instantiation | LustreError: 53458:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 53458:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 53458 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? vsnprintf+0x340/0x520 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? xa_find_after+0xe9/0x110 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f0597a8adbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 52572:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.28.85@tcp (stopping) LustreError: 52861:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 23460:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.86@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 23460:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 26 previous similar messages LustreError: 23460:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.84@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9445:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.88@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9445:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 5 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.28.87@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lfsck test 1a: LFSCK can find out and repair crashed FID-in-dirent | LustreError: 29701:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 29701:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 29701 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f59e8a1ddbe | Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 29725:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lnet test 226: test missing route for 1 of 2 routers | LustreError: 274527:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 274527:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 274527 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f9e06f4edbe | Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod Key type lgssc unregistered LNet: 230902:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 230902:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.28.87@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 233371:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 233371:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm33.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm33.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.28.87@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.28.86@tcp1 LNet: 236696:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 234930:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.86@tcp is being used as a gateway but routing feature is not turned on LNetError: 234930:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.86@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.28.85@tcp1 LNetError: 234930:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.86@tcp is being used as a gateway but routing feature is not turned on LNetError: 234930:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 234930:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.85@tcp is being used as a gateway but routing feature is not turned on LNetError: 234930:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.86@tcp is being used as a gateway but routing feature is not turned on LNetError: 234930:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.86@tcp has gone from down to up LNetError: 234930:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.28.84@tcp Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.28.85@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.28.85@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.28.85@tcp1; else LNet: 236406:0:(lib-move.c:2247:lnet_handle_find_routed_path()) No peer NI for gateway 10.240.28.85@tcp1. Attempting to find an alternative route. Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.28.86@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.28.86@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.28.86@tcp1; else Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.28.85@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.28.85@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.28.85@tcp1; else Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 237575:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 237575:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.28.87@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 240355:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 240355:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm33.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 Lustre: DEBUG MARKER: onyx-156vm33.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing load_lnet lnet_peer_discovery_disabled=1 lnet_health_sensitivity=0 lnet_transaction_timeout=10 alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.28.87@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.28.85@tcp1 LNet: 243487:0:(router.c:718:lnet_add_route()) Consider turning discovery on to enable full Multi-Rail routing functionality LNet: 243487:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 241913:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.85@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 241913:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.85@tcp1 has gone from up to down LNetError: 241913:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.85@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 241913:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.85@tcp1 is being used as a gateway but routing feature is not turned on LNetError: 241913:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.85@tcp1 has gone from down to up Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.28.84@tcp Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.28.85@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.28.85@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.28.85@tcp1; else Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 244074:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 244074:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.28.87@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 246854:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 246854:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm33.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm33.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.28.87@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.28.86@tcp1 LNet: 250178:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 248412:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.86@tcp is being used as a gateway but routing feature is not turned on LNetError: 248412:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.86@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.28.85@tcp1 LNetError: 248412:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.85@tcp is being used as a gateway but routing feature is not turned on LNetError: 248412:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 248412:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.85@tcp has gone from down to up LNetError: 248412:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message LNetError: 248412:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.86@tcp is being used as a gateway but routing feature is not turned on LNetError: 248412:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.28.84@tcp Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm33.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-156vm33.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest LNet: 251528:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251528:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251528:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251528:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251528:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251528:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 251528:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.28.86@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.28.86@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.28.86@tcp1; else Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.28.85@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.28.85@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.28.85@tcp1; else Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 252065:0:(timer.c:222:stt_shutdown()) waiting for 1 threads to terminate LNet: 252065:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 252065:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Autotest: Test running for 40 minutes (lustre-reviews_review-dne-zfs-part-2_122657.37) Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing load_lnet libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure --all LNet: Added LNI 10.240.28.87@tcp [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing lnet_if_list Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 261006:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 261006:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm33.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm33.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing load_lnet alive_router_check_interval=5 router_ping_timeout=5 large_router_buffers=4 small_router_buffers=8 tiny_router_buffers=9 Lustre: DEBUG MARKER: /usr/sbin/lnetctl lnet configure Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing \/usr\/sbin\/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing /usr/sbin/lnetctl net add --net tcp1 --if eth0 LNet: Added LNI 10.240.28.87@tcp1 [8/256/0/180] Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.28.86@tcp1 LNet: 264330:0:(router.c:733:lnet_add_route()) Use hops = 1 for a single-hop route when avoid_asym_router_failure feature is enabled LNetError: 262564:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.86@tcp is being used as a gateway but routing feature is not turned on LNetError: 262564:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.86@tcp has gone from up to down Lustre: DEBUG MARKER: /usr/sbin/lnetctl route add --net tcp --gateway 10.240.28.85@tcp1 LNetError: 262564:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.86@tcp is being used as a gateway but routing feature is not turned on LNetError: 262564:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 3 previous similar messages LNetError: 262564:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.85@tcp has gone from down to up LNetError: 262564:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message LNetError: 262564:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.86@tcp is being used as a gateway but routing feature is not turned on LNetError: 262564:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl show_route Lustre: DEBUG MARKER: /usr/sbin/lnetctl route show -v Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n routes Lustre: DEBUG MARKER: /usr/sbin/lnetctl ping 10.240.28.84@tcp Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm33.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_module ..\/lnet\/selftest\/lnet_selftest Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest Lustre: DEBUG MARKER: onyx-156vm33.onyx.whamcloud.com: executing load_module ../lnet/selftest/lnet_selftest LNet: 265713:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265713:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265713:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265713:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265713:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265713:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer LNet: 265713:0:(rpc.c:648:srpc_service_add_buffers()) waiting for adding buffer Lustre: DEBUG MARKER: echo 5 > /sys/module/lnet/parameters/alive_router_check_interval Lustre: DEBUG MARKER: echo 5 > /sys/module/lnet/parameters/router_ping_timeout Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait 5s for LST to start Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: Wait 5s for LST to start Lustre: DEBUG MARKER: /usr/sbin/lctl mark Disable routing on onyx-156vm34 Lustre: DEBUG MARKER: Disable routing on onyx-156vm34 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait for lst to finish Lustre: DEBUG MARKER: Wait for lst to finish LNetError: 262564:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.85@tcp is being used as a gateway but routing feature is not turned on LNetError: 262564:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.85@tcp has gone from up to down LNetError: 262564:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) Skipped 1 previous similar message LNetError: 262564:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.85@tcp is being used as a gateway but routing feature is not turned on Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait 5s for LST to start Lustre: DEBUG MARKER: Wait 5s for LST to start Lustre: DEBUG MARKER: Start LST rw Lustre: DEBUG MARKER: /usr/sbin/lctl mark Enable routing on onyx-156vm34 Lustre: DEBUG MARKER: Enable routing on onyx-156vm34 LNetError: 262564:0:(router.c:398:lnet_router_discovery_ping_reply()) Peer 10.240.28.85@tcp is being used as a gateway but routing feature is not turned on LNetError: 262564:0:(router.c:398:lnet_router_discovery_ping_reply()) Skipped 1 previous similar message Lustre: DEBUG MARKER: /usr/sbin/lctl mark Wait for lst to finish LNet: 1 peer NIs in recovery (showing 1): 10.240.28.85@tcp1 Lustre: DEBUG MARKER: Wait for lst to finish Lustre: DEBUG MARKER: lst stop brw_rw Lustre: DEBUG MARKER: Stop LST rw Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.28.86@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.28.86@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.28.86@tcp1; else Lustre: DEBUG MARKER: output="$(/usr/sbin/lnetctl route show --net tcp --gateway 10.240.28.85@tcp1 2>/dev/null)"; if [[ -n "${output}" ]]; then echo "Delete route to tcp via 10.240.28.85@tcp1"; /usr/sbin/lnetctl route del --net tcp --gateway 10.240.28.85@tcp1; else LNetError: 262564:0:(lib-lnet.h:1315:lnet_set_route_aliveness()) route to tcp through 10.240.28.85@tcp has gone from down to up Lustre: DEBUG MARKER: /usr/sbin/lustre_rmmod LNet: 267175:0:(timer.c:222:stt_shutdown()) waiting for 1 threads to terminate LNet: 267175:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 267175:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp1 Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: Lustre: Build Version: 2.16.59_74_g40c855a LNet: Added LNI 10.240.28.87@tcp [8/256/0/180] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-sec test 27a: test fileset in various nodemaps | LustreError: 240869:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 240869:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 240869 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f86a11bfdbe | Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_activate 1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin --value 1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted --value 1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.default.admin_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin=1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted=1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property admin=1 Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_modify --name default --property trusted=1 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.trusted_nodemap Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl nodemap_set_fileset --name default --fileset /thisisaverylongsubdirtotestlongfilesetsandtotestmultiplefilesetfragmentsonthenodemapiam_default Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.fileset | awk '/primary/ { if ($3 == "/thisisaverylongsubdirtotestlongfilesetsandtotestmultiplefilesetfragmentsonthenodemapiam_default") print $3 }' Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n nodemap.active Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl list_nids | grep -w tcp | cut -f 1 -d @ Lustre: DEBUG MARKER: /usr/sbin/lctl get_param nodemap.default.fileset Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0000: Not available for connect from 10.240.28.88@tcp (stopping) Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0000: Not available for connect from 10.240.28.86@tcp (stopping) Lustre: Skipped 3 previous similar messages Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: Skipped 7 previous similar messages LustreError: 236576:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8048:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.28.88@tcp arrived at 1773808147 with bad export cookie 13378462371591131137 LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.28.88@tcp failed: rc = -107 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.28.88@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message LustreError: 11313:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.88@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11313:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 16 previous similar messages LustreError: 9482:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.88@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9482:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true LustreError: 8336:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.86@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 8046:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1773808155 with bad export cookie 13378462371591130920 LustreError: 8046:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: MGC10.240.28.87@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 9482:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.28.88@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 9482:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 8 previous similar messages Lustre: lustre-MDT0002: Not available for connect from 10.240.28.88@tcp (stopping) LustreError: 236968:0:(obd_class.h:479:obd_check_dev()) Device 34 not setup LustreError: 236968:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/u Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm37.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-156vm37.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing unload_modules_local Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing unload_modules_local LNet: 238684:0:(lib-ptl.c:967:lnet_clear_lazy_portal()) Active lazy portal 0 on exit LNetError: 238684:0:(acceptor.c:264:lnet_acceptor_remove_socket()) Interface eth0 not found LNet: Removed LNI 10.240.28.87@tcp Key type .llcrypt unregistered Key type ._llcrypt unregistered Key type ._llcrypt registered Key type .llcrypt registered Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm34.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm37.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm35.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-156vm36.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-156vm34.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-156vm35.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-156vm37.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_modules_local Lustre: DEBUG MARKER: onyx-156vm36.onyx.whamcloud.com: executing load_modules_local libcfs: HW NUMA nodes: 1, HW CPU cores: 2, npartitions: 1 alg: No test for adler32 (adler32-zlib) Lustre: Lustre: Build Version: 2.16.59_74_g40c855a LNet: Added LNI 10.240.28.87@tcp [8/256/0/180] Key type lgssc registered Lustre: Echo OBD driver; http://www.lustre.org/ Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| replay-single test 0a: empty replay | LustreError: 15095:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 15095:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 15095 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0x22e/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f97f9cc0dbe | Lustre: DEBUG MARKER: sync; sync; sync Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 notransno Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0000 readonly LustreError: 14225:0:(osd_handler.c:720:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 14514:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-quota test 7c: Quota reintegration (restart mds during reintegration) | LustreError: 118851:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 118851:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 118851 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? cfs_trace_unlock_tcd+0x20/0x70 [libcfs] ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f58bcbb8dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_* Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0 Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ugp LustreError: 117981:0:(qsd_reint.c:618:qqi_reint_delayed()) lustre-MDT0000: Delaying reintegration for qtype:0 until pending updates are flushed. LustreError: 117981:0:(qsd_reint.c:618:qqi_reint_delayed()) Skipped 5 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 118174:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-flr test 50A: mirror split update layout generation | LustreError: 31279:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 31279:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 31279 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0xd3/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f23f7a52dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 30698:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-lsnapshot test 1b: mount snapshot without original filesystem mounted | LustreError: 22886:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 22886:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 22886 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0xd3/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fcd61975dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot create -F lustre -n lss_1b_0 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000: Not available for connect from 10.240.25.54@tcp (stopping) LustreError: 20218:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup LustreError: 20218:0:(obd_class.h:479:obd_check_dev()) Skipped 5 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot mount -F lustre -n lss_1b_0 Lustre: 32513d2a-MDT0000: set dev_rdonly on this device Lustre: 32513d2a-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot mount -F lustre -n lss_1b_0 Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot umount -F lustre -n lss_1b_0 Lustre: Failing over 32513d2a-MDT0000 LustreError: 21977:0:(obd_class.h:479:obd_check_dev()) Device 9 not setup LustreError: 21977:0:(obd_class.h:479:obd_check_dev()) Skipped 7 previous similar messages Lustre: server umount 32513d2a-MDT0000 complete Lustre: DEBUG MARKER: /usr/sbin/lctl snapshot list -F lustre -n lss_1b_0 -d Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| insanity test 0: Fail all nodes, independently | LustreError: 39897:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 39897:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 39897 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0xd3/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f67b7d3fdbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 39315:0:(obd_class.h:479:obd_check_dev()) Device 10 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| Module load | LustreError: 18972:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 18972:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 18972 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? srso_alias_return_thunk+0x5/0xfcdfd ? __queue_work+0x145/0x3f0 ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f7853683dbe | Lustre: Lustre: Build Version: 2.16.59_74_g40c855a LNet: Added LNI 10.240.45.180@tcp [8/256/0/180] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' Lustre: Setting parameter lustre-MDT0000.mdt.identity_upcall=/usr/sbin/l_getidentity in log lustre-MDT0000 Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space. Lustre: lustre-MDT0000: new disk, initializing Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/li Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm22.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm22.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm22.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm22.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' Lustre: DEBUG MARKER: sync; sleep 1; sync Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 2>/dev/null Lustre: 8300:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0001/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000240000400-0x0000000280000400]:1:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm23.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm23.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds3; mount -t lustre -o localrecov lustre-mdt3/mdt3 /mnt/lustre-mds3 Lustre: 8298:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0002/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: srv-lustre-MDT0002: No data found on store. Initialize space. Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0002: new disk, initializing Lustre: lustre-MDT0002: Imperative Recovery not enabled, recovery window 60-180 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000280000400-0x00000002c0000400]:2:mdt Lustre: cli-ctl-lustre-MDT0002: Allocated super-sequence [0x0000000280000400-0x00000002c0000400]:2:mdt] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/usr/lib64/lustre/utils/gss:/usr/li Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm22.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm22.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm22.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm22.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}' Lustre: DEBUG MARKER: sync; sleep 1; sync Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt3/mdt3 2>/dev/null Lustre: 8300:0:(mgs_llog.c:1346:mgs_modify_param()) MGS: modify lustre-MDT0003/mdt.identity_upcall=/usr/sbin/l_getidentity (mode = 0) failed: rc = -17 Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x00000002c0000400-0x0000000300000400]:3:mdt Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm23.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm23.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P debug_raw_pointers=Y Lustre: Modifying parameter general.debug_raw_pointers=Y in log params Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000300000400-0x0000000340000400]:0:ost Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x100000000 to 0x300000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000380000400-0x00000003c0000400]:2:ost Lustre: Skipped 1 previous similar message Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x100010000 to 0x340000403 Lustre: lustre-OST0002-osc-MDT0000: update sequence from 0x100020000 to 0x380000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0003-osc-MDT0000: update sequence from 0x100030000 to 0x3c0000401 Lustre: lustre-OST0004-osc-MDT0000: update sequence from 0x100040000 to 0x400000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: lustre-OST0005-osc-MDT0000: update sequence from 0x100050000 to 0x440000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000480000400-0x00000004c0000400]:6:ost Lustre: Skipped 3 previous similar messages Lustre: lustre-OST0007-osc-MDT0000: update sequence from 0x100070000 to 0x4c0000401 Lustre: lustre-OST0006-osc-MDT0000: update sequence from 0x100060000 to 0x480000401 Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm21.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: /usr/sbin/lctl mark trevis-155vm20.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: trevis-155vm20.trevis.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Lustre: DEBUG MARKER: lctl get_param -n timeout Lustre: DEBUG MARKER: /usr/sbin/lctl mark Using TIMEOUT=20 Lustre: DEBUG MARKER: Using TIMEOUT=20 Lustre: DEBUG MARKER: [ -f /sys/module/mgc/parameters/mgc_requeue_timeout_min ] && echo 1 > /sys/module/mgc/parameters/mgc_requeue_timeout_min; exit 0 Lustre: DEBUG MARKER: lctl dl | grep ' IN osc ' 2>/dev/null | wc -l Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.sys.jobid_var='procname_uid' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P lod.*.mdt_hash=crush Lustre: Setting parameter general.lod.*.mdt_hash=crush in log params Lustre: DEBUG MARKER: sysctl --values kernel/kptr_restrict Lustre: DEBUG MARKER: sysctl -wq kernel/kptr_restrict=1 Lustre: DEBUG MARKER: /usr/sbin/lctl mark Client: 2.16.59.74 Lustre: DEBUG MARKER: Client: 2.16.59.74 Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl mark MDS: 2.16.59.74 Lustre: DEBUG MARKER: MDS: 2.16.59.74 Lustre: DEBUG MARKER: /usr/sbin/lctl mark OSS: 2.16.59.74 Lustre: DEBUG MARKER: OSS: 2.16.59.74 Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: mmp ============----- Wed Mar 18 03:17:49 UTC 2026 Lustre: DEBUG MARKER: -----============= acceptance-small: mmp ============----- Wed Mar 18 03:17:49 UTC 2026 Lustre: DEBUG MARKER: hostname -I Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/mmp.*ex || true Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/mmp.*ex 2>/dev/null ||true Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests: Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds1 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) Lustre: lustre-MDT0000: Not available for connect from 10.240.45.181@tcp (stopping) Lustre: Skipped 4 previous similar messages LustreError: 16057:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: 8015:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.45.181@tcp arrived at 1773803878 with bad export cookie 12034420908811271003 LustreError: lustre-MDT0001-osp-MDT0002: operation mds_statfs to node 10.240.45.181@tcp failed: rc = -107 Lustre: lustre-MDT0001-osp-MDT0002: Connection to lustre-MDT0001 (at 10.240.45.181@tcp) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true LustreError: 11258:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.179@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 11258:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 22 previous similar messages Lustre: DEBUG MARKER: umount -d -f /mnt/lustre-mds3 LustreError: 8016:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1773803885 with bad export cookie 12034420908811270786 LustreError: 8016:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 4 previous similar messages LustreError: MGC10.240.45.180@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail LustreError: 16449:0:(obd_class.h:479:obd_check_dev()) Device 33 not setup LustreError: 16449:0:(obd_class.h:479:obd_check_dev()) Skipped 23 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt3 >/dev/null 2>&1 || Lustre: DEBUG MARKER: /usr/sbin/lctl mark SKIP: mmp ldiskfs only test Lustre: DEBUG MARKER: SKIP: mmp ldiskfs only test Lustre: DEBUG MARKER: /usr/sbin/lctl mark -----============= acceptance-small: replay-ost-single ============----- Wed Mar 18 03:18:37 UTC 2026 Lustre: DEBUG MARKER: -----============= acceptance-small: replay-ost-single ============----- Wed Mar 18 03:18:37 UTC 2026 Lustre: DEBUG MARKER: hostname -I Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: cat /etc/system-release Lustre: DEBUG MARKER: test -r /etc/os-release Lustre: DEBUG MARKER: cat /etc/os-release Lustre: DEBUG MARKER: ls /usr/lib64/lustre/tests/except/replay-ost-single.*ex || true Lustre: DEBUG MARKER: cat /usr/lib64/lustre/tests/except/replay-ost-single.*ex 2>/dev/null ||true Lustre: DEBUG MARKER: /usr/sbin/lctl mark excepting tests: Lustre: DEBUG MARKER: excepting tests: Lustre: DEBUG MARKER: /usr/sbin/lctl mark skipping tests SLOW=no: 5 Lustre: DEBUG MARKER: skipping tests SLOW=no: 5 Lustre: DEBUG MARKER: /usr/sbin/lctl mark === replay-ost-single: start setup 03:18:42 \(1773803922\) === Lustre: DEBUG MARKER: === replay-ost-single: start setup 03:18:42 (1773803922) === Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds3' ' /proc/mounts); Lustre: DEBUG MARKER: running=$(grep -c /mnt/lustre-mds1' ' /proc/mounts); Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: 18996:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0002: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-hsm test 26e: RAoLU with a non-started coordinator | LustreError: 64313:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 64313:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 64313 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f7e87d70dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x47:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x47:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x48:0x0'.*action='ARCHIVE'/ {print $13}' | cut -f2 -d= Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n mdt.lustre-MDT0000.hsm.actions | awk '/'0x200000405:0x48:0x0'.*action='ARCHIVE'/ {print $6}' | cut -f3 -d/ Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0000.hsm_control='shutdown' Lustre: Modifying parameter lustre.mdt.lustre-MDT0000.hsm_control=shutdown in log params Lustre: Skipped 3 previous similar messages Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0001.hsm_control='shutdown' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0002.hsm_control='shutdown' Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -P mdt.lustre-MDT0003.hsm_control='shutdown' Autotest: Test running for 10 minutes (lustre-reviews_review-dne-zfs-part-4_122657.39) Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 63728:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: lustre-MDT0000: Not available for connect from 10.240.45.181@tcp (stopping) Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8292:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.177@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8292:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 22 previous similar messages LustreError: 8292:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.178@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8292:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; LustreError: 40642:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.177@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 40642:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 13 previous similar messages Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.45.180@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| insanity test 0: Fail all nodes, independently | LustreError: 23290:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 23290:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 0 PID: 23290 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f4f043c8dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.45.177@tcp (stopping) LustreError: 22707:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000-lwp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete LustreError: 8268:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.178@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8268:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 18 previous similar messages LustreError: 8271:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.181@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8271:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 3 previous similar messages LustreError: 8268:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8268:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 1 previous similar message Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.45.180@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-quota test 7c: Quota reintegration (restart mds during reintegration) | LustreError: 143825:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 143825:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 143825 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? srso_alias_return_thunk+0x5/0xfcdfd ? libcfs_debug_msg+0x907/0xc00 [libcfs] ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xd23/0x1f80 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] ? ll_alloc_inode+0x110/0x110 [lustre] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fcd0b899dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osc.*MDT*.sync_* Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n osp.*.destroys_in_flight Lustre: DEBUG MARKER: lctl set_param fail_val=0 fail_loc=0 Lustre: DEBUG MARKER: lctl set_param -n os[cd]*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=none Lustre: DEBUG MARKER: /usr/sbin/lctl conf_param lustre.quota.ost=ugp LustreError: 142951:0:(qsd_reint.c:618:qqi_reint_delayed()) lustre-MDT0000: Delaying reintegration for qtype:0 until pending updates are flushed. LustreError: 142951:0:(qsd_reint.c:618:qqi_reint_delayed()) Skipped 11 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.45.181@tcp (stopping) Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 1 previous similar message LustreError: 143146:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.45.180@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-flr test 50A: mirror split update layout generation | LustreError: 101685:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 101685:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 101685 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? finish_wait+0x80/0x80 ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7f4b05417dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 101102:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || LustreError: lustre-MDT0000-osp-MDT0002: operation mds_statfs to node 0@lo failed: rc = -107 LustreError: Skipped 2 previous similar messages LustreError: 55010:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.181@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 55010:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 21 previous similar messages LustreError: 11434:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.178@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8128:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.179@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 8128:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 9 previous similar messages LustreError: 55010:0:(ldlm_lib.c:1245:target_handle_connect()) lustre-MDT0000: not available for connect from 10.240.45.181@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. LustreError: 55010:0:(ldlm_lib.c:1245:target_handle_connect()) Skipped 4 previous similar messages Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.45.180@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |
| sanity-scrub test 1c: Auto detect kinds of OI file(s) removed/recreated cases | LustreError: 217854:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 217854:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 217854 Comm: mount.lustre Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? try_to_wake_up+0x1b4/0x4b0 ? lock_timer_base+0x67/0x90 ? __queue_work+0x145/0x3f0 ? ptlrpc_pinger_add_import+0x17e/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? kfree+0x22e/0x250 ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fe3880e4dbe | Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n osd*.*MDT*.force_sync=1 Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 LustreError: 215616:0:(obd_class.h:479:obd_check_dev()) Device 14 not setup LustreError: 215616:0:(obd_class.h:479:obd_check_dev()) Skipped 117 previous similar messages Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; LustreError: 211430:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) ldlm_cancel from 10.240.45.12@tcp arrived at 1773808537 with bad export cookie 11982420751780773556 LustreError: 211430:0:(ldlm_lockd.c:2563:ldlm_cancel_handler()) Skipped 9 previous similar messages Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds3' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds3 Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: Skipped 7 previous similar messages LustreError: MGC10.240.45.11@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: Failing over lustre-MDT0002 LustreError: Skipped 3 previous similar messages Lustre: server umount lustre-MDT0002 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds1_flakey /mnt/lustre-brpt LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/oi.16.0 Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: test -b /dev/mapper/mds3_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-brpt Lustre: DEBUG MARKER: mount -t ldiskfs /dev/mapper/mds3_flakey /mnt/lustre-brpt LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null) Lustre: DEBUG MARKER: rm -fv /mnt/lustre-brpt/oi.16.0 Lustre: DEBUG MARKER: umount -d /mnt/lustre-brpt Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: modprobe dm-flakey; Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey >/dev/null 2>&1 Lustre: DEBUG MARKER: dmsetup status /dev/mapper/mds1_flakey 2>&1 Lustre: DEBUG MARKER: test -b /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: e2label /dev/mapper/mds1_flakey Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov -o user_xattr,noscrub,notcu /dev/mapper/mds1_flakey /mnt/lustre-mds1 LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,user_xattr,no_mbcache,nodelalloc Lustre: lustre-MDT0000: invalid oi count 63, remove them, then set it to 64 Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 Lustre: Skipped 6 previous similar messages | Link to test |
| sanityn test 33c: Cancel cross-MDT lock should trigger Sync-on-Lock-Cancel | LustreError: 43148:0:(mdd_device.c:996:mdd_trash_setup()) ASSERTION( ((&mdo->mo_lu)->lo_header->loh_attr & LOHA_EXISTS) ) failed: LustreError: 43148:0:(mdd_device.c:996:mdd_trash_setup()) LBUG CPU: 1 PID: 43148 Comm: mount.lustre Kdump: loaded Tainted: P OE -------- - - 4.18.0-553.76.1.el8_lustre.x86_64 #1 Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-4.el9 04/01/2014 Call Trace: dump_stack+0x41/0x60 lbug_with_loc.cold.6+0x5/0x43 [libcfs] mdd_dot_lustre_setup+0x919/0x930 [mdd] ? srso_alias_return_thunk+0x5/0xfcdfd mdd_prepare+0x618/0x1690 [mdd] ? tgt_ses_key_init+0x29/0x100 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd mdt_prepare+0x4f/0x3b0 [mdt] server_start_targets+0x2491/0x2c20 [ptlrpc] ? class_config_dump_handler+0x710/0x710 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_recover_import+0x37d/0x4d0 [ptlrpc] ? ptlrpc_invalidate_import+0x2d2/0xac0 [ptlrpc] ? finish_wait+0x80/0x80 ? ptlrpc_set_import_discon+0x50a/0x870 [ptlrpc] ? srso_alias_return_thunk+0x5/0xfcdfd ? ptlrpc_reconnect_import+0x7c/0x240 [ptlrpc] ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd ? kfree+0xd3/0x250 ? srso_alias_return_thunk+0x5/0xfcdfd ? lustre_start_mgc+0xe10/0x1f80 [obdclass] ? do_lcfg+0x3f7/0x510 [obdclass] ? srso_alias_return_thunk+0x5/0xfcdfd server_fill_super+0xd11/0x11a0 [ptlrpc] ? obd_zombie_barrier+0x38/0xb0 [obdclass] lustre_fill_super+0x37d/0x470 [lustre] ? ll_alloc_inode+0x110/0x110 [lustre] mount_nodev+0x49/0xa0 legacy_get_tree+0x27/0x50 vfs_get_tree+0x25/0xc0 ? srso_alias_return_thunk+0x5/0xfcdfd do_mount+0x2e9/0x980 ksys_mount+0xbe/0xe0 __x64_sys_mount+0x21/0x30 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x66/0xcb RIP: 0033:0x7fdc1c5f5dbe | Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds1' ' /proc/mounts || true Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds1 Lustre: Failing over lustre-MDT0000 Lustre: lustre-MDT0000: Not available for connect from 10.240.46.131@tcp (stopping) Lustre: Skipped 1 previous similar message Lustre: lustre-MDT0000-osp-MDT0002: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) LustreError: 42467:0:(obd_class.h:479:obd_check_dev()) Device 35 not setup Lustre: server umount lustre-MDT0000 complete Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null && Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt1 >/dev/null 2>&1 || Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1 Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs; Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt1/mdt1 Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 LustreError: MGC10.240.46.133@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 | Link to test |