Antsle Forum

Welcome to our Antsle community! This forum is to connect all Antsle users to post experiences, make user-generated content available for the entire community and more. 

Please note: This forum is about discussing one specific issue at a time. No generalizations. No judgments. Please check the Forum Rules before posting. If you have specific questions about your Antsle and expect a response from our team directly, please continue to use the appropriate channels (email: [email protected]) so every inquiry is tracked. 

Please or Register to create posts and topics.

Re-creating the hdd zpool...

Hello gang,

I've noticed that there is a bug in ZFS that leads to a "space map refcount mismatch" error when a mirror device is replaced in an "apparently corrupted" zpool.  The zpool will report as degraded, typically the refcount is only off by a few blocks but scrubbing or resilvering the mirror vdev doesn't correct and the same error will return.  From what I've been able to determine the course of action is to re-create the zpool.  That's what I want to do.  The root cause, FYI, was that one of the vdevs in the zpool failed hard '/dev/sdb'.  I was able to plug in a usb to get that device back and ran a zpool replace but as I said the error persists.

What is a good way to replace that whole zpool and get a proper one stood back up.  I'd imagine the steps include send/recv and an export, but I want to get a clear path forward so that I don't lose anything (which shouldn't be too hard) I can see where I might need to take than zpool off-line but I'm not sure of that step.  Thanks in advance.

daniel.luck has reacted to this post.
daniel.luck

Hi @infobro

Thanks for sharing about the potential bug in ZFS.  I've reached out to my team to let them know and for advice on steps going forward.

Thank you,
antsle Support

infobro has reacted to this post.
infobro

Thanks, Daniel.

The actual problem I was observing is the refcount mismatch issue highlighted in this thread.  https://github.com/openzfs/zfs/issues/7204

Also, magically, after running SETUP at boot to re-order to boot sequence and move the USB to later in the stack /dev/sdb resurrected and I was able to run a "zpool replace hdd sdb" now I have all the block devices showing so I just created a new zpool using the USB drives.

# lsscsi
[0:0:0:0] disk ATA ST2000LM015-2E81 0001 /dev/sda
[1:0:0:0] disk ATA ST2000LM015-2E81 SDM1 /dev/sdb <- this drive failed and was UNAVAILABLE for resilvering
[4:0:0:0] disk ATA Samsung SSD 850 3B6Q /dev/sdc
[5:0:0:0] disk ATA Samsung SSD 850 3B6Q /dev/sdd
[6:0:0:0] disk Seagate Portable 0712 /dev/sde
[7:0:0:0] disk Seagate Portable 0712 /dev/sdf

# blkid
/dev/sdb1: LABEL="hdd" UUID="13478755572967550998" UUID_SUB="11005604975810268912" TYPE="zfs_member" PARTLABEL="zfs-a8e5e6cfb2b688dc" PARTUUID="b450368c-6ab7-184f-8bb9-858e58d942b5"
/dev/sda1: LABEL="hdd" UUID="13478755572967550998" UUID_SUB="16408262801347867790" TYPE="zfs_member" PARTLABEL="zfs-cf2d609248c3f808" PARTUUID="1be09e8b-1bd4-274a-b4e0-1b79f2030129"
/dev/sdc2: UUID="E00C-F95D" TYPE="vfat" PARTLABEL="boot" PARTUUID="0a888808-f9ec-4796-ad71-27f741013e30"
/dev/sdc3: UUID="47891db7-19cf-4292-9a0d-b6e53495f831" TYPE="ext4" PARTLABEL="antsleOS" PARTUUID="11b96ed9-0960-4df5-a256-722eccdb8a78"
/dev/sdc4: LABEL="antlets" UUID="7843901734371145190" UUID_SUB="1572101277366165317" TYPE="zfs_member" PARTLABEL="antlets" PARTUUID="f670d0f2-2009-4d64-8a20-7da4b413397d"
/dev/sdd1: UUID="42d39eea-d546-46a4-ae5b-5d85a9669044" TYPE="swap" PARTLABEL="swap" PARTUUID="affd085e-ef52-48c3-ae20-298ee267305a"
/dev/sdd2: LABEL="antlets" UUID="7843901734371145190" UUID_SUB="802464354203686441" TYPE="zfs_member" PARTLABEL="antlets" PARTUUID="0c8880a3-8d8e-4320-bd35-336e6fd381c6"
/dev/sde1: LABEL="usbpool" UUID="11203921784423396124" UUID_SUB="2646758268057084209" TYPE="zfs_member" PARTLABEL="zfs-4faebb3cd4206eb4" PARTUUID="f7c43748-21e4-b843-83ef-b71451169c7c"
/dev/sdf1: LABEL="usbpool" UUID="11203921784423396124" UUID_SUB="3489786884074567251" TYPE="zfs_member" PARTLABEL="zfs-14360bc6984732c8" PARTUUID="44bab15a-df1d-5640-a0bd-613ccd54c519"
/dev/sda9: PARTUUID="03e140e2-21a3-4a47-83bc-04c20812adff"
/dev/sdb9: PARTUUID="2d75d767-8924-534d-aeef-7710bc1be177"
/dev/sdc1: PARTLABEL="grub" PARTUUID="071611c2-2686-43c4-a8e6-e62008542151"
/dev/sde9: PARTUUID="face5fcb-806d-3f44-850a-2c63cb3b4822"
/dev/sdf9: PARTUUID="f862b8c2-3268-cc4d-ae77-6a8352a617c6"

There's still some hours for the resilvering to finish but I'll keep you apprised.  FYI, the resilvering using one USB when it was /dev/sdb popped up with CKSUM errors a short while after the process for replace had started.

Could that be because there was no parted  or mklabel were able to be run?  This is an antsle One still running GenToo

 

Uploaded files:
  • resilvering.png

Hi @infobro

Thanks for the update.

I have confirmed that both parted and mklabel (as sub command in parted) are both available in Gentoo-based edgeLinux as referenced in our Support article here:

https://docs.antsle.com/system/usb-drives#create-zpool-on-external-usb-drive

Thank you,
antsle Support