Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zpool import fails with "internal error", Value too large for defined data type #16973

Open
Wabelbit opened this issue Jan 22, 2025 · 2 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@Wabelbit
Copy link

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 24.10
Kernel Version 6.11.0-13-generic
Architecture 64 bit
OpenZFS Version 2.2.6

Describe the problem you're observing

zpool import fails with internal error: cannot import 'rpool': Value too large for defined data type. This happens on both of these pools (both pools use encryption, all devices are solid-state storage):

   pool: rpool
     id: 10746466459123295066
  state: ONLINE
status: Some supported features are not enabled on the pool.
	(Note that they may be intentionally disabled if the
	'compatibility' property is set.)
 action: The pool can be imported using its name or numeric identifier, though
	some features will not be available without an explicit 'zpool upgrade'.
 config:

	rpool        ONLINE
	  nvme1n1p3  ONLINE

   pool: hpool
     id: 4997779101015336780
  state: ONLINE
status: Some supported features are not enabled on the pool.
	(Note that they may be intentionally disabled if the
	'compatibility' property is set.)
 action: The pool can be imported using its name or numeric identifier, though
	some features will not be available without an explicit 'zpool upgrade'.
 config:

	hpool       ONLINE
	  mirror-0  ONLINE
	    sda     ONLINE
	    sdb1    ONLINE

evidently, a simple zpool import discovers them just fine and reports nothing out of the ordinary. There is also a third pool on spinny-disks that imports just fine.

I'm currently on a fresh installation of Ubuntu to try and recover my ArchLinux zfs-on-root system, so it can't be related to corrupted cache file.

I've tried all non-destructive ways of recovering that zpool-import has to offer, but everything results in the same error message.

Describe how to reproduce the problem

Not sure yet - when I shut down my system yesterday, everything was still okay. Today I start it up an am faced with this error.
The only thing I can think of is that I did a zpool trim on both of these pools just before shutting down. Not sure if that was still going at the time of shutdown.

Include any warning/errors/backtraces from the system logs

/proc/spl/kstat/zfs/dbgmsg

1737550015   spa.c:6771:spa_tryimport(): spa_tryimport: importing hpool
1737550015   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): LOADING
1737550015   vdev.c:161:vdev_dbgmsg(): disk vdev '/dev/sda1': best uberblock found for spa $import. txg 2015398
1737550015   spa_misc.c:418:spa_load_note(): spa_load($import, config untrusted): using uberblock with txg=2015398
1737550015   spa_misc.c:2311:spa_import_progress_set_notes_impl(): 'hpool' Loading checkpoint txg
1737550015   spa_misc.c:418:spa_load_note(): spa_load($import, config trusted): UNLOADING
1737550015   spa.c:6623:spa_import(): spa_import: importing hpool
1737550015   spa_misc.c:418:spa_load_note(): spa_load(hpool, config trusted): LOADING
1737550015   vdev.c:161:vdev_dbgmsg(): disk vdev '/dev/sda1': best uberblock found for spa hpool. txg 2015398
1737550015   spa_misc.c:418:spa_load_note(): spa_load(hpool, config untrusted): using uberblock with txg=2015398
1737550015   spa_misc.c:2311:spa_import_progress_set_notes_impl(): 'hpool' Loading checkpoint txg
1737550015   spa_misc.c:418:spa_load_note(): spa_load(hpool, config trusted): UNLOADING
1737550015   spa_misc.c:418:spa_load_note(): spa_load(hpool, config trusted): spa_load_retry: rewind, max txg: 2015397
1737550015   spa_misc.c:418:spa_load_note(): spa_load(hpool, config trusted): LOADING
1737550015   vdev.c:161:vdev_dbgmsg(): disk vdev '/dev/sda1': best uberblock found for spa hpool. txg 2015395
1737550015   spa_misc.c:418:spa_load_note(): spa_load(hpool, config untrusted): using uberblock with txg=2015395
1737550015   spa_misc.c:2311:spa_import_progress_set_notes_impl(): 'hpool' Loading checkpoint txg
1737550015   spa_misc.c:418:spa_load_note(): spa_load(hpool, config trusted): UNLOADING
@Wabelbit Wabelbit added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jan 22, 2025
@Wabelbit
Copy link
Author

Wabelbit commented Jan 22, 2025

Update: I did another barebones Arch install real quick to get access to the same ZFS 2.3.0 that these pools were previously used with.
With the latest version, I am now able to import them, even though zpool list shows ALLOC 0, which is mildly concerning. Could be due to the fact that I imported with -o readonly=on, not sure if that's expected behaviour then.

It is still strange that the third pool works just fine with 2.2.6 though. The only other significant difference I can think of that I've used block cloning on the broken(?) pools.

I'll go salvage my data, then try to import read-write and see if I can find out anything more. Otherwise this might just be a lack of forward-compatibility between 2.2 and 2.3, though I haven't used zpool upgrade on any of the pools in quite a while, so I wasn't expecting this kind of problem (and besides, all was working perfectly fine 24h ago!).

@Wabelbit
Copy link
Author

I am very happy to report that my pools actually seem to be fine. I just can't use ZFS <2.3 with them anymore, even though I never upgraded the pools. Oh well, might aswell do that now.
For me personally, the solution will be "always use the same zfs version everywhere". Although I'm still at a loss for ideas how this was suddenly a problem, after using those pools with 2.2.6 and 2.3.0 interchangeably for almost a week now, and how it was never a problem for that third pool.

Interesting side-note, I did in fact shut the system down while there was still an ongoing trim – after importing it read-write again, zpool status showed the trim resuming:

  pool: hpool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:21:10 with 0 errors on Tue Jan 21 00:33:15 2025
checkpoint: created Wed Jan 22 00:18:41 2025, consumes 93.1M
config:

        NAME        STATE     READ WRITE CKSUM
        hpool       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sda     ONLINE       0     0     0  (trimming)
            sdb1    ONLINE       0     0     0

errors: No known data errors

And not sure how relevant, but zfs/dbgmsg mentioned something about data errors for this pool during import:

1737569877   ffffa0950de2a140 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'hpool' Verifying Log Devices
1737569877   ffffa0950de2a140 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'hpool' Verifying pool data
1737569877   ffffa0950de2a140 spa_misc.c:429:spa_load_note(): spa_load(hpool, config trusted): spa_load_verify found 0 metadata errors and 4 data errors
1737569877   ffffa0950de2a140 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'hpool' Calculating deflated space
1737569877   ffffa0950de2a140 spa_misc.c:2376:spa_import_progress_set_notes_impl(): 'hpool' Starting import

the other affected pool didn't have that and a subsequent scrub found nothing:
scan: scrub repaired 0B in 00:15:22 with 0 errors on Wed Jan 22 21:27:39 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

1 participant