Discussion:
[PVE-User] ZFS, grub cannot load second stage...
Marco Gaiarin
2018-07-19 16:34:59 UTC
Permalink
I've installed a little server vith PVE 5.2, using ZFS, as just done
before.
The only differences is that server have a raid-1 zpool with 4 disks (2
1TB and 2 4TB disks) all in a single 5TB rpool.
The server have a plain AHCI controller, no RAID hardware.

After configuring the server, configuring some VM/LXC inside, probably
a user inadvertitely powered off it.


The server never gone back, and print in console:

error: no such device: aeXXXXX.
error: unknown filesystem.
Entering rescue mode...
grub rescue>

in rescue mode, i can 'ls' all the disks and seems OK (number of disks
and partition match), but i cannot 'ls' inside (filesystem not found or
similar error).


I've googled and found:

https://forum.proxmox.com/threads/crashes-with-zfs-root-and-stuck-on-grub-rescue-prompt.34172/

and i was able to run the install CD (rescue mode does nothing, say
there's no rpool) and with:

zpool import -f -d /dev/ -R /mnt rpool

i've mounted the pool, chmod-ed on it and tried to reinstall grub and
rebuild grub confing and initrd. Data seems OK.

But after a reboot, throw the same error.


How can fix that?


The hex code in 'error: no such device: aeXXXXX.' is relative to the
'ROOT' zfs volume? How can i determine if match correctly?


Please, help me. Thanks.
--
dott. Marco Gaiarin GNUPG Key ID: 240A3D66
Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/
Polo FVG - Via della Bontà, 7 - 33078 - San Vito al Tagliamento (PN)
marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797

Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
Marco Gaiarin
2018-07-19 16:47:57 UTC
Permalink
Mandi! Marco Gaiarin
In chel di` si favelave...
Post by Marco Gaiarin
zpool import -f -d /dev/ -R /mnt rpool
Sorry, a little notes.

I was forced to use -d /dev because if i omit it, zpool found only two
disks out of 4 (does not found the two disks added after the first
install). This seems strange.


As just stated, the server was rebooted many time, so, anyway, it
worked.


Again, thanks.
--
dott. Marco Gaiarin GNUPG Key ID: 240A3D66
Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/
Polo FVG - Via della Bontà, 7 - 33078 - San Vito al Tagliamento (PN)
marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797

Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
Marco Gaiarin
2018-07-20 15:56:39 UTC
Permalink
Post by Marco Gaiarin
Sorry, a little notes.
Again, two little notes. Finally i've, in some way, extracted the VMs
and containers form the 'dead' system and reinstalled, but this time
using TWO zfs pools.

Anyway...

a) i've noted that, even if i've used many tools, the 4TB set of disk
remains with label of provious setup (linux software raid); i was
forced to use dd and write zeroes to some megabytes of the disks, not
the first 512 bytes only.

I suppose that this is the 'mother of all my troubles', but still i
don't understand why the system worked perfectly, until i ''filled''
disks with data.


b) i've discovered that systemd does not work in chroot environment; so
i was not able to run proxmox and only after fiddling a bit with
services i was able to mount /etc/pve filesystem and backup containers
with vzdump.
Still VMs backup does not work, so i've done a disk dump with dd and
recreated a identical machine on the reinstalled system.

Probably a page on the wiki on how to work on such 'emergency mode'
would be useful.


Sorry, i was on time shortage and so i've not appointed the procedure i
used, but roughly:

a) boot the proxmox 5.2 cd; do Ctrl-C on terminal 1 to interrupt the
installer and have a shell (please, add some shells on the installer
cd!).

b) import the 'broken' pool and mount it somewhere; mount also /proc,
/sys and /dev and chroot within.

c) start manually pmxcfs to mount /etc/pve

d) backup containers with 'vzdump'

e) backup VMs disks with dd ('zfs list' for the list of zfs devices)

f) backup /etc for reference and for VMs parameters (eg, MAC addresses)

Clearly, to test all that stuff there's no need to have a 'broken'
system, simply boot the installer cd on a working one.


Thanks.
--
dott. Marco Gaiarin GNUPG Key ID: 240A3D66
Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/
Polo FVG - Via della Bontà, 7 - 33078 - San Vito al Tagliamento (PN)
marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797

Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
Loading...