[PVE-User] Bug when removing a VM

Discussion:

l***@ulrar.net

2018-06-21 08:04:58 UTC

Hi,

Not sure this is a bug, but if it's not there should be a huge red
warning in the VM removal box, I think.

We've been migrating some VMs between clusters, and for that I mounted
the old storage on the new cluster, and just re-created the VM, stopped
them on the old cluster and started them on the new, then used "move
disk"

That works fine, but this morning a colleague just deleted a bunch of
VMs on the new clusters, and we discovered with horror that when you
delete VM 112 for example, it doesn't just remove the images/112
directory on the storage the VM was using, it does it on all attached
storage.
So when he deleted a few VM on the new cluster, it deleted the hard
drives of a bunch of other VM on the old cluster that hadn't been
migrated yet.

Surprise ..

--
PGP Fingerprint : 0x624E42C734DAC346

Eneko Lacunza

2018-06-21 08:12:11 UTC

Permalink

Hi,

I'm sorry for your troubles, I hope you had good backups.

You should never share storage between clusters.
-> If you must or it's convenient to do so, just don't repeat de VM Ids...

For example on a NFS server, another thing you can do is just use a
different directory for each cluster; when you want to migrate a VM
between clusters, just 'mv' from NFS cli the VM directories between NFS
directories, but again be aware of repeated VM id!!

Cheers

Post by l***@ulrar.net
Hi,
Not sure this is a bug, but if it's not there should be a huge red
warning in the VM removal box, I think.
We've been migrating some VMs between clusters, and for that I mounted
the old storage on the new cluster, and just re-created the VM, stopped
them on the old cluster and started them on the new, then used "move
disk"
That works fine, but this morning a colleague just deleted a bunch of
VMs on the new clusters, and we discovered with horror that when you
delete VM 112 for example, it doesn't just remove the images/112
directory on the storage the VM was using, it does it on all attached
storage.
So when he deleted a few VM on the new cluster, it deleted the hard
drives of a bunch of other VM on the old cluster that hadn't been
migrated yet.
Surprise ..

--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

l***@ulrar.net

2018-06-21 08:26:51 UTC

Permalink

Well we have too many clusters for not repeating IDs, keeping track
would be pretty hard.

Can't really use "mv" to migrate, the downtime would be huge, that's why
we use the "move disk" option to do it online. All of that works great,
just find that really really strange that proxmox deletes the directory
everywhere, not just on the related storage. At the very least a warning
would be nice, maybe even a list of the files it's going to remove if
you press the button ? That way you can check it's valid

Post by Eneko Lacunza
Hi,
I'm sorry for your troubles, I hope you had good backups.
You should never share storage between clusters.
-> If you must or it's convenient to do so, just don't repeat de VM Ids...
For example on a NFS server, another thing you can do is just use a
different directory for each cluster; when you want to migrate a VM
between clusters, just 'mv' from NFS cli the VM directories between NFS
directories, but again be aware of repeated VM id!!
Cheers

--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es
_______________________________________________
pve-user mailing list
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

--
PGP Fingerprint : 0x624E42C734DAC346

Dietmar Maurer

2018-06-21 08:43:53 UTC

Permalink

In general, you should never mount storage on different clusters at the same
time. This is always dangerous - mostly because there is no locking and
because of VMID conflicts. If you do, mount at least read-only.

Post by l***@ulrar.net
Not sure this is a bug, but if it's not there should be a huge red
warning in the VM removal box, I think.
We've been migrating some VMs between clusters, and for that I mounted
the old storage on the new cluster,

l***@ulrar.net

2018-06-21 09:17:46 UTC

Permalink

Mounting it read only doesn't allow easy migration between clusters,
which is the goal here. Unless someone knows a better way to migrate a
VM from a cluster to another without more than a few seconds of
downtime, but that's the best I could come up with

Post by Dietmar Maurer
In general, you should never mount storage on different clusters at the same
time. This is always dangerous - mostly because there is no locking and
because of VMID conflicts. If you do, mount at least read-only.

--
PGP Fingerprint : 0x624E42C734DAC346

Simone Piccardi

2018-06-21 13:48:28 UTC

Permalink

Yes, that's dangerous (I was hurt by this).

But still I do not understand why, if you remove a VM that has a disk
hosted in a specific storage, it will removed also on all other storage
(they ), or, like it happened to me, all the logical volumes with the
same VMID number in the name.

We was using different Proxmox server (as independent standalone server,
as they must stay in totally separated networks) using shared LVM over a
FC connected SAN.

We usually avoid to use duplicated VMID, but one time a VM was created
in a server having an already taken VMID (in another server). It worked
because it correctly create a logical volume with a different disk
number, but when it was removed, also the other logical volume was deleted.

I do not understand the reason of this behaviour, there is a clearly
identified logical volume (or disk name on a storage) written inside the
configuration; if just that one is removed, no problem arise. Why you
remove also other logical volumes o disk on different storage, not
related to the VM you are deleting?

Regards
Simone

--
Simone Piccardi Truelite Srl
***@truelite.it (email/jabber) Via Monferrato, 6
Tel. +39-347-1032433 50142 Firenze
http://www.truelite.it Tel. +39-055-7879597

Dietmar Maurer

2018-06-21 14:33:25 UTC

Permalink

Post by Simone Piccardi

All volume belongs to a VM, indicated by the encoded VMID. If you
remove a VM, we remove all volumes belonging to that VM.

l***@ulrar.net

2018-06-21 14:54:26 UTC

Permalink

Post by Dietmar Maurer

Post by Simone Piccardi

All volume belongs to a VM, indicated by the encoded VMID. If you
remove a VM, we remove all volumes belonging to that VM.

You remove anything containing the VMID, even volumes that the VM config
aren't refering to. That's really, really strange and should be warned
in big bright red in the interface I think, because that's really not
what you'd expect.

--
PGP Fingerprint : 0x624E42C734DAC346

Dietmar Maurer

2018-06-21 15:07:22 UTC

Permalink

Post by l***@ulrar.net

Post by Dietmar Maurer
All volume belongs to a VM, indicated by the encoded VMID. If you
remove a VM, we remove all volumes belonging to that VM.

You remove anything containing the VMID, even volumes that the VM config
aren't refering to. That's really, really strange and should be warned

This is not strange if you use the system in the recommended way. You do
really dangerous things, so a warning would not help anyways. With your setup,
you will lose data anyways (sooner or later).

Dietmar Maurer

2018-06-21 14:36:23 UTC

Permalink

Post by Simone Piccardi

I do repeat myself, but you should never do that (never). Locking does
not work, and it is likely that you will lose data.

David Lawley

2018-06-21 15:21:29 UTC

Permalink

Just sharing an experience.. Not trying to high jack

Had an occurrence few years back, older version of Prox, with fencing
and HA ( it was a way older version) . I did not stop the VM before
enabling HA. Guess what, had ghosts in the machine. The VMs that I had
enabled HA started another VM, so I had duplicate VMs. Only one showed
up in the control panel. It was not until I stooped the vm and found I
could still ping it did I discover this. I did not think it would let
me do it if would screw something up. It kind of backed me off of HA.
Sure this is all better now, right?

Dietmar Maurer

2018-06-21 15:52:22 UTC

Permalink

Post by David Lawley
could still ping it did I discover this. I did not think it would let
me do it if would screw something up. It kind of backed me off of HA.
Sure this is all better now, right?

If you reported a bug, you can few the status in the bug tracker.