Discussion:
[PVE-User] Changing votes and quorum
Dewangga Alam
2018-10-28 13:54:45 UTC
Permalink
Hello!

I was new in proxmox and am trying to build large scale proxmox 5.2
cluster (>128 nodes). My `/etc/pve/corosync.conf` configuration like :

```
nodelist {
node {
name: node1
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.24.2
}
... skip till nodes 28 ...
node {
name: node28
nodeid: 28
quorum_votes: 1
ring0_addr: 192.168.24.110
}
}
quorum {
provider: corosync_votequorum
expected_votes: 16
last_man_standing: 1
last_man_standing_window: 20000
}
totem {
cluster_name: px-cluster1
config_version: 90
window_size: 300
interface {
ringnumber: 0
}
ip_version: ipv4
secauth: on
transport: udpu
version: 2
}
```

My cluster have 28 nodes in each rack. and total nodes will 28 nodes*5
racks. So it will 140 nodes in a cluster. From the adjustment above, I
wonder there's affected to pve/corosync.conf configuration, but in
fact, it didn't.

So my basic question, when I invoke `pvecm status` in a node, the
result wasn't as I expect. Then, is it possible to change votequorum
configuration?


```
Quorum information
- ------------------
Date: Sun Oct 28 20:51:42 2018
Quorum provider: corosync_votequorum
Nodes: 28
Node ID: 0x00000002
Ring ID: 1/27668
Quorate: Yes

Votequorum information
- ----------------------
Expected votes: 28
Highest expected: 28
Total votes: 28
Quorum: 15
Flags: Quorate LastManStanding
```
Thomas Lamprecht
2018-10-29 09:14:41 UTC
Permalink
Hi!
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hello!
I was new in proxmox and am trying to build large scale proxmox 5.2
```
nodelist {
node {
name: node1
nodeid: 1
quorum_votes: 1
ring0_addr: 192.168.24.2
}
... skip till nodes 28 ...
node {
name: node28
nodeid: 28
quorum_votes: 1
ring0_addr: 192.168.24.110
}
}
quorum {
provider: corosync_votequorum
expected_votes: 16
expected_votes must be your real highest expected votes.
last_man_standing: 1
last_man_standing_window: 20000
}
totem {
cluster_name: px-cluster1
config_version: 90
window_size: 300
interface {
ringnumber: 0
}
ip_version: ipv4
secauth: on
transport: udpu
version: 2
}
```
My cluster have 28 nodes in each rack. and total nodes will 28 nodes*5
racks. So it will 140 nodes in a cluster. From the adjustment above, I
wonder there's affected to pve/corosync.conf configuration, but in
fact, it didn't.
So my basic question, when I invoke `pvecm status` in a node, the
result wasn't as I expect. Then, is it possible to change votequorum
configuration?
What wasn't as expected? That your set expected_votes is not
"accepted" by corosync? That is expected behaviour.

What is the real problem you want to solve?
```
Quorum information
- ------------------
Date: Sun Oct 28 20:51:42 2018
Quorum provider: corosync_votequorum
Nodes: 28
Node ID: 0x00000002
Ring ID: 1/27668
Quorate: Yes
Votequorum information
- ----------------------
Expected votes: 28
if more nodes are online than you set expected it will automatically use
the real node count, i.e. formula would be

expected = max(user_set_expected, #nodes_quorate_online)

Note that last_man_standing is not really recommended by us, if you
employ it nonetheless then please test it carefully before rolling out
in production, maybe take also a look at wait_for_all flag to take
this a bit on the save side for cluster cold boots.
Highest expected: 28
Total votes: 28
Quorum: 15
Flags: Quorate LastManStanding
```
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEEZpxiw/Jg6pEte5xQ5X/SIKAozXAFAlvVv6UACgkQ5X/SIKAo
zXBVdg/9GryQp4HbCmTYfHx1wlbESRlTsBViNCoKLCYcnbdYK9oZa6WCO34C5vHq
RGSmvqT8CPk1exlXQYvHRNZwBJHzjZ8t5CtQXxOrW+SiIlSWcuW6iw+UKDrZASqI
wwmJIy0Snu6GqP3Fb6OLfpU5rzGgHARQPBlSxAG7q8U7ZSzQIJ/bm7OnSc/R6Ghk
Xv/GsCfYD3iDlkkGsUb0xG4f2X7o539OP4su5j+cYcjfKn/l+ffsi0P//hZn21He
oltgUq6i9v6VJpwkc8rVIZ7/WI/yOqfiwV4OsYiJpASxxzvysJEKedLLp1fyNTuB
a4D3JXy9caYurbEWGeApKnvJILtr2E1APDJWqXH82MYaY/HgNn97bkdvKnmaq/dD
3/TwaUUOn2Mk74Pw2WxqXHQp7heAZW1+O/pb6qtgIDIMsZRB4zZRddfbh6X7ySoi
Ca8WIh9LkWhD5T1qZPhT2lFyWgLww1ZxCGbEKkVSH7MUsMG8+YwY9NwCTmFdiQoJ
y9gNSMBJ4l8Fj+qyuvj0zd9Wmr3nW6/AsW+6edWUN9py0tKFxgOoqFTJDCn288AC
ytl/a3V/iVDh21GnltkUdoUmZAaYIjty3CPLs3uTNqLaI/7c3FQAw8+13W1Oa1qR
zrBdNa4fkljJw9ew1x5h4zW466HWb5wm1O/HAG9HmPKBQk9UH48=
=wQxp
-----END PGP SIGNATURE-----
Dewangga Alam
2018-10-29 16:36:39 UTC
Permalink
Hello!

Thanks for your response, Thomas.
Hi!
Am 10/28/2018 um 02:54 PM schrieb Dewangga Alam: Hello!
I was new in proxmox and am trying to build large scale proxmox
5.2 cluster (>128 nodes). My `/etc/pve/corosync.conf` configuration
``` nodelist { node { name: node1 nodeid: 1 quorum_votes: 1
node28 nodeid: 28 quorum_votes: 1 ring0_addr: 192.168.24.110 } }
quorum { provider: corosync_votequorum expected_votes: 16
Post by Thomas Lamprecht
expected_votes must be your real highest expected votes.
[..]
last_man_standing: 1 last_man_standing_window: 20000
300 interface { ringnumber: 0 } ip_version: ipv4 secauth: on
transport: udpu version: 2 } ```
My cluster have 28 nodes in each rack. and total nodes will 28
nodes*5 racks. So it will 140 nodes in a cluster. From the
adjustment above, I wonder there's affected to pve/corosync.conf
configuration, but in fact, it didn't.
So my basic question, when I invoke `pvecm status` in a node, the
result wasn't as I expect. Then, is it possible to change
votequorum configuration?
Post by Thomas Lamprecht
What wasn't as expected? That your set expected_votes is not
"accepted" by corosync? That is expected behaviour.
What is the real problem you want to solve?
I want to build > 32 nodes in one cluster, and I am expect that
expected_votes can be controlled lower than real votes. I thought, it
should be make a quorum if the 50%+1 formulas aren't met.

Is it a best practice?
``` Quorum information ------------------ Date: Sun Oct
Yes
Votequorum information ---------------------- Expected votes: 28
Post by Thomas Lamprecht
if more nodes are online than you set expected it will
automatically use the real node count, i.e. formula would be
expected = max(user_set_expected, #nodes_quorate_online)
So, if real nodes online are 56, then I set expected_votes: 16. it
will be override by real nodes online, right? I expect, that if first
16 nodes visible at the first time, it will be quorate as soon as
possible.
Post by Thomas Lamprecht
Note that last_man_standing is not really recommended by us, if
you employ it nonetheless then please test it carefully before
rolling out in production, maybe take also a look at wait_for_all
flag to take this a bit on the save side for cluster cold boots.
Highest expected: 28 Total votes: 28 Quorum: 15
Flags: Quorate LastManStanding ```
Thomas Lamprecht
2018-10-30 08:48:11 UTC
Permalink
Hi!
Post by Dewangga Alam
Am 10/28/2018 um 02:54 PM schrieb Dewangga Alam: Hello!
I was new in proxmox and am trying to build large scale proxmox
5.2 cluster (>128 nodes). My `/etc/pve/corosync.conf` configuration
``` nodelist { node { name: node1 nodeid: 1 quorum_votes: 1
node28 nodeid: 28 quorum_votes: 1 ring0_addr: 192.168.24.110 } }
quorum { provider: corosync_votequorum expected_votes: 16
Post by Thomas Lamprecht
expected_votes must be your real highest expected votes.
[..]
last_man_standing: 1 last_man_standing_window: 20000
300 interface { ringnumber: 0 } ip_version: ipv4 secauth: on
transport: udpu version: 2 } ```
My cluster have 28 nodes in each rack. and total nodes will 28
nodes*5 racks. So it will 140 nodes in a cluster. From the
adjustment above, I wonder there's affected to pve/corosync.conf
configuration, but in fact, it didn't.
So my basic question, when I invoke `pvecm status` in a node, the
result wasn't as I expect. Then, is it possible to change
votequorum configuration?
Post by Thomas Lamprecht
What wasn't as expected? That your set expected_votes is not
"accepted" by corosync? That is expected behaviour.
What is the real problem you want to solve?
I want to build > 32 nodes in one cluster,
cool!
Post by Dewangga Alam
and I am expect that
expected_votes can be controlled lower than real votes. I thought, it
should be make a quorum if the 50%+1 formulas aren't met.
No, that isn't needed. Highest expected can not be smaller then the
actual online quorate nodes. That's like saying you have a election
in your small country with 10 people (and thus 10 expected votes) but
you receive (as example) 16 votes - something is fishy. Corosync here
just thinks that the census (in this case you) was wrong and uses the
higher number.

Although, if you set expected votes, but have a higher node count
but equal or less than your manual set expected_votes count are
online/quorate then you will also see that it stays at your set number
and you actually can have quorum with "less" nodes. But as soon as more
nodes get online corosync expected vote number will rise.

When enabling last_man_standing you can achieve that the highest
expecting also scales down, if some nodes go down (or have another
reason to lose cluster communication) but there are enough nodes left
to have a working, quorate, cluster then after a specified time window
- when all stays working - corosync recalculates it's expected votes
and you can then loose additional nodes.

Hope that helps a bit.

cheers,
Thomas
Post by Dewangga Alam
Is it a best practice?
Dewangga Alam
2018-10-30 09:31:16 UTC
Permalink
Hello!
Post by Thomas Lamprecht
Hi!
Post by Dewangga Alam
Am 10/28/2018 um 02:54 PM schrieb Dewangga Alam: Hello!
I was new in proxmox and am trying to build large scale
proxmox 5.2 cluster (>128 nodes). My `/etc/pve/corosync.conf`
``` nodelist { node { name: node1 nodeid: 1 quorum_votes: 1
ring0_addr: 192.168.24.2 } ... skip till nodes 28 ... node {
192.168.24.110 } } quorum { provider: corosync_votequorum
expected_votes: 16
Post by Thomas Lamprecht
expected_votes must be your real highest expected votes.
[..]
last_man_standing: 1 last_man_standing_window: 20000
} totem { cluster_name: px-cluster1 config_version: 90
window_size: 300 interface { ringnumber: 0 } ip_version: ipv4
secauth: on transport: udpu version: 2 } ```
My cluster have 28 nodes in each rack. and total nodes will 28
nodes*5 racks. So it will 140 nodes in a cluster. From the
adjustment above, I wonder there's affected to
pve/corosync.conf configuration, but in fact, it didn't.
So my basic question, when I invoke `pvecm status` in a node,
the result wasn't as I expect. Then, is it possible to change
votequorum configuration?
Post by Thomas Lamprecht
What wasn't as expected? That your set expected_votes is not
"accepted" by corosync? That is expected behaviour.
What is the real problem you want to solve?
I want to build > 32 nodes in one cluster,
cool!
I am really scary right now. LoL.
Post by Thomas Lamprecht
Post by Dewangga Alam
and I am expect that expected_votes can be controlled lower than
real votes. I thought, it should be make a quorum if the 50%+1
formulas aren't met.
No, that isn't needed. Highest expected can not be smaller then
the actual online quorate nodes. That's like saying you have a
election in your small country with 10 people (and thus 10 expected
votes) but you receive (as example) 16 votes - something is fishy.
Corosync here just thinks that the census (in this case you) was
wrong and uses the higher number.
Although, if you set expected votes, but have a higher node count
but equal or less than your manual set expected_votes count are
online/quorate then you will also see that it stays at your set
number and you actually can have quorum with "less" nodes. But as
soon as more nodes get online corosync expected vote number will
rise.
When enabling last_man_standing you can achieve that the highest
expecting also scales down, if some nodes go down (or have another
reason to lose cluster communication) but there are enough nodes
left to have a working, quorate, cluster then after a specified
time window - when all stays working - corosync recalculates it's
expected votes and you can then loose additional nodes.
OTOH, with last_man_standing it should be safer than wait_for_all,
isn't it? (eg. If I got real nodes online 56 nodes in a cluster, then
30 nodes are down, it will trigger expected_votes and re-calculate
automatically, right?)

If it isn't, is there any best practice to adjust corosync.conf in
large scale deployment?
Post by Thomas Lamprecht
Hope that helps a bit.
cheers, Thomas
Post by Dewangga Alam
Is it a best practice?
Loading...