Discussion:
[PVE-User] please help setup correctly proxmox cluster
Юрий Авдеев
2018-10-22 05:02:52 UTC
Permalink
Hello everyone,

Please sorry for bad english. About three times I trying to set up proxmox cluster, but still unsuccessful.
Cluster works greatly about one-two days and then just dying.
Nodes are responses to each other with ICMP, but Cluster is dead and no quorum to power on any virt machine.
I don't know what I do wrong. Clear install.
What I need: Two hosts (node1 and node2) with one virtual machine in replication without shared storage.
If one of two hosts is dead - virtual machine will starts in other hosts. Node3 is online, only for quorum, not for virt.
I using ZFS for storage on hosts. Please help me to understand, how I must set up this thing correctly.
Great thanks everybody!

-- 
С уважением, 
Юрий Авдеев
+79046111135
Thomas Lamprecht
2018-10-22 07:08:45 UTC
Permalink
Post by Юрий Авдеев
What I need: Two hosts (node1 and node2) with one virtual machine in replication without shared storage.
If one of two hosts is dead - virtual machine will starts in other hosts. Node3 is online, only for quorum, not for virt.
I using ZFS for storage on hosts. Please help me to understand, how I must set up this thing correctly.
Great thanks everybody!
Either just give one node more votes (preferred) or instead of starting
up a VM with PVE for quorum just execute `pvecm expected 1` if you are sure
the other node *is really* dead.

It's hard to say whats wrong without any logs. But a VM in PVE for itself
should never be done, at least if not for testing purpose - to much
intertwined dependencies, it's just not a good stable design.
p***@mattern.org
2018-10-22 08:27:50 UTC
Permalink
Hi,

maybe you have a problem with IGMP snooping? What means "dying"? What
does it do?
You can test to use UDP instead of mcasts in corosync (see
https://pve.proxmox.com/wiki/Multicast_notes)

Marcus
Post by Юрий Авдеев
Hello everyone,
Please sorry for bad english. About three times I trying to set up proxmox cluster, but still unsuccessful.
Cluster works greatly about one-two days and then just dying.
Nodes are responses to each other with ICMP, but Cluster is dead and no quorum to power on any virt machine.
I don't know what I do wrong. Clear install.
What I need: Two hosts (node1 and node2) with one virtual machine in replication without shared storage.
If one of two hosts is dead - virtual machine will starts in other hosts. Node3 is online, only for quorum, not for virt.
I using ZFS for storage on hosts. Please help me to understand, how I must set up this thing correctly.
Great thanks everybody!
-- 
С уважением, 
Юрий Авдеев
+79046111135
_______________________________________________
pve-user mailing list
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Юрий Авдеев
2018-10-22 10:27:22 UTC
Permalink
Hi and thanks for reply!

Thanks for idea, but take a look to my setup step by step, maybe I miss something?

1. Hardware
Node1 - supermicro server with good cpu and 256Gb RAM, 3 15k SAS, 5 SSD's.
Node2 - absolutely same server like Node1
Node3 - small little PC with 4Gb RAM and one 500Gb SATA disk
Network 1Gbps via VLAN (nodes are not in the same place)

What I need: Virtual Machine in H/A mode (if some of two nodes die, vm should be started up on the second node)

2. My configuration steps on Node1 and Node2:
- In first of all of course is clear installation of Proxmox 5.2-1 to every node
- In disk section of installer I prepare ZFS setup with 3 SAS 15k disks in ZFS-RAID1 mode
- Network for nodes is placed in private subnet 172.31.101.0/24
- After install process I will configure addition file storage with 5 SSDs in ZFS-RAIDz3 mode

3. My config steps on Node3:
- Simple install proxmox 5.2-1 on 500Gb SATA disk with ext4 and default LVM setup
- Network for node is placed in private subnet 172.31.101.0/24 too

4. Configuring proxmox:
- I logged via web to Node2 and configure Cluster (in web - Datacenter => Cluster => Create cluster)
- Then I will login to Node1 and click "Join Cluster", then I'll copy here join info and cluster ready
- After that I will configure local storage parameters, and check SSDs storage for use only with this two nodes
- Now its time to add Node3. Paste join information and cluster is online with 3 nodes and quorum.

5. Virtual machine.
- VM is created and placed on node2
- Same physically network is used (but another IP subnet)
- Setting up replication for this VM to node1 (every minute */1)
- Setting up H/A for this machine and request "started" state
- Then I'll create Group in H/A where I will check only node1 and node2, also I will check "restricted" checkbox

Setup is works fine. If some node dies, VM will be started after fencing will completes.
But this setup will die after one-two days.
After that, Cluster is looks like not defined, but nodes are still avaliable for each other on the network.
VM does not works, all work is stalled.

I read in manuals something about MASTER node... May be I must register node3 as master?
But if I do that, I won't get SSD's storage, what is presend on node1 and node2, but not on node3.

Thanks everyone for reply!

-- 
С уважением, 
Юрий Авдеев
+79046111135
Post by p***@mattern.org
Hi,
maybe you have a problem with IGMP snooping? What means "dying"? What
does it do?
You can test to use UDP instead of mcasts in corosync (see
https://pve.proxmox.com/wiki/Multicast_notes)
Marcus
 Hello everyone,
 Please sorry for bad english. About three times I trying to set up proxmox cluster, but still unsuccessful.
 Cluster works greatly about one-two days and then just dying.
 Nodes are responses to each other with ICMP, but Cluster is dead and no quorum to power on any virt machine.
 I don't know what I do wrong. Clear install.
 What I need: Two hosts (node1 and node2) with one virtual machine in replication without shared storage.
 If one of two hosts is dead - virtual machine will starts in other hosts. Node3 is online, only for quorum, not for virt.
 I using ZFS for storage on hosts. Please help me to understand, how I must set up this thing correctly.
 Great thanks everybody!
 --
 С уважением,
 Юрий Авдеев
 +79046111135
 _______________________________________________
 pve-user mailing list
 https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
_______________________________________________
pve-user mailing list
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Yannis Milios
2018-10-22 12:05:41 UTC
Permalink
The previous two posts provided you already with enough tips (including a
link to the wiki) on how to troubleshoot your situation.

It’s now up to you to give some effort in reading carefully what is being
said there in order first to understand and then troubleshoot the problem.

In my opinion (and the others posters) this is caused to some kind of
malfunction on the cluster communication. If the cluster communication is
not working properly, then you will have such behaviour.
I would give attention in particular to the the fact that the nodes, are
not “in the same place”,as you stated, hence the need to implement the VLAN
approach.
--
Sent from Gmail Mobile
Юрий Авдеев
2018-10-22 12:09:47 UTC
Permalink
Thanks for reply.
I setup corosync with UDP transport now.
If troubles will arraive again, I will send info about it.

-- 
С уважением, 
Юрий Авдеев
+79046111135
Post by Yannis Milios
The previous two posts provided you already with enough tips (including a
link to the wiki) on how to troubleshoot your situation.
It’s now up to you to give some effort in reading carefully what is being
said there in order first to understand and then troubleshoot the problem.
In my opinion (and the others posters) this is caused to some kind of
malfunction on the cluster communication. If the cluster communication is
not working properly, then you will have such behaviour.
I would give attention in particular to the the fact that the nodes, are
not “in the same place”,as you stated, hence the need to implement the VLAN
approach.
--
Sent from Gmail Mobile
_______________________________________________
pve-user mailing list
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Loading...