[PVE-User] dual host HA solution in 5.2

Discussion:

Adam Weremczuk

2018-09-28 15:13:37 UTC

Hi all,

I have 2 identical servers, each having 4 NICs + Management and 15 x
500GB disks.

I'm trying Proxmox VE 5.2 and just found out the default recommended HA
solution requires 3 hosts, which I don't have.

I'm only planning to stick to Debian stretch and LXC containers.

I have the following architecture in mind on each:

- RAID1 virtual disk (2 disks) - 500GB space for Proxmox and containers
(Samba AD, LDAP, Cyrus mail, MySQL and a few more).

I will try to import pre-baked Turnkey Linux server apps:
https://www.turnkeylinux.org/all

If there is insufficient space they start using storage which will be:

- RAID50 (6 + 6 disks) + 1 hot spare virtual disk - 5 TB of storage
space. This disk will be synced between servers over 2 directly
connected bonded gigabit links using DRBD/pacemaker/corosys.

Something similar to:
https://www.theurbanpenguin.com/drbd-pacemaker-ha-cluster-ubuntu-16-04/

- each host is connected to a core switch with 2 bonded gigabit links
using LAG/LACP (this is already in place and working nicely).

- LXC containers will be kept in sync on application level (2 domain
controllers, MySQL replication etc.)

- everything will be backed up by a dedicated Bacula server to LTO tapes
with off site rotation.

I should be able to take one node offline at any time.

If I pause storage replication beforehand then it should cause no
disruptions.

If the master node goes boom without warning then I guess some downtime
is possible.

E.g. if the slave storage needs to run some consistency / filesystem check.

With my solution I'm not too concerned about performance or few hours of
downtime.

But data loss or prolonged outages are unacceptable.

Please advise if you have better ideas!

Regards,
Adam

Woods, Ken A (DNR)

2018-09-28 15:51:31 UTC

Permalink

Post by Adam Weremczuk
Please advise if you have better ideas

Buy another server.

Mark Adams

2018-09-28 18:02:51 UTC

Permalink

If you have to stick with 2 servers, personally I would go for zfs as your
storage. Storage replication using zfs in proxmox has been made super
simple.

This is asynchronous though, unlike DRBD. You would have to manually start
your VM's should the "live" node go down and the data will be out of date
depending on how frequently you've told it to sync. IMO, this is a decent
setup if you are limited to 2 servers and is very simple.

Then you also get the great features such as high performance snapshots
(LVM sucks at this..), clones and even really simple replication to another
server (IE a disaster recovery location) with pve-zsync. Not to mention all
the other features of zfs - compression, checksumming etc (google it if you
don't know).

Regards,
Mark

Post by Woods, Ken A (DNR)

Post by Adam Weremczuk
Please advise if you have better ideas

Buy another server.
_______________________________________________
pve-user mailing list
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Yannis Milios

2018-09-28 20:05:13 UTC

Permalink

Another option would be going cheap and adding something like this as a 3rd
node ...

https://pve.proxmox.com/wiki/Raspberry_Pi_as_third_node

Post by Mark Adams
If you have to stick with 2 servers, personally I would go for zfs as your
storage. Storage replication using zfs in proxmox has been made super
simple.
This is asynchronous though, unlike DRBD. You would have to manually start
your VM's should the "live" node go down and the data will be out of date
depending on how frequently you've told it to sync. IMO, this is a decent
setup if you are limited to 2 servers and is very simple.
Then you also get the great features such as high performance snapshots
(LVM sucks at this..), clones and even really simple replication to another
server (IE a disaster recovery location) with pve-zsync. Not to mention all
the other features of zfs - compression, checksumming etc (google it if you
don't know).
Regards,
Mark

Post by Woods, Ken A (DNR)

Post by Adam Weremczuk
Please advise if you have better ideas

Buy another server.
_______________________________________________
pve-user mailing list
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

_______________________________________________
pve-user mailing list
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Adam Weremczuk

2018-10-01 08:13:10 UTC

Permalink

Hi Yannis,

Thank you for the hint.

Would it make any difference if I use a tiny VM running on a different
server (VMware host) instead?

That way I wouldn't need to buy any new hardware, could run Debian
stretch across all nodes, take snapshots etc.

Thanks,
Adam

Post by Yannis Milios
Another option would be going cheap and adding something like this as a 3rd
node ...
https://pve.proxmox.com/wiki/Raspberry_Pi_as_third_node

Adam Weremczuk

2018-10-01 10:06:48 UTC

Permalink

Hi Ronny,

Do you know if it's possible to have the 2 "real nodes" holding all the
data connected directly with a 2 x 1 Gbps bond and the third "dummy"
quorum member (tiny VM) accessible via standard LAN interface, i.e.
traffic going through LAN switches and VMware bond?

My 2 servers have 4 Ethernet ports each so it seams to make sense.

Sync data between them faster without clogging up the LAN.

Or purchasing a new switch for them to share and introducing a single
point of failure.

Regards,
Adam

we run Debian stretch on Rpi's...
but any kind of node would work, even a virtual one on a different
system. as long as it did not run a vm load it can be very lightweight
kind regards
Ronny Aasen

Mark Schouten

2018-10-01 10:33:23 UTC

Permalink

Post by Adam Weremczuk
Or purchasing a new switch for them to share and introducing a
single
point of failure.

In your situation, I would wonder if a switch is really that big of a
SPOF.

It looks like you're trying to beat this:

Loading Image...

--
Mark Schouten | Tuxis Internet Engineering
KvK: 61527076 | http://www.tuxis.nl/
T: 0318 200208 | ***@tuxis.nl

Adam Weremczuk

2018-10-01 08:23:10 UTC

Permalink

Hi Mark,

If restricted to 2 servers, without a 3rd dummy quorum node, I'm not
really planning to replicate the containers.
They would be running in parallel and keep in sync on application level
(e.g. log file based MySQL replication).
The storage virtual disk or partition will only be used for NFS and CIFS
file sharing (Samba AD).
The idea is to eliminate need for any manual intervention, downtime and
keep data in sync as much as possible.
I still haven't decided and considering props and cons of different
architectures.

Thanks,
Adam