[PVE-User] DRS proxmox

Discussion:

Mohamed Sadok Ben Jazia

2016-04-05 09:04:38 UTC

Hi list,
For my proxmox infrastructure, i set a number of nodes of a cluster.
I'm looking for a load-balancer, to make those tasks:
-Choose the best node for a just created or resized CT/VM.
-Live migration to gain ressources on nodes, or for optimisation.
My idea is to create a dynamic resources scheduler that is integrated
to my server side script to perform this function.

Here is the ling to the project

https://github.com/BenJaziaSadok/proxmox-DRS

Any help with the algorithm or in the development is welcome

Thank you

Thomas Lamprecht

2016-04-05 09:42:20 UTC

Permalink

Hi,

this idea was proposed quite some time ago and we planned to implement
it in the pve-ha-manager stack,
as it provides a lot of functionality needed for that.

The general idea from our side is:
* wait for the cluster to become stable (e.g. a few minutes no cluster
action),
* evaluate the load
* see if there is a configuration which makes the load more equal, here
migrate "lighter" VMs first else we may get to big system time delays
which are bad for such systems and can cause instability.
* if there is any such configuration try to achieve it (migrating one VM
at a time).
* start at the beginning.

There are a few question open, e.g. how to determine load _correctly_ as
there are various setups and indicators from memory, cpu, network and
IO, which may have different effects on different setups.
What happens in edge cases (fencing, ...)

Also a static value which can be assigned to VMs would be nice, as just
because a VM is lightweight

Thus we want to start simple, i.e. use static load balance, then a
simple dynamic on (e.g. CPU only) and at best with a simulation which
can evaluate how often migration happened and so on (wishlist).
And AFAIK, we want to "limit" it to HA Groups, meaning this group should
be balanced over the group assigned nodes.

The point of this message is to summarize our (or better my) thoughts to
that topic and to notify you that there is already something planned and
also that there is a Project by us which someone who wants to implement
that could make use of, namely the Proxmox VE HA Manager.

I appreciate the fact that you want to make something for PVE and wish
you the best,
it could be a though worth for you to use some of the HA manager stack
using perl would help here, this way it could also land upstream.

best regards,
Thomas

Post by Mohamed Sadok Ben Jazia
Hi list,
For my proxmox infrastructure, i set a number of nodes of a cluster.
-Choose the best node for a just created or resized CT/VM.
-Live migration to gain ressources on nodes, or for optimisation.
My idea is to create a dynamic resources scheduler that is integrated
to my server side script to perform this function.
Here is the ling to the project
https://github.com/BenJaziaSadok/proxmox-DRS
Any help with the algorithm or in the development is welcome
Thank you
_______________________________________________
pve-user mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Mohamed Sadok Ben Jazia

2016-04-05 10:03:29 UTC

Permalink

Thank you Thomas
I'm going to describe my thoughts about the DRS based on the project i'm
working on, and i was stuck in this step.
Starting from many clusters in different sub nets and locations, i want
create a large number of LXC containers for my clients.
So for one cluster with many nodes and shared storage, it's a greedy
algorithm with best matches, and by considering the LXC live migration is
not yet available, this is what i'm doing:

For each new container, or re-sizing an old one, i loop all available nodes
in the cluster and see the one that uses more ressources without reaching
the max possible hardware resources, in order to make nodes full.
Optimization of this method is doing a silent migration when a container is
rebooted or restarted based on the same logic.

What do you think of my logic (if it's clear until now).
Also, this point is not clear for me (* wait for the cluster to become
stable (e.g. a few minutes no cluster action), can you explain the reason.

best regards

Post by Thomas Lamprecht
Hi,
this idea was proposed quite some time ago and we planned to implement
it in the pve-ha-manager stack,
as it provides a lot of functionality needed for that.
* wait for the cluster to become stable (e.g. a few minutes no cluster
action),
* evaluate the load
* see if there is a configuration which makes the load more equal, here
migrate "lighter" VMs first else we may get to big system time delays
which are bad for such systems and can cause instability.
* if there is any such configuration try to achieve it (migrating one VM
at a time).
* start at the beginning.
There are a few question open, e.g. how to determine load _correctly_ as
there are various setups and indicators from memory, cpu, network and
IO, which may have different effects on different setups.
What happens in edge cases (fencing, ...)
Also a static value which can be assigned to VMs would be nice, as just
because a VM is lightweight
Thus we want to start simple, i.e. use static load balance, then a
simple dynamic on (e.g. CPU only) and at best with a simulation which
can evaluate how often migration happened and so on (wishlist).
And AFAIK, we want to "limit" it to HA Groups, meaning this group should
be balanced over the group assigned nodes.
The point of this message is to summarize our (or better my) thoughts to
that topic and to notify you that there is already something planned and
also that there is a Project by us which someone who wants to implement
that could make use of, namely the Proxmox VE HA Manager.
I appreciate the fact that you want to make something for PVE and wish
you the best,
it could be a though worth for you to use some of the HA manager stack
using perl would help here, this way it could also land upstream.
best regards,
Thomas

_______________________________________________
pve-user mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Thomas Lamprecht

2016-04-06 06:28:16 UTC

Permalink

Hi,

Post by Mohamed Sadok Ben Jazia
Thank you Thomas
I'm going to describe my thoughts about the DRS based on the project i'm
working on, and i was stuck in this step.
Starting from many clusters in different sub nets and locations, i want
create a large number of LXC containers for my clients.
So for one cluster with many nodes and shared storage, it's a greedy
algorithm with best matches, and by considering the LXC live migration is
For each new container, or re-sizing an old one, i loop all available nodes
in the cluster and see the one that uses more ressources without reaching
the max possible hardware resources, in order to make nodes full.
Optimization of this method is doing a silent migration when a container is
rebooted or restarted based on the same logic.

Ah okay, now I understand. This would be and "CT deployment tool" and
should be definitively more stable as the problems I mentioned in my email.
It does not really change dynamically the cluster but rather on the
checkpoints (create, stop CT), sounds quite cool.

Post by Mohamed Sadok Ben Jazia
What do you think of my logic (if it's clear until now).

The summarize from above seems good to me, if you really plan to
create/start/stop a lot of containers in the cluster try to keep the
evaluation algorithm rather simple so that it runs in O(n) time, else
you could run into performance problems.

Post by Mohamed Sadok Ben Jazia
Also, this point is not clear for me (* wait for the cluster to become
stable (e.g. a few minutes no cluster action), can you explain the reason.

I thought here of (live) migrations, they give the network, and the
nodes some load, whille besides your algorithm there may also run other
- user triggered - actions which also need resources.
Further I want to wait a bit of time to let the cluster stabilize, else
it could trigger unnecessary migrations or an out of control feedback
loop, but this affects you less as you use static resource values (CPU
cores, max ram), as far as I've understood.

best regards,
Thomas

Post by Mohamed Sadok Ben Jazia

_______________________________________________
pve-user mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Mohamed Sadok Ben Jazia

2016-04-06 06:49:26 UTC

Permalink

Thank you thomas,
As i'm deploying a large number of CT, this is the best algorithm, and as i
mentioned above, i can't wait next version of pve for live migration, also
doing a backup/restore is not friendly once it takes time.
I'm going to commit this small method in github and upgrade it later.
Best regards

Post by Thomas Lamprecht
Hi,

nodes

Post by Mohamed Sadok Ben Jazia
in the cluster and see the one that uses more ressources without reaching
the max possible hardware resources, in order to make nodes full.
Optimization of this method is doing a silent migration when a container

Post by Mohamed Sadok Ben Jazia
rebooted or restarted based on the same logic.

Post by Mohamed Sadok Ben Jazia
What do you think of my logic (if it's clear until now).

Post by Mohamed Sadok Ben Jazia
Also, this point is not clear for me (* wait for the cluster to become
stable (e.g. a few minutes no cluster action), can you explain the

reason.
I thought here of (live) migrations, they give the network, and the
nodes some load, whille besides your algorithm there may also run other
- user triggered - actions which also need resources.
Further I want to wait a bit of time to let the cluster stabilize, else
it could trigger unnecessary migrations or an out of control feedback
loop, but this affects you less as you use static resource values (CPU
cores, max ram), as far as I've understood.
best regards,
Thomas

Post by Mohamed Sadok Ben Jazia

_______________________________________________
pve-user mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Thomas Lamprecht

2016-04-06 07:18:43 UTC

Permalink

Post by Mohamed Sadok Ben Jazia
Thank you thomas,
As i'm deploying a large number of CT, this is the best algorithm, and as i
mentioned above, i can't wait next version of pve for live migration, also

Understandable.

Post by Mohamed Sadok Ben Jazia
doing a backup/restore is not friendly once it takes time.

I would also strongly be _against_ backup/restore in this case.

cheers,
Thomas

Post by Mohamed Sadok Ben Jazia
I'm going to commit this small method in github and upgrade it later.
Best regards

Post by Thomas Lamprecht
Hi,

nodes

Post by Mohamed Sadok Ben Jazia
rebooted or restarted based on the same logic.

Post by Mohamed Sadok Ben Jazia
What do you think of my logic (if it's clear until now).

Post by Mohamed Sadok Ben Jazia
Also, this point is not clear for me (* wait for the cluster to become
stable (e.g. a few minutes no cluster action), can you explain the

Post by Mohamed Sadok Ben Jazia