Discussion:
[PVE-User] Proxmox NFS issue
Muhammad Yousuf Khan
2013-11-06 15:05:13 UTC
Permalink
i am facing slow read and write on our new NAS.
here is the hardware detail.

Proxmox :
12GB RAM
Xeon 3.2 (2 processors)
500GB HD for Proxmox and Debian OS.


Remote SAN/NAS:
OS : omniOS
RAM : 12 GB
FS : ZFS
Sharing protocol : NFS
Xeon 3.2 (2 processors)
1 SSD 60GB ARCH2
1 SSD 120GB ZIL
3 Mix capasity HD 800GB,500GB and 1TB all are 64bit Cache.
ZFS RAID Type : RaidZ




when i am inside the VM and trying to copy "to" network or "from" network i
see very slow traffic. specifically talking about inside VM.

kindly find an attach file. to see my read and write performance.
in side graphs Red line is showing my copy data-to-VM and white line
showing copy data-to-network.


when i am copying data from terminal it is showing me good speed.

here is the some more detail on Mount points. (output of my "df -h" command)

10.x.x.25:/tank/VMbase 903G 2.1G 901G 1% /nfscom

---------------------------------------------------------------------------
here is some copy test on terminal from the nfs mount point

***@bull:/nfscom/images/1009# rsync --progress vm-1009-disk-1.raw /
vm-1009-disk-1.raw
824311808 7% 70.00MB/s 0:02:18

(you can see i am coping 10GB file from NFS mount to "/" this is the same
VM image which is showing problem in the attached graphics)

now copying same VM from "/" to same NFS mount point.

***@bull:/nfscom/images# rsync --progress /vm-1009-disk-1.raw
/nfscom/images/
vm-1009-disk-1.raw
607682560 5% 63.71MB/s 0:02:35


you can see my read and writes are working-great from the console but when
it come to VM i am facing issues inside VM.

actually i have asked this question few weeks ago. some one suggested me in
the forum to buy a SSD for ZIL so it took me a while and i bought even two
SSDs 1 for Level 2 Arch and 1 for ZIL. but still i am standing at the same
point.

Can anyone please tell me what mistake i am doing here.

even i tried FreeNas with same ZFS config i am facing same issue.

however when using Same NFS with Virtualbox in ubuntu 12.x it is doing
fine.

i tried any available HD type and all the modes available in proxmox, such
as "no cache" "sync" write through" on both raw and qemu drive types but
nothing help.


i dont know where i am doing wrong.

please help me out. i am very near to bang my head :).
Michael Rasmussen
2013-11-06 16:00:32 UTC
Permalink
Hi,

Maybe you could try the new ZFS plugin which will expose native ZFS
through iSCSI?

This also means cloning and online native zfs snapshots with raw images.
Post by Muhammad Yousuf Khan
i am facing slow read and write on our new NAS.
here is the hardware detail.
12GB RAM
Xeon 3.2 (2 processors)
500GB HD for Proxmox and Debian OS.
OS : omniOS
RAM : 12 GB
FS : ZFS
Sharing protocol : NFS
Xeon 3.2 (2 processors)
1 SSD 60GB ARCH2
1 SSD 120GB ZIL
3 Mix capasity HD 800GB,500GB and 1TB all are 64bit Cache.
ZFS RAID Type : RaidZ
when i am inside the VM and trying to copy "to" network or "from" network i see very slow traffic. specifically talking about inside VM.
kindly find an attach file. to see my read and write performance. in side graphs Red line is showing my copy data-to-VM and white line showing copy data-to-network.
when i am copying data from terminal it is showing me good speed.
here is the some more detail on Mount points. (output of my "df -h" command)
10.x.x.25:/tank/VMbase 903G 2.1G 901G 1% /nfscom
---------------------------------------------------------------------------
here is some copy test on terminal from the nfs mount point
vm-1009-disk-1.raw
824311808 7% 70.00MB/s 0:02:18
(you can see i am coping 10GB file from NFS mount to "/" this is the same VM image which is showing problem in the attached graphics)
now copying same VM from "/" to same NFS mount point.
vm-1009-disk-1.raw
607682560 5% 63.71MB/s 0:02:35
you can see my read and writes are working-great from the console but when it come to VM i am facing issues inside VM.
actually i have asked this question few weeks ago. some one suggested me in the forum to buy a SSD for ZIL so it took me a while and i bought even two SSDs 1 for Level 2 Arch and 1 for ZIL. but still i am standing at the same point.
Can anyone please tell me what mistake i am doing here.
even i tried FreeNas with same ZFS config i am facing same issue.
however when using Same NFS with Virtualbox in ubuntu 12.x it is doing fine.
i tried any available HD type and all the modes available in proxmox, such as "no cache" "sync" write through" on both raw and qemu drive types but nothing help.
i dont know where i am doing wrong.
please help me out. i am very near to bang my head :).
!DSPAM:527a5d7e152822157475998!
_______________________________________________
pve-user mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user [1]
!DSPAM:527a5d7e152822157475998!
--
Hilsen/regards
Michael Rasmussen


Links:
------
[1] http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Gilles Mocellin
2013-11-06 17:31:10 UTC
Permalink
Post by Muhammad Yousuf Khan
i am facing slow read and write on our new NAS.
[...]
Post by Muhammad Yousuf Khan
when i am inside the VM and trying to copy "to" network or "from"
network i see very slow traffic. specifically talking about inside VM.
Why do you think you have a problem with your storage ? You're doing
network.

[...]
Post by Muhammad Yousuf Khan
---------------------------------------------------------------------------
here is some copy test on terminal from the nfs mount point
vm-1009-disk-1.raw
824311808 7% 70.00MB/s 0:02:18
(you can see i am coping 10GB file from NFS mount to "/" this is the
same VM image which is showing problem in the attached graphics)
now copying same VM from "/" to same NFS mount point.
/nfscom/images/
vm-1009-disk-1.raw
607682560 5% 63.71MB/s 0:02:35
you can see my read and writes are working-great from the console but
when it come to VM i am facing issues inside VM.
Here you're showing that you don't have a problem with your storage.
To be really sure, can you do local disk I/O test inside your VM ?

When you test network traffic in your VM, from or to network as you say,
What are you exactly doing ?
Do you transfer with your NAS or with another machine ?

I hope you use virtio drivers, for storage and for network, inside your VM ?
Muhammad Yousuf Khan
2013-11-06 17:50:28 UTC
Permalink
On Wed, Nov 6, 2013 at 10:31 PM, Gilles Mocellin <
Post by Muhammad Yousuf Khan
i am facing slow read and write on our new NAS.
[...]
when i am inside the VM and trying to copy "to" network or "from" network
Post by Muhammad Yousuf Khan
i see very slow traffic. specifically talking about inside VM.
Why do you think you have a problem with your storage ? You're doing
network.
sorry for the misunderstanding, what i meant is that the VM hosted on NFS
is having problem, on the other hand the stats that i have showed with
command "rsync" was the proof of concept that the file transfer b/w Proxmox
and Storage is doing fine. but the problem part is inside VM. which is
Windows 2003 server.

Here you're showing that you don't have a problem with your storage.
Post by Muhammad Yousuf Khan
To be really sure, can you do local disk I/O test inside your VM ?
i will share the diagram of results.
Post by Muhammad Yousuf Khan
When you test network traffic in your VM, from or to network as you say,
What are you exactly doing ?
Do you transfer with your NAS or with another machine ?
the connectivity is like this. i have 3 servers all in 1 broadcast domain
(Samba storage server, Proxmox Server and OmniOS server). all can ping each
other. means OmniOS server is not behind the Proxmox they all are in same
network.

i have a 700MB ISO image which i demonstrated in the graphics that was
attached to my last email. i copied that 700MB file to that Samba box and
then i copied that same file from Samba box back to VM and use windows
share/SMB for copy and pasting the file.
Post by Muhammad Yousuf Khan
I hope you use virtio drivers, for storage and for network, inside your VM ?
NOPE! i am not using virtio neither disk nor in Ethernet. is it important
to use virtio when using storage box? because i have never used External
boxesbefore i only used local storage with RAID1 (mdadm) and used same old
RAW and QCOW images. and they were perfectly fine for me.

Please advice.
Post by Muhammad Yousuf Khan
_______________________________________________
pve-user mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Gilles Mocellin
2013-11-06 18:26:12 UTC
Permalink
Le 06/11/2013 18:50, Muhammad Yousuf Khan a écrit :
[...]
Post by Gilles Mocellin
I hope you use virtio drivers, for storage and for network, inside your VM ?
NOPE! i am not using virtio neither disk nor in Ethernet. is it
important to use virtio when using storage box? because i have never
used External boxesbefore i only used local storage with RAID1 (mdadm)
and used same old RAW and QCOW images. and they were perfectly fine
for me.
It's not because you use an external storage, but différences can be
more visible, because external storage is often slower than local
storage, at least for latency.
Post by Gilles Mocellin
Please advice.
I always try to use virtio drivers, as in theory, they must be better,
lower latency and better throughput.
But you must test, it can depend on the version and your OS.

After testing local disk I/O in your VM, to test Windows driver/KVM/NFS
storage, you could test network without disk I/O, using iperf between
your VM and your samba box.
I think you can find iperf for Windows. Here for example :
http://linhost.info/2010/02/iperf-on-windows/
Muhammad Yousuf Khan
2013-11-06 18:39:51 UTC
Permalink
On Wed, Nov 6, 2013 at 11:26 PM, Gilles Mocellin <
Post by Gilles Mocellin
[...]
I hope you use virtio drivers, for storage and for network, inside
Post by Gilles Mocellin
your VM ?
NOPE! i am not using virtio neither disk nor in Ethernet. is it
important to use virtio when using storage box? because i have never used
External boxesbefore i only used local storage with RAID1 (mdadm) and used
same old RAW and QCOW images. and they were perfectly fine for me.
It's not because you use an external storage, but différences can be more
visible, because external storage is often slower than local storage, at
least for latency.
Please advice.
I always try to use virtio drivers, as in theory, they must be better,
lower latency and better throughput.
But you must test, it can depend on the version and your OS.
After testing local disk I/O in your VM, to test Windows driver/KVM/NFS
storage, you could test network without disk I/O, using iperf between your
VM and your samba box.
http://linhost.info/2010/02/iperf-on-windows/
Thanks for sharing i will try that one too and share the stats
thanks
Post by Gilles Mocellin
_______________________________________________
pve-user mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Pongrácz István
2013-11-06 17:43:06 UTC
Permalink
Hi,

I just read your issue.

Some comments on your ZFS server setup:


for ZIL, in your config, 1-8GB size more than enoug in any case

L2ARC - it needs ram to keep header information in ARC, probably lower l2arc than actual


Example: for ZIL and L2ARC, you should be better with 2 x 60GB SSD:


2 x 40GB for L2ARC, striped, let's say sdb2 and sdc2 - total 80GB

2 x 5GB for ZIL, in mirror, let's say sdb1 and sdb2 - total 5GB (mirror)





you should check your ZFS setup in details (compression, atime, ashift, dedup etc.)


compression: lz4, atime: off, ashift: 12, dedup: off, blocksize 128k



you should check your raw ZFS performance on the nas, be careful, not as simple as sounds

check your cache hit rates (arc, l2arc),

check your iostats under load (zpool iostat -v 1)

read carefully the manual of the chosen ZFS implementation, seriously, great tool, but needs some knowledge

sign up to a zfs specific mailing list to get ZFS specific help


Network:


check your NFS setup on the ZFS server (sync vs. async)

check your Proxmox nfs client settings, how do you mount


Proxmox:


try to use writeback cache

compare raw and qcow2 format performance, choose the better one

install proxmox into a kvm and check its pveperf - good indicator

you can mount nfs manually and setup proxmox to use that point as a simple directory -> you can tune nfs parameters


In kvm:


try to use safe delete regularly or always (overwrite deleted files with 0)


In general, if you tune one parameter, it should need change other parameters as well, for example if you use qcow2 as image format on the proxmox server, the zfs compression should be zle or off.

My opinion, your problem at this moment somewhere in your network/nfs setup, later you will have issues with ZFS under real world load :)))))))

Bye,

István

----------------eredeti ÃŒzenet-----------------
Feladó: "Muhammad Yousuf Khan" ***@gmail.com
Címzett: "pve-***@pve.proxmox.com "

Dátum: Wed, 6 Nov 2013 20:05:13 +0500
----------------------------------------------------------
Post by Muhammad Yousuf Khan
i am facing slow read and write on our new NAS.
here is the hardware detail.
12GB RAM
Xeon 3.2 (2 processors)
500GB HD for Proxmox and Debian OS.
OS : omniOS
RAM : 12 GB
FS : ZFS
Sharing protocol : NFS
Xeon 3.2 (2 processors)
1 SSD 60GB ARCH2
1 SSD 120GB ZIL
3 Mix capasity HD 800GB,500GB and 1TB all are 64bit Cache.
ZFS RAID Type : RaidZ
when i am inside the VM and trying to copy "to" network or "from" network i see very slow
traffic. specifically talking about inside VM.
kindly find an attach file. to see my read and write performance.
in side graphs Red line is showing my copy data-to-VM and white line showing copy
data-to-network.
when i am copying data from terminal it is showing me good speed.
here is the some more detail on Mount points. (output of my "df -h" command)
10.x.x.25:/tank/VMbase 903G 2.1G 901G 1% /nfscom
-------------------------------------------------------------------------
--
here is some copy test on terminal from the nfs mount point
vm-1009-disk-1.raw
824311808 7% 70.00MB/s 0:02:18
(you can see i am coping 10GB file from NFS mount to "/" this is the same VM image which is
showing problem in the attached graphics)
now copying same VM from "/" to same NFS mount point.
/nfscom/images/
vm-1009-disk-1.raw
607682560 5% 63.71MB/s 0:02:35
you can see my read and writes are working-great from the console but when it come to VM i
am facing issues inside VM.
actually i have asked this question few weeks ago. some one suggested me in the forum to
buy a SSD for ZIL so it took me a while and i bought even two SSDs 1 for Level 2 Arch and 1 for ZIL.
but still i am standing at the same point.
Can anyone please tell me what mistake i am doing here.
even i tried FreeNas with same ZFS config i am facing same issue.
however when using Same NFS with Virtualbox in ubuntu 12.x it is doing fine.
i tried any available HD type and all the modes available in proxmox, such as "no cache"
"sync" write through" on both raw and qemu drive types but nothing help.
i dont know where i am doing wrong.
please help me out. i am very near to bang my head :).
__________________________________________________
_______________________________________________
pve-user mailing list
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Muhammad Yousuf Khan
2013-11-06 18:27:47 UTC
Permalink
- for ZIL, in your config, 1-8GB size more than enoug in any case
- L2ARC - it needs ram to keep header information in ARC, probably
lower l2arc than actual
- Example: for ZIL and L2ARC, you should be better with 2 x 60GB
- 2 x 40GB for L2ARC, striped, let's say sdb2 and sdc2 - total
80GB
- 2 x 5GB for ZIL, in mirror, let's say sdb1 and sdb2 - total
5GB (mirror)
- you should check your ZFS setup in details (compression, atime,
ashift, dedup etc.)
First of i am really thankful for such a informative and helpful email, as
you have discuss so many points i will have to test it one by one. however
just a question which is still confusing me.

the status that i have showed in my "rsync" examples. it is almost bumps
around 65MB to 70MB read/write performance. which i think is not bad at all.

so my question is why it is showing me good stats when i am sending or
receiving data from my storage to Proxmox. why it showing delay only inside
VM.

Lets say if "compression , Atime, ashift , dedup etc" all creating problem
in VM then why they are not causing the same problem when copy from proxmox
to OmniOS or OmniOS to proxmox?
-
- compression: lz4, atime: off, ashift: 12, dedup: off, blocksize
128k
- you should check your raw ZFS performance on the nas, be careful,
not as simple as sounds
- check your cache hit rates (arc, l2arc),
- check your iostats under load (zpool iostat -v 1)
- read carefully the manual of the chosen ZFS implementation,
seriously, great tool, but needs some knowledge
- sign up to a zfs specific mailing list to get ZFS specific help
- check your NFS setup on the ZFS server (sync vs. async)
- check your Proxmox nfs client settings, how do you mount
would you please share me that Where can i tweak nfs client settings in
proxmox?
- try to use writeback cache
i tried that , didnt helpe
- compare raw and qcow2 format performance, choose the better one
ok i will try this but i know previously i did the test and they both
ended up with the same issue..

VERY NOTICEABLE THING IS. it does not happen all the time, e.g, for 5
seconds my networks graph reach 25MBPS and continues for 6 to 7 seconds
then again goes down to 0.90 to 0.70% which is tooooo much slow. this shows
a ZIG ZAG graph up and down i dont know why.

on the other hand, when coping from Omni to PVE and PVE to Omni. the
bandwidth stats show 70MBPS and this is very constant till file transfer
ends.
- install proxmox into a kvm and check its pveperf - good indicator
- you can mount nfs manually and setup proxmox to use that point as a
simple directory -> you can tune nfs parameters
- try to use safe delete regularly or always (overwrite deleted files
with 0)
In general, if you tune one parameter, it should need change other
parameters as well, for example if you use qcow2 as image format on the
proxmox server, the zfs compression should be zle or off.
can you please share any link, where i can get more information on above
point.

Thanks

one more question, when we are using NFS for hosting VM, isn't it like we
are sanding and receiving data via "scp" , "rsycn" does VM hosted on
external box operates very differently?

One more confusing point which i want to share, with same settings i am
hosting a VM in virtualbox and my virtualbox VM some time hangs during
copying big file but it does not low down the graph no matter how big the
file is. mean IO read/writes from Virtualbox are very constant.
Pongrácz István
2013-11-07 22:01:23 UTC
Permalink
Hi,

Sorry, I have not enough free time.

My comments below in red (html viewer can help).

----------------eredeti ÃŒzenet-----------------
Feladó: "Muhammad Yousuf Khan" ***@gmail.com
Címzett: "Pongrácz István"

CC: "pve-user pve.proxmox.com"

Dátum: Wed, 6 Nov 2013 23:27:47 +0500
----------------------------------------------------------
First of i am really thankful for such a informative and helpful email, as you have
discuss so many points i will have to test it one by one. however just a question which is still
confusing me.
the status that i have showed in my "rsync" examples. it is almost bumps around 65MB to
70MB read/write performance. which i think is not bad at all.
so my question is why it is showing me good stats when i am sending or receiving data from
my storage to Proxmox. why it showing delay only inside VM.
Probably you need to check KVM in various way: use linux on it, virtio etc. Check as many
combination as you can.
Lets say if "compression , Atime, ashift , dedup etc" all creating problem in VM then
why they are not causing the same problem when copy from proxmox to OmniOS or OmniOS to
proxmox?
You just narrow your problem to the VM, you are on your way :)
would you please share me that Where can i tweak nfs client settings in proxmox?
Just issue the command: mount and check the output for details. You can mount manually
the NFS export to a directory and set up in PVE this directory as local storage.
try to use writeback cache
i tried that , didnt helpe
compare raw and qcow2 format performance, choose the better one
ok i will try this but i know previously i did the test and they both ended up with the same
issue..
VERY NOTICEABLE THING IS. it does not happen all the time, e.g, for 5 seconds my
networks graph reach 25MBPS and continues for 6 to 7 seconds then again goes down to 0.90 to 0.70%
which is tooooo much slow. this shows a ZIG ZAG graph up and down i dont know why.
on the other hand, when coping from Omni to PVE and PVE to Omni. the bandwidth stats show
70MBPS and this is very constant till file transfer ends.
install proxmox into a kvm and check its pveperf - good indicator
you can mount nfs manually and setup proxmox to use that point as a simple directory ->
you can tune nfs parameters
try to use safe delete regularly or always (overwrite deleted files with 0)
In general, if you tune one parameter, it should need change other parameters as well,
for example if you use qcow2 as image format on the proxmox server, the zfs compression
should be zle or off.
can you please share any link, where i can get more information on above point.
Just google it, one good start is zfsonlinux.org .
Thanks
one more question, when we are using NFS for hosting VM, isn't it like we are sanding and
receiving data via "scp" , "rsycn" does VM hosted on external box operates very differently?
One more confusing point which i want to share, with same settings i am hosting a VM in
virtualbox and my virtualbox VM some time hangs during copying big file but it does not low down the
graph no matter how big the file is. mean IO read/writes from Virtualbox are very constant.
Virtualbox != KVM. Different. That means, they will act differently. In KVM, you have
a lot of parameters for test: change virtual network/disk type, CPU etc. Anyway, this
also narrow down your problem to the KVM/Guest operating system level, so, you should
focuse on to make good tests.
Bye,
István
Muhammad Yousuf Khan
2013-11-08 07:20:53 UTC
Permalink
Probably you need to check KVM in various way: use linux on it, virtio
etc. Check as many combination as you can.
Lets say if "compression , Atime, ashift , dedup etc" all creating problem
in VM then why they are not causing the same problem when copy from proxmox
to OmniOS or OmniOS to proxmox?
You just narrow your problem to the VM, you are on your way :)
Thanks for the input, i think qcow with write back on virtio on Windows
server 2008 is showing me some positive results.
still need to test 2003 server, Win7 and win8 .... long way to test :(
but happy with results, though some minor gliches are there but i
sucessfully copy 14GB data with 100 to 180mbps. it is not that good but
better from my last experience.

now testing by connecting the storage directly to the server to avoid the
latency issue.
Bye,
István
Loading...