Discussion:
[PVE-User] Cephfs starting 2nd MDS
Vadim Bulst
2018-08-07 10:13:11 UTC
Permalink
Dear list,

I'm trying to bring up a second mds with no luck.

This is what my ceph.conf looks like:

[global]

         auth client required = cephx
         auth cluster required = cephx
         auth service required = cephx
         cluster network = 10.10.144.0/24
         filestore xattr use omap = true
         fsid = 5349724e-fa96-4fd6-8e44-8da2a39253f7
         keyring = /etc/pve/priv/$cluster.$name.keyring
         osd journal size = 5120
         osd pool default min size = 1
         public network = 172.18.144.0/24
         mon allow pool delete = true

[osd]
         keyring = /var/lib/ceph/osd/ceph-$id/keyring

[mon.2]
         host = scvirt03
         mon addr = 172.18.144.243:6789

[mon.0]
         host = scvirt01
         mon addr = 172.18.144.241:6789
[mon.1]
         host = scvirt02
         mon addr = 172.18.144.242:6789

[mds.0]
        host = scvirt02
[mds.1]
        host = scvirt03


I did the following to set up the service:

apt install ceph-mds

mkdir /var/lib/ceph/mds

mkdir /var/lib/ceph/mds/ceph-$(hostname -s)

chown -R ceph:ceph /var/lib/ceph/mds

chmod -R 0750 /var/lib/ceph/mds

ceph auth get-or-create mds.$(hostname -s) mon 'allow profile mds' mgr
'allow profile mds' osd 'allow rwx' mds 'allow' >
/var/lib/ceph/mds/ceph-$(hostname -s)/keyring

chmod -R 0600 /var/lib/ceph/mds/ceph-$(hostname -s)/keyring

systemctl enable ceph-mds@$(hostname -s).service

systemctl start ceph-mds@$(hostname -s).service


The service will not start. I also did the same procedure with the first
mds which is running with no problems.

1st mds:

***@scvirt02:/home/urzadmin# systemctl status -l ceph-mds@$(hostname
-s).service
● ceph-***@scvirt02.service - Ceph metadata server daemon
   Loaded: loaded (/lib/systemd/system/ceph-***@.service; enabled;
vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-***@.service.d
           └─ceph-after-pve-cluster.conf
   Active: active (running) since Thu 2018-06-07 13:08:58 CEST; 2
months 0 days ago
 Main PID: 612704 (ceph-mds)
   CGroup:
/system.slice/system-ceph\x2dmds.slice/ceph-***@scvirt02.service
           └─612704 /usr/bin/ceph-mds -f --cluster ceph --id scvirt02
--setuser ceph --setgroup ceph

Jul 29 06:25:01 scvirt02 ceph-mds[612704]: 2018-07-29 06:25:01.792601
7f6e4bae0700 -1 received  signal: Hangup from  PID: 3831071 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
Jul 30 06:25:02 scvirt02 ceph-mds[612704]: 2018-07-30 06:25:02.081591
7f6e4bae0700 -1 received  signal: Hangup from  PID: 184355 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
Jul 31 06:25:01 scvirt02 ceph-mds[612704]: 2018-07-31 06:25:01.448571
7f6e4bae0700 -1 received  signal: Hangup from  PID: 731440 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
Aug 01 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-01 06:25:01.274541
7f6e4bae0700 -1 received  signal: Hangup from  PID: 1278492 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
Aug 02 06:25:02 scvirt02 ceph-mds[612704]: 2018-08-02 06:25:02.009054
7f6e4bae0700 -1 received  signal: Hangup from  PID: 1825500 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
Aug 03 06:25:02 scvirt02 ceph-mds[612704]: 2018-08-03 06:25:02.042845
7f6e4bae0700 -1 received  signal: Hangup from  PID: 2372815 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
Aug 04 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-04 06:25:01.404619
7f6e4bae0700 -1 received  signal: Hangup from  PID: 2919837 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
Aug 05 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-05 06:25:01.214749
7f6e4bae0700 -1 received  signal: Hangup from  PID: 3467000 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
Aug 06 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-06 06:25:01.149512
7f6e4bae0700 -1 received  signal: Hangup from  PID: 4014197 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
Aug 07 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-07 06:25:01.863104
7f6e4bae0700 -1 received  signal: Hangup from  PID: 367698 task name:
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0

2nd mds:

***@scvirt03:/home/urzadmin# systemctl status -l ceph-mds@$(hostname
-s).service
● ceph-***@scvirt03.service - Ceph metadata server daemon
   Loaded: loaded (/lib/systemd/system/ceph-***@.service; enabled;
vendor preset: enabled)
  Drop-In: /lib/systemd/system/ceph-***@.service.d
           └─ceph-after-pve-cluster.conf
   Active: inactive (dead) since Tue 2018-08-07 10:27:18 CEST; 1h
38min ago
  Process: 3620063 ExecStart=/usr/bin/ceph-mds -f --cluster ${CLUSTER}
--id scvirt03 --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
 Main PID: 3620063 (code=exited, status=0/SUCCESS)

Aug 07 10:27:17 scvirt03 systemd[1]: Started Ceph metadata server daemon.
Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: starting mds.scvirt03 at -
Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: 2018-08-07 10:27:18.008338
7f6be03816c0 -1 auth: unable to find a keyring on
/etc/pve/priv/ceph.mds.scvirt03.keyring: (13) Permission denied
Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: 2018-08-07 10:27:18.008351
7f6be03816c0 -1 mds.scvirt03 ERROR: failed to get monmap: (13)
Permission denied


content of /etc/pve/priv

***@scvirt03:/home/urzadmin# ls -la /etc/pve/priv/
total 5
drwx------ 2 root www-data    0 Apr 15  2017 .
drwxr-xr-x 2 root www-data    0 Jan  1  1970 ..
-rw------- 1 root www-data 1675 Apr 15  2017 authkey.key
-rw------- 1 root www-data 1976 Jul  6 15:41 authorized_keys
drwx------ 2 root www-data    0 Apr 16  2017 ceph
-rw------- 1 root www-data   63 Apr 15  2017 ceph.client.admin.keyring
-rw------- 1 root www-data  214 Apr 15  2017 ceph.mon.keyring
-rw------- 1 root www-data 4224 Jul  6 15:41 known_hosts
drwx------ 2 root www-data    0 Apr 15  2017 lock
-rw------- 1 root www-data 3243 Apr 15  2017 pve-root-ca.key
-rw------- 1 root www-data    3 Jul  6 15:41 pve-root-ca.srl
-rw------- 1 root www-data   36 May 23 13:03 urzbackup.cred


What could be the reason this failure?

Cheers,

Vadim
--
Vadim Bulst

Universität Leipzig / URZ
04109 Leipzig, Augustusplatz 10

phone: ++49-341-97-33380
mail: ***@uni-leipzig.de
Alwin Antreich
2018-08-07 13:30:07 UTC
Permalink
Hello Vadim,
Post by Vadim Bulst
Dear list,
I'm trying to bring up a second mds with no luck.
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.144.0/24
filestore xattr use omap = true
fsid = 5349724e-fa96-4fd6-8e44-8da2a39253f7
keyring = /etc/pve/priv/$cluster.$name.keyring
osd journal size = 5120
osd pool default min size = 1
public network = 172.18.144.0/24
mon allow pool delete = true
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
[mon.2]
host = scvirt03
mon addr = 172.18.144.243:6789
[mon.0]
host = scvirt01
mon addr = 172.18.144.241:6789
[mon.1]
host = scvirt02
mon addr = 172.18.144.242:6789
[mds.0]
host = scvirt02
[mds.1]
host = scvirt03
apt install ceph-mds
mkdir /var/lib/ceph/mds
mkdir /var/lib/ceph/mds/ceph-$(hostname -s)
chown -R ceph:ceph /var/lib/ceph/mds
chmod -R 0750 /var/lib/ceph/mds
ceph auth get-or-create mds.$(hostname -s) mon 'allow profile mds' mgr
'allow profile mds' osd 'allow rwx' mds 'allow' >
/var/lib/ceph/mds/ceph-$(hostname -s)/keyring
chmod -R 0600 /var/lib/ceph/mds/ceph-$(hostname -s)/keyring
The service will not start. I also did the same procedure with the first
mds which is running with no problems.
-s).service
vendor preset: enabled)
└─ceph-after-pve-cluster.conf
Active: active (running) since Thu 2018-06-07 13:08:58 CEST; 2
months 0 days ago
Main PID: 612704 (ceph-mds)
└─612704 /usr/bin/ceph-mds -f --cluster ceph --id scvirt02
--setuser ceph --setgroup ceph
Jul 29 06:25:01 scvirt02 ceph-mds[612704]: 2018-07-29 06:25:01.792601
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Jul 30 06:25:02 scvirt02 ceph-mds[612704]: 2018-07-30 06:25:02.081591
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Jul 31 06:25:01 scvirt02 ceph-mds[612704]: 2018-07-31 06:25:01.448571
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 01 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-01 06:25:01.274541
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 02 06:25:02 scvirt02 ceph-mds[612704]: 2018-08-02 06:25:02.009054
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 03 06:25:02 scvirt02 ceph-mds[612704]: 2018-08-03 06:25:02.042845
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 04 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-04 06:25:01.404619
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 05 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-05 06:25:01.214749
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 06 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-06 06:25:01.149512
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 07 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-07 06:25:01.863104
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
-s).service
vendor preset: enabled)
└─ceph-after-pve-cluster.conf
Active: inactive (dead) since Tue 2018-08-07 10:27:18 CEST; 1h
38min ago
Process: 3620063 ExecStart=/usr/bin/ceph-mds -f --cluster ${CLUSTER}
--id scvirt03 --setuser ceph --setgroup ceph (code=exited,
status=0/SUCCESS)
Main PID: 3620063 (code=exited, status=0/SUCCESS)
Aug 07 10:27:17 scvirt03 systemd[1]: Started Ceph metadata server daemon.
Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: starting mds.scvirt03 at -
Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: 2018-08-07 10:27:18.008338
7f6be03816c0 -1 auth: unable to find a keyring on
/etc/pve/priv/ceph.mds.scvirt03.keyring: (13) Permission denied
Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: 2018-08-07 10:27:18.008351
7f6be03816c0 -1 mds.scvirt03 ERROR: failed to get monmap: (13)
Permission denied
content of /etc/pve/priv
total 5
drwx------ 2 root www-data 0 Apr 15 2017 .
drwxr-xr-x 2 root www-data 0 Jan 1 1970 ..
-rw------- 1 root www-data 1675 Apr 15 2017 authkey.key
-rw------- 1 root www-data 1976 Jul 6 15:41 authorized_keys
drwx------ 2 root www-data 0 Apr 16 2017 ceph
-rw------- 1 root www-data 63 Apr 15 2017 ceph.client.admin.keyring
-rw------- 1 root www-data 214 Apr 15 2017 ceph.mon.keyring
-rw------- 1 root www-data 4224 Jul 6 15:41 known_hosts
drwx------ 2 root www-data 0 Apr 15 2017 lock
-rw------- 1 root www-data 3243 Apr 15 2017 pve-root-ca.key
-rw------- 1 root www-data 3 Jul 6 15:41 pve-root-ca.srl
-rw------- 1 root www-data 36 May 23 13:03 urzbackup.cred
What could be the reason this failure?
The ceph user has no permissions to access the the keyring under /etc/pve.
Add a section for [mds] into the ceph.conf pointing to the keyring, similar
to the OSD one. This way the MDS will find the key in It's working
directory.

Cheers,
Alwin
Vadim Bulst
2018-08-08 05:54:45 UTC
Permalink
Hi Alwin,

thanks for your advise. But no success. Still same error.

mds-section:

[mds.1]
        host = scvirt03
        keyring = /var/lib/ceph/mds/ceph-scvirt03/keyring

Vadim
Post by Alwin Antreich
Hello Vadim,
Post by Vadim Bulst
Dear list,
I'm trying to bring up a second mds with no luck.
[global]
auth client required = cephx
auth cluster required = cephx
auth service required = cephx
cluster network = 10.10.144.0/24
filestore xattr use omap = true
fsid = 5349724e-fa96-4fd6-8e44-8da2a39253f7
keyring = /etc/pve/priv/$cluster.$name.keyring
osd journal size = 5120
osd pool default min size = 1
public network = 172.18.144.0/24
mon allow pool delete = true
[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
[mon.2]
host = scvirt03
mon addr = 172.18.144.243:6789
[mon.0]
host = scvirt01
mon addr = 172.18.144.241:6789
[mon.1]
host = scvirt02
mon addr = 172.18.144.242:6789
[mds.0]
host = scvirt02
[mds.1]
host = scvirt03
apt install ceph-mds
mkdir /var/lib/ceph/mds
mkdir /var/lib/ceph/mds/ceph-$(hostname -s)
chown -R ceph:ceph /var/lib/ceph/mds
chmod -R 0750 /var/lib/ceph/mds
ceph auth get-or-create mds.$(hostname -s) mon 'allow profile mds' mgr
'allow profile mds' osd 'allow rwx' mds 'allow' >
/var/lib/ceph/mds/ceph-$(hostname -s)/keyring
chmod -R 0600 /var/lib/ceph/mds/ceph-$(hostname -s)/keyring
The service will not start. I also did the same procedure with the first
mds which is running with no problems.
-s).service
vendor preset: enabled)
└─ceph-after-pve-cluster.conf
Active: active (running) since Thu 2018-06-07 13:08:58 CEST; 2
months 0 days ago
Main PID: 612704 (ceph-mds)
└─612704 /usr/bin/ceph-mds -f --cluster ceph --id scvirt02
--setuser ceph --setgroup ceph
Jul 29 06:25:01 scvirt02 ceph-mds[612704]: 2018-07-29 06:25:01.792601
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Jul 30 06:25:02 scvirt02 ceph-mds[612704]: 2018-07-30 06:25:02.081591
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Jul 31 06:25:01 scvirt02 ceph-mds[612704]: 2018-07-31 06:25:01.448571
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 01 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-01 06:25:01.274541
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 02 06:25:02 scvirt02 ceph-mds[612704]: 2018-08-02 06:25:02.009054
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 03 06:25:02 scvirt02 ceph-mds[612704]: 2018-08-03 06:25:02.042845
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 04 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-04 06:25:01.404619
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 05 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-05 06:25:01.214749
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 06 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-06 06:25:01.149512
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
Aug 07 06:25:01 scvirt02 ceph-mds[612704]: 2018-08-07 06:25:01.863104
killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0
-s).service
vendor preset: enabled)
└─ceph-after-pve-cluster.conf
Active: inactive (dead) since Tue 2018-08-07 10:27:18 CEST; 1h
38min ago
Process: 3620063 ExecStart=/usr/bin/ceph-mds -f --cluster ${CLUSTER}
--id scvirt03 --setuser ceph --setgroup ceph (code=exited,
status=0/SUCCESS)
Main PID: 3620063 (code=exited, status=0/SUCCESS)
Aug 07 10:27:17 scvirt03 systemd[1]: Started Ceph metadata server daemon.
Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: starting mds.scvirt03 at -
Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: 2018-08-07 10:27:18.008338
7f6be03816c0 -1 auth: unable to find a keyring on
/etc/pve/priv/ceph.mds.scvirt03.keyring: (13) Permission denied
Aug 07 10:27:18 scvirt03 ceph-mds[3620063]: 2018-08-07 10:27:18.008351
7f6be03816c0 -1 mds.scvirt03 ERROR: failed to get monmap: (13)
Permission denied
content of /etc/pve/priv
total 5
drwx------ 2 root www-data 0 Apr 15 2017 .
drwxr-xr-x 2 root www-data 0 Jan 1 1970 ..
-rw------- 1 root www-data 1675 Apr 15 2017 authkey.key
-rw------- 1 root www-data 1976 Jul 6 15:41 authorized_keys
drwx------ 2 root www-data 0 Apr 16 2017 ceph
-rw------- 1 root www-data 63 Apr 15 2017 ceph.client.admin.keyring
-rw------- 1 root www-data 214 Apr 15 2017 ceph.mon.keyring
-rw------- 1 root www-data 4224 Jul 6 15:41 known_hosts
drwx------ 2 root www-data 0 Apr 15 2017 lock
-rw------- 1 root www-data 3243 Apr 15 2017 pve-root-ca.key
-rw------- 1 root www-data 3 Jul 6 15:41 pve-root-ca.srl
-rw------- 1 root www-data 36 May 23 13:03 urzbackup.cred
What could be the reason this failure?
The ceph user has no permissions to access the the keyring under /etc/pve.
Add a section for [mds] into the ceph.conf pointing to the keyring, similar
to the OSD one. This way the MDS will find the key in It's working
directory.
Cheers,
Alwin
_______________________________________________
pve-user mailing list
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
Alwin Antreich
2018-08-08 07:22:38 UTC
Permalink
Hi,
Post by Vadim Bulst
Hi Alwin,
thanks for your advise. But no success. Still same error.
[mds.1]
        host = scvirt03
        keyring = /var/lib/ceph/mds/ceph-scvirt03/keyring
[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring

So it will work for every MDS that you setup. Besides that, no extra
options would be needed for MDS to start.

More then the bellow lines should not be needed to get the mds started.

mkdir -p /var/lib/ceph/mds/ceph-$SERVER
chown -R ceph:ceph /var/lib/ceph/mds/ceph-$SERVER
ceph --cluster ceph --name client.bootstrap-mds \
--keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth \
get-or-create mds.$SERVER osd 'allow rwx' mds 'allow' mon 'allow profile mds' \
-o /var/lib/ceph/mds/ceph-$SERVER/keyring

If it's not working, whats the output of 'systemctl status
ceph-mds@<id>'?

--
Cheers,
Alwin
Vadim Bulst
2018-08-08 09:29:45 UTC
Permalink
Thanks guys - great help! All up and running :-)
Post by Alwin Antreich
Hi,
Post by Vadim Bulst
Hi Alwin,
thanks for your advise. But no success. Still same error.
[mds.1]
        host = scvirt03
        keyring = /var/lib/ceph/mds/ceph-scvirt03/keyring
[mds]
keyring = /var/lib/ceph/mds/ceph-$id/keyring
So it will work for every MDS that you setup. Besides that, no extra
options would be needed for MDS to start.
More then the bellow lines should not be needed to get the mds started.
mkdir -p /var/lib/ceph/mds/ceph-$SERVER
chown -R ceph:ceph /var/lib/ceph/mds/ceph-$SERVER
ceph --cluster ceph --name client.bootstrap-mds \
--keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth \
get-or-create mds.$SERVER osd 'allow rwx' mds 'allow' mon 'allow profile mds' \
-o /var/lib/ceph/mds/ceph-$SERVER/keyring
If it's not working, whats the output of 'systemctl status
--
Cheers,
Alwin
_______________________________________________
pve-user mailing list
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
--
Vadim Bulst

Universität Leipzig / URZ
04109 Leipzig, Augustusplatz 10

phone: ++49-341-97-33380
mail: ***@uni-leipzig.de
Loading...