Re: EC pool creation results in incorrect M value? - ceph-users

28 Jan 2020

Hi Eric,

With regards to "From the output of “ceph osd pool ls detail” you can see
min_size=4, the crush rule says min_size=3 however the pool does NOT
survive 2 hosts failing.  Am I missing something?"

For your EC profile you need to set the pool min_size=3 to still read/write
to the pool with two host failures.
RUN:  sudo ceph osd pool set ec32pool min_size 3

Kind regards
Geoffrey Rhodes

On Mon, 27 Jan 2020 at 22:11, &lt;ceph-users-request(a)ceph.io&gt; wrote:

...
  Send ceph-users mailing list submissions to
         ceph-users(a)ceph.io

 To subscribe or unsubscribe via email, send a message with subject or
 body 'help' to
         ceph-users-request(a)ceph.io

 You can reach the person managing the list at
         ceph-users-owner(a)ceph.io

 When replying, please edit your Subject line so it is more specific
 than "Re: Contents of ceph-users digest..."

 Today's Topics:

    1. Re: EC pool creation results in incorrect M value? (Paul Emmerich)
    2. Re: EC pool creation results in incorrect M value? (Smith, Eric)
    3. Re: EC pool creation results in incorrect M value? (Smith, Eric)
    4. data loss on full file system? (Håkan T Johansson)

 ----------------------------------------------------------------------

 Date: Mon, 27 Jan 2020 17:14:55 +0100
 From: Paul Emmerich &lt;paul.emmerich(a)croit.io&gt;
 Subject: [ceph-users] Re: EC pool creation results in incorrect M
         value?
 To: "Smith, Eric" &lt;Eric.Smith(a)ccur.com&gt;
 Cc: &quot;ceph-users(a)ceph.io&quot; &lt;ceph-users(a)ceph.io&gt;
 Message-ID:
         <
 CAD9yTbFb28FX_XaNXRUUxQ1UqAYa_ouhO_fkE+Vbts1VXGkuDw(a)mail.gmail.com&gt;
 Content-Type: text/plain; charset="UTF-8"

 min_size in the crush rule and min_size in the pool are completely
 different things that happen to share the same name.

 Ignore min_size in the crush rule, it has virtually no meaning in
 almost all cases (like this one).

 Paul

 --
 Paul Emmerich

 Looking for help with your Ceph cluster? Contact us at https://croit.io

 croit GmbH
 Freseniusstr. 31h
 81247 München
 www.croit.io
 Tel: +49 89 1896585 90

 On Mon, Jan 27, 2020 at 3:41 PM Smith, Eric &lt;Eric.Smith(a)ccur.com&gt; wrote:

 I have a Ceph Luminous (12.2.12) cluster with 6 nodes. I’m attempting to  create an
EC3+2 pool with the following commands:

 Create the EC profile:

 ceph osd erasure-code-profile set es32 k=3 m=2 plugin=jerasure w=8 
technique=reed_sol_van crush-failure-domain=host crush-root=sgshared

 Verify profile creation:

 [root@mon-1 ~]# ceph osd erasure-code-profile get es32

 crush-device-class=

 crush-failure-domain=host

 crush-root=sgshared

 jerasure-per-chunk-alignment=false

 k=3

 m=2

 plugin=jerasure

 technique=reed_sol_van

 w=8

 Create a pool using this profile:

 ceph osd pool create ec32pool 1024 1024 erasure es32

 List pool detail:

 pool 31 'es32' erasure size 5 min_size 4 crush_rule 11 object_hash 
rjenkins pg_num 1024 pgp_num 1024 last_change 1568 flags hashpspool
 stripe_width 12288 application ES

 Here’s the crush rule that’s created:
     {

         "rule_id": 11,

         "rule_name": "es32",

         "ruleset": 11,

         "type": 3,

         "min_size": 3,

         "max_size": 5,

         "steps": [

             {

                 "op": "set_chooseleaf_tries",

                 "num": 5

             },

             {

                 "op": "set_choose_tries",

                 "num": 100

             },

             {

                 "op": "take",

                 "item": -2,

                 "item_name": "sgshared"

             },

             {

                 "op": "chooseleaf_indep",

                 "num": 0,

                 "type": "host"

             },

             {

                 "op": "emit"

             }

         ]

     },

 From the output of “ceph osd pool ls detail” you can see min_size=4, the  crush
rule says min_size=3 however the pool does NOT survive 2 hosts
 failing.

 Am I missing something?

 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io 
 ------------------------------

 Date: Mon, 27 Jan 2020 16:22:06 +0000
 From: "Smith, Eric" &lt;Eric.Smith(a)ccur.com&gt;
 Subject: [ceph-users] Re: EC pool creation results in incorrect M
         value?
 To: Paul Emmerich &lt;paul.emmerich(a)croit.io&gt;
 Cc: &quot;ceph-users(a)ceph.io&quot; &lt;ceph-users(a)ceph.io&gt;
 Message-ID:  &lt;BN8PR14MB28206D6D29507FD9499DEE91EA0B0(a)BN8PR14MB2820.nam
         prd14.prod.outlook.com>
 Content-Type: text/plain; charset="utf-8"

 Thanks for the info regarding min_size in the crush rule - does this seem
 like a bug to you then? Is anyone else able to reproduce this?

 -----Original Message-----
 From: Paul Emmerich &lt;paul.emmerich(a)croit.io&gt;
 Sent: Monday, January 27, 2020 11:15 AM
 To: Smith, Eric &lt;Eric.Smith(a)ccur.com&gt;
 Cc: ceph-users(a)ceph.io
 Subject: Re: [ceph-users] EC pool creation results in incorrect M value?

 min_size in the crush rule and min_size in the pool are completely
 different things that happen to share the same name.

 Ignore min_size in the crush rule, it has virtually no meaning in almost
 all cases (like this one).

 Paul

 --
 Paul Emmerich

 Looking for help with your Ceph cluster? Contact us at https://croit.io

 croit GmbH
 Freseniusstr. 31h
 81247 München
 www.croit.io
 Tel: +49 89 1896585 90

 On Mon, Jan 27, 2020 at 3:41 PM Smith, Eric &lt;Eric.Smith(a)ccur.com&gt; wrote:

 I have a Ceph Luminous (12.2.12) cluster with 6 nodes. I’m attempting to  create an
EC3+2 pool with the following commands:

 Create the EC profile:

 ceph osd erasure-code-profile set es32 k=3 m=2 plugin=jerasure w=8  >
technique=reed_sol_van crush-failure-domain=host crush-root=sgshared

 Verify profile creation:

 [root@mon-1 ~]# ceph osd erasure-code-profile get es32

 crush-device-class=

 crush-failure-domain=host

 crush-root=sgshared

 jerasure-per-chunk-alignment=false

 k=3

 m=2

 plugin=jerasure

 technique=reed_sol_van

 w=8

 Create a pool using this profile:

 ceph osd pool create ec32pool 1024 1024 erasure es32

 List pool detail:

 pool 31 'es32' erasure size 5 min_size 4 crush_rule 11 object_hash  >
rjenkins pg_num 1024 pgp_num 1024 last_change 1568 flags hashpspool
 > stripe_width 12288 application ES

 Here’s the crush rule that’s created:
     {

         "rule_id": 11,

         "rule_name": "es32",

         "ruleset": 11,

         "type": 3,

         "min_size": 3,

         "max_size": 5,

         "steps": [

             {

                 "op": "set_chooseleaf_tries",

                 "num": 5

             },

             {

                 "op": "set_choose_tries",

                 "num": 100

             },

             {

                 "op": "take",

                 "item": -2,

                 "item_name": "sgshared"

             },

             {

                 "op": "chooseleaf_indep",

                 "num": 0,

                 "type": "host"

             },

             {

                 "op": "emit"

             }

         ]

     },

 From the output of “ceph osd pool ls detail” you can see min_size=4, the  crush
rule says min_size=3 however the pool does NOT survive 2 hosts
 failing.

 Am I missing something?

 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
 email to ceph-users-leave(a)ceph.io 
 ------------------------------

 Date: Mon, 27 Jan 2020 16:45:52 +0000
 From: "Smith, Eric" &lt;Eric.Smith(a)ccur.com&gt;
 Subject: [ceph-users] Re: EC pool creation results in incorrect M
         value?
 To: "Smith, Eric" &lt;Eric.Smith(a)ccur.com&gt;om>, Paul Emmerich
         &lt;paul.emmerich(a)croit.io&gt;
 Cc: &quot;ceph-users(a)ceph.io&quot; &lt;ceph-users(a)ceph.io&gt;
 Message-ID:  &lt;BN8PR14MB2820B1DF71DE395B7B5AA1ECEA0B0(a)BN8PR14MB2820.nam
         prd14.prod.outlook.com>
 Content-Type: text/plain; charset="utf-8"

 OK I see this: https://github.com/ceph/ceph/pull/8008

 Perhaps it's just to be safe...

 -----Original Message-----
 From: Smith, Eric &lt;Eric.Smith(a)ccur.com&gt;
 Sent: Monday, January 27, 2020 11:22 AM
 To: Paul Emmerich &lt;paul.emmerich(a)croit.io&gt;
 Cc: ceph-users(a)ceph.io
 Subject: [ceph-users] Re: EC pool creation results in incorrect M value?

 Thanks for the info regarding min_size in the crush rule - does this seem
 like a bug to you then? Is anyone else able to reproduce this?

 -----Original Message-----
 From: Paul Emmerich &lt;paul.emmerich(a)croit.io&gt;
 Sent: Monday, January 27, 2020 11:15 AM
 To: Smith, Eric &lt;Eric.Smith(a)ccur.com&gt;
 Cc: ceph-users(a)ceph.io
 Subject: Re: [ceph-users] EC pool creation results in incorrect M value?

 min_size in the crush rule and min_size in the pool are completely
 different things that happen to share the same name.

 Ignore min_size in the crush rule, it has virtually no meaning in almost
 all cases (like this one).

 Paul

 --
 Paul Emmerich

 Looking for help with your Ceph cluster? Contact us at https://croit.io

 croit GmbH
 Freseniusstr. 31h
 81247 München
 www.croit.io
 Tel: +49 89 1896585 90

 On Mon, Jan 27, 2020 at 3:41 PM Smith, Eric &lt;Eric.Smith(a)ccur.com&gt; wrote:

 I have a Ceph Luminous (12.2.12) cluster with 6 nodes. I’m attempting to  create an
EC3+2 pool with the following commands:

 Create the EC profile:

 ceph osd erasure-code-profile set es32 k=3 m=2 plugin=jerasure w=8  >
technique=reed_sol_van crush-failure-domain=host crush-root=sgshared

 Verify profile creation:

 [root@mon-1 ~]# ceph osd erasure-code-profile get es32

 crush-device-class=

 crush-failure-domain=host

 crush-root=sgshared

 jerasure-per-chunk-alignment=false

 k=3

 m=2

 plugin=jerasure

 technique=reed_sol_van

 w=8

 Create a pool using this profile:

 ceph osd pool create ec32pool 1024 1024 erasure es32

 List pool detail:

 pool 31 'es32' erasure size 5 min_size 4 crush_rule 11 object_hash  >
rjenkins pg_num 1024 pgp_num 1024 last_change 1568 flags hashpspool
 > stripe_width 12288 application ES

 Here’s the crush rule that’s created:
     {

         "rule_id": 11,

         "rule_name": "es32",

         "ruleset": 11,

         "type": 3,

         "min_size": 3,

         "max_size": 5,

         "steps": [

             {

                 "op": "set_chooseleaf_tries",

                 "num": 5

             },

             {

                 "op": "set_choose_tries",

                 "num": 100

             },

             {

                 "op": "take",

                 "item": -2,

                 "item_name": "sgshared"

             },

             {

                 "op": "chooseleaf_indep",

                 "num": 0,

                 "type": "host"

             },

             {

                 "op": "emit"

             }

         ]

     },

 From the output of “ceph osd pool ls detail” you can see min_size=4, the  crush
rule says min_size=3 however the pool does NOT survive 2 hosts
 failing.

 Am I missing something?

 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
 email to ceph-users-leave(a)ceph.io  _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
 email to ceph-users-leave(a)ceph.io

 ------------------------------

 Date: Mon, 27 Jan 2020 21:10:10 +0100
 From: Håkan T Johansson &lt;f96hajo(a)chalmers.se&gt;
 Subject: [ceph-users] data loss on full file system?
 To: &lt;ceph-users(a)ceph.io&gt;
 Message-ID:
         &lt;alpine.DEB.2.20.2001272106210.7767(a)planck-o.fy.chalmers.se&gt;
 Content-Type: text/plain; format=flowed; charset="UTF-8"

 Hi,

 for test purposes, I have set up two 100 GB OSDs, one
 taking a data pool and the other metadata pool for cephfs.

 Am running 14.2.6-1-gffd69200ad-1 with packages from
 https://mirror.croit.io/debian-nautilus

 Am then running a program that creates a lot of 1 MiB files by calling
    fopen()
    fwrite()
    fclose()
 for each of them.  Error codes are checked.

 This works successfully for ~100 GB of data, and then strangely also
 succeeds
 for many more 100 GB of data...  ??

 All written files have size 1 MiB with 'ls', and thus should contain the
 data
 written.  However, on inspection, the files written after the first ~100
 GiB,
 are full of just 0s.  (hexdump -C)

 To further test this, I use the standard tool 'cp' to copy a few
 random-content
 files into the full cephfs filessystem.  cp reports no complaints, and
 after
 the copy operations, content is seen with hexdump -C.  However, after
 forcing
 the data out of cache on the client by reading other earlier created
 files,
 hexdump -C show all-0 content for the files copied with 'cp'.  Data that
 was
 there is suddenly gone...?

 I am new to ceph.  Is there an option I have missed to avoid this
 behaviour?
 (I could not find one in
 https://docs.ceph.com/docs/master/man/8/mount.ceph/ )

 Is this behaviour related to
 https://docs.ceph.com/docs/mimic/cephfs/full/
 ?

 (That page states 'sometime after a write call has already returned 0'.
 But if
 write returns 0, then no data has been written, so the user program would
 not
 assume any kind of success.)

 Best regards,

 Håkan

 ------------------------------

 Subject: Digest Footer

 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io
 %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

 ------------------------------

 End of ceph-users Digest, Vol 84, Issue 44
 ******************************************