Hi all,
We're cancelling our Ceph tech talk today. In the mean time check out
our archive and consider reaching out to me about giving your own talk
in the upcoming months.
https://ceph.com/ceph-tech-talks/
--
Mike Perez (thingee)
On Wed, Aug 21, 2019 at 9:33 PM Zaharo Bai (白战豪)-云数据中心集团
<baizhanhao01(a)inspur.com> wrote:
>
> I tested and combed the current migration process. If I read and write the new image during the migration process and then use migration_abort, the newly written data will be lost. Do we have a solution to this problem?
That's a good point that we didn't consider. I've opened a tracker
ticket against the issue [1].
>
> -----邮件原件-----
> 发件人: Jason Dillaman [mailto:jdillama@redhat.com]
> 发送时间: 2019年8月22日 8:38
> 收件人: Zaharo Bai (白战豪)-云数据中心集团 <baizhanhao01(a)inspur.com>
> 抄送: ceph-users <ceph-users(a)ceph.com>
> 主题: Re: About image migration
>
> On Wed, Aug 21, 2019 at 8:35 PM Zaharo Bai (白战豪)-云数据中心集团
>
> <baizhanhao01(a)inspur.com> wrote:
> >
> > So, what is the current usage scenario for our online migration? I understand that the current online migration of the community has not been truly online, and the upper layer (iscsi target, openstack, etc.) must be required to do some operations such as switching and data maintenance.
> > Is there a way to achieve full online migration in the rbd layer, the upper application is not aware, just need to call librbd's CLI or API, or is it necessary to do so, because doing so will inevitably change the architecture of ceph.
>
> If using RBD under a higher-level layer, the upper layers would need to know about the migration to update their internal data structures to point to the correct (new) image. The only case where this really isn't necessary is when live-migrating an image "in-place" (i.e. you keep it in the same pool w/ the same name).
>
> > -----邮件原件-----
> > 发件人: Jason Dillaman [mailto:jdillama@redhat.com]
> > 发送时间: 2019年8月21日 20:44
> > 收件人: Zaharo Bai (白战豪)-云数据中心集团 <baizhanhao01(a)inspur.com>
> > 抄送: ceph-users <ceph-users(a)ceph.com>
> > 主题: Re: About image migration
> >
> > On Tue, Aug 20, 2019 at 10:04 PM Zaharo Bai (白战豪)-云数据中心集团
> > <baizhanhao01(a)inspur.com> wrote:
> > >
> > > Hi jason:
> > >
> > > I have a question I would like to ask you, Is the current image migration and openstack adapted? according to my understanding, openstack’s previous live-migration logic is implemented in cinder, just call librbd rbd_read/write API to do Data migration.
> >
> > I believe the existing Cinder volume block live-migration is just a wrapper around QEMU's block live-migration functionality, so indeed it would just be a series of RBD read/write API calls between two volumes. The built-in RBD live-migration is similar but it copies snapshots and preserves image sparseness during the migration process.
> > Because Cinder is using the QEMU block live-migration functionality, it's not tied into RBD's live-migration.
> >
> > >
> > >
> > >
> > > Best wishes
> > >
> > > Zaharo
> >
> >
> >
> > --
> > Jason
>
>
>
> --
> Jason
[1] https://tracker.ceph.com/issues/41394
--
Jason
Hi
Yesterday one of our customers asked us a strange request. He asked us to use SAN as the Ceph storage space to add the SAN storages it currently has to the cluster and reduce other disk purchase costs.
Anybody know can we do this or not?! And if this is possible how we should start to architect this Strange Ceph?! Is it good or not?!
Thanks for your help.
Mohsen Mottaghi
Hi everyone,
few days ago I reduced the number og PGs on a small pool. The cluster
runs 14.2.2, was upgraded from jewel to 14.2.1 and the to 14.2.2. I did
a ceph-bluestore-tool repair on all OSDs to update statistics.
Today I got a scrub error reporting:
4.3 scrub : stat mismatch, got 68/68 objects, 0/0 clones, 68/68 dirty,
2/2 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 whiteouts,
545259538/511705106 bytes, 0/0 manifest objects, 0/0
hit_set_archive bytes.
This message appears on the primary OSD only.
rados list-inconsistent-obj 4.3
{"epoch":228717,"inconsistents":[]}
What can I do to find the reason for the scrub error? Is this a hardware
failure or a bug? Is it safe to do a pg repair?
Thanks for your help.
Daniel
--
Daniel Schreiber
Facharbeitsgruppe Systemsoftware
Universitaetsrechenzentrum
Technische Universität Chemnitz
Straße der Nationen 62 (Raum B303)
09111 Chemnitz
Germany
Tel: +49 371 531 35444
Fax: +49 371 531 835444
On Wed, Aug 21, 2019 at 8:35 PM Zaharo Bai (白战豪)-云数据中心集团
<baizhanhao01(a)inspur.com> wrote:
>
> So, what is the current usage scenario for our online migration? I understand that the current online migration of the community has not been truly online, and the upper layer (iscsi target, openstack, etc.) must be required to do some operations such as switching and data maintenance.
> Is there a way to achieve full online migration in the rbd layer, the upper application is not aware, just need to call librbd's CLI or API, or is it necessary to do so, because doing so will inevitably change the architecture of ceph.
If using RBD under a higher-level layer, the upper layers would need
to know about the migration to update their internal data structures
to point to the correct (new) image. The only case where this really
isn't necessary is when live-migrating an image "in-place" (i.e. you
keep it in the same pool w/ the same name).
> -----邮件原件-----
> 发件人: Jason Dillaman [mailto:jdillama@redhat.com]
> 发送时间: 2019年8月21日 20:44
> 收件人: Zaharo Bai (白战豪)-云数据中心集团 <baizhanhao01(a)inspur.com>
> 抄送: ceph-users <ceph-users(a)ceph.com>
> 主题: Re: About image migration
>
> On Tue, Aug 20, 2019 at 10:04 PM Zaharo Bai (白战豪)-云数据中心集团
> <baizhanhao01(a)inspur.com> wrote:
> >
> > Hi jason:
> >
> > I have a question I would like to ask you, Is the current image migration and openstack adapted? according to my understanding, openstack’s previous live-migration logic is implemented in cinder, just call librbd rbd_read/write API to do Data migration.
>
> I believe the existing Cinder volume block live-migration is just a wrapper around QEMU's block live-migration functionality, so indeed it would just be a series of RBD read/write API calls between two volumes. The built-in RBD live-migration is similar but it copies snapshots and preserves image sparseness during the migration process.
> Because Cinder is using the QEMU block live-migration functionality, it's not tied into RBD's live-migration.
>
> >
> >
> >
> > Best wishes
> >
> > Zaharo
>
>
>
> --
> Jason
--
Jason
On Tue, Aug 20, 2019 at 10:04 PM Zaharo Bai (白战豪)-云数据中心集团
<baizhanhao01(a)inspur.com> wrote:
>
> Hi jason:
>
> I have a question I would like to ask you, Is the current image migration and openstack adapted? according to my understanding, openstack’s previous live-migration logic is implemented in cinder, just call librbd rbd_read/write API to do Data migration.
I believe the existing Cinder volume block live-migration is just a
wrapper around QEMU's block live-migration functionality, so indeed it
would just be a series of RBD read/write API calls between two
volumes. The built-in RBD live-migration is similar but it copies
snapshots and preserves image sparseness during the migration process.
Because Cinder is using the QEMU block live-migration functionality,
it's not tied into RBD's live-migration.
>
>
>
> Best wishes
>
> Zaharo
--
Jason
Hello,
I have setup two separate Ceph clusters with RGW instance each and trying to achieve multisite data synchronization. Primary runs 13.2.5, slave runs 14.2.2 (I have upgraded slave side from 14.2.1 due to known data corruption during transfer due to curl errors). I have emptied slave zone and allowed sync to run from beginning to the end. Then I have recalculated MD5 hashes over original data and over data in slave zones and found that in some cases they do not match. Data corruption is evident.
Comparison of data byte for byte shows that some parts of data just moved around (for example file has correct bytes from 0 to 513k and then exactly the same bytes from position of 513k up to 1026k - feels like as some sort of buffer issue). File size is correct. I have read the RGW sources and could not find anything what can cause such sort of behavior however I could not find one single piece of code which I would consider critical: during FetchRemote RGW obtains object's etag from remote RGW but apparently there nothing what recalculates MD5 from actual data and comparing it to etag received to ensure that data was correctly transferred. Even broken data then stored to local cluster and into bucket index. Then there just nothing further to prevent wrong data to reach end user downloading from slave zone.
Such check should happen regardless of sync backend in use: objects with broken data should not be stored in bucket index at all. No one needs broken data and it is better just to fail object sync to allow it to try again later instead of storing faulty data and then allowing end users to download it.
Is it me who cannot see such sort of check or it is actually missing?! And if it is not there at all I think it should be quite high on TODO list.
Regards,
Vladimir
Oh no, it's not that bad. It's
$ ping -s 65000 dest.inati.on
on a VPN connection that has a MTU of 1300 via IPv6. So I suspect that I
only get an answer, when all 51 fragments get fully returned. It's clear
that big packets with lots of fragments are more affected by packet loss
than 64 byte pings.
I just (at 9 o'clock in the morning) repeated this ping test and got
hardly any drops (less than 1%), even with the size of 64k. So it's
really dependent on the time of the day. Seems like some ISPs are
dropping some packets, especially in the evening...
A few minutes ago I restarted all down-marked OSDs, but they are getting
marked down again... Seems like Ceph is tolerable against packet loss
(it surely affects performance, but this irrelevant for me).
Could erasure coded pools pose some problems?
Thank you all for every hint!
Lorenz
Am 15.08.19 um 08:51 schrieb Janne Johansson:
> Den ons 14 aug. 2019 kl 17:46 skrev Lorenz Kiefner
> <root+cephusers(a)deinadmin.de <mailto:root%2Bcephusers@deinadmin.de>>:
>
> Is ceph sensitive to packet loss? On some VPN links I have up to 20%
> packet loss on 64k packets but less than 3% on 5k packets in the
> evenings.
>
>
> 20% seems crazy high, there must be something really wrong there.
>
> At 20%, you would get tons of packet timeouts to wait for on all those
> lost frames,
> then resends of (at least!) those 20% extra, which in turn would lead
> to 20% of those
> resends getting lost, all while the main streams of data try to move
> forward when some
> older packet do get over. This is a really bad situation to design for,
>
> I think you should look for a link solution that doesn't drop that
> many packets, instead of changing
> the software you try to run over that link, all others will notice
> this too and act badly in some way or other.
>
> Heck, 20% is like taking a math schoolbook and remove all instances of
> "3" and "8" and see if kids can learn to count from it. 8-/
>
> --
> May the most significant bit of your life be positive.
On Mon, Aug 12, 2019 at 10:03 PM yangjun(a)cmss.chinamobile.com
<yangjun(a)cmss.chinamobile.com> wrote:
>
> Hi Jason,
>
> I was recently testing the RBD mirror feature(ceph12.2.8), my test environment is a single-node cluster, which including 10 3T hdd OSDs + 800G pcie ssd + bluestore, and the wal and db partition of the OSD is 30G.
> The test result of a 100G image is as follows:
>
> disable journal enable journal decline percentage
> iops: 1000 877 12.3%
> bw: 402MB/s 129MB/s 67%
>
>
> Why does the bandwidth decline so much after starting journal of the RBD image? I'm very appreciate if you could give me some suggestions for optimization. Thank you very much.
The use of the journal requires first writing to the journal and, once
committed, writing to the image (i.e. doubling the latency).
Therefore, the expected worst-case performance should be around 2x
slower [1]. There was a recent bug fix [2] in the master branch that
will be backported to older releases which greatly increases small IO
journal performance -- since it was nearly 10x slower due to the bug
instead of the expected 2x [3].
> ________________________________
> yangjun(a)cmss.chinamobile.com
[1] https://www.slideshare.net/JasonDillaman/disaster-recovery-and-ceph-block-s…
[2] https://tracker.ceph.com/issues/40072
[3] https://youtu.be/ZifNGprBUTA?t=1687
--
Jason
Since you deleted the stack this is meaningless. You can simply delete the
volumes from pool using the rbd.
The proper way to delete is to delete them before destroying the stack.
If the stack is alive and you have issues deleting, you can take two
approach.
1) run openstack volume delete with --debug to see what happens.
2) use rbd status pool/volume to see is there a watcher to the volume. If
yes, then the instence has not been removed/not properly removed.
Thanks,
Suresh
On Fri, Aug 16, 2019, 8:21 PM Nerurkar, Ruchir (Nokia - US/Mountain View) <
ruchir.nerurkar(a)nokia.com> wrote:
> Hello,
>
>
>
> So I am able to install Ceph on CentOS 7.4 and I can successfully
> integrate my Openstack testbed with it.
>
>
>
> However, I am facing with an issue recently where after deleting stack, my
> cinder volumes are not getting deleted and they are getting stuck.
>
>
>
> Any idea on this issue?
>
>
>
> Best Regards,
>
> Ruchir Nerurkar
>
> 857-701-3405
>
>
>
> *From:* Götz Reinicke <goetz.reinicke(a)filmakademie.de>
> *Sent:* Monday, August 12, 2019 1:17 PM
> *To:* Nerurkar, Ruchir (Nokia - US/Mountain View) <
> ruchir.nerurkar(a)nokia.com>
> *Cc:* ceph-users(a)ceph.io
> *Subject:* Re: [ceph-users] Request to guide on ceph-deploy install
> command for luminuous 12.2.12 release
>
>
>
> Hi,
>
>
>
> Am 07.08.2019 um 20:20 schrieb Nerurkar, Ruchir (Nokia - US/Mountain View)
> <ruchir.nerurkar(a)nokia.com>:
>
>
>
> Hello,
>
>
>
> I work in Nokia as a Software QA engineer and I am trying to install Ceph
> on centOS 7.4 version.
>
>
>
> Do you have to stick at 7.4?
>
>
>
>
>
> But I am getting this error with the following output: -
>
>
>
> vmgoscephcontrollerluminous][WARNIN] Error: Package:
> 2:ceph-common-12.2.12-0.el7.x86_64 (Ceph)
>
> [mvmgoscephcontrollerluminous][WARNIN] Requires:
> liblz4.so.1()(64bit)
>
> [mvmgoscephcontrollerluminous][WARNIN] Error: Package:
> 2:ceph-osd-12.2.12-0.el7.x86_64 (Ceph)
>
> [mvmgoscephcontrollerluminous][WARNIN] Requires:
> liblz4.so.1()(64bit)
>
> [mvmgoscephcontrollerluminous][WARNIN] Error: Package:
> 2:ceph-base-12.2.12-0.el7.x86_64 (Ceph)
>
> [mvmgoscephcontrollerluminous][WARNIN] Requires:
> gperftools-libs >= 2.6.1
>
> [mvmgoscephcontrollerluminous][WARNIN] Available:
> gperftools-libs-2.4-8.el7.i686 (base)
>
> [mvmgoscephcontrollerluminous][DEBUG ] You could try using --skip-broken
> to work around the problem
>
> [mvmgoscephcontrollerluminous][WARNIN] gperftools-libs =
> 2.4-8.el7
>
> [mvmgoscephcontrollerluminous][WARNIN] Error: Package:
> 2:ceph-mon-12.2.12-0.el7.x86_64 (Ceph)
>
> [mvmgoscephcontrollerluminous][WARNIN] Requires:
> liblz4.so.1()(64bit)
>
> [mvmgoscephcontrollerluminous][WARNIN] Error: Package:
> 2:ceph-base-12.2.12-0.el7.x86_64 (Ceph)
>
> [mvmgoscephcontrollerluminous][WARNIN] Requires:
> liblz4.so.1()(64bit)
>
> [mvmgoscephcontrollerluminous][DEBUG ] You could try running: rpm -Va
> --nofiles --nodigest
>
> [mvmgoscephcontrollerluminous][ERROR ] RuntimeError: command returned
> non-zero exit status: 1
>
> [ceph_deploy][ERROR ] RuntimeError: Failed to execute command: yum -y
> install ceph ceph-radosgw
>
>
>
> Does anyone know about this issue like how can I resolve package
> dependencies?
>
>
>
> My first shot: Update Centos to min. 7.5 (which
> includes gperftools-libs-2.6.1)
>
>
>
> And install lz4 too.
>
>
>
> Regards . Götz
>
>
>
>
>
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>