Say one is forced to move a production cluster (4 nodes) to a different
datacenter. What options do I have, other than just turning it off at
the old location and on on the new location?
Maybe buying some extra nodes, and move one node at a time?
We are looking to role out a all flash Ceph cluster as storage for our cloud solution. The OSD's will be on slightly slower Micron 5300 PRO's, with WAL/DB on Micron 7300 MAX NVMe's.
My main concern with Ceph being able to fit the bill is its snapshot abilities.
For each RBD we would like the following snapshots
8x 30 minute snapshots (latest 4 hours)
With our current solution (HPE Nimble) we simply pause all write IO on the 10 minute mark for roughly 2 seconds and then we take a snapshot of the entire Nimble volume. Each VM within the Nimble volume is sitting on a Linux Logical Volume so its easy for us to take one big snapshot and only get access to a specific clients data.
Are there any options for automating managing/retention of snapshots within Ceph besides some bash scripts? Is there anyway to take snapshots of all RBD's within a pool at a given time?
Is there anyone successfully running with this many snapshots? If anyone is running a similar setup, would love to hear how your doing it.
Hello,
We have a medium-sized Ceph Luminous cluster that, up til now, has been the
RBD image backend solely for an OpenStack Newton cluster that's marked for
upgrade to Stein later this year.
Recently we deployed a brand new Stein cluster however, and I'm curious
whether the idea of pointing the new OpenStack cluster at the same RBD
pools for Cinder/Glance/Nova as the Luminous cluster would be considered
bad practice, or even potentially dangerous.
One argument for doing it may be that multiple CInder/Glance/Nova pools
serving disparate groups of clients would come at a PG cost to the cluster,
though the separation of multiple, distinct pools also has its advantages.
The UUIDs generated for RBD images in the pools by OpenStack services
*should* be unique and collision-less between the 2 OpenStack clusters, in
theory.
One other point I was curious about was RBD image feature sets; Stein Ceph
clients will be running later versions of Ceph libraries than Newton
clients. If the 2 sets of clients were to share pools, would that itself
cause problems (in the case that neither set needed to share RBD images
within pools, only the pool itself) with some images in the pool having
different feature lists?
--
*******************
Paul Browne
Research Computing Platforms
University Information Services
Roger Needham Building
JJ Thompson Avenue
University of Cambridge
Cambridge
United Kingdom
E-Mail: pfb29(a)cam.ac.uk
Tel: 0044-1223-746548
*******************
Hi Ceph Community (I'm new here :),
I'm learning Ceph in a Virtual Environment Vagrant/Virtualbox (I understand
this is far from a real environment in several ways, mainly performance,
but I'm ok with that at this point :)
I've 3 nodes, and after few *vagrant halt/up*, when I do *ceph -s*, I got
the following message:
[vagrant@ceph-node1 ~]$ sudo ceph -s
cluster:
id: 7f8cb5f0-1989-4ab1-8fb9-d5c08aa96658
health: *HEALTH_WARN*
Reduced data availability: 512 pgs inactive
4 slow ops, oldest one blocked for 1576 sec, daemons
[osd.6,osd.7,osd.8] have slow ops.
services:
mon: 3 daemons, quorum ceph-node1,ceph-node2,ceph-node3 (age 7m)
mgr: ceph-node1(active, since 26m), standbys: ceph-node2, ceph-node3
osd: 9 osds: 9 up (since 25m), 9 in (since 2d)
data:
pools: 1 pools, 512 pgs
objects: 0 objects, 0 B
usage: 9.1 GiB used, 162 GiB / 171 GiB avail
pgs: 100.000% pgs unknown
512 unknown
Here the output of *ceph health detail*:
[vagrant@ceph-node1 ~]$ sudo ceph health detail
HEALTH_WARN Reduced data availability: 512 pgs inactive; 4 slow ops, oldest
one blocked for 1810 sec, daemons [osd.6,osd.7,osd.8] have slow ops.
PG_AVAILABILITY Reduced data availability: 512 pgs inactive
pg 2.1cd is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ce is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1cf is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d0 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d1 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d2 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d3 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d4 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d5 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d6 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d7 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d8 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1d9 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1da is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1db is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1dc is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1dd is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1de is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1df is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e0 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e1 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e2 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e3 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e4 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e5 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e6 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e7 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e8 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1e9 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ea is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1eb is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ec is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ed is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ee is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ef is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f0 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f1 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f2 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f3 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f4 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f5 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f6 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f7 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f8 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1f9 is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1fa is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1fb is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1fc is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1fd is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1fe is stuck inactive for 1815.881027, current state unknown, last
acting []
pg 2.1ff is stuck inactive for 1815.881027, current state unknown, last
acting []
SLOW_OPS 4 slow ops, oldest one blocked for 1810 sec, daemons
[osd.6,osd.7,osd.8] have slow ops.
Do you have any guidance on how to proceed with this? I'm trying to
understand why the cluster is HEALTH_WARN and what I need to do in order to
make it health again.
Thanks!
--
Ignacio Ocampo
All,
I've just spent a significant amount of time unsuccessfully chasing the
_read_fsid unparsable uuid error on Debian 10 / Natilus 14.2.6. Since
this is a brand new cluster, last night I gave up and moved back to
Debian 9 / Luminous 12.2.11. In both cases I'm using the packages from
Debian Backports with ceph-ansible as my deployment tool.
Note that above I said 'the _read_fsid unparsable uuid' error. I've
searched around a bit and found some previously reported issues, but I
did not see any conclusive resolutions.
I would like to get to Nautilus as quickly as possible, so I'd gladly
provide additional information to help track down the cause of this
symptom. I can confirm that, looking at the ceph-volume.log on the OSD
host I see no difference between the ceph-volume lvm batch command
generated by the ceph-ansible versions associated with these two Ceph
releases:
ceph-volume --cluster ceph lvm batch --bluestore --yes
--block-db-size 133358734540 /dev/sdc /dev/sdd /dev/sde /dev/sdf
/dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/nvme0n1
Note that I'm using --block-db-size to divide my NVMe into 12 segments
as I have 4 empty drive bays on my OSD servers that I may eventually be
able to fill.
My OSD hardware is:
Disk /dev/nvme0n1: 1.5 TiB, 1600321314816 bytes, 3125627568 sectors
Disk /dev/sdc: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/sdd: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/sde: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/sdf: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/sdg: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/sdh: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/sdi: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
Disk /dev/sdj: 10.9 TiB, 12000138625024 bytes, 23437770752 sectors
I'd send the output of ceph-volume inventory on Luminous, but I'm
getting -->: KeyError: 'human_readable_size'.
Please let me know if I can provide any further information.
Thanks.
-Dave
--
Dave Hall
Binghamton University
Yes but we are offering our rbd volumes in another cloud product which can enable them migrate their volumes to openstack when they want.
Sent from my iPhone
On 29 Jan 2020, at 18:38, Matthew H <matthew.heler(a)hotmail.com> wrote:
You should have used separate pool name scemes for each OpenStack cluster..
________________________________
From: tdados(a)hotmail.com <tdados(a)hotmail.com>
Sent: Wednesday, January 29, 2020 12:29 PM
To: ceph-users(a)ceph.io <ceph-users(a)ceph.io>
Subject: [ceph-users] Re: Servicing multiple OpenStack clusters from the same Ceph cluster
Hello,
We have recently deployed that and it's working fine. We have deployed different keys for the different openstack clusters ofcourse and they are using the same cinder/nova/glance pools.
The only risk is if a client from one openstack cluster creates a volume and the id that will be generated ends up being the same on an existing volume from the other openstack cluster. But that's like possibility of 1 in 5 billion or something.
We took the risk.
Regards
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
Hello Igor,
i updated all servers to latest 4.19.97 kernel but this doesn't fix the
situation.
I can provide you with all those logs - any idea where to upload / how
to sent them to you?
Greets,
Stefan
Am 20.01.20 um 13:12 schrieb Igor Fedotov:
> Hi Stefan,
>
> these lines are result of transaction dump performed on a failure during
> transaction submission (which is shown as
>
> "submit_transaction error: Corruption: block checksum mismatch code = 2"
>
> Most probably they are out of interest (checksum errors are unlikely to
> be caused by transaction content) and hence we need earlier stuff to
> learn what caused that
>
> checksum mismatch.
>
> It's hard to give any formal overview of what you should look for, from
> my troubleshooting experience generally one may try to find:
>
> - some previous error/warning indications (e.g. allocation, disk access,
> etc)
>
> - prior OSD crashes (sometimes they might have different causes/stack
> traces/assertion messages)
>
> - any timeout or retry indications
>
> - any uncommon log patterns which aren't present during regular running
> but happen each time before the crash/failure.
>
> Anyway I think the inspection depth should be much(?) deeper than
> presumably it is (from what I can see from your log snippets).
>
> Ceph keeps last 10000 log events with an increased log level and dumps
> them on crash with negative index starting at -9999 up to -1 as a prefix.
>
> -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
>
>
> It would be great If you share several log snippets for different
> crashes containing these last 10000 lines.
>
>
> Thanks,
>
> Igor
>
>
> On 1/19/2020 9:42 PM, Stefan Priebe - Profihost AG wrote:
>> Hello Igor,
>>
>> there's absolutely nothing in the logs before.
>>
>> What do those lines mean:
>> Put( Prefix = O key =
>> 0x7f8000000000000001cc45c881217262'd_data.4303206b8b4567.0000000000009632!='0xfffffffffffffffeffffffffffffffff6f00120000'x'
>>
>> Value size = 480)
>> Put( Prefix = O key =
>> 0x7f8000000000000001cc45c881217262'd_data.4303206b8b4567.0000000000009632!='0xfffffffffffffffeffffffffffffffff'o'
>>
>> Value size = 510)
>>
>> on the right size i always see 0xfffffffffffffffeffffffffffffffff on all
>> failed OSDs.
>>
>> greets,
>> Stefan
>> Am 19.01.20 um 14:07 schrieb Stefan Priebe - Profihost AG:
>>> Yes, except that this happens on 8 different clusters with different
>>> hw but same ceph version and same kernel version.
>>>
>>> Greets,
>>> Stefan
>>>
>>>> Am 19.01.2020 um 11:53 schrieb Igor Fedotov <ifedotov(a)suse.de>:
>>>>
>>>> So the intermediate summary is:
>>>>
>>>> Any OSD in the cluster can experience interim RocksDB checksum
>>>> failure. Which isn't present after OSD restart.
>>>>
>>>> No HW issues observed, no persistent artifacts (except OSD log)
>>>> afterwards.
>>>>
>>>> And looks like the issue is rather specific to the cluster as no
>>>> similar reports from other users seem to be present.
>>>>
>>>>
>>>> Sorry, I'm out of ideas other then collect all the failure logs and
>>>> try to find something common in them. May be this will shed some
>>>> light..
>>>>
>>>> BTW from my experience it might make sense to inspect OSD log prior
>>>> to failure (any error messages and/or prior restarts, etc) sometimes
>>>> this might provide some hints.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Igor
>>>>
>>>>
>>>>> On 1/17/2020 2:30 PM, Stefan Priebe - Profihost AG wrote:
>>>>> HI Igor,
>>>>>
>>>>>> Am 17.01.20 um 12:10 schrieb Igor Fedotov:
>>>>>> hmmm..
>>>>>>
>>>>>> Just in case - suggest to check H/W errors with dmesg.
>>>>> this happens on around 80 nodes - i don't expect all of those have not
>>>>> identified hw errors. Also all of them are monitored - no dmesg
>>>>> outpout
>>>>> contains any errors.
>>>>>
>>>>>> Also there are some (not very much though) chances this is another
>>>>>> incarnation of the following bug:
>>>>>> https://tracker.ceph.com/issues/22464
>>>>>> https://github.com/ceph/ceph/pull/24649
>>>>>>
>>>>>> The corresponding PR works around it for main device reads (user data
>>>>>> only!) but theoretically it might still happen
>>>>>>
>>>>>> either for DB device or DB data at main device.
>>>>>>
>>>>>> Can you observe any bluefs spillovers? Are there any correlation
>>>>>> between
>>>>>> failing OSDs and spillover presence if any, e.g. failing OSDs always
>>>>>> have a spillover. While OSDs without spillovers never face the
>>>>>> issue...
>>>>>>
>>>>>> To validate this hypothesis one can try to monitor/check (e.g. once a
>>>>>> day for a week or something) "bluestore_reads_with_retries"
>>>>>> counter over
>>>>>> OSDs to learn if the issue is happening
>>>>>>
>>>>>> in the system. Non-zero values mean it's there for user data/main
>>>>>> device and hence is likely to happen for DB ones as well (which
>>>>>> doesn't
>>>>>> have any workaround yet).
>>>>> OK i checked bluestore_reads_with_retries on 360 osds but all of
>>>>> them say 0.
>>>>>
>>>>>
>>>>>> Additionally you might want to monitor memory usage as the above
>>>>>> mentioned PR denotes high memory pressure as potential trigger for
>>>>>> these
>>>>>> read errors. So if such pressure happens the hypothesis becomes
>>>>>> more valid.
>>>>> we already do this heavily and have around 10GB of memory per OSD.
>>>>> Also
>>>>> no of those machines show any io pressure at all.
>>>>>
>>>>> All hosts show a constant rate of around 38GB to 45GB mem available in
>>>>> /proc/meminfo.
>>>>>
>>>>> Stefan
>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Igor
>>>>>>
>>>>>> PS. Everything above is rather a speculation for now. Available
>>>>>> information is definitely not enough for extensive troubleshooting
>>>>>> the
>>>>>> cases which happens that rarely.
>>>>>>
>>>>>> You might want to start collecting failure-related information
>>>>>> (including but not limited to failure logs, perf counter dumps,
>>>>>> system
>>>>>> resource reports etc) for future analysis.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 1/16/2020 11:58 PM, Stefan Priebe - Profihost AG wrote:
>>>>>>> Hi Igor,
>>>>>>>
>>>>>>> answers inline.
>>>>>>>
>>>>>>> Am 16.01.20 um 21:34 schrieb Igor Fedotov:
>>>>>>>> you may want to run fsck against failing OSDs. Hopefully it will
>>>>>>>> shed
>>>>>>>> some light.
>>>>>>> fsck just says everything fine:
>>>>>>>
>>>>>>> # ceph-bluestore-tool --command fsck --path
>>>>>>> /var/lib/ceph/osd/ceph-27/
>>>>>>> fsck success
>>>>>>>
>>>>>>>
>>>>>>>> Also wondering if OSD is able to recover (startup and proceed
>>>>>>>> working)
>>>>>>>> after facing the issue?
>>>>>>> no recover needed. It just runs forever after restarting.
>>>>>>>
>>>>>>>> If so do you have any one which failed multiple times? Do you
>>>>>>>> have logs
>>>>>>>> for these occurrences?
>>>>>>> may be but there are most probably weeks or month between those
>>>>>>> failures
>>>>>>> - most probably logs are already deleted.
>>>>>>>
>>>>>>>> Also please note that patch you mentioned doesn't fix previous
>>>>>>>> issues
>>>>>>>> (i.e. duplicate allocations), it prevents from new ones only.
>>>>>>>>
>>>>>>>> But fsck should show them if any...
>>>>>>> None showed.
>>>>>>>
>>>>>>> Stefan
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Igor
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 1/16/2020 10:04 PM, Stefan Priebe - Profihost AG wrote:
>>>>>>>>> Hi Igor,
>>>>>>>>>
>>>>>>>>> ouch sorry. Here we go:
>>>>>>>>>
>>>>>>>>> -1> 2020-01-16 01:10:13.404090 7f3350a14700 -1 rocksdb:
>>>>>>>>> submit_transaction error: Corruption: block checksum mismatch
>>>>>>>>> code = 2
>>>>>>>>> Rocksdb transaction:
>>>>>>>>> Put( Prefix = M key =
>>>>>>>>> 0x0000000000000402'.OBJ_0000000000000002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e0000000000001f2e..'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 97)
>>>>>>>>> Put( Prefix = M key =
>>>>>>>>> 0x0000000000000402'.MAP_00000000000BB85C_0000000000000002.953BFD0A.bb85c.rbd%udata%e3e8eac6b8b4567%e0000000000001f2e..'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 93)
>>>>>>>>> Put( Prefix = M key =
>>>>>>>>> 0x0000000000000916'.0000823257.00000000000073922044' Value size
>>>>>>>>> = 196)
>>>>>>>>> Put( Prefix = M key =
>>>>>>>>> 0x0000000000000916'.0000823257.00000000000073922045' Value size
>>>>>>>>> = 184)
>>>>>>>>> Put( Prefix = M key = 0x0000000000000916'._info' Value size = 899)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f00000000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 418)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f00030000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 474)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f0007c000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 392)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f00090000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 317)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f000a0000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 521)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f000f4000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 558)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f00130000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 649)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f00194000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 449)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f001cc000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 580)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f00200000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 435)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f00240000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 569)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f00290000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 465)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f002e0000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 710)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f00300000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 599)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f0036c000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 372)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f003a6000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 130)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f003b4000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 540)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff6f003fc000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 47)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0x00000000000bb85cffffffffffffffff'o'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 1731)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0xfffffffffffffffeffffffffffffffff6f00040000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 675)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0xfffffffffffffffeffffffffffffffff6f00080000'x'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 395)
>>>>>>>>> Put( Prefix = O key =
>>>>>>>>> 0x7f80000000000000029acdfb05217262'd_data.3e8eac6b8b4567.0000000000001f2e!='0xfffffffffffffffeffffffffffffffff'o'
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Value size = 1328)
>>>>>>>>> Put( Prefix = X key = 0x0000000018a38deb Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x0000000018a38dea Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a035b Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a035c Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a0355 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a0356 Value size = 17)
>>>>>>>>> Put( Prefix = X key = 0x000000001a54f6e4 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000001b1c061e Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a038f Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a0389 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a0358 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a035f Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a0357 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a0387 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a038a Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a0388 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x00000000134c3fbe Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x00000000134c3fb5 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a036e Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a036d Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x00000000134c3fb8 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a0371 Value size = 14)
>>>>>>>>> Put( Prefix = X key = 0x000000000d7a036a Value size = 14)
>>>>>>>>> 0> 2020-01-16 01:10:13.413759 7f3350a14700 -1
>>>>>>>>> /build/ceph/src/os/bluestore/BlueStore.cc: In function 'void
>>>>>>>>> BlueStore::_kv_sync_thread()' thread 7f3350a14700 time 2020-01-16
>>>>>>>>> 01:10:13.404113
>>>>>>>>> /build/ceph/src/os/bluestore/BlueStore.cc: 8808: FAILED
>>>>>>>>> assert(r == 0)
>>>>>>>>>
>>>>>>>>> ceph version 12.2.12-11-gd3eae83543
>>>>>>>>> (d3eae83543bffc0fc6c43823feb637fa851b6213) luminous (stable)
>>>>>>>>> 1: (ceph::__ceph_assert_fail(char const*, char const*, int,
>>>>>>>>> char
>>>>>>>>> const*)+0x102) [0x55c9a712d232]
>>>>>>>>> 2: (BlueStore::_kv_sync_thread()+0x24c5) [0x55c9a6fb54b5]
>>>>>>>>> 3: (BlueStore::KVSyncThread::entry()+0xd) [0x55c9a6ff608d]
>>>>>>>>> 4: (()+0x7494) [0x7f33615f9494]
>>>>>>>>> 5: (clone()+0x3f) [0x7f3360680acf]
>>>>>>>>>
>>>>>>>>> I already picked those:
>>>>>>>>> https://github.com/ceph/ceph/pull/28644
>>>>>>>>>
>>>>>>>>> Greets,
>>>>>>>>> Stefan
>>>>>>>>> Am 16.01.20 um 17:00 schrieb Igor Fedotov:
>>>>>>>>>> Hi Stefan,
>>>>>>>>>>
>>>>>>>>>> would you please share log snippet prior the assertions? Looks
>>>>>>>>>> like
>>>>>>>>>> RocksDB is failing during transaction submission...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Igor
>>>>>>>>>>
>>>>>>>>>> On 1/16/2020 11:56 AM, Stefan Priebe - Profihost AG wrote:
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> does anybody know a fix for this ASSERT / crash?
>>>>>>>>>>>
>>>>>>>>>>> 2020-01-16 02:02:31.316394 7f8c3f5ab700 -1
>>>>>>>>>>> /build/ceph/src/os/bluestore/BlueStore.cc: In function 'void
>>>>>>>>>>> BlueStore::_kv_sync_thread()' thread 7f8c3f5ab700 time
>>>>>>>>>>> 2020-01-16
>>>>>>>>>>> 02:02:31.304993
>>>>>>>>>>> /build/ceph/src/os/bluestore/BlueStore.cc: 8808: FAILED assert(r
>>>>>>>>>>> == 0)
>>>>>>>>>>>
>>>>>>>>>>> ceph version 12.2.12-11-gd3eae83543
>>>>>>>>>>> (d3eae83543bffc0fc6c43823feb637fa851b6213) luminous (stable)
>>>>>>>>>>> 1: (ceph::__ceph_assert_fail(char const*, char const*,
>>>>>>>>>>> int, char
>>>>>>>>>>> const*)+0x102) [0x55e6df9d9232]
>>>>>>>>>>> 2: (BlueStore::_kv_sync_thread()+0x24c5) [0x55e6df8614b5]
>>>>>>>>>>> 3: (BlueStore::KVSyncThread::entry()+0xd) [0x55e6df8a208d]
>>>>>>>>>>> 4: (()+0x7494) [0x7f8c50190494]
>>>>>>>>>>> 5: (clone()+0x3f) [0x7f8c4f217acf]
>>>>>>>>>>>
>>>>>>>>>>> all bluestore OSDs are randomly crashing sometimes (once a
>>>>>>>>>>> week).
>>>>>>>>>>>
>>>>>>>>>>> Greets,
>>>>>>>>>>> Stefan
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>> ceph-users(a)lists.ceph.com
>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users(a)lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
Hello,
I am currently evaluating Ceph for our needs and I have a question
about the 'object append' feature. I note that the rados core API
supports an 'append' operation, and the S3-compatible interface has
too.
My question is: does Ceph support concurrent append? I would like to
use Ceph as a temporary store, a "buffer" if you will, for incoming
data from a variety of sources. Each object would hold data for a
particular identifier. I'd like to know if two or more different
clients can 'append' to the same object, and the data doesn't
overwrite each other, and each 'append' is added to the end of the
object?
Performance wise we'd likely be performing 15-20 thousand writes per
second, so we'd be building a pretty big cluster on very fast flash
disk. Data would only reside on the system for about an hour at most
before being read and deleted.
Cheers,
David Bell
Hi,
The command "ceph daemon mds.$mds perf dump" does not give the
collection with MDS specific data anymore. In Mimic I get the following
MDS specific collections:
- mds
- mds_cache
- mds_log
- mds_mem
- mds_server
- mds_sessions
But those are not available in Nautilus anymore (14.2.4). Also not
listed in a "perf schema".
Where did these metrics go?
Thanks,
Stefan
--
| BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info(a)bit.nl