March 2021 - ceph-users - lists.ceph.io

A practical approach to efficiently store 100 billions small objects in Ceph

by Loïc Dachary

Bonjour, In the past weeks a few mailing list threads[0][1][2] explored the problem of storing billions of small objects in Ceph. There was great feedback (I learned at lot) and it turns out the solution is a rather simple aggregation of the ideas that were suggested during these discussions. It is described in detail here: https://wiki.softwareheritage.org/wiki/A_practical_approach_to_efficiently_… The next step will be to write and run benchmarks[3]. Although I'm convinced it is a good solution, I've been wrong before and confirmation is required :-) Many thanks to all the participants in the discussions on behalf of the Software Heritage project[4]. Cheers [0] Storing 20 billions of immutable objects in Ceph, 75% <16KB https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/JSG2TXKNXPX… [1] Small RGW objects and RADOS 64KB minimun size https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/AEMW6O7WVJF… [2] Using RBD to pack billions of small files https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/RHQ5ZCHJISX… [3] https://forge.softwareheritage.org/T3054 [4] https://www.softwareheritage.org/ -- Loïc Dachary, Artisan Logiciel Libre

3 years, 2 months

4
8
1 0

RadosGW unable to start resharding

by Ansgar Jazdzewski

Hi Folks, We are running ceph 14.2.16 and I like to reshard a bucket because I have a large object warning! so I did: radosgw-admin bucket reshard --tenant="..." --bucket="..." --uid="..." --num-shards=512 but I got receive an error: ERROR: the bucket is currently undergoing resharding and cannot be added to the reshard list at this time `radosgw-admin reshard list` is empty so I assume I have to delete some leftovers from the old resharding!? did someone has had this before? thanks for your input, Ansgar

3 years, 2 months

2
3
0 0

Bluestore OSD Layout - WAL, DB, Journal

by Dave Hall

Hello, I'm in the process of doubling the number of OSD nodes in my Nautilus cluster - from 3 to 6. Based on answers receive from earlier posts to this list, the new nodes have more NVMe that the old nodes. More to the point, on the original nodes the amount of NVMe allocated to each OSD was about 120GB, so the RocksDB was limited to 30GB. However, for my workload 300GB is probably recommended. As I prepare to lay out the NVMe on these new nodes, I'm still trying to understand how to size the DB and WAL for my OSDs and whether Journal is even needed. According to https://docs.ceph.com/en/nautilus/ceph-volume/lvm/prepare/ <https://docs.ceph.com/en/nautilus/ceph-volume/lvm/prepare/> > Bluestore supports the following configurations: > > * A block device, a block.wal, and a block.db device > * A block device and a block.wal device > * A block device and a block.db device > * A single block device > First question: On my first nodes I managed to get a DB, but no WAL. My current perception is that WAL and DB occupy separate physical/logical partitions. By specifying a WAL size and a DB size, ceph-volume will create the corresponding logical volumes on the NVMe. Is this correct? It is also possible to lay these out as basic logical partitions? Second question: How do I decide whether I need WAL, DB, or both? Third question: Once I answer the above WAL/DB question, what are the guidelines for sizing them? Thanks. -Dave -- Dave Hall Binghamton University

3 years, 2 months

2
1
0 0

Some confusion around PG, OSD and balancing issue

by Darrin Hodges

HI all, Just looking for clarification around the relationship between PGs,OSDs and balancing on a ceph (octopus) cluster. We have pg autobalance on and balancing is set to upmap. There are 2 pools, one is the default metric pool with 1 pg, the other is the pool we are using for everything, it has 512 PG. There are 60 OSD's split across 4 hosts. The USD usage ranges between 39% and 68%: * current cluster score 0.048908 ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 0 hdd 2.74599 1.00000 2.7 TiB 1.3 TiB 1.3 TiB 2.3 MiB 2.3 GiB 1.5 TiB 47.08 0.86 19 up 1 hdd 2.74550 1.00000 2.7 TiB 1.4 TiB 1.3 TiB 3.2 MiB 2.3 GiB 1.4 TiB 49.46 0.90 20 up 2 hdd 2.74550 1.00000 2.7 TiB 1.4 TiB 1.4 TiB 2.8 MiB 2.4 GiB 1.3 TiB 51.81 0.95 21 up 3 hdd 2.74550 1.00000 2.7 TiB 1.4 TiB 1.4 TiB 589 KiB 2.4 GiB 1.3 TiB 51.73 0.95 21 up 4 hdd 2.74550 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 2.2 MiB 2.7 GiB 1.1 TiB 59.05 1.08 24 up 5 hdd 9.00000 1.00000 9.1 TiB 5.2 TiB 5.2 TiB 6.9 MiB 8.2 GiB 3.9 TiB 56.75 1.04 77 up 6 hdd 2.74550 1.00000 2.7 TiB 1.5 TiB 1.5 TiB 1.9 MiB 2.8 GiB 1.3 TiB 54.21 0.99 22 up 7 hdd 2.74550 1.00000 2.7 TiB 1.4 TiB 1.3 TiB 1.1 MiB 2.7 GiB 1.4 TiB 49.32 0.90 20 up 8 hdd 2.74550 1.00000 2.7 TiB 1.4 TiB 1.3 TiB 1.1 MiB 2.3 GiB 1.4 TiB 49.36 0.90 20 up 9 hdd 2.74550 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 3.3 MiB 3.0 GiB 1.1 TiB 59.21 1.08 24 up 10 hdd 2.74550 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 2.2 MiB 3.4 GiB 941 GiB 66.53 1.22 27 up 11 hdd 2.74550 1.00000 2.7 TiB 1.1 TiB 1.1 TiB 3.9 MiB 2.0 GiB 1.7 TiB 39.69 0.73 16 up 12 hdd 2.74550 1.00000 2.7 TiB 1.2 TiB 1.2 TiB 1.2 MiB 2.1 GiB 1.5 TiB 44.69 0.82 18 up 13 hdd 2.74550 1.00000 2.7 TiB 1.1 TiB 1.1 TiB 2.4 MiB 1.9 GiB 1.7 TiB 39.59 0.72 16 up 14 hdd 2.74550 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 2.1 MiB 3.1 GiB 945 GiB 66.37 1.21 27 up 15 hdd 2.74599 1.00000 2.7 TiB 1.6 TiB 1.5 TiB 1.7 MiB 2.6 GiB 1.2 TiB 56.68 1.04 23 up 16 hdd 2.74599 1.00000 2.7 TiB 1.3 TiB 1.3 TiB 3.3 MiB 2.3 GiB 1.5 TiB 46.76 0.86 19 up 17 hdd 2.74599 0.95001 2.7 TiB 1.8 TiB 1.8 TiB 2.8 MiB 3.1 GiB 953 GiB 66.12 1.21 26 up 18 hdd 2.74599 0.95001 2.7 TiB 1.8 TiB 1.7 TiB 1.5 MiB 3.0 GiB 1010 GiB 64.08 1.17 26 up 19 hdd 2.74599 1.00000 2.7 TiB 1.8 TiB 1.7 TiB 3.0 MiB 2.9 GiB 1016 GiB 63.88 1.17 26 up 20 hdd 9.00000 1.00000 9.1 TiB 4.7 TiB 4.7 TiB 28 MiB 7.5 GiB 4.4 TiB 51.47 0.94 70 up 21 hdd 2.74599 1.00000 2.7 TiB 1.2 TiB 1.2 TiB 2.8 MiB 2.2 GiB 1.5 TiB 44.43 0.81 18 up 22 hdd 2.74599 1.00000 2.7 TiB 1.4 TiB 1.4 TiB 653 KiB 2.4 GiB 1.3 TiB 51.89 0.95 21 up 23 hdd 2.74599 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 2.3 MiB 2.7 GiB 1.1 TiB 59.26 1.08 24 up 24 hdd 2.74599 1.00000 2.7 TiB 1.6 TiB 1.5 TiB 3.1 MiB 3.0 GiB 1.2 TiB 56.67 1.04 23 up 25 hdd 2.74599 0.90002 2.7 TiB 1.7 TiB 1.7 TiB 5.3 MiB 4.0 GiB 1.1 TiB 61.47 1.12 25 up 26 hdd 2.74599 1.00000 2.7 TiB 1.4 TiB 1.4 TiB 1.5 MiB 2.4 GiB 1.3 TiB 51.82 0.95 21 up 27 hdd 2.74599 1.00000 2.7 TiB 1.3 TiB 1.3 TiB 1.1 MiB 2.2 GiB 1.5 TiB 46.93 0.86 19 up 28 hdd 2.74599 1.00000 2.7 TiB 1.3 TiB 1.3 TiB 2.6 MiB 2.2 GiB 1.5 TiB 46.88 0.86 19 up 29 hdd 2.74599 1.00000 2.7 TiB 1.4 TiB 1.3 TiB 557 KiB 2.6 GiB 1.4 TiB 49.37 0.90 20 up 45 hdd 2.74599 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 2.3 MiB 2.7 GiB 1.1 TiB 59.18 1.08 24 up 46 hdd 2.74599 1.00000 2.7 TiB 1.4 TiB 1.4 TiB 28 MiB 2.5 GiB 1.3 TiB 51.76 0.95 22 up 47 hdd 2.74599 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 977 KiB 3.1 GiB 1.1 TiB 59.07 1.08 24 up 48 hdd 2.74599 1.00000 2.7 TiB 1.3 TiB 1.3 TiB 625 KiB 2.6 GiB 1.5 TiB 46.86 0.86 19 up 49 hdd 2.74599 1.00000 2.7 TiB 1.7 TiB 1.7 TiB 1.9 MiB 3.2 GiB 1.1 TiB 61.68 1.13 25 up 50 hdd 2.74599 1.00000 2.7 TiB 1.7 TiB 1.7 TiB 1.0 MiB 2.7 GiB 1.1 TiB 61.53 1.13 25 up 51 hdd 2.74599 1.00000 2.7 TiB 1.2 TiB 1.1 TiB 2.5 MiB 2.3 GiB 1.6 TiB 41.88 0.77 17 up 52 hdd 2.74599 0.95001 2.7 TiB 1.8 TiB 1.8 TiB 1.8 MiB 3.8 GiB 942 GiB 66.49 1.22 27 up 53 hdd 9.00000 1.00000 9.1 TiB 4.3 TiB 4.3 TiB 6.2 MiB 6.8 GiB 4.8 TiB 47.46 0.87 64 up 54 hdd 2.74599 0.95001 2.7 TiB 1.8 TiB 1.7 TiB 1.3 MiB 2.9 GiB 1008 GiB 64.13 1.17 26 up 55 hdd 2.74599 1.00000 2.7 TiB 1.8 TiB 1.7 TiB 1.5 MiB 3.0 GiB 1014 GiB 63.95 1.17 26 up 56 hdd 2.74599 1.00000 2.7 TiB 1.8 TiB 1.7 TiB 1.3 MiB 2.8 GiB 1013 GiB 63.99 1.17 26 up 57 hdd 2.74599 1.00000 2.7 TiB 1.6 TiB 1.5 TiB 2.9 MiB 2.6 GiB 1.2 TiB 56.67 1.04 23 up 58 hdd 2.74599 1.00000 2.7 TiB 1.5 TiB 1.5 TiB 1.2 MiB 2.5 GiB 1.3 TiB 54.41 1.00 22 up 59 hdd 2.74599 0.95001 2.7 TiB 1.8 TiB 1.7 TiB 1.3 MiB 2.9 GiB 1015 GiB 63.92 1.17 26 up 30 hdd 2.74599 1.00000 2.7 TiB 1.9 TiB 1.9 TiB 1.6 MiB 3.1 GiB 878 GiB 68.79 1.26 28 up 31 hdd 2.74599 1.00000 2.7 TiB 1.3 TiB 1.3 TiB 2.3 MiB 2.2 GiB 1.5 TiB 47.03 0.86 19 up 32 hdd 2.74599 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 3.2 MiB 3.3 GiB 944 GiB 66.45 1.22 27 up 33 hdd 2.74599 1.00000 2.7 TiB 1.1 TiB 1.1 TiB 3.1 MiB 1.9 GiB 1.7 TiB 39.88 0.73 16 up 34 hdd 2.74599 1.00000 2.7 TiB 1.9 TiB 1.9 TiB 28 MiB 3.0 GiB 874 GiB 68.93 1.26 29 up 35 hdd 2.74599 1.00000 2.7 TiB 1.3 TiB 1.3 TiB 1.3 MiB 2.2 GiB 1.5 TiB 47.01 0.86 19 up 36 hdd 2.74599 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 1.6 MiB 3.0 GiB 947 GiB 66.31 1.21 27 up 37 hdd 2.74599 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 1.8 MiB 2.9 GiB 1.1 TiB 58.96 1.08 24 up 38 hdd 2.74599 1.00000 2.7 TiB 1.9 TiB 1.9 TiB 2.8 MiB 3.4 GiB 877 GiB 68.82 1.26 28 up 39 hdd 2.74599 0.85004 2.7 TiB 1.6 TiB 1.6 TiB 2.4 MiB 3.5 GiB 1.1 TiB 59.20 1.08 24 up 40 hdd 2.74599 1.00000 2.7 TiB 1.4 TiB 1.3 TiB 1.5 MiB 2.3 GiB 1.4 TiB 49.35 0.90 20 up 41 hdd 2.74599 1.00000 2.7 TiB 1.4 TiB 1.4 TiB 2.3 MiB 2.4 GiB 1.3 TiB 51.73 0.95 21 up 42 hdd 9.00000 1.00000 9.1 TiB 4.5 TiB 4.5 TiB 2.1 MiB 7.2 GiB 4.6 TiB 49.76 0.91 67 up 43 hdd 2.74599 0.90002 2.7 TiB 1.6 TiB 1.6 TiB 2.9 MiB 3.2 GiB 1.1 TiB 59.17 1.08 24 up 44 hdd 2.74599 1.00000 2.7 TiB 1.2 TiB 1.2 TiB 4.0 MiB 2.2 GiB 1.5 TiB 44.34 0.81 18 up 54 of the OSD are 3TiB and there are four 8TiB - can this cause imbalance? Or do I need to increase the PG from 512? or is this as good as it gets? many thanks Darrin -- CONFIDENTIALITY NOTICE: This email is intended for the named recipients only. It may contain privileged, confidential or copyright information. If you are not the named recipients, any use, reliance upon, disclosure or copying of this email or any attachments is unauthorised. If you have received this email in error, please reply via email or telephone +61 2 8004 5928.

3 years, 2 months

1
0
0 0

how smart is ceph recovery?

by Marc

1. is backfilling/remapping so smart that it will do what ever it can? Or are there situations like a pg a. is scheduled to be moved, but can not be moved because of min_size. Now another pg b. cannot be moved because pg a. allocated osd space and the backfill ratio will be met. Yet if the order was reversed if the pg b. would be moved and pg a. would not be moved because it is stuck anyway and the backfill ratio will be met. 2. If a down host comes up again and it's osd are started. Is data still being copied, or does ceph see that checksums(?) are the same and just sets a pointer(?) back to the old location? > -----Original Message----- > Sent: 09 March 2021 23:59 > To: ceph-users(a)ceph.io > Subject: [ceph-users] node down pg with backfill_wait waiting for > incomplete? > > > I have a node down and pg's are remapping/backfilling. I have also a lot > of pg's in backfill_wait. > > I was wondering if there is a specific order that this is being > executed. Eg I have a large'garbage' pool ec21 that is stuck. I could > resolve that by changing the min size. However I rather have this not > remapped, and wait for this node to be back on line. > > Question is: is other remapping/backfilling waiting for this stuck to be > fixed, or is backfilling/remapping so smart that it will do what ever it > can? > > > > > > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -. > F1 Outsourcing Development Sp. z o.o. > Poland > > t: +48 (0)12 4466 845 > f: +48 (0)12 4466 843 > e: marc(a)f1-outsourcing.eu > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

3 years, 2 months

1
0
0 0

How to speed up removing big rbd pools

by huxiaoyu＠horebdata.cn

Dear Cepher， For some reasons, i had a cluster with several 20 TB pools and 100TB ones, which were previously linked with iSCSI for virtual machines. When deleting those big rbd images, it turns out to be extremely slow, taking hours if not days. The Ceph cluster is running on Luminous 12.2.13 with bluestore How to speed up removing big rbd pools? thanks, samuel huxiaoyu(a)horebdata.cn

3 years, 2 months

1
0
0 0

PG inactive when host is down despite CRUSH failure domain being host

by Janek Bevendorff

Hi, I am having a weird phenomenon, which I am having trouble to debug. We have 16 OSDs per host, so when I reboot one node, 16 OSDs will be missing for a short time. Since our minimum CRUSH failure domain is host, this should not cause any problems. Unfortunately, I always have handful (1-5) PGs that become inactive nonetheless and are stuck in the state undersized+degraded+peered until the host and its OSDs are back up. The other 2000+ PGs that are also on these OSDs do not have this problem. In total, we have between 110 and 150 PGs per OSD with a configured maximum of 250, which should give us enough headspace. The affected pools always seem to be RBD pools or at least I haven't seen it on our much larger RGW pool yet. The pool's CRUSH rule looks like this: rule rbd-data { id 8 type replicated min_size 2 max_size 10 step take default step chooseleaf firstn 0 type host step emit } ceph pg dump_stuck inactive gives me this: PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY 115.3 undersized+degraded+peered [194,267] 194 [194,267] 194 115.13 undersized+degraded+peered [151,1122] 151 [151,1122] 151 116.12 undersized+degraded+peered [288,726] 288 [288,726] 288 and when I query one of the inactive PGs, I see (among other things): "up": [ 288, 726 ], "acting": [ 288, 726 ], "acting_recovery_backfill": [ "288", "726" ], "recovery_state": [ { "name": "Started/Primary/Active", "enter_time": "2021-03-10T16:23:09.301174+0100", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "pull_from_peer": [], "pushing": [] } } }, { "name": "Started", "enter_time": "2021-03-10T16:23:08.297622+0100" } ], So you can see that two out of three OSDs on other hosts are indeed up and active and the . I also see the ceph-osd daemons running on those hosts, so the data is definitely there and the PG should be available. Do you have any idea why these PGs may be becoming inactive nonetheless? I am suspecting some kind of concurrency limit, but I wouldn't know which one that could be. Thanks Janek

3 years, 2 months

2
2
0 0

OSD crashes create_aligned_in_mempool in 15.2.9 and 14.2.16

by Andrej Filipcic

Hi, under heavy load our cluster is experiencing frequent OSD crashes. Is this a known bug or should I report it? Any workarounds? It looks to be highly correlated with memory tuning. it happens with both nautilus 14.2.16 and octopus 15.2.9. I have forced the bitmap bluefs and bluestore allocator. the cluster is ~60 nodes with 256GB ram and 25Gb NICs, ~1600 OSDs on 100g network. Typically it is happening when the traffic is above 100GB/s. Best regards, Andrej -14> 2021-03-09T14:10:30.105+0100 7fc128e05700 10 monclient: tick -13> 2021-03-09T14:10:30.105+0100 7fc128e05700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2021-03-09T14:10:00.107344+0100) -12> 2021-03-09T14:10:30.210+0100 7fc119412700 5 osd.209 9539 heartbeat osd_stat(store_statfs(0xe68762a0000/0x40000000/0xe8d7fc00000, data 0x24b195d923/0x24c97c0000, compress 0x0/0x0/0x0, omap 0xf5dca, meta 0x3ff0a236), peers [6,8,11,21,22,23,24,25,26,27,28,29,34,35,37,38,65,86,87,90,120,128,129,135,136,140,150,153,154,160,184,188,192,193,203,208,210,217,229,233,242,248,254,25 6,275,277,282,290,311,313,324,326,331,339,348,369,409,411,413,466,477,532,535,538,539,542,544,546,548,552,554,556,558,561,576,580,600,601,604,612,614,624,625,631,657,689,695,704,717,738,739,740,766,790,810,812,833,839,890,891,895,903,909,916,926,927,946,960,965,991,1050,1055,1062,1064,1067,1069,1072,1073,1075,1077,1078,1079,1095,1100,1117,1125,1127,1141,1148,1149,1153,1155,1195,12 01,1202,1215,1229,1238,1253,1260,1283,1290,1298,1303,1329,1330,1349,1350,1388,1389,1422,1423,1430,1431,1434,1448,1455,1478,1479,1485,1488,1494,1497,1506,1516,1561,1564,1573,1574,1580] op hist [0,0,0,1,3,4,15,24,43,64,102,117]) -11> 2021-03-09T14:10:30.468+0100 7fc137cc0700 10 monclient: handle_auth_request added challenge on 0x55c08becf800 -10> 2021-03-09T14:10:30.543+0100 7fc1374bf700 10 monclient: handle_auth_request added challenge on 0x55c08becf400 -9> 2021-03-09T14:10:30.712+0100 7fc1384c1700 10 monclient: handle_auth_request added challenge on 0x55c08becec00 -8> 2021-03-09T14:10:31.029+0100 7fc137cc0700 10 monclient: handle_auth_request added challenge on 0x55c08becfc00 -7> 2021-03-09T14:10:31.033+0100 7fc12ca33700 5 prioritycache tune_memory target: 7264711979 mapped: 1564606464 unmapped: 47874048 heap: 1612480512 old mem: 5369698813 new mem: 5369698813 -6> 2021-03-09T14:10:31.105+0100 7fc128e05700 10 monclient: tick -5> 2021-03-09T14:10:31.105+0100 7fc128e05700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2021-03-09T14:10:01.107451+0100) -4> 2021-03-09T14:10:31.574+0100 7fc1374bf700 10 monclient: handle_auth_request added challenge on 0x55c0b69a8000 -3> 2021-03-09T14:10:32.036+0100 7fc12ca33700 5 prioritycache tune_memory target: 7264711979 mapped: 1708449792 unmapped: 46637056 heap: 1755086848 old mem: 5369698813 new mem: 5369698813 -2> 2021-03-09T14:10:32.106+0100 7fc128e05700 10 monclient: tick -1> 2021-03-09T14:10:32.106+0100 7fc128e05700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2021-03-09T14:10:02.107524+0100) 0> 2021-03-09T14:10:32.661+0100 7fc1384c1700 -1 *** Caught signal (Aborted) ** in thread 7fc1384c1700 thread_name:msgr-worker-0 ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus (stable) 1: (()+0x12b20) [0x7fc13cc20b20] 2: (gsignal()+0x10f) [0x7fc13b8847ff] 3: (abort()+0x127) [0x7fc13b86ec35] 4: (()+0x9009b) [0x7fc13c23a09b] 5: (()+0x9653c) [0x7fc13c24053c] 6: (()+0x96597) [0x7fc13c240597] 7: (()+0x967f8) [0x7fc13c2407f8] 8: (ceph::buffer::v15_2_0::create_aligned_in_mempool(unsigned int, unsigned int, int)+0x229) [0x55c0669dce49] 9: (ceph::buffer::v15_2_0::create_aligned(unsigned int, unsigned int)+0x26) [0x55c0669dcf46] 10: (ceph::buffer::v15_2_0::create_small_page_aligned(unsigned int)+0x55) [0x55c0669dd8e5] 11: (ProtocolV1::read_message_data_prepare()+0x368) [0x55c066b78ac8] 12: (ProtocolV1::read_message_middle()+0x130) [0x55c066b78c90] 13: (ProtocolV1::handle_message_front(char*, int)+0x2f4) [0x55c066b79624] 14: (()+0xf72bed) [0x55c066b72bed] 15: (AsyncConnection::process()+0x8a9) [0x55c066b6fa39] 16: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xcb7) [0x55c0669c41b7] 17: (()+0xdc979c) [0x55c0669c979c] 18: (()+0xc2ba3) [0x7fc13c26cba3] 19: (()+0x814a) [0x7fc13cc1614a] 20: (clone()+0x43) [0x7fc13b949f23] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 1/ 5 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 rbd_mirror 0/ 5 rbd_replay 0/ 5 rbd_rwl 0/ 5 journaler 0/ 5 objectcacher 0/ 5 immutable_obj_cache 0/ 5 client 1/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 0/ 0 ms 1/ 5 mon 0/10 monc 1/ 5 paxos 0/ 5 tp 1/ 5 auth 1/ 5 crypto 1/ 1 finisher 1/ 1 reserver 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 rgw_sync 1/10 civetweb 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle 0/ 0 refs 1/ 5 compressor 1/ 5 bluestore 1/ 5 bluefs 1/ 3 bdev 1/ 5 kstore 4/ 5 rocksdb 4/ 5 leveldb 4/ 5 memdb 1/ 5 fuse 1/ 5 mgr 1/ 5 mgrc 1/ 5 dpdk 1/ 5 eventtrace 1/ 5 prioritycache 0/ 5 test -2/-2 (syslog threshold) -1/-1 (stderr threshold) --- pthread ID / name mapping for recent threads --- 7fc119412700 / osd_srv_heartbt 7fc119c13700 / tp_osd_tp 7fc11a414700 / tp_osd_tp 7fc11ac15700 / tp_osd_tp 7fc11b416700 / tp_osd_tp 7fc11bc17700 / tp_osd_tp 7fc124c29700 / ms_dispatch 7fc125c2b700 / rocksdb:dump_st 7fc12822a700 / bstore_kv_sync 7fc128e05700 / safe_timer 7fc129e07700 / ms_dispatch 7fc12ca33700 / bstore_mempool 7fc133446700 / safe_timer 7fc1374bf700 / msgr-worker-2 7fc137cc0700 / msgr-worker-1 7fc1384c1700 / msgr-worker-0 max_recent 10000 max_new 1000 -- _____________________________________________________________ prof. dr. Andrej Filipcic, E-mail: Andrej.Filipcic(a)ijs.si Department of Experimental High Energy Physics - F9 Jozef Stefan Institute, Jamova 39, P.o.Box 3000 SI-1001 Ljubljana, Slovenia Tel.: +386-1-477-3674 Fax: +386-1-425-7074 -------------------------------------------------------------

3 years, 2 months

3
4
0 0

Re: Rados gateway basic pools missing

by St-Germain, Sylvain (SSC/SPC)

Ok I fix it works -----Message d'origine----- De : St-Germain, Sylvain (SSC/SPC) <sylvain.st-germain(a)canada.ca> Envoyé : 9 mars 2021 17:41 À : St-Germain, Sylvain (SSC/SPC) <sylvain.st-germain(a)canada.ca>; ceph-users(a)ceph.io Objet : RE: Rados gateway basic pools missing Ok in the interface when I create a bucket the index in created automatically 1 device_health_metrics 2 cephfs_data 3 cephfs_metadata 4 .rgw.root 5 default.rgw.log 6 default.rgw.control 7 default.rgw.meta 8 default.rgw.buckets.index * I think I just could not make an insertion using s3cmd List command - connection problem # s3cmd la !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! An unexpected error has occurred. Please try reproducing the error using the latest s3cmd code from the git master branch found at: https://github.com/s3tools/s3cmd and have a look at the known issues list: https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutio… If the error persists, please report the following lines (removing any private info as necessary) to: s3tools-bugs(a)lists.sourceforge.net !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Invoked as: /usr/bin/s3cmd la Problem: <class 'ConnectionRefusedError: [Errno 111] Connection refused S3cmd: 2.0.2 python: 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0] environment LANG=en_CA.UTF-8 Traceback (most recent call last): File "/usr/bin/s3cmd", line 3092, in <module> rc = main() File "/usr/bin/s3cmd", line 3001, in main rc = cmd_func(args) File "/usr/bin/s3cmd", line 164, in cmd_all_buckets_list_all_content response = s3.list_all_buckets() File "/usr/lib/python3/dist-packages/S3/S3.py", line 302, in list_all_buckets response = self.send_request(request) File "/usr/lib/python3/dist-packages/S3/S3.py", line 1258, in send_request conn = ConnMan.get(self.get_hostname(resource['bucket'])) File "/usr/lib/python3/dist-packages/S3/ConnMan.py", line 253, in get conn.c.connect() File "/usr/lib/python3.8/http/client.py", line 921, in connect self.sock = self._create_connection( File "/usr/lib/python3.8/socket.py", line 808, in create_connection raise err File "/usr/lib/python3.8/socket.py", line 796, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! An unexpected error has occurred. Please try reproducing the error using the latest s3cmd code from the git master branch found at: https://github.com/s3tools/s3cmd and have a look at the known issues list: https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutio… If the error persists, please report the above lines (removing any private info as necessary) to: s3tools-bugs(a)lists.sourceforge.net !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! -----Message d'origine----- De : St-Germain, Sylvain (SSC/SPC) <sylvain.st-germain(a)canada.ca> Envoyé : 9 mars 2021 17:19 À : ceph-users(a)ceph.io Objet : [ceph-users] Rados gateway basic pools missing Hi everyone, I just rebuild a (test) cluster using : OS : Ubuntu 20.04.2 LTS CEPH : ceph version 15.2.9 (357616cbf726abb779ca75a551e8d02568e15b17) octopus (stable) 3 nodes : monitor/storage 1. The cluster looks good : # ceph -s cluster: id: 9a89aa5a-1702-4f87-a99c-f94c9f2cdabd health: HEALTH_OK services: mon: 3 daemons, quorum dao-wkr-04,dao-wkr-05,dao-wkr-06 (age 7m) mgr: dao-wkr-05(active, since 8m), standbys: dao-wkr-04, dao-wkr-06 mds: cephfs:1 {0=dao-wkr-04=up:active} 2 up:standby osd: 9 osds: 9 up (since 7m), 9 in (since 4h) rgw: 3 daemons active (dao-wkr-04.rgw0, dao-wkr-05.rgw0, dao-wkr-06.rgw0) task status: data: pools: 7 pools, 121 pgs objects: 234 objects, 16 KiB usage: 9.0 GiB used, 2.0 TiB / 2.0 TiB avail pgs: 121 active+clean 2. except that the main pools for the radosgw are not there # sudo ceph osd lspools 1 device_health_metrics 2 cephfs_data 3 cephfs_metadata 4 .rgw.root 5 default.rgw.log 6 default.rgw.control 7 default.rgw.meta Missing : default.rgw.buckets.index & default.rgw.buckets.data What do you think ? Thx ! Sylvain _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

3 years, 2 months

1
0
0 0

buckets with negative num_objects

by Boris Behrens

Hi, I am in the process of resharding large buckets and to find them I ran radosgw-admin bucket limit check | grep '"fill_status": "OVER' -B5 and I see that there are two buckets with negative num_objects "bucket": "ncprod", "tenant": "", "num_objects": -482, "num_shards": 0, "objects_per_shard": -482, "fill_status": "OVER 100.000000%" -- "bucket": "fileshare-s3", "tenant": "", "num_objects": -137, "num_shards": 0, "objects_per_shard": -137, "fill_status": "OVER 100.000000%" Is this an error? -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groÃƒ¼en Saal.

3 years, 2 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users March 2021