Hi,
Are the creation of RBD volumes and RGW buckets audited? If yes, what do
the audit logs look like? Is there any documentation about it? I tried to
find the related audit logs from the "/var/log/ceph/ceph.audit.log" file
but didn't find any.
Thanks,
Jinhao
Are there alternatives to TheJJ balancer? I have a (temporary) rebalance
problem, and that code chokes[1].
Essentially, I have a few pgs in remapped+backfill_toofull, but plenty of
space in the parent's parent bucket(s).
[1] https://github.com/TheJJ/ceph-balancer/issues/23
On Wed, Dec 14, 2022 at 6:55 AM Denis Polom <denispolom(a)gmail.com> wrote:
> Hi,
>
> looks like TheJJ balancer solved the issue!
>
> Thx!
>
>
> On 11/9/22 13:35, Denis Polom wrote:
> > Hi Stefan,
> >
> > thank you for help. Looks very interesting and command you sent helps
> > to have better insight on that. Still wandering why some of OSDs keeps
> > primary for more PGs as others. I was thinking that balancer and CRUSH
> > should take care of that.
> >
> > I will try balancer you sent a link for and will post result. But this
> > will take more time as first I have to test it on some non-production
> > Ceph.
> >
> > Thx!
> >
> >
> > On 11/9/22 08:20, Stefan Kooman wrote:
> >> On 11/1/22 13:45, Denis Polom wrote:
> >>> Hi
> >>>
> >>> I observed on my Ceph cluster running latest Pacific that same size
> >>> OSDs are utilized differently even if balancer is running and
> >>> reports status as perfectly balanced.
> >>>
> >>
> >> That might be true because the primary PGs are not evenly balanced.
> >> You can check that with: ceph pg dump. The last output is the
> >> overview for how many PGs an OSD is primary for. To get more detail
> >> by pool you can run this (source: unknown, but it works :-)):
> >>
> >> "ceph pg dump | awk '
> >> BEGIN { IGNORECASE = 1 }
> >> /^PG_STAT/ { col=1; while($col!="UP") {col++}; col++ }
> >> /^[0-9a-f]+\.[0-9a-f]+/ { match($0,/^[0-9a-f]+/); pool=substr($0,
> >> RSTART, RLENGTH); poollist[pool]=0;
> >> up=$col; i=0; RSTART=0; RLENGTH=0; delete osds;
> >> while(match(up,/[0-9]+/)>0) { osds[++i]=substr(up,RSTART,RLENGTH); up
> >> = substr(up, RSTART+RLENGTH) }
> >> for(i in osds) {array[osds[i],pool]++; osdlist[osds[i]];}
> >> }
> >> END {
> >> printf("\n");
> >> printf("pool :\t"); for (i in poollist) printf("%s\t",i); printf("|
> >> SUM \n");
> >> for (i in poollist) printf("--------"); printf("----------------\n");
> >> for (i in osdlist) { printf("osd.%i\t", i); sum=0;
> >> for (j in poollist) { printf("%i\t", array[i,j]); sum+=array[i,j];
> >> sumpool[j]+=array[i,j] }; printf("| %i\n",sum) }
> >> for (i in poollist) printf("--------"); printf("----------------\n");
> >> printf("SUM :\t"); for (i in poollist) printf("%s\t",sumpool[i]);
> >> printf("|\n");
> >> }'"
> >>
> >> 11/15/2022 14:35 UTC there is a talk about this: New workload
> >> balancer in Ceph (Ceph virtual 2022).
> >>
> >> The balancer made by Jonas Jelten works very well for us (though does
> >> not balance primary PGs): https://github.com/TheJJ/ceph-balancer. It
> >> outperforms the ceph-balancer module by far. And had faster
> >> convergence. This is true up to and including octopus release.
> >>
> >> Gr. Stefan
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
--
Jeremy Austin
jhaustin(a)gmail.com
Hello,
I'm currently investigating a downed ceph cluster that I cannot communicate with.
Setup is:
3 hosts with each 12 disks (osd/mon)
3 vm's with mon/mds/mgr
The vm's are unavailable at the moment and one of the hosts is online with osd/mon running.
When issuing the command ceph -s nothing happens and after 5 minutes the following.
2023-01-26T12:40:08.111+0100 7f68b4b8b700 0 monclient(hunting): authenticate timed out after 300
What would be the best way of troubleshooting this?
Venlig hilsen - Mit freundlichen Grüßen - Kind Regards,
Jens Galsgaard
Hi everyone,
Ceph Days are coming to Southern California, co-located with our friends at
SCALE - the Southern California Linux Expo! The event will be a full day of
Ceph content on March 9.
The CFP ends on *2023-02-02*, so start drafting and completing your
proposals quickly.
https://survey.zohopublic.com/zs/3UBUpChttps://ceph.io/en/community/events/2023/ceph-days-socal/
Here are some suggested topics:
- Ceph operations, management, and development
- New and proposed Ceph features, development status
- Ceph development roadmap
- Best practices
- Ceph use-cases, solution architectures, and user experiences
- Ceph performance and optimization
- Platform Integrations
- Kubernetes, OpenShift
- OpenStack (Cinder, Manila, etc.)
- Spark
- Multi-site and multi-cluster data services
- Persistent memory, ZNS SSDs, SMR HDDs, DPUs, and other new hardware
technologies
- Storage management, monitoring, and deployment automation
- Experiences deploying and operating Ceph in production and/or at scale
- Small-scale or edge deployments
- Long-term, archival storage
- Data compression, deduplication, and storage optimization
- Developer processes, tools, challenges
- Ceph testing infrastructure, tools
- Ceph community issues, outreach, and project governance
- Ceph documentation, training, and learner experience
--
Mike Perez
There are other Ceph speaking opportunities to consider:
- Ceph Days NYC <https://ceph.io/en/community/events/2023/ceph-days-nyc/> -
February 21st, 2023 - Schedule and registration are available
- Ceph Days Southern California
<https://ceph.io/en/community/events/2023/ceph-days-socal/> - March 9th,
2023 - CFP open until *February 2nd*.
- Cephalocon 2023 <https://events.linuxfoundation.org/cephalocon/>
(co-located
with KubeCon in Amsterdam) - April 16 - 18 - CFP now available!
- Ceph Tech Talk (virtual) <https://ceph.io/en/community/tech-talks/> -
Monthly
Make sure to join our Announcement list or social media for further updates
on events
- Ceph Announcement list
<https://lists.ceph.io/postorius/lists/ceph-announce.ceph.io/>
- Twitter <https://twitter.com/ceph>
- LinkedIn <https://www.linkedin.com/company/ceph/>
- FaceBook <https://www.facebook.com/cephstorage/>
Hello,
We have a couple of RBD images in a pool that are unable to be deleted. The user attempted to delete these volumes , while we were in the middle of a ceph minor version upgrade (where ceph processes restart). I am suspecting that during one of the service restarts (probably monitor?), the image deletion only got half way, and let these images in a bad state. It looks like the images were scheduled to move to the trash (from the rbd rm), but did not make it. Their omap values still exist in the trash though.
Any suggestions on how these can be cleaned up? We are running Ceph 15.2.17 (now , after the upgrade from Ceph 15.2.15/16). Thank you
#rbd rm 418824e0-576a-4167-957a-5f3fa6f2693a -p <pool name>
Removing image: 0% complete...failed.
rbd: delete error: (117) Structure needs cleaning
2023-01-26T15:14:40.258+0000 7f2677dfb380 -1 librbd::api::Trash: remove: error: image is pending moving to the trash.
#rbd -p <pool name> trash ls
no output
#rbd -p <pool> ls | grep -i 418824e0-576a-4167-957a-5f3fa6f2693a
418824e0-576a-4167-957a-5f3fa6f2693a
#rbd info <pool name>/418824e0-576a-4167-957a-5f3fa6f2693a
rbd: error opening image 418824e0-576a-4167-957a-5f3fa6f2693a: (2) No such file or directory
#rados -p <pool name> listomapvals rbd_trash
id_db434b7d01af76
value (64 bytes) :
00000000 02 01 3a 00 00 00 03 24 00 00 00 34 31 38 38 32 |..:....$...41882|
00000010 34 65 30 2d 35 37 36 61 2d 34 31 36 37 2d 39 35 |4e0-576a-4167-95|
00000020 37 61 2d 35 66 33 66 61 36 66 32 36 39 33 61 0e |7a-5f3fa6f2693a.|
00000030 41 22 63 06 f7 bf 23 0e 41 22 63 06 f7 bf 23 01 |A"c...#.A"c...#.|
00000040
id_db5431429f149d
value (64 bytes) :
00000000 02 01 3a 00 00 00 03 24 00 00 00 33 65 31 63 65 |..:....$...3e1ce|
00000010 61 34 34 2d 30 31 63 35 2d 34 62 65 39 2d 39 31 |a44-01c5-4be9-91|
00000020 32 30 2d 37 65 64 64 62 66 37 65 39 65 32 63 0e |20-7eddbf7e9e2c.|
00000030 41 22 63 97 2b 3e 20 0e 41 22 63 97 2b 3e 20 01 |A"c.+> .A"c.+> .|
00000040
Important Notice: This email is intended to be received only by persons entitled to receive the confidential and legally privileged information it presumptively contains, and this notice constitutes identification as such. Any reading, disclosure, copying, distribution or use of this information by or to someone who is not the intended recipient, is prohibited. If you received this email in error, please notify us immediately at legal(a)kaseya.com, and then delete it. To opt-out of receiving emails Please click here<https://info.kaseya.com/email-subscription-center.html>. The term 'this e-mail' includes any and all attachments.
Dear all,
I have a two hosts setup, and I recently rebooted a mgr machine without
"set noout" and "set norebalance" commands.
The "darkside2" machine is the cephadm machine, and "darkside3" is the
improperly rebooted mgr.
Now the darkside3 machine does not resume ceph configuration:
[root@darkside2 ~]# ceph orch host ls
HOST ADDR LABELS STATUS
darkside2 darkside2
darkside3 172.22.132.189 Offline
If I understood the docs correctly, I should
[root@darkside2 ~]# ceph orch host add darkside3
But this fails because darkside3 doesn't accept root ssh connnections.
I presume this has been discussed before, but I couldn't find the
correct thread. Could someone please point me in the right direction?
Cordially,
Renata.
Hi,
Trying to dist-upgrade an osd server this morning and lots of necessary
packages have been removed!
Start-Date: 2023-01-26 10:04:57
Commandline: apt dist-upgrade
Install: linux-image-5.10.0-21-amd64:amd64 (5.10.162-1, automatic)
Upgrade: librados2:amd64 (16.2.10-1~bpo11+1, 16.2.11-1~bpo11+1),
ceph-mgr-modules-core:amd64 (16.2.10-1~bpo11+1, 16.2.11-1~bpo11+1),
librbd1:amd64 (16.2.10-1~bpo11+1, 16.2.11-1~bpo11+1), librgw2:amd64
(16.2.10-1~bpo11+1, 16.2.11-1~bpo11+1), linux-image-amd64:amd64
(5.10.158-2, 5.10.162-1), libradosstriper1:amd64 (16.2.10-1~bpo11+1,
16.2.11-1~bpo11+1), cephadm:amd64 (16.2.10-1~bpo11+1,
16.2.11-1~bpo11+1), linux-libc-dev:amd64 (5.10.158-2, 5.10.162-1)
Remove: ceph-base:amd64 (16.2.10-1~bpo11+1),
linux-image-5.10.0-15-amd64:amd64 (5.10.120-1), ceph-mgr-cephadm:amd64
(16.2.10-1~bpo11+1), ceph-common:amd64 (16.2.10-1~bpo11+1),
radosgw:amd64 (16.2.10-1~bpo11+1), linux-image-5.10.0-17-amd64:amd64
(5.10.136-1), ceph-mds:amd64 (16.2.10-1~bpo11+1), ceph-mgr:amd64
(16.2.10-1~bpo11+1), ceph-mon:amd64 (16.2.10-1~bpo11+1), ceph-osd:amd64
(16.2.10-1~bpo11+1), ceph-mgr-diskprediction-local:amd64
(16.2.10-1~bpo11+1), ceph:amd64 (16.2.10-1~bpo11+1),
ceph-mgr-dashboard:amd64 (16.2.10-1~bpo11+1), ceph-mgr-k8sevents:amd64
(16.2.10-1~bpo11+1)
End-Date: 2023-01-26 10:06:26
Seems ceph-common depends on packages which aren't (yet) in the repo
The following packages have unmet dependencies:
ceph-common : Depends: python3-cephfs (= 16.2.11-1~bpo11+1) but it is
not going to be installed
Depends: python3-rados (= 16.2.11-1~bpo11+1) but
16.2.10-1~bpo11+1 is to be installed
Depends: python3-rbd (= 16.2.11-1~bpo11+1) but
16.2.10-1~bpo11+1 is to be installed
Depends: python3-rgw (= 16.2.11-1~bpo11+1) but
16.2.10-1~bpo11+1 is to be installed
E: Unable to correct problems, you have held broken packages.
Is anyone aware of this?
Thanks
--
All postal correspondence to:
The Positive Internet Company, 24 Ganton Street, London. W1F 7QY
*Follow us on Twitter* @posipeople
The Positive Internet Company Limited is registered in England and Wales.
Registered company number: 3673639. VAT no: 726 7072 28.
Registered office: Northside House, Mount Pleasant, Barnet, Herts, EN4 9EE.
Cephalocon Is Coming to
Amsterdam
[image: Logo]
https://events.linuxfoundation.org/cephalocon/[image: Cephalocon April
16-18, 2023 in Amsterdam, Netherlands]
Cephalocon Is Coming to Amsterdam
Cephalocon 2023 <https://events.linuxfoundation.org/cephalocon> is coming
to Amsterdam on April 16-18, co-located with KubeCon!
Cephalocon is the premier yearly event that brings together the global
community of operators, developers, and researchers for Ceph, the open
source distributed storage system designed to provide excellent
performance, reliability, and scalability. Join new and existing community
members from around the world to learn more about Ceph and the future of
the project from the developers writing the code and the operators
deploying it at scale.
The CFP is now open
<https://events.linuxfoundation.org/cephalocon/program/cfp/>, and
registration will be available coming soon!
Here are some important dates:
- *CFP Closes:* Sunday, February 12 at 11:59 pm PST
- *CFP Notifications:* Wednesday, February 22
- *Schedule Announcement:* Monday, February 27
- *Presentation Slide Due Date: *Wednesday, April 12
- *Event Dates: *Monday, April 17 – Tuesday, April 18 (Developer Summit
Sunday, April 16)
The sponsorship prospectus is available
<https://events.linuxfoundation.org/wp-content/uploads/2023/01/sponsor-ceph-…>.
Contact sponsorships(a)ceph.foundation to secure your sponsorship, request
additional details, or discuss custom options.
Thank you to our event planners and everyone who has contributed to
planning discussions. We look forward to seeing you all soon!
------
There are other Ceph speaking opportunities to consider:
- Ceph Days NYC <https://ceph.io/en/community/events/2023/ceph-days-nyc/> -
February 21st, 2023 - Schedule and registration are available
- Ceph Days Southern California
<https://ceph.io/en/community/events/2023/ceph-days-socal/> - March 9th,
2023 - CFP open until *February 2nd*.
- Ceph Tech Talk (virtual) <https://ceph.io/en/community/tech-talks/> -
Monthly
SUBMIT YOUR PROPOSAL <https://events.linuxfoundation.org/cephalocon/>
[image: Link icon] <https://ceph.io>
[image: Twitter icon] <https://twitter.com/ceph>
[image: YouTube icon] <https://youtube.com/c/cephstorage>
[image: LinkedIn icon] <https://www.linkedin.com/company/ceph/>
[image: Facebook icon] <https://www.facebook.com/cephstorage/>
jee74vonjgtz
We're happy to announce the 11th backport release in the Pacific series.
We recommend users to update to this release. For detailed release
notes with links & changelog please refer to the official blog entry
at https://ceph.io/en/news/blog/2023/v16-2-11-pacific-released
Notable Changes
---------------
* Cephfs: The 'AT_NO_ATTR_SYNC' macro is deprecated, please use the standard
'AT_STATX_DONT_SYNC' macro. The 'AT_NO_ATTR_SYNC' macro will be removed in
the future.
* Trimming of PGLog dups is now controlled by the size instead of the version.
This fixes the PGLog inflation issue that was happening when the on-line
(in OSD) trimming got jammed after a PG split operation. Also, a new off-line
mechanism has been added: `ceph-objectstore-tool` got `trim-pg-log-dups` op
that targets situations where OSD is unable to boot due to those
inflated dups.
If that is the case, in OSD logs the "You can be hit by THE DUPS BUG" warning
will be visible.
Relevant tracker: https://tracker.ceph.com/issues/53729
* RBD: `rbd device unmap` command gained `--namespace` option. Support for
namespaces was added to RBD in Nautilus 14.2.0 and it has been possible to
map and unmap images in namespaces using the `image-spec` syntax since then
but the corresponding option available in most other commands was missing.
Getting Ceph
------------
* Git at git://github.com/ceph/ceph.git
* Tarball at https://download.ceph.com/tarballs/ceph-16.2.11.tar.gz
* Containers at https://quay.io/repository/ceph/ceph
* For packages, see https://docs.ceph.com/en/latest/install/get-packages/
* Release git sha1: 3cf40e2dca667f68c6ce3ff5cd94f01e711af894
Hi team,
We have a ceph cluster with 3 storage nodes:
1. storagenode1 - abcd:abcd:abcd::21
2. storagenode2 - abcd:abcd:abcd::22
3. storagenode3 - abcd:abcd:abcd::23
We have a dns server with ip abcd:abcd:abcd::31 which resolves the above ip's with a single hostname.
The resolution is as follows:
```
$TTL 1D
@ IN SOA storage.com root (
6 ; serial
1D ; refresh
1H ; retry
1W ; expire
3H ) ; minimum
IN NS master
master IN A 10.0.1.31
storagenode IN AAAA abcd:abcd:abcd::21
storagenode IN AAAA abcd:abcd:abcd::22
storagenode IN AAAA abcd:abcd:abcd::23
```
We want to mount the ceph storage on a node using this hostname.
For this we are using the command:
```
mount -t ceph [storagenode.storage.com]:6789:/ /backup -o name=admin,secret=AQCM+8hjqzuZEhAAcuQc+onNKReq7MV+ykFirg==
```
We are getting the following logs in /var/log/messages:
```
Jan 24 17:23:17 localhost kernel: libceph: resolve 'storagenode.storage.com' (ret=-3): failed
Jan 24 17:23:17 localhost kernel: libceph: parse_ips bad ip 'storagenode.storage.com:6789'
```
We also tried mounting ceph storage by removing the dns server and resolving the ip as follows:
```
abcd:abcd:abcd::21 storagenode1
```
But we are getting similar results.
Also kindly note that we are able to perform the mount operation if we use ips instead of domain name.
Could you please help us out with how we can mount ceph using FQDN.
Kindly let me know if any other imformation is required.
My ceph.conf configuration is as follows:
```
[global]
ms bind ipv6 = true
ms bind ipv4 = false
mon initial members = storagenode1,storagenode2,storagenode3
osd pool default crush rule = -1
fsid = 7969b8a3-1df7-4eae-8ccf-2e5794de87fe
mon host = [v2:[abcd:abcd:abcd::21]:3300,v1:[abcd:abcd:abcd::21]:6789],[v2:[abcd:abcd:abcd::22]:3300,v1:[abcd:abcd:abcd::22]:6789],[v2:[abcd:abcd:abcd::23]:3300,v1:[abcd:abcd:abcd::23]:6789]
public network = abcd:abcd:abcd::/64
cluster network = eff0:eff0:eff0::/64
[osd]
osd memory target = 4294967296
[client.rgw.storagenode1.rgw0]
host = storagenode1
keyring = /var/lib/ceph/radosgw/ceph-rgw.storagenode1.rgw0/keyring
log file = /var/log/ceph/ceph-rgw-storagenode1.rgw0.log
rgw frontends = beast endpoint=[abcd:abcd:abcd::21]:8080
rgw thread pool size = 512
```
Thanks and Regards