I have an issue on ceph-iscsi ( ubuntu 20 LTS and Ceph 15.2.6) after I restart rbd-target-api, it fails and not starting again:
```
sudo systemctl status rbd-target-api.service
● rbd-target-api.service - Ceph iscsi target configuration API
Loaded: loaded (/lib/systemd/system/rbd-target-api.service; enabled; vendor preset: enabled)
Active: deactivating (stop-sigterm) since Sat 2020-11-28 17:01:40 +0330; 20s ago
Main PID: 37651 (rbd-target-api)
Tasks: 55 (limit: 9451)
Memory: 141.4M
CGroup: /system.slice/rbd-target-api.service
├─15289 /usr/bin/python3 /usr/bin/rbd-target-api
└─37651 /usr/bin/python3 /usr/bin/rbd-target-api
Nov 28 14:36:53 dev11 systemd[1]: Started Ceph iscsi target configuration API.
Nov 28 14:36:54 dev11 rbd-target-api[37651]: Started the configuration object watcher
Nov 28 14:36:54 dev11 rbd-target-api[37651]: Processing osd blacklist entries for this node
Nov 28 14:36:54 dev11 rbd-target-api[37651]: Checking for config object changes every 1s
Nov 28 14:36:55 dev11 rbd-target-api[37651]: Reading the configuration object to update local LIO configuration
Nov 28 14:36:55 dev11 rbd-target-api[37651]: Processing Gateway configuration
Nov 28 14:36:55 dev11 rbd-target-api[37651]: Setting up iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 14:36:55 dev11 rbd-target-api[37651]: (Gateway.load_config) successfully loaded existing target definition
Nov 28 17:01:40 dev11 systemd[1]: Stopping Ceph iscsi target configuration API...
```
journalctl:
```
Nov 28 17:00:01 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:01 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:04 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:04 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:06 dev11 ceph-mgr[3184]: [172.16.1.3:57002] [GET] [500] [45.074s] [admin] [513.0B] /api/health/minimal
Nov 28 17:00:06 dev11 ceph-mgr[3184]: [b'{"status": "500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request_id": "68eed46b-3ece-4e60-bc17-a172358f2d76"} ']
Nov 28 17:00:06 dev11 ceph-mgr[3184]: [172.16.1.3:60128] [GET] [500] [45.070s] [admin] [513.0B] /api/health/minimal
Nov 28 17:00:06 dev11 ceph-mgr[3184]: [b'{"status": "500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request_id": "5b6fdaa2-dc70-48a7-b01f-ca554ecfec41"} ']
Nov 28 17:00:07 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:07 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:11 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:11 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:11 dev11 ceph-mgr[3184]: ::ffff:127.0.0.1 - - [28/Nov/2020:17:00:11] "GET /metrics HTTP/1.1" 200 151419 "" "Prometheus/2.7.2"
Nov 28 17:00:14 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:14 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:17 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:17 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:20 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:20 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:22 dev11 ceph-mgr[3184]: [172.16.1.3:59834] [GET] [500] [45.062s] [admin] [513.0B] /api/health/minimal
Nov 28 17:00:22 dev11 ceph-mgr[3184]: [b'{"status": "500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request_id": "1ba61331-1dfd-43e7-8ced-9f28aeb8a39c"} ']
Nov 28 17:00:23 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:23 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:26 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:26 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:26 dev11 ceph-mgr[3184]: ::ffff:127.0.0.1 - - [28/Nov/2020:17:00:26] "GET /metrics HTTP/1.1" 200 151420 "" "Prometheus/2.7.2"
Nov 28 17:00:27 dev11 ceph-mgr[3184]: [172.16.1.3:60132] [GET] [500] [45.081s] [admin] [513.0B] /api/health/minimal
Nov 28 17:00:27 dev11 ceph-mgr[3184]: [b'{"status": "500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request_id": "9c1dd49b-07fb-4c49-a033-f5d8d82d9cbe"} ']
Nov 28 17:00:29 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:29 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:32 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:32 dev11 kernel: iSCSI Login negotiation failed.
Nov 28 17:00:35 dev11 kernel: Unable to locate Target Portal Group on iqn.2003-01.com.redhat.iscsi-gw:iscsi-igw
Nov 28 17:00:35 dev11 kernel: iSCSI Login negotiation failed.
```
What should I do for this problem?
Hi all,
Since Ceph-Dashboard RESTful API
<https://docs.ceph.com/en/latest/mgr/ceph_api/> is becoming the official
RESTful API for Ceph (starting Pacific release), the proposal is to
mark RESTful
Module <https://docs.ceph.com/en/latest/mgr/restful/> for deprecation by
Pacific and for removal on Q-release.
You may find a detailed feature-gap analysis in this tracker issue
<https://tracker.ceph.com/issues/47066>. We'd like to know about existing
users of the RESTful module and their concerns and suggestions for this
proposal.
Thank you!
Kind regards,
Ernesto
Den fre 27 nov. 2020 kl 23:21 skrev Marc Roos <M.Roos(a)f1-outsourcing.eu>:
> Is there a best practice or guide for backuping rbd images?
>
One would think that most things that apply to an iscsi mounted device
would be equally valid for an RBD mount, so you might look into that for
hints and tips on how to backup remote network data devices if you are
seeing this from the mounting client point of view.
If now, it probably matters what you are aiming for since "backup" is quite
a wide concept apart from "copy of my data".
Is it best practice for storing sparse images for conserving backup space,
quick restores, valid images, legal archiving demands, or "want to try this
weird update and be able to move backwards 30 minutes" ?
The solution will be quite different depending on what the problem is, more
than perhaps "what the mountpoint type is".
--
May the most significant bit of your life be positive.
Thank You for response, how I can upload this to metadata? Is this
operation safe?
Regards
Mateusz Skała
W dniu sob., 21.11.2020 o 18:01 Amit Ghadge <amitg.b14(a)gmail.com>
napisał(a):
> I go through this and you need to update bucket metadata, radosgw-admin
> metadata get bucket.instance:bucket:xxx > bucket.json, update two parameter
> I don't remember but it's look reshard: false and next_marker set empty.
>
> -AmitG
> On Sat, 21 Nov, 2020, 2:04 PM Mateusz Skała, <mateusz.skala(a)gmail.com>
> wrote:
>
>> Hello Community.
>> I need Your help. Few days ago I started manual resharding of one bucket
>> with large objects. Unfortunately I interrupted this by Ctrl+c. At now I
>> can’t start this process again.
>> There is message:
>> # radosgw-admin bucket reshard --bucket objects --num-shards 2
>> ERROR: the bucket is currently undergoing resharding and cannot be added
>> to the reshard list at this time
>>
>> But list of reshard process is empty:
>> # radosgw-admin reshard list
>> []
>>
>> # radosgw-admin reshard status --bucket objects
>> [
>> {
>> "reshard_status": "not-resharding",
>> "new_bucket_instance_id": "",
>> "num_shards": -1
>> }
>> ]
>>
>> How can I fix this situation ? How to restore possibility resharding this
>> bucket?
>> And BTW is resharding process locking writes/reads on bucket?
>> Regards
>> Mateusz Skała
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
>
Hi,
Sorry to bother you all.
It’s a home server setup.
Three nodes (ODROID-H2+ with 32GB RAM and dual 2.5Gbit NICs), two 14TB
7200rpm SATA drives and an Optane 118GB NVMe in each node (OS boots from
eMMC).
Only CephFS, I'm anticipating having 50-200K files when the 50TB (4+2 EC)
is full.
I'm trying to address the issue of really big OSD's, not my words, a
Redditor:
"When you write an object to a drive with collocated db and raw space, the
disk has to read/write to both sections before acking a write. That's a lot
to ask a 7200 disk to handle gracefully. I believe Red Hat only supports up
to 8TB because of performance concerns with larger disks. I may be wrong,
but once you are shuffling through 6-10TB of data I'd think those disks are
gonna be bogged down in seek time."
So I want to have my two DB's on the Optane to avoid the above, am I making
sense?
OK, so large files have lower metadata overhead than small files. This is
for a media library this probably means super low overhead, one guy I spoke
to had similar setup and for 48TB used he had a 2.6GB DB?
Is there a rough CephFS calculation (each file uses x bytes of metadata), I
think I should be safe with 30GB, now I read I should double that (you
should allocate twice the size of the biggest layer to allow for
compaction) but I only have 118GB and two OSDs so I will have to go for
59GB (or whatever will fit)?
I'm thinking that I might not even use 3GB but to be safe I’ll make it
30GB, lets say it settles at 5GB, when compaction comes that means I will
only need 20GB and therefore no spillover?
I realise that if the Optane dies both OSD's go with it, do I have to
configure anything special there (CRUSH would just handle it)?
Being that this a home deployment will Ceph be OK with an occasional power
outage, I mean if a bluray mkv gets corrupted ripping it again is an easy
fix, also I will backup the clusters data once a month?
Below are the commands I got from the website:
Create the volume groups:
$ vgcreate ceph-block-0 /dev/sda
$ vgcreate ceph-block-1 /dev/sdb
Create the logical volumes:
$ lvcreate -l 100%FREE -n block-0 ceph-block-0
$ lvcreate -l 100%FREE -n block-1 ceph-block-1
Create db logical volumes (118GB Optane)
$ vgcreate ceph-db-0 /dev/sdc
$ lvcreate -L 59GB -n db-0 ceph-db-0
$ lvcreate -L 59GB -n db-1 ceph-db-0
Create the OSDs
$ ceph-volume lvm create --bluestore --data ceph-block-0/block-0 --block.db
ceph-db-0/db-0
$ ceph-volume lvm create --bluestore --data ceph-block-1/block-1 --block.db
ceph-db-0/db-1
Thanks.
Richard
Hello,
In our environment we have a user that has a leading whitespace in the UID. I don’t know how it was created, however I am unable to GET or DELETE it either using `radosgw-admin` or the Admin API:
# radosgw-admin user list | grep rgw
" rgw-prometheus",
"rgw-prometheus",
When I try to get info of the user directly, I get:
# radosgw-admin user info --uid=" rgw-prometheus"
could not fetch user info: no user info saved
I’ve tried using the admin API as well while URL encoding the whitespace to %20 and get “Invalid Argument”.
This user is not important in any way, however it creates issues trying to monitor the rgw usage logs using https://github.com/blemmenes/radosgw_usage_exporter
I’ve considered modifying the script to ignore that user, but clearly Ceph is having troubles addressing it as well, so I figured I’d try to get to the bottom of how to remove this user.
We are running 12.2.11 currently, however this cluster was built on 12.2.5 and I’m 99% certain the user was created in 12.2.5
Thanks,
Ben
Hi all,
Does this project work with the latest zipkin apis?
https://github.com/ceph/babeltrace-zipkin
Also what do you prefer to trace requests for rgw and rbd in ceph?
Thanks.
Hi all,
In reference to:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/Y2KTC7RXQYW…
We are seeing similar behavior with public Swift bucket access being broken.
In this case RadosGW Nautilus integrated to OpenStack Queens Keystone.
Public Swift containers have worked fine from Luminous era up to Nautilus
14.2.11, and started to break when upgrading RadosGW to 14.2.12 or newer.
Unsure if this is related to the backport of "rgw: Swift API anonymous access
should 401 (pr#37438", or some other rgw change within 14.2.12.
I believe the following ceph.conf we use is relevant:
rgw_swift_account_in_url = true
rgw_keystone_implicit_tenants = false
As well as the configured endpoint format:
https://fqdn:443/swift/v1/AUTH_%(tenant_id)s
Steps to reproduce:
Horizon:
--------
1) Public container access
- Create a container with "Container Access" set to Public
- Click on the Horizon provided Link which is of the format https://fqdn/swift/v1/AUTH_projectUUID/public-test-container/
Expected result: Empty bucket listing
Actual result: "AccessDenied"
2) Public object access
- Upload an object to the public container
- Try to access the object via unauthenticated browser session
Expected result: Object downloaded or loaded into browser
Actual result: "NoSuchBucket"
Also getting similar behavior with Swift CLI tools (ACL '.r:*') from what I
can see.
Any suggestions how to troubleshoot further?
Happy to provide more debug log and configuration details if need be, as well
as pointers if something might be actually wrong in our configuration.
Also, apologies for the possible double post - I tried to first submit via the
hyperkitty web form but that post seems to have gone into a black hole.
BR,
Jukka
Hi.
Thank You, I will try this solution probably on Sunday. Will write results here.
Regards
Mateusz Skała
> Wiadomość napisana przez Amit Ghadge <amitg.b14(a)gmail.com> w dniu 24.11.2020, o godz. 11:12:
>
> Sorry for delay reply, I never tried on production but after revert the changes I can able to reshard again
> You get same bucket two entries from metadata,
> radosgw-admin metadata list bucket.instance | grep bucket
> You now the older bucket entry then update those two parameter first,
> radosgw-admin metadata get bucket.instance:bucket:<id> > bucket.json
> Set reshard_status to 0 and new_bucket_instance_id to ""
> Update bucket instance by, radosgw-admin metadata put bucket.instance:bucket:<id> < bucket.json
>
> On Sun, Nov 22, 2020 at 6:04 PM Mateusz Skała <mateusz.skala(a)gmail.com <mailto:mateusz.skala@gmail.com>> wrote:
> Thank You for response, how I can upload this to metadata? Is this operation safe?
> Regards
> Mateusz Skała
>
> W dniu sob., 21.11.2020 o 18:01 Amit Ghadge <amitg.b14(a)gmail.com <mailto:amitg.b14@gmail.com>> napisał(a):
> I go through this and you need to update bucket metadata, radosgw-admin metadata get bucket.instance:bucket:xxx > bucket.json, update two parameter I don't remember but it's look reshard: false and next_marker set empty.
>
> -AmitG
> On Sat, 21 Nov, 2020, 2:04 PM Mateusz Skała, <mateusz.skala(a)gmail.com <mailto:mateusz.skala@gmail.com>> wrote:
> Hello Community.
> I need Your help. Few days ago I started manual resharding of one bucket with large objects. Unfortunately I interrupted this by Ctrl+c. At now I can’t start this process again.
> There is message:
> # radosgw-admin bucket reshard --bucket objects --num-shards 2
> ERROR: the bucket is currently undergoing resharding and cannot be added to the reshard list at this time
>
> But list of reshard process is empty:
> # radosgw-admin reshard list
> []
>
> # radosgw-admin reshard status --bucket objects
> [
> {
> "reshard_status": "not-resharding",
> "new_bucket_instance_id": "",
> "num_shards": -1
> }
> ]
>
> How can I fix this situation ? How to restore possibility resharding this bucket?
> And BTW is resharding process locking writes/reads on bucket?
> Regards
> Mateusz Skała
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io <mailto:ceph-users@ceph.io>
> To unsubscribe send an email to ceph-users-leave(a)ceph.io <mailto:ceph-users-leave@ceph.io>