ceph-users December 2020

ceph-users@ceph.io

120 participants
146 discussions

Upgrade to 15.2.7 fails on mixed x86_64/arm64 cluster

by Bryan Stillwell

I tried upgrading my home cluster to 15.2.7 (from 15.2.5) today and it appears to be entering a loop when trying to match docker images for ceph:v15.2.7: 2020-12-01T16:47:26.761950-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mgr daemons... 2020-12-01T16:47:26.769581-0700 mgr.aladdin.liknom [INF] Upgrade: All mgr daemons are up to date. 2020-12-01T16:47:26.770096-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mon daemons... 2020-12-01T16:47:28.800426-0700 mgr.aladdin.liknom [INF] Upgrade: All mon daemons are up to date. 2020-12-01T16:47:28.800878-0700 mgr.aladdin.liknom [INF] Upgrade: Checking crash daemons... 2020-12-01T16:47:28.851819-0700 mgr.aladdin.liknom [INF] Upgrade: Setting container_image for all crash... 2020-12-01T16:47:28.855595-0700 mgr.aladdin.liknom [INF] Upgrade: All crash daemons are up to date. 2020-12-01T16:47:28.856283-0700 mgr.aladdin.liknom [INF] Upgrade: Checking osd daemons... 2020-12-01T16:47:31.348345-0700 mgr.aladdin.liknom [INF] Upgrade: Pulling docker.io/ceph/ceph:v15.2.7 on mandalaybay 2020-12-01T16:47:35.311065-0700 mgr.aladdin.liknom [INF] Upgrade: image docker.io/ceph/ceph:v15.2.7 pull on mandalaybay got new image 9a0677fecc08d155a8e643b37c6e97d45c04747d9cb9455cafe0a7590d00b959 (not 2bc420ddb175bd1cf9031387948a8812d1bda9ef1180e429b4704e3c06bb943e), restarting 2020-12-01T16:47:35.534893-0700 mgr.aladdin.liknom [INF] Upgrade: Target is docker.io/ceph/ceph:v15.2.7 with id 9a0677fecc08d155a8e643b37c6e97d45c04747d9cb9455cafe0a7590d00b959 2020-12-01T16:47:35.546444-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mgr daemons... 2020-12-01T16:47:35.547185-0700 mgr.aladdin.liknom [INF] Upgrade: Need to upgrade myself (mgr.aladdin.liknom) 2020-12-01T16:47:37.506337-0700 mgr.aladdin.liknom [INF] Upgrade: Pulling docker.io/ceph/ceph:v15.2.7 on ether 2020-12-01T16:47:40.770290-0700 mgr.aladdin.liknom [INF] Upgrade: image docker.io/ceph/ceph:v15.2.7 pull on ether got new image 2bc420ddb175bd1cf9031387948a8812d1bda9ef1180e429b4704e3c06bb943e (not 9a0677fecc08d155a8e643b37c6e97d45c04747d9cb9455cafe0a7590d00b959), restarting 2020-12-01T16:47:41.172402-0700 mgr.aladdin.liknom [INF] Upgrade: Target is docker.io/ceph/ceph:v15.2.7 with id 2bc420ddb175bd1cf9031387948a8812d1bda9ef1180e429b4704e3c06bb943e 2020-12-01T16:47:41.226550-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mgr daemons... 2020-12-01T16:47:41.230932-0700 mgr.aladdin.liknom [INF] Upgrade: All mgr daemons are up to date. 2020-12-01T16:47:41.231887-0700 mgr.aladdin.liknom [INF] Upgrade: Checking mon daemons... 2020-12-01T16:47:43.179844-0700 mgr.aladdin.liknom [INF] Upgrade: All mon daemons are up to date. 2020-12-01T16:47:43.180305-0700 mgr.aladdin.liknom [INF] Upgrade: Checking crash daemons... 2020-12-01T16:47:43.187481-0700 mgr.aladdin.liknom [INF] Upgrade: Setting container_image for all crash... 2020-12-01T16:47:43.191821-0700 mgr.aladdin.liknom [INF] Upgrade: All crash daemons are up to date. 2020-12-01T16:47:43.192290-0700 mgr.aladdin.liknom [INF] Upgrade: Checking osd daemons... 2020-12-01T16:47:45.692126-0700 mgr.aladdin.liknom [INF] Upgrade: Pulling docker.io/ceph/ceph:v15.2.7 on mandalaybay 2020-12-01T16:47:50.679789-0700 mgr.aladdin.liknom [INF] Upgrade: image docker.io/ceph/ceph:v15.2.7 pull on mandalaybay got new image 9a0677fecc08d155a8e643b37c6e97d45c04747d9cb9455cafe0a7590d00b959 (not 2bc420ddb175bd1cf9031387948a8812d1bda9ef1180e429b4704e3c06bb943e), restarting The machines 'ether' and 'aladdin' are x86_64 machines, but 'mandalaybay' is a raspberry pi 4 (arm64). Is there a way to bypass this check to allow me to finish upgrading the cluster? Thanks, Bryan

3 years, 4 months

Determine effective min_alloc_size for a specific OSD

by 胡玮文

Hi all, I’ve read from this mail list that too high bluestore_min_alloc_size will result in too much space wasted if I have many small objects. But too low bluestore_min_alloc_size will reduce performance. I’ve also read that this config can’t be changed after OSD creation. Now I want to tune this config myself. I change bluestore_min_alloc_size_hdd from default 64k to 32k, then deployed some more OSDs with cephadm. I assume old OSD will continue to use the old setting, and new OSDs will use the new setting. But how can I verify that? To be specific, how can I query the effective min_alloc_size for a specific OSD? I’ve tried to use “ceph daemon osd.X ...” but cannot find the appropriate command. I also searched the OSD logs. Thanks.

3 years, 4 months

ceph in docker the log_file config is empty

by goodluck

when I use kolla to deploy ceph in docker. I found there is no ceph logs output with osds and mons. But the mgr have the ceph logs output in the logfile. While I found the ceph log_file is empty even I set the log_file config in ceph.conf. ceph daemon /var/run/ceph/ceph-osd.0.asok config show|grep log_file "log_file": "", "mon_cluster_log_file": "default=/var/log/ceph/ceph.$channel.log cluster=/var/log/ceph/ceph.log", "mon_cluster_log_file_level": "debug", cat /etc/ceph/ceph.conf [global] log file = /var/log/kolla/ceph/$cluster-$name.log log to syslog = false err to syslog = false log to stderr = false err to stderr = false fsid = 3722db7f-9831-473f-a591-a98bd76464d8 mon initial members = mon host = mon addr = auth cluster required = cephx auth service required = cephx auth client required = cephx #setuser match path = /var/lib/ceph/$type/$cluster-$id osd crush update on start = false osd objectstore = filestore bluestore = false [mon] mon compact on start = true mon cluster log file = /var/log/kolla/ceph/$cluster.log [ { "Id": "c6d9fdcd6c974e2a95527542792ea44d59218373f1b4694af5d0c4626f1779df", "Created": "2020-12-01T06:42:06.300762742Z", "Path": "dumb-init", "Args": [ "--single-child", "--", "kolla_start" ], "State": { "Status": "running", "Running": true, "Paused": false, "Restarting": false, "OOMKilled": false, "Dead": false, "Pid": 51905, "ExitCode": 0, "Error": "", "StartedAt": "2020-12-01T06:48:05.787234081Z", "FinishedAt": "2020-12-01T06:48:05.485792802Z" }, "Image": "sha256:973a3e0d341100d5a53f64ef7f9b440aadfce01a9ca52e9637678361aa7fb1ba", "ResolvConfPath": "/var/lib/docker/containers/c6d9fdcd6c974e2a95527542792ea44d59218373f1b4694af5d0c4626f1779df/resolv.conf", "HostnamePath": "/var/lib/docker/containers/c6d9fdcd6c974e2a95527542792ea44d59218373f1b4694af5d0c4626f1779df/hostname", "HostsPath": "/var/lib/docker/containers/c6d9fdcd6c974e2a95527542792ea44d59218373f1b4694af5d0c4626f1779df/hosts", "LogPath": "/var/lib/docker/containers/c6d9fdcd6c974e2a95527542792ea44d59218373f1b4694af5d0c4626f1779df/c6d9fdcd6c974e2a95527542792ea44d59218373f1b4694af5d0c4626f1779df-json.log", "Name": "/ceph_osd_0", "RestartCount": 0, "Driver": "btrfs", "Platform": "linux", "MountLabel": "", "ProcessLabel": "", "AppArmorProfile": "unconfined", "ExecIDs": null, "HostConfig": { "Binds": [ "/etc/localtime:/etc/localtime:ro", "/etc/kolla/ceph-osd/:/var/lib/kolla/config_files/:ro", "/var/lib/ceph/osd/d4e7ed3e-f997-463b-9350-ffa9f5c1b429:/var/lib/ceph/osd/ceph-0:rw", "kolla_logs:/var/log/kolla/:rw", "/var/run/:/var/run/:rw", "/dev/:/dev/:rw" ], "ContainerIDFile": "", "LogConfig": { "Type": "json-file", "Config": { "max-file": "5", "max-size": "50m" } }, "NetworkMode": "host", "PortBindings": {}, "RestartPolicy": { "Name": "unless-stopped", "MaximumRetryCount": 0 }, "AutoRemove": false, "VolumeDriver": "", "VolumesFrom": null, "CapAdd": null, "CapDrop": null, "Capabilities": null, "Dns": [], "DnsOptions": [], "DnsSearch": [], "ExtraHosts": null, "GroupAdd": null, "IpcMode": "private", "Cgroup": "", "Links": null, "OomScoreAdj": 0, "PidMode": "", "Privileged": true, "PublishAllPorts": false, "ReadonlyRootfs": false, "SecurityOpt": [ "label=disable" ], "UTSMode": "", "UsernsMode": "", "ShmSize": 67108864, "Runtime": "runc", "ConsoleSize": [ 0, 0 ], "Isolation": "", "CpuShares": 0, "Memory": 0, "NanoCpus": 0, "CgroupParent": "", "BlkioWeight": 0, "BlkioWeightDevice": [], "BlkioDeviceReadBps": null, "BlkioDeviceWriteBps": null, "BlkioDeviceReadIOps": null, "BlkioDeviceWriteIOps": null, "CpuPeriod": 0, "CpuQuota": 0, "CpuRealtimePeriod": 0, "CpuRealtimeRuntime": 0, "CpusetCpus": "", "CpusetMems": "", "Devices": [], "DeviceCgroupRules": null, "DeviceRequests": null, "KernelMemory": 0, "KernelMemoryTCP": 0, "MemoryReservation": 0, "MemorySwap": 0, "MemorySwappiness": null, "OomKillDisable": false, "PidsLimit": null, "Ulimits": null, "CpuCount": 0, "CpuPercent": 0, "IOMaximumIOps": 0, "IOMaximumBandwidth": 0, "MaskedPaths": null, "ReadonlyPaths": null }, "GraphDriver": { "Data": null, "Name": "btrfs" }, "Mounts": [ { "Type": "volume", "Name": "kolla_logs", "Source": "/var/lib/docker/volumes/kolla_logs/_data", "Destination": "/var/log/kolla", "Driver": "local", "Mode": "rw", "RW": true, "Propagation": "" }, { "Type": "bind", "Source": "/var/run", "Destination": "/var/run", "Mode": "rw", "RW": true, "Propagation": "rprivate" }, { "Type": "bind", "Source": "/dev", "Destination": "/dev", "Mode": "rw", "RW": true, "Propagation": "rprivate" }, { "Type": "bind", "Source": "/etc/localtime", "Destination": "/etc/localtime", "Mode": "ro", "RW": false, "Propagation": "rprivate" }, { "Type": "bind", "Source": "/etc/kolla/ceph-osd", "Destination": "/var/lib/kolla/config_files", "Mode": "ro", "RW": false, "Propagation": "rprivate" }, { "Type": "bind", "Source": "/var/lib/ceph/osd/d4e7ed3e-f997-463b-9350-ffa9f5c1b429", "Destination": "/var/lib/ceph/osd/ceph-0", "Mode": "rw", "RW": true, "Propagation": "rprivate" } ], "Config": { "Hostname": "compute-taishan-01", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": false, "AttachStderr": false, "Tty": false, "OpenStdin": false, "StdinOnce": false, "Env": [ "OSD_BS_FSUUID=d4e7ed3e-f997-463b-9350-ffa9f5c1b429", "OSD_STORETYPE=filestore", "OSD_ID=0", "KOLLA_SERVICE_NAME=ceph-osd-0", "KOLLA_INSTALL_TYPE=source", "KOLLA_INSTALL_METATYPE=mixed", "KOLLA_BASE_ARCH=aarch64", "DEBIAN_FRONTEND=noninteractive", "TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728", "KOLLA_CONFIG_STRATEGY=COPY_ALWAYS", "JOURNAL_PARTITION=/dev/disk/by-partuuid/80664f67-6196-496c-81f2-02d974edbe52", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "LANG=en_US.UTF-8", "KOLLA_BASE_DISTRO=ubuntu", "KOLLA_DISTRO_PYTHON_VERSION=3.6", "PS1=$(tput bold)($(printenv KOLLA_SERVICE_NAME))$(tput sgr0)[$(id un)@$(hostname -s) $(pwd)]$ " ], "Cmd": [ "kolla_start" ], "Image": "10.12.254.254:4020/cszt-v4-anke/ubuntu-source-ceph-osd:4.0.1", "Volumes": { "/dev/": {}, "/etc/localtime": {}, "/var/lib/ceph/osd/ceph-0": {}, "/var/lib/kolla/config_files/": {}, "/var/log/kolla/": {}, "/var/run/": {} }, "WorkingDir": "", "Entrypoint": [ "dumb-init", "--single-child", "-" ], "OnBuild": null, "Labels": { "build-date": "20201019", "kolla_version": "0.1.0", "maintainer": "Kolla Project (https://launchpad.net/kolla)", "name": "ceph-osd" } }, "NetworkSettings": { "Bridge": "", "SandboxID": "5bdcea4f0b2cc2f54e4d6c4abad0ca60ef080c74e7419ac75cde837025aedffe", "HairpinMode": false, "LinkLocalIPv6Address": "", "LinkLocalIPv6PrefixLen": 0, "Ports": {}, "SandboxKey": "/var/run/docker/netns/default", "SecondaryIPAddresses": null, "SecondaryIPv6Addresses": null, "EndpointID": "", "Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "IPAddress": "", "IPPrefixLen": 0, "IPv6Gateway": "", "MacAddress": "", "Networks": { "host": { "IPAMConfig": null, "Links": null, "Aliases": null, "NetworkID": "b5bfa3839097ea5932eaef6cd2b05ed2190eeec845f765f5ead543247787f860", "EndpointID": "67275cf88430263a8e49e5be9109a9ca2fedb3b99b348840072b54e52d2bdb57", "Gateway": "", "IPAddress": "", "IPPrefixLen": 0, "IPv6Gateway": "", "GlobalIPv6Address": "", "GlobalIPv6PrefixLen": 0, "MacAddress": "", "DriverOpts": null }

3 years, 4 months

OSD Metadata Imbalance

by Paul Kramme

Hello, my cluster is currently showing a metadata imbalance. Normally, all OSDs have around 23GB metadata (META column), but 4 OSDs out of 56 have 34 GB metadata. Compacting reduces the data for some OSDs, but not for others. OSDs where the compaction worked quickly grow to back to 34GB. Our cluster configuration: * 8 nodes, each with 6 HDDs OSDs and 1 SSD used for blockdb and WAL * k=4 m=2 EC * v14.2.14 Normal OSD: ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 40 hdd 11.09470 1.00000 11 TiB 8.6 TiB 8.4 TiB 1.3 GiB 23 GiB 2.5 TiB 77.15 1.01 130 up Big OSD: ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 0 hdd 11.09499 1.00000 11 TiB 8.6 TiB 8.4 TiB 1.8 GiB 30 GiB 2.5 TiB 77.59 1.02 130 up There are 56 OSDs in the cluster, 4 OSDs of which are bigger. These OSDs are all in different hosts. Why is that? Is that dangerous or could lead to problems such as performance degrades? Thanks, Paul

3 years, 4 months

osd out cant' bring it back online

by Oliver Weinmann

Hi, I'm still evaluating ceph 15.2.5 in a lab so the problem is not really hurting me, but I want to understand it and hopefully fix it. It is a good practice. To test the resilience of the cluster I try to break it by doing all kinds of things. Today I powered off (clean shutdown) one osd node and powered it back on. Last time I tried this there was no problem getting it back online. After a few minutes the cluster health was back to ok. This time it stayed degraded forever. I checked and noticed that the service osd.0 on the osd node was failing. So i used google and there people recommended to simply delete the osd and re-create it. I tried it and still can't get the osd back in service. First I removed the osd: [root@gedasvl02 ~]# ceph osd out 0 INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af INFO:cephadm:Inferring config /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 osd.0 is already out. [root@gedasvl02 ~]# ceph auth del 0 INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af INFO:cephadm:Inferring config /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 Error EINVAL: bad entity name [root@gedasvl02 ~]# ceph auth del osd.0 INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af INFO:cephadm:Inferring config /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 updated [root@gedasvl02 ~]# ceph osd rm 0 INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af INFO:cephadm:Inferring config /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 removed osd.0 [root@gedasvl02 ~]# ceph osd tree INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af INFO:cephadm:Inferring config /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.43658 root default -7 0.21829 host gedaopl01 2 ssd 0.21829 osd.2 up 1.00000 1.00000 -3 0 host gedaopl02 -5 0.21829 host gedaopl03 3 ssd 0.21829 osd.3 up 1.00000 1.00000 Looks ok it's gone... Then i zapped it: [root@gedasvl02 ~]# ceph orch device zap gedaopl02 /dev/sdb --force INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af INFO:cephadm:Inferring config /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 INFO:cephadm:/usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices. INFO:cephadm:/usr/bin/podman:stderr --> Zapping: /dev/sdb INFO:cephadm:/usr/bin/podman:stderr --> Zapping lvm member /dev/sdb. lv_path is /dev/ceph-3bf1bb28-0858-4464-a848-d7f56319b40a/osd-block-3a79800d-2a19-45d8-a850-82c6a8113323 INFO:cephadm:/usr/bin/podman:stderr Running command: /usr/bin/dd if=/dev/zero of=/dev/ceph-3bf1bb28-0858-4464-a848-d7f56319b40a/osd-block-3a79800d-2a19-45d8-a850-82c6a8113323 bs=1M count=10 conv=fsync INFO:cephadm:/usr/bin/podman:stderr stderr: 10+0 records in INFO:cephadm:/usr/bin/podman:stderr 10+0 records out INFO:cephadm:/usr/bin/podman:stderr 10485760 bytes (10 MB, 10 MiB) copied, 0.0314447 s, 333 MB/s INFO:cephadm:/usr/bin/podman:stderr stderr: INFO:cephadm:/usr/bin/podman:stderr --> Only 1 LV left in VG, will proceed to destroy volume group ceph-3bf1bb28-0858-4464-a848-d7f56319b40a INFO:cephadm:/usr/bin/podman:stderr Running command: /usr/sbin/vgremove -v -f ceph-3bf1bb28-0858-4464-a848-d7f56319b40a INFO:cephadm:/usr/bin/podman:stderr stderr: Removing ceph--3bf1bb28--0858--4464--a848--d7f56319b40a-osd--block--3a79800d--2a19--45d8--a850--82c6a8113323 (253:0) INFO:cephadm:/usr/bin/podman:stderr stderr: Archiving volume group "ceph-3bf1bb28-0858-4464-a848-d7f56319b40a" metadata (seqno 5). INFO:cephadm:/usr/bin/podman:stderr stderr: Releasing logical volume "osd-block-3a79800d-2a19-45d8-a850-82c6a8113323" INFO:cephadm:/usr/bin/podman:stderr stderr: Creating volume group backup "/etc/lvm/backup/ceph-3bf1bb28-0858-4464-a848-d7f56319b40a" (seqno 6). INFO:cephadm:/usr/bin/podman:stderr stdout: Logical volume "osd-block-3a79800d-2a19-45d8-a850-82c6a8113323" successfully removed INFO:cephadm:/usr/bin/podman:stderr stderr: Removing physical volume "/dev/sdb" from volume group "ceph-3bf1bb28-0858-4464-a848-d7f56319b40a" INFO:cephadm:/usr/bin/podman:stderr stdout: Volume group "ceph-3bf1bb28-0858-4464-a848-d7f56319b40a" successfully removed INFO:cephadm:/usr/bin/podman:stderr Running command: /usr/bin/dd if=/dev/zero of=/dev/sdb bs=1M count=10 conv=fsync INFO:cephadm:/usr/bin/podman:stderr stderr: 10+0 records in INFO:cephadm:/usr/bin/podman:stderr 10+0 records out INFO:cephadm:/usr/bin/podman:stderr stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0355641 s, 295 MB/s INFO:cephadm:/usr/bin/podman:stderr --> Zapping successful for: <Raw Device: /dev/sdb> And re-added it: [root@gedasvl02 ~]# ceph orch daemon add osd gedaopl02:/dev/sdb INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af INFO:cephadm:Inferring config /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 Created osd(s) 0 on host 'gedaopl02' But the osd is still out... [root@gedasvl02 ~]# ceph osd tree INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af INFO:cephadm:Inferring config /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.43658 root default -7 0.21829 host gedaopl01 2 ssd 0.21829 osd.2 up 1.00000 1.00000 -3 0 host gedaopl02 -5 0.21829 host gedaopl03 3 ssd 0.21829 osd.3 up 1.00000 1.00000 0 0 osd.0 down 0 1.00000 Looking at the cluster log in the webui i see the following error: Failed to apply osd.dashboard-admin-1606745745154 spec DriveGroupSpec(name=dashboard-admin-1606745745154->placement=PlacementSpec(host_pattern='*'), service_id='dashboard-admin-1606745745154', service_type='osd', data_devices=DeviceSelection(size='223.6GB', rotational=False, all=False), osd_id_claims={}, unmanaged=False, filter_logic='AND', preview_only=False): No filters applied Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/module.py", line 2108, in _apply_all_services if self._apply_service(spec): File "/usr/share/ceph/mgr/cephadm/module.py", line 2005, in _apply_service self.osd_service.create_from_spec(cast(DriveGroupSpec, spec)) File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 43, in create_from_spec ret = create_from_spec_one(self.prepare_drivegroup(drive_group)) File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 127, in prepare_drivegroup drive_selection = DriveSelection(drive_group, inventory_for_host) File "/lib/python3.6/site-packages/ceph/deployment/drive_selection/selector.py", line 32, in __init__ self._data = self.assign_devices(self.spec.data_devices) File "/lib/python3.6/site-packages/ceph/deployment/drive_selection/selector.py", line 138, in assign_devices if not all(m.compare(disk) for m in FilterGenerator(device_filter)): File "/lib/python3.6/site-packages/ceph/deployment/drive_selection/selector.py", line 138, in <genexpr> if not all(m.compare(disk) for m in FilterGenerator(device_filter)): File "/lib/python3.6/site-packages/ceph/deployment/drive_selection/matchers.py", line 410, in compare raise Exception("No filters applied") Exception: No filters applied I have another error "pgs undersized", maybe this is also causing trouble? [root@gedasvl02 ~]# ceph -s INFO:cephadm:Inferring fsid d0920c36-2368-11eb-a5de-005056b703af INFO:cephadm:Inferring config /var/lib/ceph/d0920c36-2368-11eb-a5de-005056b703af/mon.gedasvl02/config INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 cluster: id: d0920c36-2368-11eb-a5de-005056b703af health: HEALTH_WARN Degraded data redundancy: 13142/39426 objects degraded (33.333%), 176 pgs degraded, 225 pgs undersized services: mon: 1 daemons, quorum gedasvl02 (age 2w) mgr: gedasvl02.vqswxg(active, since 2w), standbys: gedaopl02.yrwzqh mds: cephfs:1 {0=cephfs.gedaopl01.zjuhem=up:active} 1 up:standby osd: 3 osds: 2 up (since 4d), 2 in (since 94m) task status: scrub status: mds.cephfs.gedaopl01.zjuhem: idle data: pools: 7 pools, 225 pgs objects: 13.14k objects, 77 GiB usage: 148 GiB used, 299 GiB / 447 GiB avail pgs: 13142/39426 objects degraded (33.333%) 176 active+undersized+degraded 49 active+undersized io: client: 0 B/s rd, 6.1 KiB/s wr, 0 op/s rd, 0 op/s wr Best Regards, Oliver

3 years, 4 months

v15.2.7 Octopus released

by David Galloway

This is the 7th backport release in the Octopus series. This release fixes a serious bug in RGW that has been shown to cause data loss when a read of a large RGW object (i.e., one with at least one tail segment) takes longer than one half the time specified in the configuration option `rgw_gc_obj_min_wait`. The bug causes the tail segments of that read object to be added to the RGW garbage collection queue, which will in turn cause them to be deleted after a period of time. Changelog --------- * rgw: during GC defer, prevent new GC enqueue Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at http://download.ceph.com/tarballs/ceph-15.2.7.tar.gz * For packages, see http://docs.ceph.com/docs/master/install/get-packages/ * Release git sha1: 88e41c6c49beb18add4fdb6b4326ca466d931db8

3 years, 4 months

Jump to page:

2024

2023

2022

2021

2020

2019

ceph-users December 2020