- ceph-users - lists.ceph.io

OSD crashes on start - while loading pgs

by Jesper Stemann Andersen

Hi, Any ideas for resolving an issue where an OSD crashes on start-up? I have one (large hdd) OSD that will no longer start – it crashes while loading pgs - see attached log file - excerpt below: 2019-08-02 10:08:21.021207 7fea86d7be00 0 osd.1 1844 load_pgs 2019-08-02 10:08:39.370112 7fea86d7be00 -1 *** Caught signal (Aborted) ** in thread 7fea86d7be00 thread_name:ceph-osd ceph version 12.2.12 (39cfebf25a7011204a9876d2950e4b28aba66d11) luminous (stable) 1: (()+0xa59c94) [0x55b835a6dc94] 2: (()+0x110e0) [0x7fea843800e0] 3: (gsignal()+0xcf) [0x7fea83347fff] 4: (abort()+0x16a) [0x7fea8334942a] 5: (__gnu_cxx::__verbose_terminate_handler()+0x15d) [0x7fea83c600ad] 6: (()+0x8f066) [0x7fea83c5e066] 7: (()+0x8f0b1) [0x7fea83c5e0b1] 8: (()+0x8f2c9) [0x7fea83c5e2c9] 9: (pg_log_entry_t::decode_with_checksum(ceph::buffer::list::iterator&)+0x156) [0x55b8356f57c6] 10: (void PGLog::read_log_and_missing<pg_missing_set<true> >(ObjectStore*, coll_t, coll_t, ghobject_t, pg_info_t const&, PGLog::IndexedLog&, pg_missing_set<true>&, bool, std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >&, bool, bool*, DoutPrefixProvider const*, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >*, bool)+0x1ab4) [0x55b8355a6584] 11: (PG::read_state(ObjectStore*, ceph::buffer::list&)+0x38b) [0x55b83554b7eb] 12: (OSD::load_pgs()+0x8b8) [0x55b835496678] 13: (OSD::init()+0x2237) [0x55b8354b75c7] 14: (main()+0x3092) [0x55b8353bf1c2] 15: (__libc_start_main()+0xf1) [0x7fea833352e1] 16: (_start()+0x2a) [0x55b83544b8ca] I have ensured that kernel.pid_max is set to a high value – sysctl reports kernel.pid_max = 4194304 This issue arose following an expansion of the ceph cluster: https://forum.proxmox.com/threads/unable-to-start-osd-crashes-while-loading… In summary: I added a third node, with extra OSD’s, and increased pg_num and pgp_num for one pool before the cluster had settled. However, by now the cluster has settled – I no longer have the global setting mon_max_pg_per_osd = 1000. Only the issue with the OSD that will not start remains. Best regards, Jesper Stemann Andersen Lead Software Engineer, R&D IHP Systems A/S +45 26 25 23 91

4 years, 9 months

1
0
0 0

Help needed pls (rbd cache)

by Muhammad Junaid

Hi there, Sorry for asking a question, which may be of very basic nature and asked many times before. But much Google search can not satisfy me. The question is about RBD Cache in write-back mode using KVM/libvirt. If we enable this, it uses local KVM Host's RAM as cache for VM's write requests. And KVM Host immediately responds to VM's OS that data has been written to Disk (Actually it is still not on OSD's yet). Then how can be it power failure safe? Is my understanding correct? If not, pls correct. This is very important for me. Thank you very much in advance. Best regards. Muhammad Junaid

4 years, 9 months

2
2
0 0

Ceph packages for Debian buster

by yuval.freund＠cloud.ionos.com

We've been using ceph on Debian for several years now, and would really like seeing official packages for buster. We recently resorted to building our own Luminous packages for buster (based on the Ubuntu packages), due to the fact that: * We've already upgraded our clusters to 12.2.12 , and the debian.org packages offer 12.2.11 . * The official ceph packages for stretch are built against libcurl3, which has been removed in buster in favor of libcurl4 . Sage stated last year that Luminous and Nautilus (and possibly Mimic) packages for buster should be provided at some point (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-June/027478.html), and we really hope that this decision hasn't been reverted - a lot of Debian users are looking forward to being able to upgrade their clusters to a newer release. We also believe having official Luminous packages will allow most Debian users who've kept their systems up-to-date to have a clean upgrade path. We are aware of croit's Nautilus packages for buster and have played around with them (thanks!), but would rather go with official packages in our production clusters, preferably built against python3 (for bonus points ;) ). We're also undecided regarding whether we should go straight to Nautilus, mainly due to the fact Mimic seems like the more stable choice for the time being. We'd be happy and willing to assist in any way we can. Cheers, Yuval -- Yuval Freund System Engineer 1&1 IONOS Cloud GmbH | Greifswalder Str. 207 | 10405 Berlin | Germany Web: www.ionos.de Head Office: Berlin, Germany District Court Berlin Charlottenburg, Registration number: HRB 125506 B Executive Management: Christoph Steffens, Matthias Steinberg, Achim Weiss Member of United Internet

4 years, 9 months

1
0
0 0

Ceph rbd cannot delete after I reset ceph cluster

by xu xingci

Hi guys: I had setup ceph cluster and mount rbd on one machine. I delete ceph cluster and reinstall follow the manual. but I still have rbd device mount on my machine. I can not access mount point. This is my detail info, I want to delete all old rbd device, what should I do? node1 $> rbd device list id pool namespace image snap device 0 rbd foo - /dev/rbd0 1 kube kubernetes-dynamic-pvc-1cc43c5b-ade1-11e9-9a92-863e3c12afd1 - /dev/rbd1 node1 $> df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/rootvg-lv_root 10995712 2528388 8467324 23% / devtmpfs 2012656 0 2012656 0% /dev tmpfs 2023588 0 2023588 0% /dev/shm tmpfs 2023588 207340 1816248 11% /run tmpfs 2023588 0 2023588 0% /sys/fs/cgroup /dev/sda1 520868 116936 403932 23% /boot /dev/mapper/rootvg-lv_var 5232640 3226816 2005824 62% /var /dev/mapper/rootvg-lv_tmp 5232640 33060 5199580 1% /tmp /dev/rbd0 3997376 16392 3754888 1% /mnt tmpfs 404720 0 404720 0% /run/user/1001 node1 $> rbd trush list rbd: error opening default pool 'rbd' Ensure that the default pool has been created or specify an alternate pool name. node1 $> rbd info rbd/foo rbd: error opening default pool 'rbd' Ensure that the default pool has been created or specify an alternate pool name. [https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-anima…]<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campai…> 无病毒。www.avast.com<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campai…>

4 years, 9 months

1
0
0 0

Help pls (RBD Cache)

by Muhammad Junaid

Hi there, Sorry for asking a question, which may be of very basic nature and asked many times before. But much Google search can not satisfy me. The question is about RBD Cache in write-back mode using KVM/libvirt. If we enable this, it uses local KVM Host's RAM as cache for VM's write requests. And KVM Host immediately responds to VM's OS that data has been written to Disk (Actually it is still not on OSD's yet). Then how can be it power failure safe? Is my understanding correct? If not, pls correct. This is very important for me. Thank you very much in advance. Best regards. Muhammad Junaid

4 years, 9 months

1
0
0 0

Re: Fwd: [lca-announce] linux.conf.au 2020 - Call for Sessions and Miniconfs now open!

by Tim Serong

Hi All, Just a reminder, there's only a few days left to submit talks for this most excellent conference; the CFP is open until Sunday 28 July Anywhere on Earth. (I've submitted a Data Storage miniconf day, fingers crossed...) Regards, Tim On 6/26/19 2:09 PM, Tim Serong wrote: > Here we go again! As usual the conference theme is intended to > inspire, not to restrict; talks on any topic in the world of free and > open source software, hardware, etc. are most welcome, and Ceph talks > definitely fit. > > I've added this to https://pad.ceph.com/p/cfp-coordination as well. > > -------- Forwarded Message -------- > Subject: [lca-announce] linux.conf.au 2020 - Call for Sessions and > Miniconfs now open! > Date: Tue, 25 Jun 2019 21:19:43 +1000 > From: linux.conf.au Announcements <lca-announce(a)lists.linux.org.au> > Reply-To: lca-announce(a)lists.linux.org.au > To: lca-announce(a)lists.linux.org.au > > > The linux.conf.au 2020 organising team is excited to announce that the > linux.conf.au 2020 Call for Sessions and Call for Miniconfs are now open! > These will stay open from now until Sunday 28 July Anywhere on Earth > (AoE) (https://en.wikipedia.org/wiki/Anywhere_on_Earth). > > Our theme for linux.conf.au 2020 is "Who's Watching", focusing on > security, privacy and ethics. > As big data and IoT-connected devices become more pervasive, it's no > surprise that we're more concerned about privacy and security than ever > before. > We've set our sights on how open source could play a role in maximising > security and protecting our privacy in times of uncertainty. > With the concept of privacy continuing to blur, open source could be the > solution to give us '2020 vision'. > > Call for Sessions > > Would you like to talk in the main conference of linux.conf.au 2020? > The main conference runs from Wednesday to Friday, with multiple streams > catering for a wide range of interest areas. > We welcome you to submit a session > (https://linux.conf.au/programme/sessions/) proposal for either a talk > or tutorial now. > > Call for Miniconfs > > Miniconfs are dedicated day-long streams focusing on single topics, > creating a more immersive experience for delegates than a session. > Miniconfs are run on the first two days of the conference before the > main conference commences on Wednesday. > If you would like to organise a miniconf > (https://linux.conf.au/programme/miniconfs/) at linux.conf.au, we want > to hear from you. > > Have we got you interested? > > You can find out how to submit your session or miniconf proposals at > https://linux.conf.au/programme/proposals/. > If you have any other questions you can contact us via email at > contact(a)lca2020.linux.org.au. > > We are looking forward to reading your submissions. > > linux.conf.au 2020 Organising Team > > > --- > Read this online at > https://lca2020.linux.org.au/news/call-for-sessions-miniconfs-now-open/ > _______________________________________________ > lca-announce mailing list > lca-announce(a)lists.linux.org.au > http://lists.linux.org.au/mailman/listinfo/lca-announce > >

4 years, 9 months

1
0
0 0

OSD

by u15484＠ust-global.com

I have a three node cluster running on MIMIC one OSD has with ID 0 has taken out and data is cleared. Created OSD with same ID, however it is not coming up. See the status below. cephuser@CEP001:~/clusterstor$ sudo ceph osd status +----+-----------------+-------+-------+--------+---------+--------+---------+------------------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+-----------------+-------+-------+--------+---------+--------+---------+------------------+ | 0 | | 0 | 0 | 0 | 0 | 0 | 0 | exists | | 1 | HostName | 1027M | 446G | 0 | 0 | 0 | 0 | exists,nodown,up | | 2 | HostName | 1039M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 3 | HostName | 1039M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 4 | HostName | 1039M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 5 | HostName | 1027M | 446G | 0 | 0 | 0 | 0 | exists,nodown,up | | 6 | HostName | 1039M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 7 | HostName | 1039M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 8 | HostName | 1039M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 9 | HostName | 1039M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 10 | HostName | 1028M | 446G | 0 | 0 | 0 | 0 | exists,nodown,up | | 11 | HostName | 1040M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 12 | HostName | 1040M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 13 | HostName | 1040M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | | 14 | HostName | 1040M | 3725G | 0 | 0 | 0 | 0 | exists,nodown,up | +----+-----------------+-------+-------+--------+---------+--------+---------+------------------+ Any help on this will be appreciated. Thanks

4 years, 10 months

1
0
0 0

About osd_op_num_shards configuration

by 刘朝洋

Hello everybody! I test the performance of ceph cluster using different osd_op_num_shards and different osd_op_num_threads_per_shard configuration. And I found that using multi-thread(with a single shard) can get the same improvement of performance with multi-shard. However, the document of ceph osd config(http://docs.ceph.com/docs/master/rados/configuration/osd-config-ref/… said that using lower shard number may have deleterious effects. I want to know what the deleterious effects are. Thanks for the assistance! Fibird chaoyanglius(a)gmail.com

4 years, 10 months

1
0
0 0

WAL on separate partition

by vladimir franciz blando

Hi, I have an existing Luminous installation, 3 nodes with 8x4TB HDD and a 1x200GB SSD which was used as journal before. On a default Luminous installation via ceph-deploy, I forgot to prepare the OSD with the WAL and DB on separate SSD. The environment is running and in production and I want to configure it to use the SSD as the WAL device or maybe for DB also, since the environment is in production I am hesitant to do it because it may cause some problems along the way. What should I do for me to reconfigure it without downtime, or if there are downtime at least minimal? ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 101.86957 root default -3 29.10559 host ceph01 0 hdd 3.63820 osd.0 up 1.00000 1.00000 1 hdd 3.63820 osd.1 up 1.00000 1.00000 2 hdd 3.63820 osd.2 up 1.00000 1.00000 3 hdd 3.63820 osd.3 up 1.00000 1.00000 4 hdd 3.63820 osd.4 up 1.00000 1.00000 5 hdd 3.63820 osd.5 up 1.00000 1.00000 6 hdd 3.63820 osd.6 up 1.00000 1.00000 7 hdd 3.63820 osd.7 up 1.00000 1.00000 -5 29.10559 host ceph02 8 hdd 3.63820 osd.8 up 1.00000 1.00000 9 hdd 3.63820 osd.9 up 1.00000 1.00000 10 hdd 3.63820 osd.10 up 1.00000 1.00000 11 hdd 3.63820 osd.11 up 1.00000 1.00000 12 hdd 3.63820 osd.12 up 1.00000 1.00000 13 hdd 3.63820 osd.13 up 1.00000 1.00000 14 hdd 3.63820 osd.14 up 1.00000 1.00000 15 hdd 3.63820 osd.15 up 1.00000 1.00000 -7 29.10559 host ceph03 16 hdd 3.63820 osd.16 up 1.00000 1.00000 17 hdd 3.63820 osd.17 up 1.00000 1.00000 18 hdd 3.63820 osd.18 up 1.00000 1.00000 19 hdd 3.63820 osd.19 up 1.00000 1.00000 20 hdd 3.63820 osd.20 up 1.00000 1.00000 21 hdd 3.63820 osd.21 up 1.00000 1.00000 22 hdd 3.63820 osd.22 up 1.00000 1.00000 23 hdd 3.63820 osd.23 up 1.00000 1.00000 - Vlad ᐧ

4 years, 10 months

1
0
0 0

Help please

by Konstantin Pupkov

Hi! I am trying to setup ceph cluster from existing ubuntu boxes (I'll purchase it from VSP). How to create a cluster from it using existing folder like "/cephfs" in my Ubuntu box Thanks, Konstantin

4 years, 10 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users