Hey Wido,
We upgraded a 550-osd cluster from 14.2.4 to 14.2.6 and everything seems to
be working fine. Here's top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
1432693 ceph 20 0 3246580 2.0g 18260 S 78.4 13.9 2760:58
ceph-mgr
2075038 ceph 20 0 2235072 1.1g 16408 S 11.6 7.6 176:15.30
ceph-mon
And the balancer is quick:
# ceph balancer status
{
"last_optimize_duration": "0:00:02.806449",
"plans": [],
"mode": "upmap",
"active": true,
"optimize_result": "Optimization plan created successfully",
"last_optimize_started": "Thu Jan 16 11:26:19 2020"
}
Cheers, Dan
On Thu, Jan 16, 2020 at 11:19 AM Wido den Hollander <wido(a)42on.com> wrote:
Anybody upgraded to 14.2.6 yet?
On a 1800 OSD cluster I see that ceph-mgr is consuming 200 to 450% CPU
on a 4C/8T system (Intel Xeon E3-1230 3.3Ghz CPU).
The logs don't show anything very special, it's just that the mgr is
super busy.
I noticed this when I executed:
$ ceph balancer status
That command wouldn't return and then I checked the mgr. Only after
restarting ceph-mgr the balancer module returned results again. It
didn't change the CPU usage, it's still consuming a lot of CPU, but at
least the balancer seems to work again.
Wido
On 1/9/20 10:21 AM, Lars Täuber wrote:
yesterday:
https://ceph.io/releases/v14-2-6-nautilus-released/
Cheers,
Lars
Thu, 9 Jan 2020 10:10:12 +0100
Wido den Hollander <wido(a)42on.com> ==> Neha Ojha <nojha(a)redhat.com>om>,
Sasha Litvak <alexander.v.litvak(a)gmail.com> :
> On 12/24/19 9:19 PM, Neha Ojha wrote:
>> The root cause of this issue is the overhead added by the network ping
>> time monitoring feature for the mgr to process.
>> We have a fix that disables sending the network ping times related
>> stats to the mgr and Eric has helped verify the fix(Thanks Eric!) -
>>
https://tracker.ceph.com/issues/43364#note-9. We'll get this fix out
>> in 14.2.6 after the holidays.
>>
>
> It's after the holidays now and this is affecting a lot of deployments.
> Can people expect 14.2.6 soon?
>
> Wido
>
>>
>>
>> On Fri, Dec 20, 2019 at 6:24 PM Neha Ojha <nojha(a)redhat.com> wrote:
>>>
>>> Not yet, but we have a theory and a test build in
>>>
https://tracker.ceph.com/issues/43364#note-6, if anybody would like
to
>>> give it a try.
>>>
>>> Thanks,
>>> Neha
>>>
>>> On Fri, Dec 20, 2019 at 2:31 PM Sasha Litvak
>>> <alexander.v.litvak(a)gmail.com> wrote:
>>>>
>>>> Was the root cause found and fixed? If so, will the fix be
available in 14.2.6 or sooner?
>>>>
>>>> On Thu, Dec 19, 2019 at 5:48 PM Mark Nelson <mnelson(a)redhat.com>
wrote:
>>>>>
>>>>> Hi Paul,
>>>>>
>>>>>
>>>>> Thanks for gathering this! It looks to me like at the very least we
>>>>> should redo the fixed_u_to_string and fixed_to_string functions in
>>>>> common/Formatter.cc. That alone looks like it's having a pretty
>>>>> significant impact.
>>>>>
>>>>>
>>>>> Mark
>>>>>
>>>>>
>>>>> On 12/19/19 2:09 PM, Paul Mezzanini wrote:
>>>>>> Based on what we've seen with perf, we think this is the
relevant
section. (attached is also the whole file)
>>>>>>
>>>>>> Thread: 73 (mgr-fin) - 1000 samples
>>>>>>
>>>>>> + 100.00% clone
>>>>>> + 100.00% start_thread
>>>>>> + 100.00% Finisher::finisher_thread_entry()
>>>>>> + 99.40% Context::complete(int)
>>>>>> | + 99.40% FunctionContext::finish(int)
>>>>>> | + 99.40% ActivePyModule::notify(std::string
const&,
std::string const&)
>>>>>> | + 91.30%
PyObject_CallMethod
>>>>>> | | + 91.30% call_function_tail
>>>>>> | | + 91.30% PyObject_Call
>>>>>> | | + 91.30% instancemethod_call
>>>>>> | | + 91.30% PyObject_Call
>>>>>> | | + 91.30% function_call
>>>>>> | | + 91.30% PyEval_EvalCodeEx
>>>>>> | | + 88.40% PyEval_EvalFrameEx
>>>>>> | | | + 88.40% PyEval_EvalFrameEx
>>>>>> | | | + 88.40%
ceph_state_get(BaseMgrModule*, _object*)
>>>>>> | |
| + 88.40%
ActivePyModules::get_python(std::string const&)
>>>>>> | |
| + 51.10%
PGMap::dump_osd_stats(ceph::Formatter*) const
>>>>>> | |
| | + 51.10%
osd_stat_t::dump(ceph::Formatter*) const
>>>>>> | |
| | + 22.50%
ceph::fixed_u_to_string(unsigned long, int)
>>>>>> | |
| | | + 10.50%
std::basic_ostringstream<char,
std::char_traits<char>, std::allocator<char>
::basic_ostringstream(std::_Ios_Openmode)
>>>>>> | | | | | | + 9.30%
std::basic_ios<char, std::char_traits<char>
::init(std::basic_streambuf<char,
std::char_traits<char> >*)
>>>>>> | | | | | | | + 7.00%
std::basic_ios<char, std::char_traits<char> >::_M_cache_locale(std::locale
const&)
>>>>>> | |
| | | | | | + 1.60%
std::ctype<char> const&
std::use_facet<std::ctype<char> >(std::locale
const&)
>>>>>> | |
| | | | | | | + 1.50%
__dynamic_cast
>>>>>> | |
| | | | | | | + 0.80%
__cxxabiv1::__vmi_class_type_info::__do_dyncast(long,
__cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info
const*, void const*, __cxxabiv1::__class_type_info const*, void const*,
__cxxabiv1::__class_type_info::__dyncast_result&) const
>>>>>> | |
| | | | | | + 1.40% bool
std::has_facet<std::ctype<char>
>(std::locale const&)
>>>>>> | |
| | | | | | | + 1.30%
__dynamic_cast
>>>>>> | |
| | | | | | | + 0.90%
__cxxabiv1::__vmi_class_type_info::__do_dyncast(long,
__cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info
const*, void const*, __cxxabiv1::__class_type_info const*, void const*,
__cxxabiv1::__class_type_info::__dyncast_result&) const
>>>>>> | |
| | | | | | + 1.10% bool
std::has_facet<std::num_put<char,
std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
>>>>>> | |
| | | | | | | + 0.90%
__dynamic_cast
>>>>>> | |
| | | | | | + 1.00% bool
std::has_facet<std::num_get<char,
std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
>>>>>> | |
| | | | | | | + 0.70%
__dynamic_cast
>>>>>> | |
| | | | | | | + 0.10%
std::locale::id::_M_id() const
>>>>>> | |
| | | | | | | + 0.10%
_ZNKSt6locale2id5_M_idEv@plt
>>>>>> | |
| | | | | | + 0.80%
std::num_put<char,
std::ostreambuf_iterator<char, std::char_traits<char> >
const&
std::use_facet<std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
>>>>>> | |
| | | | | | + 0.70%
std::num_get<char,
std::istreambuf_iterator<char, std::char_traits<char> >
const&
std::use_facet<std::num_get<char, std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
>>>>>> | |
| | | | | | + 0.10%
_ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt
>>>>>> | |
| | | | | + 2.00%
std::ios_base::_M_init()
>>>>>> | |
| | | | | | + 0.80%
std::locale::operator=(std::locale const&)
>>>>>> | |
| | | | | | + 0.80%
std::locale::locale()
>>>>>> | |
| | | | | | + 0.30%
std::locale::~locale()
>>>>>> | |
| | | | | | + 0.10%
_ZNSt6localeC1Ev@plt
>>>>>> | |
| | | | | + 0.20%
_ZNSt8ios_base7_M_initEv@plt
>>>>>> | |
| | | | + 0.90%
std::locale::locale()
>>>>>> | |
| | | | + 0.10%
std::ios_base::ios_base()
>>>>>> | |
| | | | + 0.10%
_ZNSt9basic_iosIcSt11char_traitsIcEE4initEPSt15basic_streambufIcS1_E@plt
>>>>>> | |
| | | + 2.80% std::ostream&
std::ostream::_M_insert<unsigned
long>(unsigned long)
>>>>>> | |
| | | | + 2.40%
std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> >
::do_put(std::ostreambuf_iterator<char,
std::char_traits<char> >,
std::ios_base&, char, unsigned long) const
>>>>>> | |
| | | | | + 2.10%
std::ostreambuf_iterator<char,
std::char_traits<char> > std::num_put<char,
std::ostreambuf_iterator<char, std::char_traits<char> >
::_M_insert_int<unsigned
long>(std::ostreambuf_iterator<char,
std::char_traits<char> >,
std::ios_base&, char, unsigned long) const
>>>>>> | |
| | | | | | + 1.60%
std::basic_streambuf<char,
std::char_traits<char> >::xsputn(char const*,
long)
>>>>>> | |
| | | | | | | + 1.40%
std::basic_stringbuf<char,
std::char_traits<char>, std::allocator<char>
::overflow(int)
>>>>>> | | | | | | | | | | + 0.90%
std::string::reserve(unsigned long)
>>>>>> | |
| | | | | | | | + 0.10%
std::basic_stringbuf<char,
std::char_traits<char>, std::allocator<char>
>::_M_sync(char*, unsigned long, unsigned long)
>>>>>> | |
| | | | | | | | + 0.10%
_ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEE7_M_syncEPcmm@plt
>>>>>> | |
| | | | | | | + 0.20%
__memcpy_ssse3_back
>>>>>>> | | | | | | | | + 0.20% ???
>>>>>> | |
| | | | | | + 0.10%
std::num_put<char,
std::ostreambuf_iterator<char, std::char_traits<char> >
::_M_pad(char, long, std::ios_base&, char*,
char const*, int&) const
>>>>>> | | | | | | | + 0.10%
_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE13_M_insert_intImEES3_S3_RSt8ios_basecT_@plt
>>>>>> | |
| | | | + 0.10%
std::ostream::sentry::sentry(std::ostream&)
>>>>>> | |
| | | + 2.80%
std::basic_stringbuf<char, std::char_traits<char>,
std::allocator<char>
::str() const
>>>>>> | | | | | | + 1.00%
std::string::assign(std::string const&)
>>>>>> | |
| | | | + 0.90% char*
std::string::_S_construct<char*>(char*, char*,
std::allocator<char> const&,
std::forward_iterator_tag) [clone .part.1796]
>>>>>> | |
| | | + 1.50%
std::string::append(char const*, unsigned long)
>>>>>> | |
| | | | + 1.20%
std::string::reserve(unsigned long)
>>>>>> | |
| | | | + 0.60%
std::string::_Rep::_M_clone(std::allocator<char>
const&, unsigned long)
>>>>>> | |
| | | | + 0.10% tc_free
>>>>>> | | | | | + 1.20%
std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone .isra.97]
[clone .part.98]
>>>>>> | |
| | | + 1.00%
std::string::append(std::string const&)
>>>>>> | |
| | | | + 0.70%
std::string::reserve(unsigned long)
>>>>>> | |
| | | | + 0.10%
__memcpy_ssse3_back
>>>>>> | |
| | | + 1.00%
std::basic_string<char, std::char_traits<char>,
std::allocator<char>
::basic_string(std::string const&, unsigned
long, unsigned long)
>>>>>> | | | | | | + 0.80% char*
std::string::_S_construct<char*>(char*, char*, std::allocator<char>
const&,
std::forward_iterator_tag) [clone .part.220]
>>>>>> | |
| | | + 0.40%
std::locale::~locale()
>>>>>> | |
| | | + 0.20% tc_free
>>>>>> | | | | | + 0.20%
__strlen_sse2_pminub
>>>>>> | |
| | | + 0.10%
std::ios_base::~ios_base()
>>>>>> | |
| | | + 0.10%
_ZNSt8ios_baseD2Ev@plt
>>>>>> | |
| | | + 0.10%
_ZNKSt15basic_stringbufIcSt11char_traitsIcESaIcEE3strEv@plt
>>>>>> | |
| | + 18.20%
PyFormatter::open_object_section(char const*)
>>>>>> | |
| | | + 17.10% PyDict_New
>>>>>> | | | | | | + 16.70%
_PyObject_GC_New
>>>>>> | |
| | | | + 16.70%
_PyObject_GC_Malloc
>>>>>> | |
| | | | + 16.60% collect
>>>>>> | | | | | | | + 8.10%
dict_traverse
>>>>>> | |
| | | | | | + 3.20%
visit_reachable
>>>>>> | |
| | | | | | | + 0.10%
type_is_gc
>>>>>> | |
| | | | | | + 2.80%
visit_decref
>>>>>> | |
| | | | | | + 1.60%
PyDict_Next
>>>>>> | |
| | | | | + 1.30%
list_traverse
>>>>>> | |
| | | | | | + 0.40%
visit_decref
>>>>>> | |
| | | | | | + 0.30%
visit_reachable
>>>>>> | |
| | | | | + 0.60%
func_traverse
>>>>>> | |
| | | | | + 0.40%
_PyDict_MaybeUntrack
>>>>>> | |
| | | | | + 0.10%
type_traverse
>>>>>> | |
| | | | | + 0.10%
subtype_traverse
>>>>>> | |
| | | | | + 0.10%
set_traverse
>>>>>> | |
| | | | | + 0.10%
class_traverse
>>>>>> | |
| | | | | + 0.10%
_PyDict_MaybeUntrack@plt
>>>>>> | |
| | | | + 0.10%
PyObject_Malloc
>>>>>> | |
| | | + 1.00%
PyFormatter::dump_pyobject(char const*, _object*)
>>>>>> | |
| | | + 0.40%
PyString_FromString
>>>>>> | |
| | | + 0.20%
dict_set_item_by_hash_or_entry
>>>>>> | |
| | | + 0.20% PyDict_SetItem
>>>>>> | | | | | + 0.10% app1
>>>>>> | | | | + 6.60%
ceph::Formatter::dump_format_unquoted(char const*, char const*, ...)
>>>>>> | |
| | | + 6.60%
PyFormatter::dump_format_va(char const*, char const*, bool,
char const*,
__va_list_tag*)
>>>>>> | |
| | | + 3.90% __vsnprintf_chk
>>>>>> | | | | | | + 3.40% vfprintf
>>>>>> | | | | | | | + 0.50% strchrnul
>>>>>> | | | | | | | + 0.40%
__GI__IO_default_xsputn
>>>>>> | |
| | | | | + 0.20% tc_free
>>>>>> | | | | | | | + 0.10% free@plt
>>>>>> | | | | | | | + 0.10% (anonymous
namespace)::free_null_or_invalid(void*, void (*)(void*)) [clone
.constprop.41]
>>>>>> | |
| | | | + 0.20% _IO_no_init
>>>>>> | | | | | | + 0.10%
_IO_str_init_static_internal
>>>>>> | |
| | | + 1.50%
PyFormatter::dump_pyobject(char const*, _object*)
>>>>>> | |
| | | | + 0.50%
PyString_FromString
>>>>>> | |
| | | | + 0.40% PyDict_SetItem
>>>>>> | | | | | | + 0.10%
dict_set_item_by_hash_or_entry
>>>>>> | |
| | | | + 0.10%
PyDict_SetItem@plt
>>>>>> | |
| | | + 1.20%
PyString_FromString
>>>>>> | |
| | | + 0.60%
PyObject_Malloc
>>>>>> | |
| | | + 0.20%
__strlen_sse2_pminub
>>>>>> | |
| | | + 0.10%
__memcpy_ssse3_back
>>>>>> | |
| | + 0.90% ctime_r
>>>>>> | | | | + 0.80%
PyFormatter::open_array_section(char const*)
>>>>>> | |
| | + 0.40%
std::string::_Rep::_M_dispose(std::allocator<char>
const&) [clone
.isra.846] [clone .part.847]
>>>>>> | |
| | + 0.30%
PyFormatter::dump_int(char const*, long)
>>>>>> | |
| | + 0.20%
PyFormatter::close_section()
>>>>>> | |
| | + 0.10% tc_free
>>>>>> | | | | + 0.10%
std::basic_string<char, std::char_traits<char>, std::allocator<char>
::basic_string(char const*,
std::allocator<char> const&)
>>>>>> | | | | + 0.10%
std::_Rb_tree_increment(std::_Rb_tree_node_base const*)
>>>>>> | |
| | + 0.10%
pow2_hist_t::dump(ceph::Formatter*) const
>>>>>> | |
| | + 0.10%
objectstore_perf_stat_t::dump(ceph::Formatter*) const
>>>>>> | |
| | + 0.10%
PyFormatter::dump_string(char const*,
std::basic_string_view<char,
std::char_traits<char> >)
>>>>>> | |
| | + 0.10%
PyFormatter::dump_pyobject(char const*, _object*)
>>>>>> | |
| + 21.80% Mutex::lock(bool)
>>>>>> | | | | + 21.80% pthread_mutex_lock
>>>>>> | | | | + 21.80% _L_lock_883
>>>>>> | | | | + 21.80%
__lll_lock_wait
>>>>>> | | | + 11.70%
PGMap::dump(ceph::Formatter*) const
>>>>>> | |
| | + 11.70%
PGMap::dump_pg_stats(ceph::Formatter*, bool) const
>>>>>> | |
| | + 10.90%
pg_stat_t::dump(ceph::Formatter*) const
>>>>>> | |
| | | + 4.20%
PyFormatter::dump_stream(char const*)
>>>>>> | |
| | | | + 2.80%
std::basic_ios<char, std::char_traits<char>
>::init(std::basic_streambuf<char, std::char_traits<char> >*)
>>>>>> | |
| | | | | + 2.10%
std::basic_ios<char, std::char_traits<char>
>::_M_cache_locale(std::locale
const&)
>>>>>> | |
| | | | | | + 0.50%
std::num_put<char,
std::ostreambuf_iterator<char, std::char_traits<char> >
const&
std::use_facet<std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
>>>>>> | |
| | | | | | + 0.50% bool
std::has_facet<std::num_put<char,
std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
>>>>>> | |
| | | | | | + 0.40% bool
std::has_facet<std::num_get<char,
std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
>>>>>> | |
| | | | | | + 0.20%
std::num_get<char,
std::istreambuf_iterator<char, std::char_traits<char> >
const&
std::use_facet<std::num_get<char, std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
>>>>>> | |
| | | | | | + 0.20%
std::ctype<char> const&
std::use_facet<std::ctype<char> >(std::locale
const&)
>>>>>> | |
| | | | | | + 0.20% bool
std::has_facet<std::ctype<char>
>(std::locale const&)
>>>>>> | |
| | | | | | + 0.10%
_ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt
>>>>>> | |
| | | | | + 0.70%
std::ios_base::_M_init()
>>>>>> | |
| | | | + 0.50%
tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int,
int)
>>>>>> | |
| | | | + 0.40%
std::string::assign(char const*, unsigned long)
>>>>>> | |
| | | | + 0.20%
std::locale::locale()
>>>>>> | |
| | | | + 0.10%
std::ios_base::ios_base()
>>>>>> | |
| | | + 1.80%
object_stat_collection_t::dump(ceph::Formatter*) const
>>>>>> | |
| | | | + 1.70%
object_stat_sum_t::dump(ceph::Formatter*) const
>>>>>> | |
| | | | | + 1.40%
PyFormatter::dump_pyobject(char const*, _object*)
>>>>>>> | | | | | | | | + 0.60%
dictresize
>>>>>> | |
| | | | | | + 0.30%
PyString_FromString
>>>>>> | |
| | | | | | + 0.20%
PyDict_SetItem
>>>>>> | |
| | | | | | + 0.10%
dict_set_item_by_hash_or_entry
>>>>>> | |
| | | | | + 0.20%
PyFormatter::dump_int(char const*, long)
>>>>>> | |
| | | | + 0.10%
PyFormatter::open_object_section(char const*)
>>>>>> | |
| | | + 1.80%
PyFormatter::open_array_section(char const*)
>>>>>> | |
| | | | + 1.60% PyList_New
>>>>>> | | | | | | | + 1.60%
_PyObject_GC_New
>>>>>> | |
| | | | | + 1.60%
_PyObject_GC_Malloc
>>>>>> | |
| | | | | + 1.60% collect
>>>>>> | | | | | | | + 0.80%
dict_traverse
>>>>>> | |
| | | | | + 0.10%
subtype_traverse
>>>>>> | |
| | | | | + 0.10%
list_traverse
>>>>>> | |
| | | | | + 0.10%
func_traverse
>>>>>> | |
| | | | | + 0.10%
_PyDict_MaybeUntrack
>>>>>> | |
| | | | + 0.20%
PyFormatter::dump_pyobject(char const*, _object*)
>>>>>> | |
| | | + 1.70%
utime_t::localtime(std::ostream&) const
>>>>>> | |
| | | | + 1.00% std::ostream&
std::ostream::_M_insert<long>(long)
>>>>>> | |
| | | | | + 0.60%
std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> >
::do_put(std::ostreambuf_iterator<char,
std::char_traits<char> >,
std::ios_base&, char, long) const
>>>>>> | |
| | | | + 0.30%
std::basic_ostream<char, std::char_traits<char>
>&
std::__ostream_insert<char, std::char_traits<char>
(std::basic_ostream<char,
std::char_traits<char> >&, char const*, long)
>>>>>> | | | | | | + 0.20% __tz_convert
>>>>>> | | | | | + 0.90%
PyFormatter::dump_pyobject(char const*, _object*)
>>>>>> | |
| | | + 0.20%
pg_state_string(unsigned long)
>>>>>> | |
| | | + 0.20%
operator<<(std::ostream&, eversion_t const&)
[clone .isra.103]
>>>>>> | |
| | | + 0.10% std::ostream&
std::ostream::_M_insert<unsigned
long>(unsigned long)
>>>>>> | |
| | + 0.40%
PyFormatter::dump_stream(char const*)
>>>>>> | |
| | + 0.30%
operator<<(std::ostream&, pg_t const&)
>>>>>> | |
| | + 0.10%
PyFormatter::open_object_section(char const*)
>>>>>> | |
| + 2.70%
PyFormatter::finish_pending_streams()
>>>>>> | |
| | + 1.00%
std::_List_base<std::shared_ptr<PyFormatter::PendingStream>,
std::allocator<std::shared_ptr<PyFormatter::PendingStream> >
>::_M_clear()
>>>>>> | |
| | | + 0.40%
std::_Sp_counted_ptr_inplace<PyFormatter::PendingStream,
std::allocator<PyFormatter::PendingStream>,
(__gnu_cxx::_Lock_policy)2>::_M_dispose()
>>>>>> | |
| | | + 0.20%
tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*,
unsigned int)
>>>>>> | |
| | + 0.70%
PyFormatter::dump_pyobject(char const*, _object*)
>>>>>> | |
| | + 0.50%
std::string::replace(unsigned long, unsigned long, char const*,
unsigned
long)
>>>>>> | |
| | + 0.30% PyString_FromString
>>>>>> | | | + 1.10% PyEval_RestoreThread
>>>>>> | | | + 1.10%
PyThread_acquire_lock
>>>>>> | | | + 1.10% sem_wait@
@GLIBC_2.2.5
>>>>>> | |
| + 1.10%
__new_sem_wait_slow.constprop.0
>>>>>> | |
| + 1.10%
do_futex_wait.constprop.1
>>>>>> | |
+ 2.90% frame_dealloc
>>>>>> | | + 2.90% dict_dealloc
>>>>>> | | + 2.90% list_dealloc
>>>>>> | | + 2.90% dict_dealloc
>>>>>> | | + 1.90% list_dealloc
>>>>>> | | | + 1.90% dict_dealloc
>>>>>> | | | + 1.70% list_dealloc
>>>>>> | | | + 1.50% dict_dealloc
>>>>>> | | | | + 0.90% dict_dealloc
>>>>>> | | | | + 0.10% PyObject_Free
>>>>>> | | | + 0.10%
tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*,
unsigned int)
>>>>>> | |
+ 0.30% PyObject_Free
>>>>>> | | + 0.20% dict_dealloc
>>>>>> | + 8.10% Gil::Gil(SafeThreadState&, bool)
>>>>>> | + 8.10% PyEval_RestoreThread
>>>>>> | + 8.10% PyThread_acquire_lock
>>>>>> | + 8.10% sem_wait@@GLIBC_2.2.5
>>>>>> | + 8.10% __new_sem_wait_slow.constprop.0
>>>>>> | + 8.10% do_futex_wait.constprop.1
>>>>>> + 0.60%
std::condition_variable::wait(std::unique_lock<std::mutex>&)
>>>>>>
>>>>>> --
>>>>>> Paul Mezzanini
>>>>>> Sr Systems Administrator / Engineer, Research Computing
>>>>>> Information & Technology Services
>>>>>> Finance & Administration
>>>>>> Rochester Institute of Technology
>>>>>> o:(585) 475-3245 | pfmeec(a)rit.edu
>>>>>>
>>>>>> CONFIDENTIALITY NOTE: The information transmitted, including
attachments, is
>>>>>> intended only for the
person(s) or entity to which it is addressed
and may
>>>>>> contain confidential
and/or privileged material. Any review,
retransmission,
>>>>>> dissemination or other
use of, or taking of any action in reliance
upon this
>>>>>> information by persons or
entities other than the intended
recipient is
>>>>>> prohibited. If you
received this in error, please contact the
sender and
>>>>>> destroy any copies of
this information.
>>>>>> ------------------------
>>>>>>
>>>>>> ________________________________________
>>>>>> From: Mark Nelson <mnelson(a)redhat.com>
>>>>>> Sent: Thursday, December 19, 2019 11:47 AM
>>>>>> To: ceph-users(a)ceph.io
>>>>>> Subject: [ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5
>>>>>>
>>>>>> If you can get a wallclock profiler on the mgr process we might
be
able
>>>>>> to figure out specifics
of what's taking so much time (ie
processing
>>>>>> pg_summary or something
else). Assuming you have gdb with the
python
>>>>>> bindings and the ceph
debug packages installed, if you (are anyone)
>>>>>> could try gdbpmp on the 100% mgr process that would be
fantastic.
>>>>>>
>>>>>>
>>>>>>
https://github.com/markhpc/gdbpmp
>>>>>>
>>>>>>
>>>>>> gdbpmp.py -p`pidof ceph-mgr` -n 1000 -o mgr.gdbpmp
>>>>>>
>>>>>>
>>>>>> If you want to view the results:
>>>>>>
>>>>>>
>>>>>> gdbpmp.py -i mgr.gdbpmp -t 1
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>>
>>>>>> On 12/19/19 6:29 AM, Paul Emmerich wrote:
>>>>>>> We're also seeing unusually high mgr CPU usage on some
setups, the
>>>>>>> only thing they have in common seem to > 300 OSDs.
>>>>>>>
>>>>>>> Threads using the CPU are "mgr-fin" and and
"ms_dispatch"
>>>>>>>
>>>>>>>
>>>>>>> Paul
>>>>>>>
>>>>>>> --
>>>>>>> Paul Emmerich
>>>>>>>
>>>>>>> Looking for help with your Ceph cluster? Contact us at
https://croit.io
>>>>>>>
>>>>>>> croit GmbH
>>>>>>> Freseniusstr. 31h
>>>>>>> 81247 München
>>>>>>>
www.croit.io <http://www.croit.io>
>>>>>>> Tel: +49 89 1896585 90
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Dec 19, 2019 at 9:40 AM Serkan Çoban <
cobanserkan(a)gmail.com
>>>>>>>
<mailto:cobanserkan@gmail.com>> wrote:
>>>>>>>
>>>>>>> +1
>>>>>>> 1500 OSDs, mgr is constant %100 after upgrading from
14.2.2
to 14.2.5.
>>>>>>>
>>>>>>> On Thu, Dec 19, 2019 at 11:06 AM Toby Darling
>>>>>>> <toby(a)mrc-lmb.cam.ac.uk
<mailto:toby@mrc-lmb.cam.ac.uk>>
wrote:
>>>>>>> >
>>>>>>> > On 18/12/2019 22:40, Bryan Stillwell wrote:
>>>>>>> > > That's how we noticed it too. Our graphs
went silent
after
>>>>>>> the upgrade
>>>>>>> > > completed. Is your large cluster over 350
OSDs?
>>>>>>> >
>>>>>>> > A 'me too' on this - graphs have gone
quiet, and mgr is
using
>>>>>>> 100% CPU.
>>>>>>> > This happened when we grew our 14.2.5 cluster from
328 to
436 OSDs.
>>>>> >
>>>>> > Cheers
>>>>> > Toby
>>>>> > --
>>>>> > Toby Darling, Scientific Computing (2N249)
>>>>> > MRC Laboratory of Molecular Biology
>>>>> > Francis Crick Avenue
>>>>> > Cambridge Biomedical Campus
>>>>> > Cambridge CB2 0QH
>>>>> > Phone 01223 267070
>>>>> > _______________________________________________
>>>>> > ceph-users mailing list -- ceph-users(a)ceph.io
>>>>> <mailto:ceph-users@ceph.io>
>>>>> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>>> <mailto:ceph-users-leave@ceph.io>
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>> <mailto:ceph-users@ceph.io>
>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>>> <mailto:ceph-users-leave@ceph.io>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io