Was the root cause found and fixed? If so, will the fix be available in
14.2.6 or sooner?
On Thu, Dec 19, 2019 at 5:48 PM Mark Nelson <mnelson(a)redhat.com> wrote:
Hi Paul,
Thanks for gathering this! It looks to me like at the very least we
should redo the fixed_u_to_string and fixed_to_string functions in
common/Formatter.cc. That alone looks like it's having a pretty
significant impact.
Mark
On 12/19/19 2:09 PM, Paul Mezzanini wrote:
Based on what we've seen with perf, we think
this is the relevant
section. (attached is also the whole file)
Thread: 73 (mgr-fin) - 1000 samples
+ 100.00% clone
+ 100.00% start_thread
+ 100.00% Finisher::finisher_thread_entry()
+ 99.40% Context::complete(int)
| + 99.40% FunctionContext::finish(int)
| + 99.40% ActivePyModule::notify(std::string const&,
std::string
const&)
| + 91.30% PyObject_CallMethod
| | + 91.30% call_function_tail
| | + 91.30% PyObject_Call
| | + 91.30% instancemethod_call
| | + 91.30% PyObject_Call
| | + 91.30% function_call
| | + 91.30% PyEval_EvalCodeEx
| | + 88.40% PyEval_EvalFrameEx
| | | + 88.40% PyEval_EvalFrameEx
| | | + 88.40% ceph_state_get(BaseMgrModule*,
_object*)
| | | + 88.40%
ActivePyModules::get_python(std::string const&)
| | | + 51.10%
PGMap::dump_osd_stats(ceph::Formatter*) const
| | | | + 51.10%
osd_stat_t::dump(ceph::Formatter*) const
| | | | + 22.50%
ceph::fixed_u_to_string(unsigned long, int)
| | | | | + 10.50%
std::basic_ostringstream<char, std::char_traits<char>,
std::allocator<char>
::basic_ostringstream(std::_Ios_Openmode)
| | | | | | + 9.30% std::basic_ios<char,
std::char_traits<char> >::init(std::basic_streambuf<char,
std::char_traits<char> >*)
| | | | | | | +
7.00%
std::basic_ios<char, std::char_traits<char>
>::_M_cache_locale(std::locale
const&)
| | | | | | | | +
1.60% std::ctype<char>
const& std::use_facet<std::ctype<char>
>(std::locale const&)
| | | | | | | | |
+ 1.50% __dynamic_cast
| | | | | | | | | + 0.80%
__cxxabiv1::__vmi_class_type_info::__do_dyncast(long,
__cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info
const*, void const*, __cxxabiv1::__class_type_info const*, void const*,
__cxxabiv1::__class_type_info::__dyncast_result&) const
| | | | | | | | +
1.40% bool
std::has_facet<std::ctype<char> >(std::locale const&)
| | | | | | | | |
+ 1.30% __dynamic_cast
| | | | | | | | | + 0.90%
__cxxabiv1::__vmi_class_type_info::__do_dyncast(long,
__cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info
const*, void const*, __cxxabiv1::__class_type_info const*, void const*,
__cxxabiv1::__class_type_info::__dyncast_result&) const
| | | | | | | | +
1.10% bool
std::has_facet<std::num_put<char,
std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
| | | | | | | | |
+ 0.90% __dynamic_cast
| | | | | | | | + 1.00% bool
std::has_facet<std::num_get<char, std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
| | | | | | | | |
+ 0.70% __dynamic_cast
| | | | | | | | | + 0.10%
std::locale::id::_M_id()
const
| | | | | | | | |
+ 0.10%
_ZNKSt6locale2id5_M_idEv@plt
| | | | | | | | +
0.80%
std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> >
const&
std::use_facet<std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
| | | | | | | | +
0.70%
std::num_get<char, std::istreambuf_iterator<char,
std::char_traits<char> >
const&
std::use_facet<std::num_get<char, std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
| | | | | | | | +
0.10%
_ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt
| | | | | | | +
2.00%
std::ios_base::_M_init()
| | | | | | | | +
0.80%
std::locale::operator=(std::locale const&)
| | | | | | | | +
0.80%
std::locale::locale()
| | | | | | | | +
0.30%
std::locale::~locale()
| | | | | | | | +
0.10%
_ZNSt6localeC1Ev@plt
| | | | | | | +
0.20%
_ZNSt8ios_base7_M_initEv@plt
| | | | | | +
0.90% std::locale::locale()
| | | | | | + 0.10%
std::ios_base::ios_base()
| | | | | | +
0.10%
_ZNSt9basic_iosIcSt11char_traitsIcEE4initEPSt15basic_streambufIcS1_E@plt
| | | | | + 2.80%
std::ostream&
std::ostream::_M_insert<unsigned long>(unsigned long)
| | | | | | +
2.40% std::num_put<char,
std::ostreambuf_iterator<char,
std::char_traits<char> >
::do_put(std::ostreambuf_iterator<char,
std::char_traits<char> >,
std::ios_base&, char, unsigned long) const
| | | | | | | +
2.10%
std::ostreambuf_iterator<char, std::char_traits<char> >
std::num_put<char,
std::ostreambuf_iterator<char, std::char_traits<char> >
::_M_insert_int<unsigned
long>(std::ostreambuf_iterator<char,
std::char_traits<char> >,
std::ios_base&, char, unsigned long) const
| | | | | | | | +
1.60%
std::basic_streambuf<char, std::char_traits<char> >::xsputn(char
const*,
long)
| | | | | | | | |
+ 1.40%
std::basic_stringbuf<char, std::char_traits<char>,
std::allocator<char>
::overflow(int)
| | | | | | | | | | + 0.90%
std::string::reserve(unsigned long)
| | | | | | | | |
| + 0.10%
std::basic_stringbuf<char, std::char_traits<char>,
std::allocator<char>
>::_M_sync(char*, unsigned long, unsigned long)
| | | | | | | | |
| + 0.10%
_ZNSt15basic_stringbufIcSt11char_traitsIcESaIcEE7_M_syncEPcmm@plt
| | | | | | | | |
+ 0.20%
__memcpy_ssse3_back
> | | | | | | | | + 0.20% ???
| | | | | | | | +
0.10%
std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> >
::_M_pad(char, long, std::ios_base&, char*,
char const*, int&) const
| | | | | | | + 0.10%
_ZNKSt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE13_M_insert_intImEES3_S3_RSt8ios_basecT_@plt
| | | | | | +
0.10%
std::ostream::sentry::sentry(std::ostream&)
| | | | | + 2.80%
std::basic_stringbuf<char, std::char_traits<char>,
std::allocator<char>
::str() const
| | | | | | + 1.00%
std::string::assign(std::string
const&)
| | | | | | +
0.90% char*
std::string::_S_construct<char*>(char*, char*,
std::allocator<char> const&,
std::forward_iterator_tag) [clone .part.1796]
| | | | | + 1.50%
std::string::append(char const*, unsigned long)
| | | | | | +
1.20%
std::string::reserve(unsigned long)
| | | | | | +
0.60%
std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned
long)
| | | | | | +
0.10% tc_free
| | | | | + 1.20%
std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone .isra.97]
[clone .part.98]
| | | | | + 1.00%
std::string::append(std::string const&)
| | | | | | +
0.70%
std::string::reserve(unsigned long)
| | | | | | +
0.10% __memcpy_ssse3_back
| | | | | + 1.00% std::basic_string<char,
std::char_traits<char>, std::allocator<char> >::basic_string(std::string
const&, unsigned long, unsigned long)
| | | | | | +
0.80% char*
std::string::_S_construct<char*>(char*, char*,
std::allocator<char> const&,
std::forward_iterator_tag) [clone .part.220]
| | | | | + 0.40%
std::locale::~locale()
| | | | | + 0.20% tc_free
| | | | | + 0.20% __strlen_sse2_pminub
| | | | | + 0.10%
std::ios_base::~ios_base()
| | | | | + 0.10%
_ZNSt8ios_baseD2Ev@plt
| | | | | + 0.10%
_ZNKSt15basic_stringbufIcSt11char_traitsIcESaIcEE3strEv@plt
| | | | + 18.20%
PyFormatter::open_object_section(char const*)
| | | | | + 17.10%
PyDict_New
| | | | | | + 16.70% _PyObject_GC_New
| | | | | | + 16.70% _PyObject_GC_Malloc
| | | | | | + 16.60% collect
| | | | | | | + 8.10% dict_traverse
| | | | | | | | + 3.20%
visit_reachable
| | | | | | |
| | + 0.10% type_is_gc
| | | | | | | | + 2.80% visit_decref
| | | | | | | | + 1.60% PyDict_Next
| | | | | | | + 1.30% list_traverse
| | | | | | | | + 0.40% visit_decref
| | | | | | | | + 0.30%
visit_reachable
| | | | | | |
+ 0.60% func_traverse
| | | | | | | + 0.40%
_PyDict_MaybeUntrack
| | | | | | |
+ 0.10% type_traverse
| | | | | | | + 0.10% subtype_traverse
| | | | | | | + 0.10% set_traverse
| | | | | | | + 0.10% class_traverse
| | | | | | | + 0.10%
_PyDict_MaybeUntrack@plt
> | | | | | | + 0.10% PyObject_Malloc
| | | | | + 1.00%
PyFormatter::dump_pyobject(char const*, _object*)
| | | | | +
0.40% PyString_FromString
| | | | | + 0.20%
dict_set_item_by_hash_or_entry
| | | | | +
0.20% PyDict_SetItem
| | | | | + 0.10% app1
| | | | + 6.60%
ceph::Formatter::dump_format_unquoted(char const*, char const*, ...)
| | | | | + 6.60%
PyFormatter::dump_format_va(char const*, char const*, bool, char const*,
__va_list_tag*)
| | | | | +
3.90% __vsnprintf_chk
| | | | | | + 3.40% vfprintf
| | | | | | | + 0.50% strchrnul
| | | | | | | + 0.40%
__GI__IO_default_xsputn
| | | | | | | +
0.20% tc_free
| | | | | | | + 0.10% free@plt
| | | | | | | + 0.10% (anonymous
namespace)::free_null_or_invalid(void*, void (*)(void*)) [clone
.constprop.41]
| | | | | | +
0.20% _IO_no_init
| | | | | | + 0.10%
_IO_str_init_static_internal
| | | | | +
1.50%
PyFormatter::dump_pyobject(char const*, _object*)
| | | | | | +
0.50% PyString_FromString
| | | | | | + 0.40% PyDict_SetItem
| | | | | | + 0.10%
dict_set_item_by_hash_or_entry
| | | | | | +
0.10% PyDict_SetItem@plt
| | | | | + 1.20% PyString_FromString
| | | | | + 0.60% PyObject_Malloc
| | | | | + 0.20% __strlen_sse2_pminub
| | | | | + 0.10% __memcpy_ssse3_back
| | | | + 0.90% ctime_r
| | | | + 0.80%
PyFormatter::open_array_section(char const*)
| | | | + 0.40%
std::string::_Rep::_M_dispose(std::allocator<char> const&) [clone
.isra.846] [clone .part.847]
| | | | + 0.30%
PyFormatter::dump_int(char const*, long)
| | | | + 0.20%
PyFormatter::close_section()
| | | | + 0.10%
tc_free
| | | | + 0.10% std::basic_string<char,
std::char_traits<char>, std::allocator<char> >::basic_string(char const*,
std::allocator<char> const&)
| | | | + 0.10%
std::_Rb_tree_increment(std::_Rb_tree_node_base const*)
| | | | + 0.10%
pow2_hist_t::dump(ceph::Formatter*) const
| | | | + 0.10%
objectstore_perf_stat_t::dump(ceph::Formatter*) const
| | | | + 0.10%
PyFormatter::dump_string(char const*, std::basic_string_view<char,
std::char_traits<char> >)
| | | | + 0.10%
PyFormatter::dump_pyobject(char const*, _object*)
| | | + 21.80%
Mutex::lock(bool)
| | | | + 21.80% pthread_mutex_lock
| | | | + 21.80% _L_lock_883
| | | | + 21.80% __lll_lock_wait
| | | + 11.70%
PGMap::dump(ceph::Formatter*) const
| | | | + 11.70%
PGMap::dump_pg_stats(ceph::Formatter*, bool) const
| | | | + 10.90%
pg_stat_t::dump(ceph::Formatter*) const
| | | | | + 4.20%
PyFormatter::dump_stream(char const*)
| | | | | | +
2.80% std::basic_ios<char,
std::char_traits<char>
>::init(std::basic_streambuf<char,
std::char_traits<char> >*)
| | | | | | | +
2.10%
std::basic_ios<char, std::char_traits<char>
>::_M_cache_locale(std::locale
const&)
| | | | | | | | +
0.50%
std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> >
const&
std::use_facet<std::num_put<char, std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
| | | | | | | | +
0.50% bool
std::has_facet<std::num_put<char,
std::ostreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
| | | | | | | | +
0.40% bool
std::has_facet<std::num_get<char,
std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
| | | | | | | | +
0.20%
std::num_get<char, std::istreambuf_iterator<char,
std::char_traits<char> >
const&
std::use_facet<std::num_get<char, std::istreambuf_iterator<char,
std::char_traits<char> > > >(std::locale const&)
| | | | | | | | +
0.20% std::ctype<char>
const& std::use_facet<std::ctype<char>
>(std::locale const&)
| | | | | | | | +
0.20% bool
std::has_facet<std::ctype<char> >(std::locale const&)
| | | | | | | | +
0.10%
_ZSt9has_facetISt7num_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEEEbRKSt6locale@plt
| | | | | | | +
0.70%
std::ios_base::_M_init()
| | | | | | +
0.50%
tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, int)
| | | | | | +
0.40%
std::string::assign(char const*, unsigned long)
> | | | | | | + 0.20% std::locale::locale()
| | | | | | +
0.10%
std::ios_base::ios_base()
| | | | | + 1.80%
object_stat_collection_t::dump(ceph::Formatter*) const
| | | | | | +
1.70%
object_stat_sum_t::dump(ceph::Formatter*) const
| | | | | | | +
1.40%
PyFormatter::dump_pyobject(char const*, _object*)
> | | | | | | | | + 0.60% dictresize
| | | | | | | | +
0.30%
PyString_FromString
> | | | | | | | | + 0.20% PyDict_SetItem
| | | | | | | | +
0.10%
dict_set_item_by_hash_or_entry
| | | | | | | +
0.20%
PyFormatter::dump_int(char const*, long)
| | | | | | +
0.10%
PyFormatter::open_object_section(char const*)
| | | | | + 1.80%
PyFormatter::open_array_section(char const*)
| | | | | | +
1.60% PyList_New
| | | | | | | + 1.60% _PyObject_GC_New
| | | | | | | + 1.60%
_PyObject_GC_Malloc
| | | | | | |
+ 1.60% collect
| | | | | | | + 0.80% dict_traverse
| | | | | | | + 0.10%
subtype_traverse
| | | | | | |
+ 0.10% list_traverse
| | | | | | | + 0.10% func_traverse
| | | | | | | + 0.10%
_PyDict_MaybeUntrack
| | | | | | +
0.20%
PyFormatter::dump_pyobject(char const*, _object*)
| | | | | + 1.70%
utime_t::localtime(std::ostream&) const
| | | | | | +
1.00% std::ostream&
std::ostream::_M_insert<long>(long)
| | | | | | | +
0.60% std::num_put<char,
std::ostreambuf_iterator<char,
std::char_traits<char> >
::do_put(std::ostreambuf_iterator<char,
std::char_traits<char> >,
std::ios_base&, char, long) const
| | | | | | +
0.30%
std::basic_ostream<char, std::char_traits<char> >&
std::__ostream_insert<char, std::char_traits<char>
(std::basic_ostream<char,
std::char_traits<char> >&, char const*, long)
| | | | | | + 0.20% __tz_convert
| | | | | + 0.90%
PyFormatter::dump_pyobject(char
const*, _object*)
| | | | | + 0.20%
pg_state_string(unsigned long)
| | | | | + 0.20%
operator<<(std::ostream&, eversion_t const&) [clone .isra.103]
| | | | | + 0.10%
std::ostream&
std::ostream::_M_insert<unsigned long>(unsigned long)
| | | | + 0.40%
PyFormatter::dump_stream(char const*)
| | | | + 0.30%
operator<<(std::ostream&,
pg_t const&)
| | | | + 0.10%
PyFormatter::open_object_section(char const*)
| | | + 2.70%
PyFormatter::finish_pending_streams()
| | | | + 1.00%
std::_List_base<std::shared_ptr<PyFormatter::PendingStream>,
std::allocator<std::shared_ptr<PyFormatter::PendingStream> >
>::_M_clear()
| | | | | + 0.40%
std::_Sp_counted_ptr_inplace<PyFormatter::PendingStream,
std::allocator<PyFormatter::PendingStream>,
(__gnu_cxx::_Lock_policy)2>::_M_dispose()
| | | | | + 0.20%
tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*,
unsigned int)
| | | | + 0.70%
PyFormatter::dump_pyobject(char const*, _object*)
| | | | + 0.50%
std::string::replace(unsigned long, unsigned long, char const*, unsigned
long)
| | | | + 0.30%
PyString_FromString
| | | + 1.10% PyEval_RestoreThread
| | | + 1.10% PyThread_acquire_lock
| | | + 1.10% sem_wait@@GLIBC_2.2.5
| | | + 1.10%
__new_sem_wait_slow.constprop.0
| | | +
1.10%
do_futex_wait.constprop.1
| | + 2.90% frame_dealloc
| | + 2.90% dict_dealloc
| | + 2.90% list_dealloc
| | + 2.90% dict_dealloc
| | + 1.90% list_dealloc
| | | + 1.90% dict_dealloc
| | | + 1.70% list_dealloc
| | | + 1.50% dict_dealloc
| | | | + 0.90% dict_dealloc
| | | | + 0.10% PyObject_Free
| | | + 0.10%
tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*,
unsigned int)
| | + 0.30%
PyObject_Free
| | + 0.20% dict_dealloc
| + 8.10% Gil::Gil(SafeThreadState&, bool)
| + 8.10% PyEval_RestoreThread
| + 8.10% PyThread_acquire_lock
| + 8.10% sem_wait@@GLIBC_2.2.5
| + 8.10% __new_sem_wait_slow.constprop.0
| + 8.10% do_futex_wait.constprop.1
+ 0.60%
std::condition_variable::wait(std::unique_lock<std::mutex>&)
--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute of Technology
o:(585) 475-3245 | pfmeec(a)rit.edu
CONFIDENTIALITY NOTE: The information transmitted, including
attachments, is
intended only for the person(s) or entity to
which it is addressed and
may
contain confidential and/or privileged material.
Any review,
retransmission,
dissemination or other use of, or taking of any
action in reliance upon
this
information by persons or entities other than the
intended recipient is
prohibited. If you received this in error, please contact the sender and
destroy any copies of this information.
------------------------
________________________________________
From: Mark Nelson <mnelson(a)redhat.com>
Sent: Thursday, December 19, 2019 11:47 AM
To: ceph-users(a)ceph.io
Subject: [ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5
If you can get a wallclock profiler on the mgr process we might be able
to figure out specifics of what's taking so much time (ie processing
pg_summary or something else). Assuming you have gdb with the python
bindings and the ceph debug packages installed, if you (are anyone)
could try gdbpmp on the 100% mgr process that would be fantastic.
https://github.com/markhpc/gdbpmp
gdbpmp.py -p`pidof ceph-mgr` -n 1000 -o mgr.gdbpmp
If you want to view the results:
gdbpmp.py -i mgr.gdbpmp -t 1
Thanks,
Mark
On 12/19/19 6:29 AM, Paul Emmerich wrote:
> We're also seeing unusually high mgr CPU usage on some setups, the
> only thing they have in common seem to > 300 OSDs.
>
> Threads using the CPU are "mgr-fin" and and "ms_dispatch"
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at
https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
>
www.croit.io <http://www.croit.io>
> Tel: +49 89 1896585 90
>
>
> On Thu, Dec 19, 2019 at 9:40 AM Serkan Çoban <cobanserkan(a)gmail.com
> <mailto:cobanserkan@gmail.com>> wrote:
>
> +1
> 1500 OSDs, mgr is constant %100 after upgrading from 14.2.2 to
14.2.5.
>
> On Thu, Dec 19, 2019 at 11:06 AM Toby Darling
> <toby(a)mrc-lmb.cam.ac.uk <mailto:toby@mrc-lmb.cam.ac.uk>> wrote:
> >
> > On 18/12/2019 22:40, Bryan Stillwell wrote:
> > > That's how we noticed it too. Our graphs went silent after
> the upgrade
> > > completed. Is your large cluster over 350 OSDs?
> >
> > A 'me too' on this - graphs have gone quiet, and mgr is using
> 100% CPU.
> > This happened when we grew our 14.2.5 cluster from 328 to 436
OSDs.
Cheers
Toby
--
Toby Darling, Scientific Computing (2N249)
MRC Laboratory of Molecular Biology
Francis Crick Avenue
Cambridge Biomedical Campus
Cambridge CB2 0QH
Phone 01223 267070
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
<mailto:ceph-users@ceph.io>
To unsubscribe send an email to
ceph-users-leave(a)ceph.io
<mailto:ceph-users-leave@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
<mailto:ceph-users@ceph.io>
To unsubscribe send an email to ceph-users-leave(a)ceph.io
<mailto:ceph-users-leave@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io