Hi all,
I am testing my "multifs auth caps" PR[1] with teuthology. The issue
is that just to discover mistakes in my patch for files like
kernel_mount.py[2], mount.py[3], and fuse_mount.py[4], I need to push
the branch on ceph-ci, wait several hours for the build process to
complete, trigger the teuthology tests and again wait for them to be
executed and repeat all these steps again until all the issues in my
patch are fixed.
Since [2][3][4] are Python programs, they don't play any role in the
build process AFAIK. If that's absolutely true, is there a way to
circumvent the "waiting for build process to complete" part and
trigger the tests directly using the binaries from previous build?
This would save several hours for me and also take the boredom out of
testing these changes. If previous build are wiped out on updating my
copy of PR branch on ceph-ci, I can maintain two branches on ceph-ci
too: one for builds and other for Python changes.
I did run my tests with vstart_runner.py locally to reduce my number
of round trips to pulpito.ceph.com but the changes in [2][3][4] don't
get tested with vstart_runner.py since vstart_runner.py uses its own
classes for handling CephFS mounts.
Thanks,
- Rishabh
[1] https://github.com/ceph/ceph/pull/32581
[2] https://github.com/rishabh-d-dave/ceph/blob/wip-djf-15070/qa/tasks/cephfs/k…
[3] https://github.com/rishabh-d-dave/ceph/blob/wip-djf-15070/qa/tasks/cephfs/m…
[4] https://github.com/rishabh-d-dave/ceph/blob/wip-djf-15070/qa/tasks/cephfs/f…
Hi,
Since almost a week, I have encountered this vstart.sh issue for 5-6
times. After rebuilding a branch few times I come against this issue
on running vstart.sh -
Populating config ...
Traceback (most recent call last):
File "/home/rishabh/repos/ceph/multifs-auth/build/bin/ceph", line
151, in <module>
from ceph_daemon import admin_socket, DaemonWatcher, Termsize
File "/home/rishabh/repos/ceph/multifs-auth/src/pybind/ceph_daemon.py",
line 24, in <module>
from prettytable import PrettyTable, HEADER
ModuleNotFoundError: No module named 'prettytable'
It's odd to get this error because, first, vstart.sh was running fine
until last time I ran make command and, second, prettytable seems to
be present on the system -
$ pip list | grep prettytable
DEPRECATION: Python 2.7 reached the end of its life on January 1st,
2020. Please upgrade your Python as Python 2.7 is no longer
maintained. A future version of pip will drop support for Python 2.7.
More details about Python 2 support in pip, can be found at
https://pip.pypa.io/en/latest/development/release-process/#python-2-support
prettytable 0.7.2
$ pip2 list | grep prettytable
DEPRECATION: Python 2.7 reached the end of its life on January 1st,
2020. Please upgrade your Python as Python 2.7 is no longer
maintained. A future version of pip will drop support for Python 2.7.
More details about Python 2 support in pip, can be found at
https://pip.pypa.io/en/latest/development/release-process/#python-2-support
prettytable 0.7.2
$
Simply running make or install-deps.sh again doesn't fix this issue. I
have to delete the build directory and build from scratch again. I am
using Fedora 31. Following are my arguments to do_cmake.sh script -
-DWITH_PYTHON2=OFF -DWITH_CEPHFS_SHELL=ON -DWITH_BABELTRACE=OFF
-DWITH_MANPAGE=OFF -DWITH_RBD=OFF -DWITH_RADOSGW=OFF -DWITH_KRBD=OFF
I rebased my branch on my latest master last time (day before
yesterday, IIRC) to check if this has already been fixed but that's
not the case. Is this a known issue? How do I avoid building from
scratch repeatedly?
Thanks,
- Rishabh
Hi all,
We are working on RBD persistent writeback cache which caches data on
block devices.
Currently, we operate the cache device using the driver in BlueStore:
src/os/bluestore/KernelDevice.
Meanwhile, we may need to use other block device drivers in BlueStore.
So we need to reuse these codes in librbd.
Thanks to Jason who gave us some suggestions. And Changcheng submitted
a PR about this work. https://github.com/ceph/ceph/pull/34622/
Welcome for more suggestions and comments.
Best wishes
Lisa
--
Best wishes
Lisa
Hello! I am seeing a TypeError when configuring a cluster handle, and it seems like it is *probably* some sort of versioning/build related issue, but I wanted to get some clarity here. Also, if it is a bug, I am not exactly sure how to register it, so some pointers there would be great.
The description of the issue is that the rados.py module (<path-to-pythonlib>/python3.6/site-packages/rados.py) has code of the following nature:
...
if clustername is not None and not isinstance(clustername, str):
raise TypeError('clustername must be a string or None')
...
ret = run_in_thread(
self.librados.rados_create2,
(
byref(self.cluster),
c_char_p(clustername),
c_char_p(name),
c_uint64(flags)
)
)
but calling `c_char_p(<str>)` throws a TypeError ("bytes or integer address expected instead of str instance"). This leads me to believe that at some point, or for some versions, c_char_p took a string and now for my environment it doesn't. I went through the rados.py file and byte-encoded the strings (using "string".encode('utf-8')), and the TypeError is no longer thrown, but I haven't yet tested much further in order to know if I've shot myself in the foot. I tried looking for the source, but couldn't quite figure out where this code is generated from, or kept. The closest I seem to be to finding the source is this file: https://github.com/ceph/ceph/blob/master/src/pybind/rados/rados.pyx
I would appreciate some help/insight/guidance as to a proper fix, if this bug was filed before (how could it not? but also I couldn't find when searching the issue tracker), and where to report if it is unreported.
Thanks!
------------------------------
Verbose, relevant information below:
A minimal script to reproduce the errors:
https://gist.github.com/drin/461942b0a361053203607cb5eb17cac4
The error:
TypeError: bytes or integer address expected instead of str instance
The environment:
cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.1 LTS (Bionic Beaver)"
rados package installed:
python3-rados_12.2.12-0ubuntu0.18.04.5_amd64.deb
A 2-node cluster (1 client, 1 osd), setup in the following way on cloudlab:
https://github.com/uccross/skyhookdm-ceph/wiki/Ceph-SkyhookDM-cluster-setup…
The python interpreter version and dependencies:
Python 3.6.9
atomicwrites 1.3.0
attrs 19.3.0
awscli 1.18.39
botocore 1.15.39
colorama 0.4.3
cython 0.29.16
docutils 0.15.2
flatbuffers 1.11
h5py 2.10.0
importlib-metadata 1.6.0
jmespath 0.9.5
more-itertools 8.2.0
numpy 1.18.2
owlready2 0.23
pandas 1.0.3
pluggy 0.13.1
py 1.8.1
pyarrow 0.15.1
pyasn1 0.4.8
pytest 3.10.1
python-dateutil 2.8.1
pytz 2019.3
pyyaml 5.3.1
rsa 3.4.2
s3transfer 0.3.3
scipy 1.4.1
six 1.14.0
urllib3 1.25.9
zipp 3.1.0
Hello there,
I am trying to understand the process flow of rgw part of ceph with the
help of gdb, but I am stuck at a place where the program exits when they
call a function global_init_prefork() after this function it is
shwoing that main thread has exited successfully(detail is given below) and
gdb stops , I am never able to make past this step.To get to my function
through gdb I have to cross this function before.
Here is the image
[image: Screenshot from 2020-04-19 16-13-17.png]
A strange thing is when I put output buffer after that function and run
cluster then the output is coming, but when doing with gdb the output
buffer is not showing (outpur buffer means I writing some text in a file to
check whether the program is reaching that part of the code)
Any suggestions will be of great help :)
We're glad to announce the availability of the ninth and very likely the
last stable release in the Ceph Mimic stable release series. This
release fixes bugs across all components and also contains a RGW
security fix. We recommend all mimic users to upgrade to this version.
We thank everyone for making this release a possibility.
Notable Changes
---------------
* CVE-2020-1760: Fixed XSS due to RGW GetObject header-splitting
* The configuration value `osd_calc_pg_upmaps_max_stddev` used for upmap
balancing has been removed. Instead use the mgr balancer config
`upmap_max_deviation` which now is an integer number of PGs of deviation
from the target PGs per OSD. This can be set with a command like
`ceph config set mgr mgr/balancer/upmap_max_deviation 2`. The default
`upmap_max_deviation` is 1. There are situations where crush rules
would not allow a pool to ever have completely balanced PGs. For example, if
crush requires 1 replica on each of 3 racks, but there are fewer OSDs in 1 of
the racks. In those cases, the configuration value can be increased.
* The `cephfs-data-scan scan_links` command now automatically repair inotables
and snaptable.
For the full changelog please refer to the official release blog entry
at https://ceph.io/releases/v13-2-9-mimic-released
Getting Ceph
------------
* Git at git://github.com/ceph/ceph.git
* Tarball at http://download.ceph.com/tarballs/ceph-13.2.9.tar.gz
* For packages, see
http://docs.ceph.com/docs/master/install/get-packages/
* Release git sha1: 58a2a9b31fd08d8bb3089fce0e312331502ff945
--
Abhishek Lekshmanan
SUSE Software Solutions Germany GmbH
GF: Felix Imendörffer
Hi Folks,
Perf meeting for today is starting in 20 minutes! Topics today include
ShardedOpWQ, Switching to the 4K min_alloc size for HDD (with Igor's
recent work) and perf CI. See you there!
Etherpad:
https://pad.ceph.com/p/performance_weekly
Bluejeans:
https://bluejeans.com/908675367
Thanks,
Mark
Hi everyone,
A couple of weeks ago the first online Ceph Developer Summit was held
to plan and discuss features for the next Ceph release - Pacific. Here
are some of the highlights from the RADOS session.
Recording: https://www.youtube.com/watch?reload=9&v=4KZCB-XzFCY
General improvements
- Adaptive recovery settings for smarter throttling of client vs recovery I/O
- Reflect severity of degradedness in ceph health and improve
mechanism to discover unfound objects that can help recover PGs
- Distributed tracing in the OSD I/O path to identify performance
bottlenecks and improve debugging capabilities -
https://github.com/ceph/ceph/pull/31358
- Telemetry: mapping device errors and osd crashes back to devices to
help improve disk prediction models
- New score-based leader election system in the monitors - first step
to reliable stretched Ceph clusters -
https://github.com/ceph/ceph/pull/32336
- Controlled osdmap trimming - https://github.com/ceph/ceph/pull/19076
- Automated OSD re-provisioning by the manager to help with repair and
migrating to new BlueStore formats
- Dedup support starting with RGW -
https://github.com/athanatos/ceph/blob/sjust/wip-refcount/doc/dev/deduplica…
- Manager and python subinterpreters - leave subinterpreters as they
are now, evaluate the need to return to a single interpreter and
review mgr modules for problematic dependencies
BlueStore and Performance related improvements
- Lower Write Amplification and improve space-amplification with
rocksdb column family sharding -
https://github.com/ceph/ceph/pull/34006
- Lower Space Amplification
- New hybrid allocator - https://github.com/ceph/ceph/pull/33365
- Deferred "big" writes - https://github.com/ceph/ceph/pull/33434
- Memory Management
- Onode double cache fix - https://github.com/ceph/ceph/pull/27705
- Cache age-based binning - https://github.com/ceph/ceph/pull/23710
- Onode data structure diet - https://github.com/ceph/ceph/pull/32697
- Fix onode cache pinning - https://github.com/ceph/ceph/pull/32852
- io_uring
- Code is already in ceph, io_uring kernel support backported to centos 8.2.
- Consider making it default for kernels that support it
- Should we continue to depend on the current aio_t for io_uring?
- Ongoing QoS work
- Balancer and PG autoscaler
- Turn balancer on in upmap mode by default and set
min_compat_client to luminous for new clusters -
https://github.com/ceph/ceph/pull/34541
- Smarter balancing of PGs - take into account bytes used, omap used
to balance in addition to pg_num, balance based on primariness for
read performance
- PG autoscaler only scale down under pressure
More details in:
https://pad.ceph.com/p/cds-pacifichttps://trello.com/b/ugTc2QFH/ceph-backlog (look for the blue pacific label)
Thanks,
Core RADOS Team
Hi,
I got some questions from people asking if ceph-fuse would build with
libfuse3....
Which it doesn't as it generates compile time errors errors.
So checked the code and found 1 location where is said:
#if FUSE_VERSION >= FUSE_MAKE_VERSION(2, 9)
static void fuse_ll_fallocate(fuse_req_t req, fuse_ino_t ino, int mode,
off_t offset, off_t length,
struct fuse_file_info *fi)
{
CephFuse::Handle *cfuse = fuse_ll_req_prepare(req);
Fh *fh = (Fh*)fi->fh;
int r = cfuse->client->ll_fallocate(fh, mode, offset, length);
fuse_reply_err(req, -r);
}
#endif
But that looks more like a test to see if we are running the last
versions of libfuse2...
So there is no libfuse3 code in ceph-fuse?
Thanx,
--WjW