- Debian Jessie 8.x. Debian Wheezy 7.x’s g++ has incomplete support for C++11 (and no systemd).
V10.1.0 JEWEL (RELEASE CANDIDATE)
This major release of Ceph will be the foundation for the next long-term stable release. There have been many major changes since the Infernalis (9.2.x) and Hammer (0.94.x) releases, and the upgrade process is non-trivial. Please read these release notes carefully.
There are a few known issues with this release candidate; see below.
KNOWN ISSUES WITH V10.1.0
- While running a mixed version cluster of jewel and infernalis or hammer monitors, any MDSMap updates will cause the pre-jewel monitors to crash. Workaround is to simply upgrde all monitors. There is a fix but it is still being tested.
- Some of the rbd-mirror functionality for switching between active and replica images is not yet merged.
MAJOR CHANGES FROM INFERNALIS
-
CephFS:
- This is the first release in which CephFS is declared stable and production ready! Several features are disabled by default, including snapshots and multiple active MDS servers.
- The repair and disaster recovery tools are now feature complete.
- A new cephfs-volume-manager module is included that provides a high-level interface for creating “shares” for OpenStack Manila and similar projects.
- There is now experimental support for multiple CephFS file systems within a single cluster.
-
RGW:
- The multisite feature has been almost completely rearchitected and rewritten to support any number of clusters/sites, bidirectional fail-over, and active/active configurations.
- You can now access radosgw buckets via NFS (experimental).
- The AWS4 authentication protocol is now supported.
- There is now support for S3 request payer buckets.
- The new multitenancy infrastructure improves compatibility with Swift, which provides separate container namespace for each user/tenant.
- The OpenStack Keystone v3 API is now supported. There are a range of other small Swift API features and compatibility improvements as well, including bulk delete and SLO (static large objects).
-
RBD:
- There is new support for mirroring (asynchronous replication) of RBD images across clusters. This is implemented as a per-RBD image journal that can be streamed across a WAN to another site, and a new rbd-mirror daemon that performs the cross-cluster replication.
- The exclusive-lock, object-map, fast-diff, and journaling features can be enabled or disabled dynamically. The deep-flatten can be disabled dynamically but not re-enabled.
- The RBD CLI has been rewritten to provide command-specific help and full bash completion support.
- RBD snapshots can now be renamed.
-
RADOS:
- BlueStore, a new OSD backend, is included as an experimental feature. The plan is for it to become the default backend in the K or L release.
- The OSD now persists scrub results and provides a librados to query results in detail, including the nature of inconsistencies found, the ability to fetch alternate versions of the same specific object (if any), and fine-grained control over repair.
MAJOR CHANGES FROM HAMMER
-
General:
- Ceph daemons are now managed via systemd (with the exception of Ubuntu Trusty, which still uses upstart).
- Ceph daemons run as ‘ceph’ user instead root.
- On Red Hat distros, there is also an SELinux policy.
-
RADOS:
- The RADOS cache tier can now proxy write operations to the base tier, allowing writes to be handled without forcing migration of an object into the cache.
- The SHEC erasure coding support is no longer flagged as experimental. SHEC trades some additional storage space for faster repair.
- There is now a unified queue (and thus prioritization) of client IO, recovery, scrubbing, and snapshot trimming.
- There have been many improvements to low-level repair tooling (ceph-objectstore-tool).
- The internal ObjectStore API has been significantly cleaned up in order to faciliate new storage backends like NewStore.
-
RGW:
- The Swift API now supports object expiration.
- There are many Swift API compatibility improvements.
-
RBD:
- The rbd du command shows actual usage (quickly, when object-map is enabled).
- The object-map feature has seen many stability improvements.
- Object-map and exclusive-lock features can be enabled or disabled dynamically.
- You can now store user metadata and set persistent librbd options associated with individual images.
- The new deep-flatten features allow flattening of a clone and all of its snapshots. (Previously snapshots could not be flattened.)
- The export-diff command is now faster (it uses aio). There is also a new fast-diff feature.
- The –size argument can be specified with a suffix for units (e.g., --size 64G).
- There is a new rbd status command that, for now, shows who has the image open/mapped.
-
CephFS:
- You can now rename snapshots.
- There have been ongoing improvements around administration, diagnostics, and the check and repair tools.
- The caching and revocation of client cache state due to unused inodes has been dramatically improved.
- The ceph-fuse client behaves better on 32-bit hosts.
DISTRO COMPATIBILITY
Starting with Infernalis, we have dropped support for many older distributions so that we can move to a newer compiler toolchain (e.g., C++11). Although it is still possible to build Ceph on older distributions by installing backported development tools, we are not building and publishing release packages for ceph.com.
We now build packages for:
- CentOS 7.x. We have dropped support for CentOS 6 (and other RHEL 6 derivatives, like Scientific Linux 6).
- Debian Jessie 8.x. Debian Wheezy 7.x’s g++ has incomplete support for C++11 (and no systemd).
- Ubuntu Trusty 14.04 and Ubuntu Xenial. Ubuntu Precise 12.04 is no longer supported.
- Fedora 22 or later.
UPGRADING FROM FIREFLY
Upgrading directly from Firefly v0.80.z is not recommended. It is possible to do a direct upgrade, but not without downtime. We recommend that clusters are first upgraded to Hammer v0.94.6 or a later v0.94.z release; only then is it possible to upgrade to Jewel 10.2.z for an online upgrade (see below).
To do an offline upgrade directly from Firefly, all Firefly OSDs must be stopped and marked down before any Jewel OSDs will be allowed to start up. This fencing is enforced by the Jewel monitor, so use an upgrade procedure like:
Upgrade Ceph on monitor hosts
Restart all ceph-mon daemons
Upgrade Ceph on all OSD hosts
Stop all ceph-osd daemons
- Mark all OSDs down with something like::
ceph osd down seq 0 1000
Start all ceph-osd daemons
Upgrade and restart remaining daemons (ceph-mds, radosgw)
UPGRADING FROM HAMMER
-
All cluster nodes must first upgrade to Hammer v0.94.4 or a later v0.94.z release; only then is it possible to upgrade to Jewel 10.2.z.
-
For all distributions that support systemd (CentOS 7, Fedora, Debian Jessie 8.x, OpenSUSE), ceph daemons are now managed using native systemd files instead of the legacy sysvinit scripts. For example,:
The main notable distro that is not yet using systemd is Ubuntu trusty 14.04. (The next Ubuntu LTS, 16.04, will use systemd instead of upstart.)
-
Ceph daemons now run as user and group ceph by default. The ceph user has a static UID assigned by Fedora and Debian (also used by derivative distributions like RHEL/CentOS and Ubuntu). On SUSE the ceph user will currently get a dynamically assigned UID when the user is created.
If your systems already have a ceph user, upgrading the package will cause problems. We suggest you first remove or rename the existing ‘ceph’ user and ‘ceph’ group before upgrading.
When upgrading, administrators have two options:
-
Add the following line to ceph.conf on all hosts:
This will make the Ceph daemons run as root (i.e., not drop privileges and switch to user ceph) if the daemon’s data directory is still owned by root. Newly deployed daemons will be created with data owned by user ceph and will run with reduced privileges, but upgraded daemons will continue to run as root.
-
Fix the data ownership during the upgrade. This is the preferred option, but it is more work and can be very time consuming. The process for each host is to:
-
Upgrade the ceph package. This creates the ceph user and group. For example:
-
Stop the daemon(s).:
-
Fix the ownership:
-
Restart the daemon(s).:
Alternatively, the same process can be done with a single daemon type, for example by stopping only monitors and chowning only /var/lib/ceph/mon.
-
-
-
The on-disk format for the experimental KeyValueStore OSD backend has changed. You will need to remove any OSDs using that backend before you upgrade any test clusters that use it.
-
When a pool quota is reached, librados operations now block indefinitely, the same way they do when the cluster fills up. (Previously they would return -ENOSPC). By default, a full cluster or pool will now block. If your librados application can handle ENOSPC or EDQUOT errors gracefully, you can get error returns instead by using the new librados OPERATION_FULL_TRY flag.
-
The return code for librbd’s rbd_aio_read and Image::aio_read API methods no longer returns the number of bytes read upon success. Instead, it returns 0 upon success and a negative value upon failure.
-
‘ceph scrub’, ‘ceph compact’ and ‘ceph sync force’ are now DEPRECATED. Users should instead use ‘ceph mon scrub’, ‘ceph mon compact’ and ‘ceph mon sync force’.
-
‘ceph mon_metadata’ should now be used as ‘ceph mon metadata’. There is no need to deprecate this command (same major release since it was first introduced).
-
The –dump-json option of “osdmaptool” is replaced by –dump json.
-
The commands of “pg ls-by-{pool,primary,osd}” and “pg ls” now take “recovering” instead of “recovery”, to include the recovering pgs in the listed pgs.
UPGRADING FROM INFERNALIS
-
There are no major compatibility changes since Infernalis. Simply upgrading the daemons on each host and restarting all daemons is sufficient.
-
The rbd CLI no longer accepts the deprecated ‘–image-features’ option during create, import, and clone operations. The ‘–image-feature’ option should be used instead.
-
The rbd legacy image format (version 1) is deprecated with the Jewel release. Attempting to create a new version 1 RBD image will result in a warning. Future releases of Ceph will remove support for version 1 RBD images.
-
The ‘send_pg_creates’ and ‘map_pg_creates’ mon CLI commands are obsolete and no longer supported.
-
A new configure option ‘mon_election_timeout’ is added to specifically limit max waiting time of monitor election process, which was previously restricted by ‘mon_lease’.
-
CephFS filesystems created using versions older than Firefly (0.80) must use the new “cephfs-data-scan tmap_upgrade” command after upgrading to Jewel. See ‘Upgrading’ in the CephFS documentation for more information.
-
The ‘ceph mds setmap’ command has been removed.
-
The default RBD image features for new images have been updated to enable the following: exclusive lock, object map, fast-diff, and deep-flatten. These features are not currently supported by the RBD kernel driver nor older RBD clients. These features can be disabled on a per-image basis via the RBD CLI or the default features can be updated to the pre-Jewel setting by adding the following to the client section of the Ceph configuration file:
-
The rbd legacy image format (version 1) is deprecated with the Jewel release.