Ceph RADOS Gateway (RGW) Cloud Sync¶
Ceph RGW has a module called Cloud Sync
which allows syncing zone data to
a remote cloud service. The sync is unidirectional, data is not synced back
from the remote zone. The goal of this module is to enable syncing data to
multiple cloud providers. The currently supported cloud providers are those
that are compatible with AWS (S3).
More info about Ceph Cloud Sync
here:
https://docs.ceph.com/en/latest/radosgw/cloud-sync-module.
The Cloud Sync
module is built atop of the multi-site framework that allows
for forwarding data and metadata to a different external tier.
More info about Ceph sync modules here: https://docs.ceph.com/en/quincy/radosgw/sync-modules.
Problem Description¶
The current ceph-radosgw
charm does not support the cloud-sync
module.
Given the fact that the cloud-sync
module is built atop of the multi-site
framework, we can leverage the existing radosgw-multisite
Juju relation
interface.
The cloud-sync
module is enabled via a new relation with the primary
ceph-radosgw application. The deployment is similar with the existing RGW
multi-site replication steps:
https://ubuntu.com/ceph/docs/setting-up-multi-site.
The only differences being:
Both
primary-ceph-radosgw
andsecondary-ceph-radosgw
(with thecloud-sync
enabled zone) are related to the same Ceph cluster.When data is replicated from
primary-ceph-radosgw
zone, thesecondary-ceph-radosgw
zone will write into a remote S3, instead of Ceph storage. Thesecondary-ceph-radosgw
zone will have the appropriate S3 credentials configured for this task.Data sync is unidirectional, therefore the
secondary-ceph-radosgw
zone will be read-only.
More info about how to configure the cloud-sync
module here:
https://docs.ceph.com/en/latest/radosgw/cloud-sync-module/#how-to-configure.
Proposed Change¶
Add a new relation, called cloud-sync
, to the ceph-radosgw
charm. The
new relation will implement the existing radosgw-multisite
relation
interface.
In the new cloud-sync
relation, when the secondary multi-site secondary
zone is created, we need to pass --tier-type=cloud
to the
radosgw-admin zone create
command in order to have the cloud-sync
module enabled. Besides this, we need to add the S3 target credentials via
--tier-config
parameter of the radosgw-admin zone modify
command.
These steps are documented at: https://docs.ceph.com/en/latest/radosgw/cloud-sync-module.
The Ceph cloud-sync
module allows multiple S3 targets to be configured in
the same zone tier config. For this, we have profiles
in the tier config.
Each profile maps a single source bucket (or multiple buckets via prefix) to
one S3 destination (Many-To-One mapping). The profiles
in the tier config
are optional.
A profile contains info about:
source_bucket
, either a bucket name, or a bucket prefix (if ends with*
) that defines the source bucket(s) for this profile.target_path
, a string that defines how the target path is created. The target path specifies a prefix to which the source object name is appended. The target path is configurable, and it can include any of the following variables:$sid
: unique string that represents the sync instance ID$zonegroup
: the zonegroup name$zonegroup_id
: the zonegroup ID$zone
: the zone name$zone_id
: the zone id$bucket
: source bucket name$owner
: source bucket owner ID
For example:
target_path = rgwx-${zone}-${sid}/${owner}/${bucket}
connection_id
, ID of the connection (with credentials) that will be used for this profile.
A new charm config, called cloud-sync-target-path
, will be added
to configure the target path for all the profiles. This allows a consistent
target path for the configured cloud-sync
zone.
The profiles are configured through the use of s3-integrator
Juju applications
together with the new config option cloud-sync-target-path
.
It is mandatory to have a default S3 target for all the buckets that don’t
have a profile configured. The rationale is that every bucket needs to have
a sync target, and the default target is the fallback for any bucket that
doesn’t have a profile configured. A new charm config will be added, called
cloud-sync-default-s3-target
for this purpose.
It is obvious that we need to handle S3 credentials for the S3 targets
configured in the cloud-sync
zone. For this purpose, we will use the
s3-integrator
charm (https://github.com/canonical/s3-integrator). The
ceph-radosgw
charm will have new relation with the s3-integrator
charm.
Each deployed application of the s3-integrator
charm will handle
credentials for a single S3 target. When relating multiple s3-integrator
applications to the same secondary-ceph-radosgw
cloud-sync application,
the tier config will be updated with profiles for each S3 target.
For example, the following Juju deployment commands:
#
# Assuming ceph-mon is already deployed
#
juju deploy ceph-radosgw primary-ceph-radosgw \
--config realm=eu \
--config zonegroup=east \
--config zone=primary
juju relate ceph-radosgw:mon ceph-mon:radosgw
juju deploy ceph-radosgw secondary-ceph-radosgw \
--config realm=eu \
--config zonegroup=east \
--config zone=primary-cloud-sync \
--config 'cloud-sync-target-path=${bucket}' \
--config cloud-sync-default-s3-target=minio-dev
juju relate secondary-ceph-radosgw:mon ceph-mon:radosgw
juju deploy s3-integrator minio-dev \
--config endpoint=http://10.7.133.248:9000 \
--config region=us-east-1 \
--config s3-uri-style=path
juju deploy s3-integrator minio-production \
--config endpoint=http://10.7.133.250:9000 \
--config region=us-east-2 \
--config s3-uri-style=path \
--config 'bucket=production*'
juju relate ceph-radosgw-cloud-sync:s3-credentials minio-dev:s3-credentials
juju relate ceph-radosgw-cloud-sync:s3-credentials minio-production:s3-credentials
#
# After all applications' units are idle
#
juju relate ceph-radosgw-cloud-sync:cloud-sync ceph-radosgw:primary
juju run minio-dev/leader sync-s3-credentials --string-args access-key=MY_DEV_ACCESS_KEY secret-key=MY_DEV_SECRET_KEY
juju run minio-production/leader sync-s3-credentials --string-args access-key=MY_PROD_ACCESS_KEY secret-key=MY_PROD_SECRET_KEY
will render the following tier config in the cloud sync zone:
{
// ...
"name": "primary-cloud-sync",
// ...
"tier_config": {
"connections": [
{
"id": "minio-dev",
"endpoint": "http://10.7.133.248:9000",
"region": "us-east-1",
"host_style": "path",
"access_key": "MY_DEV_ACCESS_KEY",
"secret": "MY_DEV_SECRET_KEY"
},
{
"id": "minio-production",
"endpoint": "http://10.7.133.250:9000",
"region": "us-east-2",
"host_style": "path",
"access_key": "MY_PROD_ACCESS_KEY",
"secret": "MY_PROD_SECRET_KEY"
}
],
"profiles": [
{
"connection_id": "minio-production",
"source_bucket": "production*",
"target_path": "${bucket}"
}
],
"connection_id": "minio-dev",
"target_path": "${bucket}"
},
// ...
}
Alternatives¶
None
Implementation¶
Assignee(s)¶
Primary assignee: ionutbalutoiu
Gerrit Topic¶
Use Gerrit topic “ceph-radosgw-cloud-sync” for all patches related to this spec.
git-review -t ceph-radosgw-cloud-sync
Work Items¶
Add two new charm configs to
ceph-radosgw
:cloud-sync-default-s3-target
, the default S3 target for buckets that don’t have a profile configured in the tier config.cloud-sync-target-path
, string that defines how the target path is created. The target path specifies a prefix to which the source object name is appended.
Add a new relation called
cloud-sync
to theceph-radosgw
charm. The new relation implements the existingradosgw-multisite
interface. The cloud-sync secondary zone will be configured with--tier-type=cloud
, and connection info for the S3 targets will be fetched from the relation with thes3-integrator
charm.When the
cloud-sync
relation is established, theceph-radosgw
cloud-sync application will be blocked until a relation with thes3-integrator
application is created, which provides S3 credentials for the configuredcloud-sync-default-s3-target
.Add a new relation called
s3-credentials
, implementings3
interface, used to fetch S3 credentials for each S3 target in thecloud-sync
tier config.The name of related
s3-integrator
application will be used as the profile name configured in the tier config. From the relation data, we also fetch the source bucket(s) for each profile.
Repositories¶
Documentation¶
The config options (cloud-sync-default-s3-target
and
cloud-sync-target-path
) will be documented in the ceph-radosgw
charm.
Also, additional documentation to charm deployment guide should be added for
the new cloud-sync
relation.
Security¶
ceph-radosgw
The Ceph
Cloud Sync
module requires S3 connection credentials for the configured S3 targets. These credentials are fetched from thes3-credentials
relation with an application that implements thes3
relation interface.
Testing¶
Code written or changed will be covered by unit tests; functional testing will
be implemented using the Zaza
framework.
Dependencies¶
No new dependencies.