ceph最基础的组件是mon,mgr,osd,有了这3个组件之后ceph就可以提供块存储的能力,但是仅仅只有这一个能力,如果需要提供文件系统存储,那么就需要部署mds
如果还需要提供对象存储的话,那么就需要部署rgw (rados gateway)对外提供对象存储
文件系统是允许用户进行
增删改查
这4个操作的,但是对象存储只允许
增,删,查
,并不允许修改,在ceph集群中,对象存储默认会将文件切分成4M的大小,不足4M的按原大小
对象存储的概念最早是由
AWS
提出,并且也搞了一个标准叫做
s3(Simple Storage Service )
桶是对象存储里的一个逻辑概念,所有的对象都是存放在桶里的,但是之前说过,ceph上所有的数据都属存放在pg上,那他到底是存在桶里还是存在pg里?那pg是在桶上呢还是桶在pg上呢?
要搞清楚这个问题还得回到ceph,因为pg这个概念是存在于ceph上的,而如果我们底层的存储系统不是用的ceph呢?是minio呢,他是不是也有pg呢?很显然,答案是不对的,所以对象存储不管你底层存储是把数据往哪放,反正我,就只往桶里放,至于你把我的桶放在哪,那是你的事,你自己安排就好
rgw不属于ceph的服务端组件,是属于ceph的客户端组件,这样做是因为对象存储是提供的http地址,然而ceph服务端是没有提供http地址的,所以将rgw放在客户端,当用户请求http地址时,用户此时并不需要安装ceph客户端,只需要发起http请求,客户端此时再去请求服务端。
对象存储认证是基于桶的认证,这个与cephx的认证是不一样的
他需要用户提供AK,SK。
也就是说现在某个用户需要存放一个文件到ceph的对象存储里面,他需要请求ceph客户端,并于ceph客户端rgw完成认证,此时ceph客户端再去与ceph服务端完成cephx的认证
[root@ceph01 ~]# radosgw-admin realm create --rgw-realm=for_swift
{
"id": "286f9296-569d-45e0-b690-4069e3f0d6bc",
"name": "for_swift",
"current_period": "89a157fc-baf4-4f5f-a4c7-884dc87c0be0",
"epoch": 1
}
现在创建了realm,有realm他后面肯定会有他的zonegroup,那么这些信息记录在哪呢?肯定是配置文件里面啊,每一个realm会有自己的配置文件来记录他这里面的zonegroup,zone,rgw。他的配置文件就叫peri,而peri会有一个ID,而现在这个配置文件里面是没有东西的,当后面创建了zonegroup之后这个配置文件就需要更新,那么这个ID也是会随着更新的,那么会更新就会有一个东西来记录他的版本,也就是更新了多少次
这里显示的current_period 就是他配置文件的ID
epoch就是配置文件的版本
[root@ceph01 ~]# radosgw-admin realm list
{
"default_info": "286f9296-569d-45e0-b690-4069e3f0d6bc",
"realms": [
"for_swift"
]
}
[root@ceph01 ~]# radosgw-admin realm get --rgw-realm for_swift
{
"id": "286f9296-569d-45e0-b690-4069e3f0d6bc",
"name": "for_swift",
"current_period": "89a157fc-baf4-4f5f-a4c7-884dc87c0be0",
"epoch": 1
}
# 先创建一个
[root@ceph01 ~]# radosgw-admin realm create --rgw-realm test
{
"id": "3d366d9c-b948-4d4d-8e15-d19a72579d77",
"name": "test",
"current_period": "d38a8be8-0d85-4244-8879-9e222761defb",
"epoch": 1
}
# 删掉他
[root@ceph01 ~]# radosgw-admin realm delete --rgw-realm test
[root@ceph01 ~]# radosgw-admin zonegroup create --rgw-realm for_swift --rgw-zonegroup for_swift
{
"id": "2b382a98-0c66-4ec3-b7d5-8cdbf1f6b93c",
"name": "for_swift",
"api_name": "for_swift",
"is_master": "false",
"endpoints": [],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "",
"zones": [],
"placement_targets": [],
"default_placement": "",
"realm_id": "286f9296-569d-45e0-b690-4069e3f0d6bc",
"sync_policy": {
"groups": []
}
}
[root@ceph01 ~]# radosgw-admin zonegroup modify --rgw-realm for_swift --rgw-zonegroup for_swift --default --master
{
"id": "2b382a98-0c66-4ec3-b7d5-8cdbf1f6b93c",
"name": "for_swift",
"api_name": "for_swift",
"is_master": "true",
"endpoints": [],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "",
"zones": [],
"placement_targets": [],
"default_placement": "",
"realm_id": "286f9296-569d-45e0-b690-4069e3f0d6bc",
"sync_policy": {
"groups": []
}
}
现在他的is-master就是true
[root@ceph01 ~]# radosgw-admin zonegroup get --rgw-realm for_swift --rgw-zonegroup for_swift
{
"id": "2b382a98-0c66-4ec3-b7d5-8cdbf1f6b93c",
"name": "for_swift",
"api_name": "for_swift",
"is_master": "true",
"endpoints": [],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "",
"zones": [],
"placement_targets": [],
"default_placement": "",
"realm_id": "286f9296-569d-45e0-b690-4069e3f0d6bc",
"sync_policy": {
"groups": []
}
}
[root@ceph01 ~]# radosgw-admin zone create --rgw-realm for_swift --rgw-zonegroup for_swift --rgw-zone for_swift --master --default
{
"id": "5ee18ff4-9b1d-41fd-83bb-3f363634fb1f",
"name": "for_swift",
"domain_root": "for_swift.rgw.meta:root",
"control_pool": "for_swift.rgw.control",
"gc_pool": "for_swift.rgw.log:gc",
"lc_pool": "for_swift.rgw.log:lc",
"log_pool": "for_swift.rgw.log",
"intent_log_pool": "for_swift.rgw.log:intent",
"usage_log_pool": "for_swift.rgw.log:usage",
"roles_pool": "for_swift.rgw.meta:roles",
"reshard_pool": "for_swift.rgw.log:reshard",
"user_keys_pool": "for_swift.rgw.meta:users.keys",
"user_email_pool": "for_swift.rgw.meta:users.email",
"user_swift_pool": "for_swift.rgw.meta:users.swift",
"user_uid_pool": "for_swift.rgw.meta:users.uid",
"otp_pool": "for_swift.rgw.otp",
"system_key": {
"access_key": "",
"secret_key": ""
},
"placement_pools": [
{
"key": "default-placement",
"val": {
"index_pool": "for_swift.rgw.buckets.index",
"storage_classes": {
"STANDARD": {
"data_pool": "for_swift.rgw.buckets.data"
}
},
"data_extra_pool": "for_swift.rgw.buckets.non-ec",
"index_type": 0
}
}
],
"realm_id": "286f9296-569d-45e0-b690-4069e3f0d6bc",
"notif_pool": "for_swift.rgw.log:notif"
}
可以看到他自己创建了很多存储池
[root@ceph01 ~]# ceph osd pool ls
device_health_metrics
test_pool
test02
rbd
mv_rbd
cephfs_data
cephfs_metadata
.rgw.root
default.rgw.log
default.rgw.control
default.rgw.meta
[root@ceph01 ~]# radosgw-admin period update --commit --rgw-realm for_swift
如果你的realm是默认的那么就可以不加上 –rgw-realm,不是默认的就需要指定
[root@ceph01 ~]# radosgw-admin realm get
{
"id": "286f9296-569d-45e0-b690-4069e3f0d6bc",
"name": "for_swift",
"current_period": "ace39846-04f4-46ba-8601-52f98aafeb1a",
"epoch": 2
}
[root@ceph01 ~]# ceph orch apply rgw for_swift --placement="1"
Scheduled rgw.for_swift update...
–placement=”1″ 指的是部署几个rgw
默认监听在80端口,如果想修改端口在创建的时候加上 –port
[root@ceph01 ~]# curl 10.104.45.241
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyBucketsResult>
可以有正常的返回值
[root@ceph01 ~]# radosgw-admin user create --uid=testuser --display-name=testuser
省略其他输出
"keys": [
{
"user": "testuser",
"access_key": "ZAKFA4A9WRJYI4J059NK",
"secret_key": "6TIZZpCMJVd5fcJcblBwgD7RepMk0gkglLONIBLj"
}
],
创建用户的时候会返回AK SK,但是AK SK 是s3标准的,如果使用swift的话这个是用不了的,因为这个用户就是s3用户
如果想要使用swift则还需要创建子用户
[root@ceph01 ~]# radosgw-admin key create --uid=testuser --key-type s3 --gen-access-key --gen-secret
{
"user_id": "testuser",
"display_name": "testuser",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"subusers": [],
"keys": [
{
"user": "testuser",
"access_key": "2EQ7K1XY4GU19C6IK694",
"secret_key": "nWoV1B4DZ40bh5CI6BxqkzgAbH3WhFDCHGLEHq6G"
},
{
"user": "testuser",
"access_key": "ZAKFA4A9WRJYI4J059NK",
"secret_key": "6TIZZpCMJVd5fcJcblBwgD7RepMk0gkglLONIBLj"
}
],
这个时候这个用户就有2个key了
[root@ceph01 ~]# radosgw-admin user list
[
"dashboard",
"testuser"
]
使用user list也是可以查看到用户的
[root@ceph01 ~]# radosgw-admin subuser create --uid=testuser --subuser testuser:swift
{
"user_id": "testuser",
"display_name": "testuser",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"subusers": [
{
"id": "testuser:swift",
"permissions": "<none>"
}
],
"keys": [
{
"user": "testuser",
"access_key": "2EQ7K1XY4GU19C6IK694",
"secret_key": "nWoV1B4DZ40bh5CI6BxqkzgAbH3WhFDCHGLEHq6G"
},
{
"user": "testuser",
"access_key": "ZAKFA4A9WRJYI4J059NK",
"secret_key": "6TIZZpCMJVd5fcJcblBwgD7RepMk0gkglLONIBLj"
}
],
"swift_keys": [
{
"user": "testuser:swift",
"secret_key": "gkstooSlkc7Ylf5Czrvzo8TxrgUY3egOW1lKBvSg"
}
],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"default_storage_class": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"temp_url_keys": [],
"type": "rgw",
"mfa_ids": []
}
因为创建的是子用户,所以需要指定主用户 –uid
再去指定子用户的类型 –subuser 主用户:子用户名
现在他的swift—key里面就有数据了
[root@ceph01 ~]# radosgw-admin subuser modify --uid testuser --subuser testuser:swift --access full
{
"user_id": "testuser",
"display_name": "testuser",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"subusers": [
{
"id": "testuser:swift",
"permissions": "full-control"
}
],
"keys": [
{
"user": "testuser",
"access_key": "2EQ7K1XY4GU19C6IK694",
"secret_key": "nWoV1B4DZ40bh5CI6BxqkzgAbH3WhFDCHGLEHq6G"
},
{
"user": "testuser",
"access_key": "ZAKFA4A9WRJYI4J059NK",
"secret_key": "6TIZZpCMJVd5fcJcblBwgD7RepMk0gkglLONIBLj"
}
],
"swift_keys": [
{
"user": "testuser:swift",
"secret_key": "gkstooSlkc7Ylf5Czrvzo8TxrgUY3egOW1lKBvSg"
}
],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"default_storage_class": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"temp_url_keys": [],
"type": "rgw",
"mfa_ids": []
}
默认创建出来时候是没有权限的,所以需要给他对bucket的权限
buket创建必须使用rgw客户端来创建,所以必须创建客户端
[root@ceph01 ~]# yum install python3-pip -y
[root@ceph01 ~]# pip install python-swiftclient -i https://pypi.tuna.tsinghua.edu.cn/simple
[root@ceph01 ~]# swift -A http://10.104.45.241/auth/1.0 -U 'testuser:swift' -K gkstooSlkc7Ylf5Czrvzo8TxrgUY3egOW1lKBvSg stat -v --insecure
StorageURL: http://10.104.45.241/swift/v1
Auth Token: AUTH_rgwtk0e00000074657374757365723a73776966746f562e87af90cca359615566a440ca284b82d38246a602414f5ea2aae59df914a4751ba2
Account: v1
Containers: 0
Objects: 0
Bytes: 0
Objects in policy "default-placement-bytes": 0
Bytes in policy "default-placement-bytes": 0
Containers in policy "default-placement": 0
Objects in policy "default-placement": 0
Bytes in policy "default-placement": 0
X-Timestamp: 1716785113.68986
X-Account-Bytes-Used-Actual: 0
X-Trans-Id: tx00000fd31d0a72e2fdcc4-0066540fd9-1005d-for_swift
X-Openstack-Request-Id: tx00000fd31d0a72e2fdcc4-0066540fd9-1005d-for_swift
Accept-Ranges: bytes
Content-Type: text/plain; charset=utf-8
Connection: Keep-Alive
# 在客户端
[root@ceph01 ~]# swift -A http://10.104.45.241/auth/1.0 -U 'testuser:swift' -K gkstooSlkc7Ylf5Czrvzo8TxrgUY3egOW1lKBvSg list
hosts
# 在服务端
[root@ceph01 ~]# radosgw-admin bucket list
[
"hosts"
]
将/etc/hosts 上传到桶,其实在swift里面是叫做容器,不过只是名字不同
[root@ceph01 ~]# swift -A http://10.104.45.241/auth/1.0 -U 'testuser:swift' -K gkstooSlkc7Ylf5Czrvzo8TxrgUY3egOW1lKBvSg upload hosts /etc/hosts
etc/hosts
[root@ceph01 ~]# swift -A http://10.104.45.241/auth/1.0 -U 'testuser:swift' -K gkstooSlkc7Ylf5Czrvzo8TxrgUY3egOW1lKBvSg list hosts -v
etc/hosts
这个非常的简单,下载一个s3客户端,
s3 browser
直接连接
多区域就是如果还有另一个ceph集群,想要跟目前的ceph集群里的zone数据保持同步,那么另一个ceph集群也需要有realm、onegroup,需要2个ceph集群内名称保持一致,zone里面又会有rgw,你有你的rgw,我有我的rgw,数据就是通过这两个rgw来同步的,无论是从ceph1集群还是从ceph2集群来操作数据,这2个集群内的数据都会发生对应的改变,这就是多区域网关。
应用场景:
异地容灾
大致流程
ceph02集群需要与ceph01集群同步信息,那么ceph02需要先同步ceph01的realm和zonegroup,然后自行创建zone就可以了,创建完zone之后还需要在zone里面创建网关(rgw),当这两个创建完成之后,2个集群的数据会开始同步,第一次是完全同步,后续则是增量同步
主zone端的配置:
需要创建用户的原因是为了安全,不然的话别人只要知道集群的IP就能够同步的话数据就会泄露,所以需要认证,并且用户还必须是系统用户
[root@ceph01 ~]# radosgw-admin user create --uid=system-user --display-name=system-user --system
会显示AK SK
[root@ceph01 ~]# radosgw-admin zone modify --rgw-realm s3_realm --rgw-zonegroup s3_zonegroup --rgw-zone s3_zone --master --default --access-key GR135X3HXX6PQQRXI8QW --secret 7Z9SMnsaL92j7HDyEcihtTsPBPVzqYglek4m6Qk3 --endpoints "http://10.104.45.240,http://10.104.45.241,http://10.104.45.243"
只看这部分信息
"system_key": {
"access_key": "GR135X3HXX6PQQRXI8QW",
"secret_key": "7Z9SMnsaL92j7HDyEcihtTsPBPVzqYglek4m6Qk3"
},
有几个rgw就需要操作几次,我这里是3个
[root@ceph01 ~]# ceph config set client.rgw.s3_realm.ceph01.msvmsr rgw_realm s3_realm
[root@ceph01 ~]# ceph config set client.rgw.s3_realm.ceph01.msvmsr rgw_zonegroup s3_zonegroup
[root@ceph01 ~]# ceph config set client.rgw.s3_realm.ceph01.msvmsr rgw_zone s3_zone
[root@ceph01 ~]# ceph config set client.rgw.s3_realm.ceph02.vqvztu rgw_realm s3_realm
[root@ceph01 ~]# ceph config set client.rgw.s3_realm.ceph02.vqvztu rgw_zonegroup s3_zonegroup
[root@ceph01 ~]# ceph config set client.rgw.s3_realm.ceph02.vqvztu rgw_zone s3_zone
[root@ceph01 ~]# ceph config set client.rgw.s3_realm.ceph03.cexvyj rgw_realm s3_realm
[root@ceph01 ~]# ceph config set client.rgw.s3_realm.ceph03.cexvyj rgw_zonegroup s3_zonegroup
[root@ceph01 ~]# ceph config set client.rgw.s3_realm.ceph03.cexvyj rgw_zone s3_zone
[root@ceph01 ~]# radosgw-admin period update --commit
# 重启之前查看一下 ceph orch ls
[root@ceph01 ~]# ceph orch restart rgw.s3_realm
Scheduled to restart rgw.s3_realm.ceph01.msvmsr on host 'ceph01'
Scheduled to restart rgw.s3_realm.ceph02.vqvztu on host 'ceph02'
Scheduled to restart rgw.s3_realm.ceph03.cexvyj on host 'ceph03'
secondary zone的配置
[root@m-rgw ~]# radosgw-admin realm pull --url http://10.104.45.240 --access-key GR135X3HXX6PQQRXI8QW --secret 7Z9SMnsaL92j7HDyEcihtTsPBPVzqYglek4m6Qk3
2024-06-01T14:51:02.345+0800 7f449d3fa2c0 1 error read_lastest_epoch .rgw.root:periods.699f662a-c436-4179-8f7e-8a6d40b188f3.latest_epoch
2024-06-01T14:51:02.359+0800 7f449d3fa2c0 1 Set the period's master zonegroup 09b8465b-a516-4464-b01d-9d8f8b6a1629 as the default
{
"id": "09c4ff84-8900-48a5-99ea-5a403754ca4f",
"name": "s3_realm",
"current_period": "699f662a-c436-4179-8f7e-8a6d40b188f3",
"epoch": 2
}
# 拉取完成之后在本地可以使用 radosgw-admin realm list 去查看
[root@m-rgw ~]# radosgw-admin period pull --url http://10.104.45.240 --access-key GR135X3HXX6PQQRXI8QW --secret 7Z9SMnsaL92j7HDyEcihtTsPBPVzqYglek4m6Qk3
[root@m-rgw ~]# radosgw-admin zone create --rgw-realm s3_realm --rgw-zonegroup s3_zonegroup --rgw-zone s3_zone_backup --access-key GR135X3HXX6PQQRXI8QW --secret 7Z9SMnsaL92j7HDyEcihtTsPBPVzqYglek4m6Qk3 --endpoints "http://10.104.45.244"
[root@m-rgw ~]# ceph orch apply rgw s3_backup --placement=1
Scheduled rgw.s3_backup update...
[root@m-rgw ~]# ceph config set client.rgw.s3_backup.m-rgw.ftwpgm rgw_realm s3_realm
[root@m-rgw ~]# ceph config set client.rgw.s3_backup.m-rgw.ftwpgm rgw_zonegroup s3_zonegroup
[root@m-rgw ~]# ceph config set client.rgw.s3_backup.m-rgw.ftwpgm rgw_zone s3_zone_bakcup
[root@m-rgw ~]# radosgw-admin period update --commit
[root@m-rgw ~]# ceph orch restart rgw.s3_backup
# 直接查看用户,bucket是否同步
[root@m-rgw ~]# radosgw-admin user list
[
"system-user",
"dashboard",
"s3"
]
[root@m-rgw ~]# radosgw-admin bucket list
[
"media"
]
可以看到s3用户已经同步过来了,bucket也有了
也可以直接使用命令查看同步状态
[root@m-rgw ~]# radosgw-admin sync status
[root@m-rgw ~]# radosgw-admin sync status
realm 09c4ff84-8900-48a5-99ea-5a403754ca4f (s3_realm)
zonegroup 09b8465b-a516-4464-b01d-9d8f8b6a1629 (s3_zonegroup)
zone abee3a3c-4789-4669-82e5-7219785cd76a (s3_zone_backup)
metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is caught up with master
data sync source: ab90bfa5-4dec-4136-b804-939c80e15d5b (s3_zone)
syncing
full sync: 8/128 shards
full sync: 8 buckets to sync
incremental sync: 120/128 shards
data is behind on 8 shards
behind shards: [82,83,84,85,87,88,90,92]