Commandes

Mounter une partition RBD

Créer le pool RBD et l’image via l’interface graphique

Sur un serveur du cluster Ceph, ajouter un utilisateur et récupérer la clef :


ceph auth add client.restobdd mon 'allow r' osd 'allow rwx pool=rbd'

ceph auth get client.restobdd
  [client.restobdd]
    	key = AQAmSNXXXXXxxxxxxxxUckw==
    	caps mon = "allow r"
    	caps osd = "allow rwx pool=rbd"

Copier /etc/ceph/ceph.conf sur la machine cliente. Créer le fichier /etc/ceph/ceph.keyring avec la config de l’utilisateur créé au dessus.

Si la machine n’est pas sur le même réseau que ceph-mon, passer par le vlan de backup enn créant une route sur la machine cliente :

ip r add 192.168.1.1 via 192.168.0.1

Mapper le device => cela va créer un /dev/rbd0 qu’il faudra formater et mounter.

rbd -n client.restobdd map rbd/restobdd -m 192.168.1.1
mkfs.ext4 /dev/rbd0
mount /dev/rbd0 /mnt

Upgrade

ceph orch upgrade start --ceph-version 18.2.4
ceph orch upgrade status
ceph -s

cephadm install cephadm ceph ceph-volume ceph-fuse

OSD

Créer deux OSD par disque

# on créé les volumes logiques sur la partition SSD pour le stockage Bluestore
lvcreate -L 10GB -n db-0 vg
lvcreate -L 10GB -n db-4 vg

# on crée les deux OSD + stockage bluestore sur le SSD
ceph orch daemon add osd ov002:data_devices=/dev/sda,db_devices=/dev/vg/db-0,/dev/vg/db-4,osds_per_device=2

Supprimer un ou plusieurs OSD

ceph orch osd rm 1 --zap
ceph orch osd rm 1 2 3 --zap
ceph orch osd rm status

Si c’est bloqué en “draining” :

ceph orch osd rm status
OSD  HOST   STATE     PGS  REPLACE  FORCE  ZAP   DRAIN STARTED AT            
4    ov003  draining   41  False    False  True  2024-07-03 22:20:13.543747  
5    ov003  draining   47  False    False  True  2024-07-03 22:20:14.556255  
6    ov003  draining   76  False    False  True  2024-07-03 22:20:15.567319  
7    ov003  draining  115  False    False  True  2024-07-03 22:20:16.581260

#Faire :
ceph osd df
ID  CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP    META     AVAIL    %USE  VAR   PGS  STATUS
 8    hdd  12.78220   1.00000   13 TiB   50 GiB   13 MiB   5 KiB  882 MiB   13 TiB  0.38  1.25  141      up
 9    hdd  12.78220   1.00000   13 TiB   50 GiB   13 MiB   4 KiB  597 MiB   13 TiB  0.38  1.25  139      up
10    hdd  12.78220   1.00000   13 TiB   50 GiB   14 MiB   4 KiB  1.1 GiB   13 TiB  0.38  1.25  149      up
11    hdd  12.78220   1.00000   13 TiB   50 GiB   15 MiB   9 KiB  1.1 GiB   13 TiB  0.38  1.25  154      up
 0    hdd   6.37650   1.00000  6.4 TiB   10 GiB   13 MiB     0 B  135 MiB  6.4 TiB  0.15  0.50  107      up
 1    hdd   6.37650   1.00000  6.4 TiB   10 GiB   14 MiB     0 B  117 MiB  6.4 TiB  0.15  0.50  104      up
 2    hdd   6.37650   1.00000  6.4 TiB   10 GiB   13 MiB     0 B  163 MiB  6.4 TiB  0.15  0.50   96      up
 3    hdd   6.37650   1.00000  6.4 TiB   10 GiB   13 MiB     0 B  181 MiB  6.4 TiB  0.15  0.50   96      up
12    hdd   6.37650   1.00000  6.4 TiB   10 GiB   13 MiB     0 B  113 MiB  6.4 TiB  0.15  0.50  100      up
13    hdd   6.37650   1.00000  6.4 TiB   10 GiB   14 MiB     0 B  113 MiB  6.4 TiB  0.15  0.50   90      up
14    hdd   6.37650   1.00000  6.4 TiB   10 GiB   13 MiB     0 B   91 MiB  6.4 TiB  0.15  0.50   74      up
15    hdd   6.37650   1.00000  6.4 TiB   10 GiB   13 MiB     0 B   91 MiB  6.4 TiB  0.15  0.50   58      up
 4    hdd         0   1.00000   13 TiB   50 GiB   13 MiB   5 KiB  112 MiB   13 TiB  0.39  1.26   41      up
 5    hdd         0   1.00000   13 TiB   50 GiB   13 MiB   4 KiB  1.1 GiB   13 TiB  0.38  1.25   47      up
 6    hdd         0   1.00000   13 TiB   50 GiB   13 MiB   2 KiB  790 MiB   13 TiB  0.38  1.25   76      up
 7    hdd         0   1.00000   13 TiB   50 GiB   14 MiB  12 KiB  651 MiB   13 TiB  0.38  1.25  115      up
                        TOTAL  153 TiB  481 GiB  215 MiB  50 KiB  7.3 GiB  153 TiB  0.31   

# Ajuster le poids d'un des OSD :
ceph osd crush reweight osd.4 12.78220

Afficher les infos sur les Metadata

ceph osd metadata  | grep bluefs_db_devices

Agrandir la partition NVMe pour la DB Bluestore

ceph-bluestore-tool show-label --path /var/lib/ceph/9e2d3cee-4d0c-11ef-ba6d-047c16f1285e/osd.0
inferring bluefs devices from bluestore path
{
    "/var/lib/ceph/9e2d3cee-4d0c-11ef-ba6d-047c16f1285e/osd.0/block": {
        "osd_uuid": "de8d5d36-0bde-4c82-9e26-7285aee5fdc3",
        "size": 22000965779456,
        "btime": "2024-07-28T21:03:49.248304+0200",
        "description": "main",
        "bfm_blocks": "5371329536",
        "bfm_blocks_per_key": "128",
        "bfm_bytes_per_block": "4096",
        "bfm_size": "22000965779456",
        "bluefs": "1",
        "ceph_fsid": "9e2d3cee-4d0c-11ef-ba6d-047c16f1285e",
        "ceph_version_when_created": "ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)",
        "created_at": "2024-07-28T19:03:52.834820Z",
        "kv_backend": "rocksdb",
        "magic": "ceph osd volume v026",
        "mkfs_done": "yes",
        "osd_key": "AQATlqZmn24KLRAAF9k3hSaSBA8bK12KKYKZjg==",
        "osdspec_affinity": "None",
        "ready": "ready",
        "require_osd_release": "19",
        "whoami": "0"
    },
    "/var/lib/ceph/9e2d3cee-4d0c-11ef-ba6d-047c16f1285e/osd.0/block.db": {
        "osd_uuid": "de8d5d36-0bde-4c82-9e26-7285aee5fdc3",
        "size": 75161927680,
        "btime": "2024-07-28T21:03:49.268953+0200",
        "description": "bluefs db"
    }
}

ls -al /var/lib/ceph/9e2d3cee-4d0c-11ef-ba6d-047c16f1285e/osd.0/block.db
drwxrwxrwx 1 167 167 20 Jan 15 22:02 /var/lib/ceph/9e2d3cee-4d0c-11ef-ba6d-047c16f1285e/osd.0/block.db -> /dev/mapper/vg-db--0

ceph osd add-noout osd.0
systemctl stop ceph-9e2d3cee-4d0c-11ef-ba6d-047c16f1285e@osd.0.service

lvextend -L+10G /dev/vg/db-0
ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/9e2d3cee-4d0c-11ef-ba6d-047c16f1285e/osd.0

systemctl start ceph-9e2d3cee-4d0c-11ef-ba6d-047c16f1285e@osd.0.service
ceph osd rm-noout osd.0

Supprimer un pool

ceph config set mon mon_allow_pool_delete true
ceph osd pool delete .rgw.root .rgw.root --yes-i-really-really-mean-it

Afficher les statistiques d’un OSD

ceph tell osd.8 dump_historic_ops

Bucket S3

export AWS_ACCESS_KEY_ID='XXXXXX'
export AWS_SECRET_ACCESS_KEY='XXXXX'
s3cmd del --verbose --host=pkgdata.backup --force --host-bucket="" --recursive s3://ov102.pkgdata.net

ceph report | jq ‘.servicemap’

Troubleshooting

# ceph orch host ls
HOST   ADDR         LABELS  STATUS   
ov001  192.168.1.1  _admin  Offline  
ov002  192.168.1.2  _admin           
ov003  192.168.1.3

# ceph cephadm check-host ov001 192.168.1.1
Error EIO: Module 'cephadm' has experienced an error and cannot handle commands: Command '['rados', '-n', 'mgr.ov002.sauxwj', '-k', '/var/lib/ceph/mgr/ceph-ov002.sauxwj/keyring', '-p', '.nfs', '--namespace', 'new_nfs', 'rm', 'grace']' timed out after 10 seconds

Le souci est sur ov002, il faut redémarrer our déclarer le MGR en fail :

ceph mgr fail
# ceph health detail
HEALTH_WARN 1 failed cephadm daemon(s)
[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
    daemon node-exporter.ov001 on ov001 is in error state

Il faut redémarrer le service :

ceph orch daemon restart node-exporter.ov001

Activer du debug sur le Dashboard :

ceph dashboard debug enable
ceph tell mgr config set debug_mgr 20

Afficher des logs :

cephadm logs --name mgr.ov001.wsyuis -- -f