在Trident存储上使用Anthos升级GKE失败
适用场景
- Astra Trident
- 使用Anthos的Google Kubernetes Engine (GKEE)集群
问题描述
在为升级做准备时运行gkectl difose时、检查存储失败、类似于:
user@hostname:~$ gkectl diagnose cluster --kubeconfig kubeconfig --cluster-name gke-anthos-cluster
Preparing for the diagnose tool...
Diagnosing the cluster...... DONE
Diagnose result is saved successfully in /home/user/diagnose-user-gke-anthos-cluster-20230130155819.json
- Validation Category: Cluster Healthiness
Checking user cluster and node pools...SUCCESS
Checking user cluster certificates...SUCCESS
Checking cluster object...SUCCESS
...
Checking GKE Hub Membership...SUCCESS
Checking all poddisruptionbudgets...SUCCESS
Checking storage...FAILURE
Reason: 3 storage error(s).
Unhealthy Resources:
PersistentVolume kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12: virtual disk "kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12" IS NOT attached to machine "hostname-of-node-01" but IS listed in the Node.Status
PersistentVolume kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12: virtual disk "kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12" IS NOT attached to machine "hostname-of-node-02" but IS listed in the Node.Status
PersistentVolume kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12: virtual disk "kubernetes.io/csi/csi.trident.netapp.io^pvc-1234abcd-1234-abcd-1234-12345abcde12" IS NOT attached to machine "hostname-of-node-03" but IS listed in the Node.Status