Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

This browser is no longer supported.

Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.

Download Microsoft EdgeMore info about Internet Explorer and Microsoft Edge
Table of contentsExit editor mode

Troubleshoot platform issues for Azure Arc-enabled Kubernetes clusters

Feedback

In this article

This document provides troubleshooting guides for issues with Azure Arc-enabled Kubernetes connectivity, permissions, and agents. It also provides troubleshooting guides for Azure GitOps, which can be used in either Azure Arc-enabled Kubernetes or Azure Kubernetes Service (AKS) clusters.

For help troubleshooting issues related to extensions, such as GitOps (Flux v2), Azure Monitor Container Insights, Open Service Mesh, seeTroubleshoot extension issues for Azure Arc-enabled Kubernetes clusters.

Azure CLI

Before usingaz connectedk8s oraz k8s-configuration CLI commands, ensure that Azure CLI is set to work against the correct Azure subscription.

az account set --subscription 'subscriptionId'az account show

If you see an error such ascli.azext_connectedk8s.custom: Failed to download and install kubectl, runaz aks install-cli --install-location ~/.azure/kubectl-client/kubectl before trying to runaz connectedk8s connect again. This command installs the kubectl client, which is required for the command to work.

Azure Arc agents

Allagents for Azure Arc-enabled Kubernetes are deployed as pods in theazure-arc namespace. All pods should be running and passing their health checks.

First, verify the Azure Arc Helm Chart release:

$ helm --namespace default status azure-arcNAME: azure-arcLAST DEPLOYED: Fri Apr  3 11:13:10 2020NAMESPACE: defaultSTATUS: deployedREVISION: 5TEST SUITE: None

If the Helm Chart release isn't found or missing, tryconnecting the cluster to Azure Arc again.

If the Helm Chart release is present withSTATUS: deployed, check the status of the agents usingkubectl:

$ kubectl -n azure-arc get deployments,podsNAME                                         READY   UP-TO-DATE   AVAILABLE   AGEdeployment.apps/cluster-metadata-operator    1/1     1            1           3d19hdeployment.apps/clusterconnect-agent         1/1     1            1           3d19hdeployment.apps/clusteridentityoperator      1/1     1            1           3d19hdeployment.apps/config-agent                 1/1     1            1           3d19hdeployment.apps/controller-manager           1/1     1            1           3d19hdeployment.apps/extension-events-collector   1/1     1            1           3d19hdeployment.apps/extension-manager            1/1     1            1           3d19hdeployment.apps/flux-logs-agent              1/1     1            1           3d19hdeployment.apps/kube-aad-proxy               1/1     1            1           3d19hdeployment.apps/metrics-agent                1/1     1            1           3d19hdeployment.apps/resource-sync-agent          1/1     1            1           3d19hNAME                                              READY   STATUS    RESTARTS        AGEpod/cluster-metadata-operator-74747b975-9phtz     2/2     Running   0               3d19hpod/clusterconnect-agent-cf4c7849c-88fmf          3/3     Running   0               3d19hpod/clusteridentityoperator-79bdfd945f-pt2rv      2/2     Running   0               3d19hpod/config-agent-67bcb94b7c-d67t8                 1/2     Running   0               3d19hpod/controller-manager-559dd48b64-v6rmk           2/2     Running   0               3d19hpod/extension-events-collector-85f4fbff69-55zmt   2/2     Running   0               3d19hpod/extension-manager-7c7668446b-69gps            3/3     Running   0               3d19hpod/flux-logs-agent-fc7c6c959-vgqvm               1/1     Running   0               3d19hpod/kube-aad-proxy-84d668c44b-j457m               2/2     Running   0               3d19hpod/metrics-agent-58fb8554df-5ll67                2/2     Running   0               3d19hpod/resource-sync-agent-dbf5db848-c9lg8           2/2     Running   0               3d19h

All pods should showSTATUS asRunning with either3/3 or2/2 under theREADY column. Fetch logs and describe the pods returning anError orCrashLoopBackOff. If any pods are stuck inPending state, there might be insufficient resources on cluster nodes.Scaling up your cluster can get these pods to transition toRunning state.

Resource provisioning failed/Service timeout error

If you see these errors, checkAzure status to see if there are any active events impacting the status of the Azure Arc-enabled Kubernetes service. If so, wait until the service event is resolved, then try onboarding again afterdeleting the existing connected cluster resource. If there are no service events, and you continue to face issues while onboarding,open a support request so we can investigate the problem.

Overage claims error

If you receive an overage claim, make sure that your service principal isn't part of more than 200 Microsoft Entra groups. If this is the case, you must create and use another service principal that isn't a member of more than 200 groups, or remove the original service principal from some of its groups and try again.

An overage claim may also occur if you have configured an outbound proxy environment without allowing the endpointhttps://<region>.obo.arc.azure.com:8084/ for outbound traffic.

If neither of these apply,open a support request so we can look into the issue.

Issues when connecting Kubernetes clusters to Azure Arc

Connecting clusters to Azure Arc requires access to an Azure subscription andcluster-admin access to a target cluster. If you can't reach the cluster, or if you have insufficient permissions, connecting the cluster to Azure Arc will fail. Make sure you've met all of theprerequisites to connect a cluster.

Tip

For a visual guide to troubleshooting connection issues, seeDiagnose connection issues for Arc-enabled Kubernetes clusters.

DNS resolution issues

For help with issues related to DNS resolution on your cluster, seeDebugging DNS Resolution.

Outbound network connectivity issues

Issues with outbound network connectivity from the cluster may arise for different reasons. First make sure all of thenetwork requirements have been met.

If you encounter connectivity issues, and your cluster is behind an outbound proxy server, make sure you passed proxy parameters during the onboarding of your cluster and that the proxy is configured correctly. For more information, seeConnect using an outbound proxy server.

You may see an error similar to the following:

An exception has occurred while trying to execute the cluster diagnostic checks in the cluster. Exception: Unable to pull cluster-diagnostic-checks helm chart from the registry 'mcr.microsoft.com/azurearck8s/helmchart/stable/clusterdiagnosticchecks:0.1.2': Error: failed to do request: Head "https://mcr.microsoft.com/v2/azurearck8s/helmchart/stable/clusterdiagnosticchecks/manifests/0.1.2": dial tcp xx.xx.xx.219:443: i/o timeout

This error occurs when themcr.microsoft.com endpoint is blocked. Be sure that your network allows connectivity to this endpoint and meets all of the othernetworking requirements.

Unable to retrieve MSI certificate

Problems retrieving the MSI certificate are usually due to network issues. Check to make sure all of thenetwork requirements have been met, then try again.

Insufficient cluster permissions

If the provided kubeconfig file doesn't have sufficient permissions to install the Azure Arc agents, the Azure CLI command returns an error:Error: list: failed to list: secrets is forbidden: User "myuser" cannot list resource "secrets" in API group "" at the cluster scope

To resolve this issue, ensure that the user connecting the cluster to Azure Arc has thecluster-admin role assigned.

Unable to connect OpenShift cluster to Azure Arc

Ifaz connectedk8s connect is timing out and failing when connecting an OpenShift cluster to Azure Arc:

  1. Ensure that the OpenShift cluster meets the version prerequisites: 4.5.41+ or 4.6.35+ or 4.7.18+.

  2. Before you runaz connectedk8s connnect, run this command on the cluster:

    oc adm policy add-scc-to-user privileged system:serviceaccount:azure-arc:azure-arc-kube-aad-proxy-sa

Installation timeouts

Connecting a Kubernetes cluster to Azure Arc-enabled Kubernetes requires installation of Azure Arc agents on the cluster. If the cluster is running over a slow internet connection, the container image pull for agents may take longer than the Azure CLI timeouts.

Helm timeout error

You may see the errorUnable to install helm release: Error: UPGRADE Failed: time out waiting for the condition. To resolve this issue, try the following steps:

  1. Run the following command:

    kubectl get pods -n azure-arc
  2. Check if theclusterconnect-agent or theconfig-agent pods are showingcrashloopbackoff, or if not all containers are running:

    NAME                                        READY   STATUS             RESTARTS   AGEcluster-metadata-operator-664bc5f4d-chgkl   2/2     Running            0          4m14sclusterconnect-agent-7cb8b565c7-wklsh       2/3     CrashLoopBackOff   0          1m15sclusteridentityoperator-76d645d8bf-5qx5c    2/2     Running            0          4m15sconfig-agent-65d5df564f-lffqm               1/2     CrashLoopBackOff   0          1m14s
  3. If theazure-identity-certificate isn't present, the system assigned managed identity hasn't been installed.

    kubectl get secret -n azure-arc -o yaml | grep name:
    name: azure-identity-certificate

    To resolve this issue, try deleting the Arc deployment by running theaz connectedk8s delete command and reinstalling it. If the issue continues to happen, it could be an issue with your proxy settings. In that case,try connecting your cluster to Azure Arc via a proxy to connect your cluster to Arc via a proxy. Also verify that all of thenetwork prerequisites have been met.

  4. If theclusterconnect-agent and theconfig-agent pods are running, but thekube-aad-proxy pod is missing, check your pod security policies. This pod uses theazure-arc-kube-aad-proxy-sa service account, which doesn't have admin permissions but requires the permission to mount host path.

  5. If thekube-aad-proxy pod is stuck inContainerCreating state, check whether the kube-aad-proxy certificate has been downloaded onto the cluster.

    kubectl get secret -n azure-arc -o yaml | grep name:
    name: kube-aad-proxy-certificate

    If the certificate is missing,delete the deployment and try onboarding again, using a different name for the cluster. If the problem continues,open a support request.

CryptoHash module error

When attempting to onboard Kubernetes clusters to the Azure Arc platform, the local environment (for example, your client console) may return the following error message:

Cannot load native module 'Crypto.Hash._MD5'

Sometimes, dependent modules fail to download successfully when adding the extensionsconnectedk8s andk8s-configuration through Azure CLI or Azure PowerShell. To fix this problem, manually remove and then add the extensions in the local environment.

To remove the extensions, use:

az extension remove --name connectedk8saz extension remove --name k8s-configuration

To add the extensions, use:

az extension add --name connectedk8saz extension add --name k8s-configuration

Cluster connect issues

If your cluster is behind an outbound proxy or firewall, verify that websocket connections are enabled for*.servicebus.windows.net, which is required specifically for theCluster Connect feature. Additionally, make sure you're using the latest version of theconnectedk8s Azure CLI extension if you're experiencing problems using cluster connect.

If theclusterconnect-agent andkube-aad-proxy pods are missing, then the cluster connect feature is likely disabled on the cluster. If so,az connectedk8s proxy fails to establish a session with the cluster, and you may see an error readingCannot connect to the hybrid connection because no agent is connected in the target arc resource.

To resolve this error, enable the cluster connect feature on your cluster:

az connectedk8s enable-features --features cluster-connect -n $CLUSTER_NAME -g $RESOURCE_GROUP

For more information, seeUse cluster connect to securely connect to Azure Arc-enabled Kubernetes clusters.

Enable custom locations using service principal

When connecting your cluster to Azure Arc or enabling custom locations on an existing cluster, you may see the following warning:

Unable to fetch oid of 'custom-locations' app. Proceeding without enabling the feature. Insufficient privileges to complete the operation.

This warning occurs when you use a service principal to log into Azure, and the service principal doesn't have the necessary permissions. To avoid this error, follow these steps:

  1. Sign in into Azure CLI using your user account. Retrieve the Object ID of the Microsoft Entra application used by Azure Arc service:

    az ad sp show --id bc313c14-388c-4e7d-a58e-70017303ee3b --query objectId -o tsv
  2. Sign in into Azure CLI using the service principal. Use the<objectId> value from the previous step to enable custom locations on the cluster:

    • To enable custom locations when connecting the cluster to Arc, runaz connectedk8s connect -n <cluster-name> -g <resource-group-name> --custom-locations-oid <objectId>
    • To enable custom locations on an existing Azure Arc-enabled Kubernetes cluster, runaz connectedk8s enable-features -n <cluster-name> -g <resource-group-name> --custom-locations-oid <objectId> --features cluster-connect custom-locations

Next steps


Feedback

Was this page helpful?

YesNoNo

Need help with this topic?

Want to try using Ask Learn to clarify or guide you through this topic?

Suggest a fix?

  • Last updated on

In this article

Was this page helpful?

YesNo
NoNeed help with this topic?

Want to try using Ask Learn to clarify or guide you through this topic?

Suggest a fix?