Issue
Openshift docker registry returns 503 Service Unavailable.
$ docker login -u $(oc whoami) -p $(oc whoami -t) docker-registry-default.example.com
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Error response from daemon: login attempt to https://docker-registry-default.example.com/v2/ failed with status: 503 Service Unavailable
In such case, we won’t be able to push/pull images to/from the Docker Registry.
Solution
The root cause is kinda obvious here, Docker registry isn’t available and does not accept connections. We’ll look deeper into the docker registry container logs and look for clues.
Step1: Check docker registry pod status
First thing first, let’s have a quick look at the status of docker registry pods in the default namespaces:
oc get pods -n default | grep docker-registry
Docker Registry pods should be running.
$ oc get pods -n default | grep docker-registry
docker-registry-1-6moc9 1/1 Running 0 2d
docker-registry-1-xpydw 1/1 Running 0 2d
If your pods aren’t running, check the event logs and container logs.
oc describe pod <docker-registry-pod> -n default
Step 2: Check the pods/containers logs
The docker registry pod might be old and contains tons of logs, we only want to see the last entries with –tail option.
oc logs docker-registry-1-6moc9 --tail=50
Look for the error messages, in my case, I see the following certificate error:
logs.go:41] http: TLS handshake error from 10.120.6.1:52774: remote error: tls: expired certificate
I could also observe the following error:
Get : x509: certificate has expired or is not yet valid
These logs can help you to solve the puzzle.
In my case, It looks like the internal docker registry certs are expired while the public route certificate is still valid. The certs for docker-registry.default.svc
and docker-registry.default.svc.cluster.local
needs to be renewed.
You can verify that by checking the registry’s certificate expiry date
openssl x509 -in /etc/origin/master/registry.crt -text -noout | grep 'Not After'
Output:
Not After : Oct 1 09:12:10 2021 GMT
Step 3: Renew Docker Registry certs
SSH to the first master and execute the following commands:
Get the registry’s hostname and IP.
REGISTRY_IP=`oc get service docker-registry -o jsonpath='{.spec.clusterIP}'`
REGISTRY_HOSTNAME=`oc get route/docker-registry -o jsonpath='{.spec.host}'`
Then generate the certs:
oc adm ca create-server-cert --signer-cert=/etc/origin/master/ca.crt --signer-key=/etc/origin/master/ca.key --hostnames=$REGISTRY_IP,docker-registry.default.svc,docker-registry.default.svc.cluster.local,$REGISTRY_HOSTNAME --cert=/etc/origin/master/registry.crt --key=/etc/origin/master/registry.key --signer-serial=/etc/origin/master/ca.serial.txt
Replace the certs:
oc create secret generic registry-certificates --from-file=/etc/origin/master/registry.crt,/etc/origin/master/registry.key -o json --dry-run | oc replace -f -
Restart docker-registry pods:
oc rollout latest dc/docker-registry
The certificates should be updated if no errors. More on this please visit the official documentation.
Step 4: Verify Docker Registry Status
Lastly, we can verify if Docker Registry is available. We can run docker login command and docker container logs to verify the TLS error is gone and the service is available.
docker login -u $(oc whoami) -p $(oc whoami -t) docker-registry-default.example.com
You should see Login Succeeded:
$ docker login -u $(oc whoami) -p $(oc whoami -t) docker-registry-default.example.com
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
If you still can’t get it right, docker registry container logs is a good place to be looking.
oc logs docker-registry-xxxx --tail=50
Conclusion
In this short tutorial, we covered how to approach the “503 Service Unavailable” error in Openshift container platforms. We looked into the container logs and discovered the root cause. Finally, we renewed docker registry expired certificates and recovered docker-registry service.