Problem
When you restart Replicated services during Terraform Enterprise maintenance or troubleshooting, the replicated, replicated-operator, and replicated-ui services may fail and exit with exit status 125.
The systemd status shows the services are stuck in an activating (auto-restart) state with an exit-code result.
$ systemctl status replicated replicated-ui replicated-operator ● replicated.service - Replicated Service ##... Active: activating (auto-restart) (Result: exit-code) since Mon 2023-01-16 23:33:18 UTC; 255ms ago ##... Main PID: 395215 (code=exited, status=125) ● replicated-ui.service - Replicated Service ##... Active: activating (auto-restart) (Result: exit-code) since Mon 2023-01-16 23:33:18 UTC; 301ms ago ##... Main PID: 395201 (code=exited, status=125) ● replicated-operator.service - Replicated Operator Service ##... Active: activating (auto-restart) (Result: exit-code) since Mon 2023-01-16 23:33:18 UTC; 290ms ago ##... Main PID: 395202 (code=exited, status=125)
Viewing the logs for these units shows that the docker run commands failed because the required container images could not be found.
$ journalctl -u replicated -u replicated-ui -u replicated-operator -n 18 ## Logs begin at Mon 2023-01-16 19:16:40 UTC, end at Mon 2023-01-16 23:41:38 UTC. -- ##... replicated-ui.service: Main process exited, code=exited, status=125/n/a replicated-ui.service: Failed with result 'exit-code'. docker[402031]: docker: Error response from daemon: manifest for replicated/replicated:current not found: manifest unknown: manifest unknown. docker[402031]: See 'docker run --help'. replicated.service: Main process exited, code=exited, status=125/n/a replicated.service: Failed with result 'exit-code'. docker[402038]: docker: Error response from daemon: manifest for replicated/replicated-operator:current not found: manifest unknown: manifest unknown. docker[402038]: See 'docker run --help'. replicated-operator.service: Main process exited, code=exited, status=125/n/a replicated-operator.service: Failed with result 'exit-code'.
Prerequisites
- An existing installation of Terraform Enterprise on Replicated.
Cause
This error can occur if the Replicated container images were inadvertently removed from the local Docker image repository during system maintenance, for example, by using commands like docker system prune, docker image prune, or docker rmi.
When Replicated is installed via the Terraform Enterprise install script, its container images are pulled from a remote registry and tagged with current. The systemd units reference these current tags in their docker run commands. If these locally tagged images are missing, Docker cannot start the Replicated containers, causing the services to fail.
Solutions
There are two ways to resolve this issue.
Solution 1: Rerun the Install Script
The most direct method is to run the Terraform Enterprise install script again. This process will restore the required container images and retag them correctly in the local repository.
Solution 2: Manually Retag Local Images
If the version-tagged Replicated images still exist in the local repository but the current tags are missing, you can manually retag them.
-
Verify that the version-tagged images are present. Run
docker image lsand locate thereplicated/replicated,replicated/replicated-operator, andreplicated/replicated-uiimages.$ docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE replicated/replicated stable-2.54.1 3f4f8cf65a77 5 weeks ago 348MB replicated/replicated-ui stable-2.54.1 24cff54683d7 5 weeks ago 138MB replicated/replicated-operator stable-2.54.1 c01aa42bf17b 5 weeks ago 129MB ##...
-
Tag each of the versioned images with the
currenttag. Replacestable-2.54.1with the version tag found in your environment.Tag the main Replicated image.
$ docker tag replicated/replicated:stable-2.54.1 replicated/replicated:current
Tag the Replicated operator image.
$ docker tag replicated/replicated-operator:stable-2.54.1 replicated/replicated-operator:current
Tag the Replicated UI image.
$ docker tag replicated/replicated-ui:stable-2.54.1 replicated/replicated-ui:current
Outcome
After applying one of the solutions, the replicated, replicated-operator, and replicated-ui services should start successfully on the next restart attempt.