Introduction
After update or new installation of the TFE with the latest Amazon Linux 2 (AMI as of 1/1/2024) replicated service fails to start
Problem
When checking replicated logs you will see the follow errors
journalctl -u replicated
Jan 04 11:35:48 ip-xxxx.ec2.internal docker[xx]: runtime/cgo: pthread_create failed: Operation not permitted Jan 04 11:35:48 ip-xxxx.ec2.internal docker[xx]: SIGABRT: abort
Cause
-
Amazon Linux 2 has recently undergone security updates that tightened
libseccomp
permissions.libseccomp
is a library used to restrict the system calls (syscalls) a process can make, enhancing security by limiting potential attack vectors. The default seccomp profile in the hardened AL2 configuration sets certain syscalls toSCMP_ACT_ERRNO
, which means if a process tries to execute those syscalls, the kernel will deny the request and return aPermission Denied
error (EPERM
). This can break applications that rely on these syscalls.
Solutions:
- Follow the steps in this documentation to amend the
/etc/docker/seccomp.json
file
From step 1 to step 4 -
Before continuing make sure to change in file
etc/docker/seccomp.json
occurrence of"defaultAction": "SCMP_ACT_ERRNO",
to"defaultAction": "SCMP_ACT_ALLOW"
. This explicitly permits the syscall, resolving thePermission Denied
errors. - Stop the Replicated Services:
sudo systemctl stop replicated replicated-operator replicated-ui
- Restart Docker:
sudo systemctl restart docker