Introduction
After update or new installation of the TFE with the latest Amazon Linux 2 (AMI as of 1/1/2024) replicated service fails to start
Problem
When checking replicated logs you will see the follow errors
journalctl -u replicated
Jan 04 11:35:48 ip-xxxx.ec2.internal docker[xx]: runtime/cgo: pthread_create failed: Operation not permitted Jan 04 11:35:48 ip-xxxx.ec2.internal docker[xx]: SIGABRT: abort
Cause
-
Amazon Linux 2 has recently undergone security updates that tightened
libseccomppermissions.libseccompis a library used to restrict the system calls (syscalls) a process can make, enhancing security by limiting potential attack vectors. The default seccomp profile in the hardened AL2 configuration sets certain syscalls toSCMP_ACT_ERRNO, which means if a process tries to execute those syscalls, the kernel will deny the request and return aPermission Deniederror (EPERM). This can break applications that rely on these syscalls.
Solutions:
- Follow the steps in this documentation to amend the
/etc/docker/seccomp.jsonfile
From step 1 to step 4 -
Before continuing make sure to change in file
etc/docker/seccomp.jsonoccurrence of"defaultAction": "SCMP_ACT_ERRNO",to"defaultAction": "SCMP_ACT_ALLOW". This explicitly permits the syscall, resolving thePermission Deniederrors. - Stop the Replicated Services:
sudo systemctl stop replicated replicated-operator replicated-ui - Restart Docker:
sudo systemctl restart docker