96 cores hot with ARMv8 and Docker

I had early access to a 96 core, 128 gigabyte ARMv8 server today. Here’s what I did to get all of the CPUs and all of the memory in use at the same time.

The system: a bare-metal hosting company is working on general availability of these ARMv8 (aarch64) servers. I got early access for beta testing. Talk to me if you’d like to know more.

The software: These systems boot with Ubuntu 16.04 which is plenty modern to run lots of workloads. The challenge I had was that there was no Docker available to download via apt-get. So, off to build my own.

A starting point: I began with this writeup of running Docker on ARMv8 from HiSilicon written by the Hypriot folks. They lovingly and carefully document getting Docker going on a 16-core system, starting from Ubuntu 15.04. There are enough differences in the config that I had to adapt, but not so many that I was in foreign territory.

Go: The 96 core system has Go 1.6.2 installed out of the box, so I didn’t have to bootstrap that. That saved a bunch of time.

Building Docker without having a Docker server running is a trick. The Hypriot team describes it thusly:

For this purpose there is an easy, but not really well-known workaround. We have to check and install the necessary development dependencies first and then we can run the build script natively to get a first working Docker binary. So, let’s do it right away.

Follow their instructions closely, and you get a build of v1.10.2 of Docker, which you can copy into the bin directories and run directly.

time AUTO_GOPATH=1 ./hack/make.sh dynbinary

Systemd was happy after I dug out the right service files for docker.service and docker.socket. Just copy the docker binaries into place, reload systemd, and you’re almost good to go. Almost, because you need to make sure you create /var/run/docker.sock which allows communications between the client and server.

Next attempt was to build the “master” release, and there I wasn’t able to successfully do a complete build because the aufs tests did not all pass. (See the open ticket for the details.)

After some time of building Docker over and over from source, and not getting tests to pass, I gave up and declared victory. Hooray! Someone who knows more about file systems can push the next step forward.

To test this system, Mohan Kartha pointed me at Marek Goldmann’s excellent treatise on resource management in Docker. He uses a system testing tool called stress running inside a Docker container to exercise workloads. As a note to get this test running, you need to slightly change the provided Dockerfile, to read as follows to pick up an aarch64 version of Fedora.

FROM resin/aarch64-fedora:latest
RUN yum -y install stress && yum clean all
ENTRYPOINT ["stress"]

Build this Dockerfile and then run as follows to give a 96 core system a good workout. Install htop first so you get a good colorful screen to watch.

docker run -it --rm stress --cpu 96 --io 96 --vm 96 --vm-bytes 4G --timeout 100s

I saved away some binaries on a couple of systems so that reinstalling should be straightforward once this machine gets destroyed, and then off we go.