Docker Deployment
There are many projects that allow you to configure and manage the automated deployment of a service. From our experience, we have found that the small learning curve, mature tooling, and widespread ecosystem support of Docker make for a relatively painless and sufficiently reliable deployment.
In Rosetta, blockchain teams are expected to create and maintain a single Dockerfile (referencing any number of build stages) that starts the node runtime and all of its dependent services without human intervention.
Upon first glance, using a single Dockerfile to start all services required by the node may sound antithetical to best practices. However, we have found that restricting blockchain node deployment to a single container makes the orchestration of multiple nodes much easier because of coordinated start/stop and single volume mounting.
Coordinated Start/Stop
Some blockchain nodes rely on a number of dependent services to function correctly. It is often the case that these nodes require an explicit startup and shutdown sequence to function correctly and/or prevent state corruption. With distributed services (in multiple running containers), this sequencing of operations can require a custom deployer for each blockchain. Building and maintaining these deployers can take a lot of communication with blockchain teams and a lot of complicated testing to ensure correctness.
With a single Dockerfile, blockchain teams can explicitly specify how services should be started and stopped using scripts and easily test for issues in various scenarios (as all services are confined to a single container instead of being spread across some set of systems).
Middleware vs Embedded
Blockchain teams that do not wish to extend a core node to comply with the Node API specification can alternatively implement a middleware server that transforms native responses to the Rosetta format (as long as this additional service's startup and shutdown is coordinated by the Docker implementation). That being said, multiple teams have initially chosen this route but reversed course to avoid the increased maintenance burden of having to manage a node interface in an external package (especially across upgrades of the core node).
Single Volume Mounting
When a deployment is started from a single Dockerfile it is straightforward to mount a single volume to the new container and manage all of its state. Node deployments can be easily scaled by duplicating this volume to any number of new hosts without any sophisticated tooling. As mentioned previously, coordinated start/stop of all services provides strong guarantees around the corruption of the state that would be much more difficult to achieve with distributed services as there may be specific ordering restrictions to prevent corruption.
Running multiple instances of a node configuration can get complicated quickly if the node utilizes multiple stateful containers (ex: a node that stores historical state in an external database). In this scenario, the node orchestration engine must manage which node deployment talks to which services based on which state the node runtime is in. Furthermore, it is more time-intensive to scale up a node deployment as a deployment and all its services must be synced from scratch to ensure correctness or the volumes of another deployment's stateful containers must be used to bootstrap the new deployment (which can be a manual procedure).
Stateful Node API Implementations are OK
To efficiently populate responses, it may be necessary to preprocess + cache data in a Rosetta Node API implementation. For example, a Bitcoin Node API server may cache transaction outpoint information (otherwise n requests to the node may need to be made to fully populate a transaction, where n is the inputs in a given transaction).
There is no reason that additional caches (outside of what is stored by the node) cannot be used in a Node API implementation. The only expectation is that any state is stored in the
/data
directory (as mentioned previously) and that cache migrations are handled gracefully.
Dockerfile Expectations
Build Anywhere
It must be possible to build your Dockerfile from any location (i.e. run docker build
).
For this reason, your Dockerfile must not copy any files from the directory
where it is built. Instead, any required files must be downloaded from the
internet with explicit versioning. If it is not possible to fetch a specific
version, the hash of downloaded files must be compared with a constant
defined in the Dockerfile.
Good Example
# Use multi-stage build
FROM golang:1.13 as builder
# Download from GitHub instead of using COPY
git clone https://github.com/blockchain-team/node
# Checkout a specific version
git checkout v1.2
make build
# Create final container
FROM alpine:latest
# It is ok to COPY files from a build container (when using multi-stage builds)
COPY --from=builder bin/node bin/node
CMD node run
Bad Example: Copy Files from Build Environment
FROM golang:1.13
# Do not COPY files into container from directory where built
COPY . .
make build
CMD node run
Complile Exclusively from Source
Your Dockerfile must build both the node you are working on and Rosetta implementation exclusively from source. In other words, your Dockerfile must not rely on any images that contain previously compiled code relevant to the blockchain you are working on.
Good Example
FROM golang:1.13 as builder
git clone https://github.com/blockchain-team/node
git checkout v1.2
make build
FROM alpine:latest
# It is ok to rely on build containers in a multi-stage build
COPY --from=builder bin/node bin/node
CMD node run
Bad Example: Build on Pre-Compiled Image
# Do not use previously built images (even if it
# is just for the node you are working on and not your Rosetta implementation)
FROM previously-built-blockchain-node-image:latest
git clone https://github.com/blockchain-team/rosetta-node
make build
CMD node run