Core services, resilience, and continuity
Describes Triton core services in terms of resilience and continuity; this page is primarily provided for reference. Please contact Joyent Support prior to enabling HA for a review of your current architecture and configuration so that they are able to provide the best practice setup for your environment.
Triton uses a service oriented architecture. Each service performs a specific role within Triton and runs within a zone. In an initial deployment of Triton, all of these zones run on the server designated as the head node. Compute nodes are used to hold additional copies of services in the case of clusters or redundant configurations of services.
The only stateful zone/service is manatee which is a storage system built on top of Postgres. Manatee is designed to be run in a three-instance configuration, consisting of a primary, synchronous slave, and asynchronous slave. Manatee uses Zookeeper for leader election and state tracking, and is designed to maintain read/write access if it encounters a failure of one of it's instances. Read-only access will be maintained in the event that there are simultaneous failures of two instances.
Access to manatee is provided to the other core services indirectly either via moray (a key/value store) or UFDS (ldap).
As a consequence of this architecture, resilience to failure and continuity of service can be achieved by implementing multiple instances of each service. The current state for each core services falls into one of three categories:
Services in this category can and should be deployed in a high availability or clustered configuration. Failures of these components should be communicated to Joyent Support prior to attempting recovery.
Services in this category can have multiple copies deployed, and in the event of a failure they can be simply recreated. The process of recreating a service instance takes roughly 1-2 minutes depending on the size/speed of your hardware and network. This is done using the
Services in this category should only ever have one instance running, and should be recreated after a failure.
The table below summarizes the various services and the recommended deployment scheme. Additionally, services that require Joyent Support assistance to recover are noted as such.
|Service||HA/Cluster||Multiple Instances||Operator Restorable|
- The only stateful instance in Triton is the manatee data store (postgres). All other instances are stateless, which greatly simplifies the resilience scheme.
- There are currently certain failure modes which can be recovered by the operator, which are noted in the table above. However, it is recommended that customers with a current Triton support contract contact Joyent Support through their normal channels in order to receive assistance on any recovery effort, regardless of the component involved.
Zookeeper is used inside the binder instances to manage the leader elections and state of clusters in Triton, such as with manatee. It is possible to create a zookeeper cluster (binder cluster) using the
sdcadm utility. Note: You cannot run a zookeeper cluster in a two-server Triton installation; you must have a minimum of three servers (one head node, and two compute nodes).
First update the
sdcadmcomponent, via the command
sdcadm self-updateto ensure that you are running the most recent version of this component.
Setup and identify two compute nodes to use for the zookeeper cluster. Warning: You must use multiple compute nodes for the cluster members; failure to do so (say, by putting multiple instances of zookeeper on one compute node) will introduce a potential point of failure into the configuration.
- Create your ZK cluster via
sdcadm post-setup ha-binder -s SERVER1_UUID -s SERVER2_UUID, where SERVER1_UUID and SERVER2_UUID are the UUID's of the servers that will be hosting the additional instances. Note that one zookeeper instance runs on the head node by default.
Currently, the test for a zookeeper is highly manual; engineering is currently developing a tool to automate this test. However, in the short term is it possible to use the steps defined below to validate that you zookeeper cluster is up and running properly.
From the head node, as the root user - get the IP addresses of the zookeepers:
ZK_IPS=$(sdc-vmapi '/vms?query=(%26(tags=*smartdc_role=binder*)(state=running))' | json -Ha nics.ip)
See if the zookeepers are reporting as up - you should see "imok" three times - once for each ZK:
for IP in $ZK_IPS; do echo ruok | nc $IP 2181; echo "" done
See if they are a cluster:
for IP in $ZK_IPS; do echo stat | nc $IP 2181 | egrep "(leader|follower|standalone)" done
You should see one leader, two followers. Customers with a current Triton support contract should contact Joyent Support through their normal channels if they run into any issues with these tests.
It is possible to use the
sdcadm command to deploy the additional instances required to put manatee into HA mode. Note: You cannot run manatee in HA mode on a two-server Triton installation; you must have a minimum of three servers (one head node, and two compute nodes).
As with zookeeper, you first need to identify two additional compute nodes to hold the additional manatees. If possible, these should not be the same nodes used for the clustered zookeeper configuration. Warning: co-locating an additional manatee instance on a compute node that already hosts a manatee instance will introduce a potential failure point into your Triton installation.
- Create your HA manatee nodes via
sdcadm post-setup ha-manatee -s SERVER1_UUID -s SERVER2_UUID
- Check to make sure the manatee cluster is up and stable by logging into a manatee zone via
sdc-login manatee0and then running a status via
manatee-adm status | json- you should see three manatee nodes, and you should see them replicating properly - ie, primary to sync, sync to async. For more information on manatee and troubleshooting, please see the pages entitled Manatee overview and troubleshooting manatee.
To deploy multiple moray instances, first determine which compute nodes that you will be using for the deployment. Ideally, these nodes will not already contain manatee or zookeeper instances.
Once you have identified the nodes to be used, you will use
sdcadm to add the additional moray instances.
For example, to add two new moray instances to a new Triton installation you would run:
headnode# sdcadm create moray -s CN1_UUID headnode# sdcadm create moray -s CN2_UUID
Note: The first moray created when you install Triton will be named
moray0; you can verify this by running the following command:
headnode# sdc-vmapi '/vms?query=(%26(tags=*smartdc_role=moray*)(state=running))' | json -Hag uuid alias nics.ip state 42184f34-638f-4e75-98a6-33c26d834d3d moray0 10.1.1.17 running
The above command will also enable you to verify that the additional morays are provisioned and running.