Troubleshooting Triton

Modified: 22 Nov 2017 17:39 UTC

This document provides basic instruction on the process of troubleshooting both an Triton Installation, and the instances running in an Triton Installation.

Troubleshooting the installation

Installation level troubleshooting generally involves working on the installation head node. Most troubleshooting tasks will require root (or root-equivalent) access.

Checking the health of Triton

The sdc-healthcheck script is designed to check the major components of Triton and provide a simplified report indicating the status of key services. Please see checking the health of a Triton installation for details on how to run and interpret the output of this script.

Checking compute node access

In order to check access to all defined compute nodes in an installation, you can use the sdc-oneachnode command to issue a hostname command to each configured compute node.

For example, this is the output for a Triton installation with one head node and 5 compute nodes:

[root@headnode (mxpa) /opt/custom/bin]# sdc-oneachnode -a hostname
HOST                 OUTPUT
headnode             headnode
cnode01              cnode01
cnode02              cnode02
cnode03              cnode03
cnode04              cnode04
cnode05              cnode05

Notes:

You should receive a response from each compute node in your installation. If you fail to see a compute node, you will need to investigate to ensure that you are able to reach that compute node (via ping, ssh, etc), as this could point to an issue with that compute node. It could also point to an issue with the cnapi zone on the head node.

Changing the root password

Because of the way that Triton boots (from USB key for the head node, and via PXE boot for the compute nodes), the following steps must be taken when changing the root password to ensure that the change persists across a reboot. To change the root password in Triton, please follow the procedure entitled changing the root password in Triton.

Troubleshooting a compute node

Issues with a compute node affect all instances that live on that particular compute node. In order to troubleshoot a compute node you will need to log onto that node with root (or root-equivalent) access.

Problems with a compute node can be broken into two broad areas:

Hardware issues should be first diagnosed by checking the output of the fmadm utility as described in Working with the Fault Management Configuration Tool fmadm.

Performance issues should always be investigated from the "bottom up"; that is, you should first check the performance within the instance that is experiencing the issue. For more information on troubleshooting an instance, please see Troubleshooting Virtual Machines / Instances below.

Once you have satisfied yourself that the instance is performing properly, you should then check the performance on the underlying compute node by reviewing the sections below and performing whatever troubleshooting is relevant to the symptoms you are seeing.

Working with the fault management configuration tool fmadm

The fmadm(1M) utility is used to administer and service problems detected by the Solaris Fault Manager, fmd(1M). If a component has been diagnosed as faulty, fmadm will report what component has failed, and the response taken for the failed device.

To view a list of failed components, run fmadm with the faulty option:

# fmadm faulty
 -------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
 -------------- ------------------------------------  -------------- ---------
May 02 20:00:34 abe52661-52aa-ec45-983e-f019e465db53  ZFS-8000-FD    Major

Host        : headnode
Platform    : MS-7850   Chassis_id  : To-be-filled-by-O.E.M.
Product_sn  :

Fault class : fault.fs.zfs.vdev.io
Affects     : zfs://pool=zones/vdev=5dbf266cd162b324
                  faulted and taken out of service
Problem in  : zfs://pool=zones/vdev=5dbf266cd162b324
                  faulted and taken out of service

Description : The number of I/O errors associated with a ZFS device exceeded
                     acceptable levels.  Refer to
              http://illumos.org/msg/ZFS-8000-FD for more information.

Response    : The device has been offlined and marked as faulted.  An attempt
                     will be made to activate a hot spare if available.

Impact      : Fault tolerance of the pool may be compromised.

Action      : Run 'zpool status -x' and replace the bad device.

The above provides an example of what you may see if a device in a zpool was diagnosed as faulted. For more details on fmadm, please view the fmadm man(1M) pages.

Compute node disk performance

Disk performance on a compute node is a function of the speed of the disks, the amount of ARC cache, and the I/O load on the system.

The following sections approach disk performance from the standpoint of the compute node. For information on viewing disk performance from the standpoint of an instance, please see the section entitled Checking instance disk usage.

Compute node disk throttling

In order to handle a multi-tenant load, Triton implements a ZFS I/O throttle. This throttle is composed of two components; one component tracks and accounts for each zone's I/O request, and one component throttles each zone that exceeds its fair share of disk I/O. When the throttle detects that a zone is consuming more than is appropriate, each read or write system call is delayed by up to 100 microseconds. This allows other zones to interleave I/O requests during those delays.

Disk throttling activity can be viewed using the vfsstat(1m) command.

In this example, we pass the -zZ flags to vfsstat which tell it to report on all zones (-z) but to omit data for any zones that are not showing any activity (-Z). The 5 5 tells it to run 5 times at a 5 second delay between runs.

# vfsstat -Zz 5 5
  r/s   w/s  kr/s  kw/s ractv wactv read_t writ_t  %r  %w   d/s  del_t zone
112.9  13.7 566.0  21.9   0.0   0.0   47.2   45.8   0   0   0.0    0.0 global (0)
  0.1   0.0   0.2   0.1   0.0   0.0  530.0  220.0   0   0   0.0   10.6 f45b2cc1 (2)
  0.1   0.0   0.2   0.1   0.0   0.0  566.8  324.9   0   0   0.0    9.1 52ea67c4 (3)
  0.1   0.0   0.2   0.1   0.0   0.0  569.3  316.4   0   0   0.0    8.2 7352a1d5 (4)
  0.1   0.0   0.2   0.1   0.0   0.0  508.0  323.6   0   0   0.0   10.7 dce9c215 (5)
242.4 829.3 16292.4 15085.3   0.2   0.1  960.4  131.8   9   7 514.5  126.7 42ef02d8 (6)
  0.6   0.1   2.0   0.4   0.0   0.0  178.5  120.7   0   0   0.1   48.4 929245ee (7)
150.4   0.1 475.0   0.0   0.0   0.0    3.5  143.9   0   0   0.0    8.4 446966ef (8)
 55.9   0.1 176.0   0.0   0.0   0.0    4.3   76.7   0   0   0.0    5.0 3496feff (9)
151.8   0.1 479.5   0.0   0.0   0.0    3.6  142.6   0   0   0.0    9.7 e9e4589b (10)
147.1   0.1 465.3   0.0   0.0   0.0    5.1  139.5   0   0   0.0   24.9 211a5737 (11)
 46.2   0.1 145.7   0.0   0.0   0.0    4.8  116.0   0   0   0.0    8.8 708ec2c4 (12)
 52.5   2.5 161.7   0.7   0.0   0.0    9.6   44.3   0   0   0.0   18.3 c3de353c (13)
  0.0   0.0   0.0   0.0   0.0   0.0    7.0    7.8   0   0   1.7   61.8 3ca31d30 (15)
  0.0   0.0   0.0   0.0   0.0   0.0   86.3   13.8   0   0   0.7   34.5 119f6ffc (16)
  0.0   0.0   0.0   0.0   0.0   0.0 1903.5   16.8   0   0   0.1   48.0 e25f589d (17)
  0.0   0.0   0.0   0.0   0.0   0.0    8.5   13.8   0   0   0.1  103.6 dbaa3a0d (18)
  0.0   0.0   0.0   0.0   0.0   0.0    7.5    6.6   0   0   0.0   35.0 d451978f (20)
  0.0   0.0   0.0   0.0   0.0   0.0    7.1   12.7   0   0   0.0   79.3 f0cd9950 (21)
  0.0   0.0   0.0   0.0   0.0   0.0    9.4    7.3   0   0   0.1   24.7 b5020965 (22)
159.5   0.1 503.5   0.0   0.0   0.0    3.6  138.3   0   0   0.0    6.8 2d716614 (31)
162.5   0.1 513.0   0.0   0.0   0.0    3.4  123.7   0   0   0.0    5.5 7d11de51 (34)
  0.0   0.0   0.0   0.0   0.0   0.0   12.7   16.3   0   0   0.0    9.7 9911693c (51)
  0.0   0.0   0.0   0.0   0.0   0.0    7.2   10.7   0   0   0.1   36.5 914ed552 (55)
  0.0   0.0   0.0   0.0   0.0   0.0  109.8   13.3   0   0   0.0   30.6 529c4dc3 (56)
175.4   0.1 553.8   0.1   0.0   0.0    3.5  126.1   0   0   0.0    7.5 b89cbc9d (62)
  0.0   0.0   0.0   0.0   0.0   0.0   35.3    7.2   0   0   6.9   85.1 b94f668a (63)
174.0   0.5 547.9   0.3   0.0   0.0    3.8   24.5   0   0   0.0    7.5 4c827f59 (65)
 12.8   0.3  39.8   0.3   0.0   0.0   11.3   29.6   0   0   0.0    5.5 0f37f0e0 (66)
  r/s   w/s  kr/s  kw/s ractv wactv read_t writ_t  %r  %w   d/s  del_t zone
237.1   1.3 576.5   0.3   0.0   0.0    4.4   53.3   0   0   0.0    0.0 global (0)
1733.5  38.6 8285.1  90.4   0.0   0.0   14.6   20.1   2   0   0.0    0.0 42ef02d8 (6)
  1.1   1.8   0.1   0.2   0.0   0.0   11.5   40.2   0   0   0.0    0.0 c3de353c (13)
  0.2   0.0   0.0   0.0   0.0   0.0   28.7    0.0   0   0   0.0    0.0 7d11de51 (34)
  r/s   w/s  kr/s  kw/s ractv wactv read_t writ_t  %r  %w   d/s  del_t zone
240.2   2.0 584.2   0.5   0.0   0.0    3.6   45.9   0   0   0.0    0.0 global (0)
898.1 113.3 3533.2 315.1   0.0   0.0   14.2   17.3   1   0   0.0    0.0 42ef02d8 (6)
  0.2   0.0   0.0   0.0   0.0   0.0   20.5    0.0   0   0   0.0    0.0 929245ee (7)
  0.2   0.0   0.0   0.0   0.0   0.0   23.8    0.0   0   0   0.0    0.0 708ec2c4 (12)
  1.7   1.8   0.1   0.3   0.0   0.0   13.3   46.4   0   0   0.0    0.0 c3de353c (13)
  0.7   0.4   0.6   0.1   0.0   0.0   17.0   15.1   0   0   0.0    0.0 b89cbc9d (62)
  r/s   w/s  kr/s  kw/s ractv wactv read_t writ_t  %r  %w   d/s  del_t zone
245.6   3.3 666.2   1.2   0.0   0.0    3.9   56.3   0   0   0.0    0.0 global (0)
1317.7  27.5 6121.2  74.7   0.0   0.0   16.5   19.4   1   0   0.0    0.0 42ef02d8 (6)
  1.1   3.9   0.1   0.6   0.0   0.0   11.2   38.4   0   0   0.0    0.0 c3de353c (13)
  r/s   w/s  kr/s  kw/s ractv wactv read_t writ_t  %r  %w   d/s  del_t zone
241.6   1.7 587.8   0.4   0.0   0.0    4.1   54.2   0   0   0.0    0.0 global (0)
1533.1  45.6 6427.7 106.1   0.0   0.0   13.9   21.6   1   0   0.0    0.0 42ef02d8 (6)
  1.9   2.4   0.1   0.4   0.0   0.0   11.6   47.3   0   0   0.0    0.0 c3de353c (13)
  0.2   0.0   0.0   0.0   0.0   0.0   34.0    0.0   0   0   0.0    0.0 2d716614 (31)

In the output above you will see six sets of data, separated by a header line:

r/s   w/s  kr/s  kw/s ractv wactv read_t writ_t  %r  %w   d/s  del_t zone

The key fields we want to look at are d/s and del_t, which represent the delays per millisecond and the total delay that are being added to a zone's I/O requests to throttle its I/O.

The first set of data is historical and can be safely ignored; what we want to review are the other 5 sets.

Based on the data we see above, we do see some activity on this compute node, especially the zone containing UUID 42ef02d8. However, we are not seeing any disk throttling taking place currently (although we can see from the historical data that zones on this compute node have been throttled in the past).

For more detailed information on how Triton manages disk throttling, please see Our ZFS IO Throttle.

Compute node ARC cache

The ARC, or Adaptive Replacement Cache, is used by Triton to improve file system and disk performance, with the goal of driving down overall system latency. In order to function smoothly, ARC requires a percentage of the available RAM in the system. This value, which can be set on the compute node level via the reservation ratio value, is typically between 10-15% of the available RAM on the compute node.

Systems that have a low amount of ARC available will tend to exhibit symptoms of poor performance, such as stalls and overall lack of disk responsiveness.

The utilization and performance of the ARC cache can be viewed by using the arcstat perl script.

A list of options and field definitions can be viewed by running arcstat with the -v flag:

# arcstat -v
Usage: arcstat [-hvx] [-f fields] [-o file] [interval [count]]

Field definitions are as follows:
     mtxmis : mutex_miss per second
      arcsz : ARC Size
       mrug : MRU Ghost List hits per second
     l2hit% : L2ARC access hit percentage
        mh% : Metadata hit percentage
    l2miss% : L2ARC access miss percentage
       read : Total ARC accesses per second
          c : ARC Target Size
       mfug : MFU Ghost List hits per second
       miss : ARC misses per second
        dm% : Demand Data miss percentage
       dhit : Demand Data hits per second
      pread : Prefetch accesses per second
      dread : Demand data accesses per second
       pmis : Prefetch misses per second
     l2miss : L2ARC misses per second
       time : Time
    l2bytes : bytes read per second from the L2ARC
        pm% : Prefetch miss percentage
        mm% : Metadata miss percentage
       hits : ARC reads per second
        mfu : MFU List hits per second
     l2read : Total L2ARC accesses per second
       mmis : Metadata misses per second
       rmis : recycle_miss per second
       mhit : Metadata hits per second
       dmis : Demand Data misses per second
        mru : MRU List hits per second
        ph% : Prefetch hits percentage
      eskip : evict_skip per second
     l2size : Size of the L2ARC
     l2hits : L2ARC hits per second
       hit% : ARC Hit percentage
      miss% : ARC miss percentage
        dh% : Demand Data hit percentage
      mread : Metadata accesses per second
       phit : Prefetch hits per second

In the example below, we see that the compute node we are viewing currently has 14GB allocated to ARC:

# arcstat 5 5
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c
14:23:03     0     0      0     0    0     0    0     0    0    14G   14G
14:23:08    44     1      2     1    2     0    0     0    0    14G   14G
14:23:13   224     1      0     1    0     0    0     0    0    14G   14G
14:23:18   125     0      0     0    0     0    0     0    0    14G   14G
14:23:23    81     0      0     0    0     0    0     0    0    14G   14G

Based on the stats, we can see this is a fairly quiet compute node. Key areas to examine are:

It is possible to customize the fields shown by arcstat as shown below.

# arcstat -f arcsz,read,hit%,miss% 5 5
arcsz  read  hit%  miss%
  14G     0     0      0
  14G  1.3K    99      0
  14G    60   100      0
  14G    71    99      0
  14G    50    95      4

The example above shows us the Total ARC reads, the ARC hit percentage, and the ARC miss percentage. Again, this shows that the compute node in question is performing very well.

For an in-depth discussion of ARC, please see Activity of the ZFS ARC

Compute node memory utilization

The memory on a compute node is used for provisioned instances (virtual machines), ARC, and the running OS. Although the bulk of the memory is reserved for instances, the compute node needs sufficient free space for both ARC and OS in order to function optimally.

ARC and memory

Please see the section above entitled Compute Node ARC Cache for information on how compute node memory is used for ARC.

Using zonememstat

The zonememstat(1m) command is used to show memory utilization by zone on a compute node.

This command can be run in three ways:

Show all zones and their usage:
  ```
  [jschmidt@30Q7HS1 (us-east-1) ~]$ zonememstat
                                   ZONE  RSS(MB)  CAP(MB)    NOVER  POUT(MB)
                                 global     1179        -        -         -
   f45b2cc1-4bec-c504-8661-bcb984796771       42      256        0         0
   52ea67c4-65d6-4072-8442-8a035727ae85       68      256        0         0
   7352a1d5-aff7-c206-f3d9-d5992ba5d713       66      256        0         0
   dce9c215-2200-c27a-c339-94a8af9b2706       49      256        0         0
   42ef02d8-1f25-4636-b964-cf054691d3f2       73    16384        0         0
   929245ee-fcee-676d-fad7-c80c72e73487      252      256    12733      7821
   446966ef-c957-ebfb-f667-f8ddafec7075      163      640        0         0
   3496feff-1f80-c176-dd10-a7e36b53482e       36     4096        0         0
   e9e4589b-4895-463c-e37a-d4a3720b900b      138      640        0         0
   211a5737-e006-cb83-bca2-aa07e7d16411      226      256     2059     17171
   708ec2c4-6bb0-4570-eed2-96de1ac4413f       24      256        0         0
   c3de353c-5001-67ea-e6d9-d1a08eafc046       97     4096        0         0
   3ca31d30-1613-404b-9db1-0dea9c5228fc     8309     9216        0         0
   119f6ffc-6fde-4cf7-80ba-daa176a33da6     1863     2048        0         0
   e25f589d-0fd9-6008-b27b-d088d9c6333c      295      512        0         0
   dbaa3a0d-8254-4fa3-8869-bf596a9283b8     4141     4352        0         0
   d451978f-a325-618c-ae03-b765657ce82d    17695    17792        0         0
   f0cd9950-f6ab-e8b5-c2c1-fa54c408f297     1848     2048        0         0
   b5020965-b563-cc4e-9d9d-8b83a54d3348     1867     2048        0         0
   2d716614-0610-cc15-a2f3-b23a001d9e84       47      256        0         0
   7d11de51-2a4d-6a89-a284-e1d0e4211f0c       47     4096        0         0
   9911693c-af0d-6b05-9e16-ff0826b8ce32      285      512        0         0
   914ed552-6aa6-e6dc-f380-9594645f96ac     2126     2304        0         0
   529c4dc3-d470-e168-fa57-c00cf6c2da21     1073     1280        0         0
   b89cbc9d-4948-6185-d6c3-b2f6661c9b7f       48      256        0         0
   b94f668a-01b6-cc70-e0b0-d7934bdfda25     7739     7936        0         0
   4c827f59-3318-c24f-c86c-d426c08a121a       38      256        0         0
   0f37f0e0-8c93-4f5f-bd9d-135a173bb901       73     1024        0         0
  ```
Show only zones that have gone over their cap:
  ```
  # zonememstat -o
                                 ZONE  RSS(MB)  CAP(MB)    NOVER  POUT(MB)
  929245ee-fcee-676d-fad7-c80c72e73487      197      256    12733      7821
  211a5737-e006-cb83-bca2-aa07e7d16411      226      256     2059     17171
 ```
Show only one zone:
  ```
  # zonememstat -z 929245ee-fcee-676d-fad7-c80c72e73487
                                   ZONE  RSS(MB)  CAP(MB)    NOVER  POUT(MB)
   929245ee-fcee-676d-fad7-c80c72e73487      233      256    12733      7821
  ```

The key fields to look at are:

Field Meaning
ZONE The UUID (name) of the zone
RSS(MB) The RSS (resident stack size) of the zone currently, in MB
CAP(MB) The memory cap of the zone, in MB
NOVER The number of times the zone has gone over the memory cap since booting
POUT(MB) The amount of memory the zone has gone over the memory cap since booting, in MB

Looking at our example above, we can see that zone 929245ee-fcee-676d-fad7-c80c72e73487 has a 256MB cap and they are currently using 233MB of memory. They have gone over their cap 12,733 times since the instance was booted for a total of 7.8GB of data. This instance should likely be investigated, and a resize suggested to the instance owner.

Zone memory usage calculation

For additional information on how memory usage/capping works in Triton and SmartOS, please see About Calculating Memory Usage and Capping

Compute node CPU utilization

Although it does show you how investigate at the zone/instance level, the documentation below primarily concerns overall CPU utilization on the compute node, and does not discuss the Fair Share scheduler and how it manages CPU resources. For more information on Fair Share please see About CPU Usage on our wiki.

Load averages

The uptime(1) command can be used to provide a quick overview of the server's activity. The numbers reported by this command provide a rough count of the number of processes currently executing plus the number of processes in the run queue. The first number is the average over the last minute, the second over the last 5 minutes, and the third over the last 15 minutes.

Although a high load average may be evidence of problems, it is not enough by itself to diagnose a problem. For that you will need to run additional diagnostics.

The prstat command

SmartOS uses prstat(1M) instead of top; it understands SmartOS better and has lower overhead. The man pages contain a great deal of detail on the various options available, but the most common invocations are:

   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
 15971 root     7749M 7737M cpu5     1    0 108:26:25 5.4% qemu-system-x86/11
 16622 root     8320M 8307M cpu7    59    0 423:50:23 2.9% qemu-system-x86/10
 82497 103        91M   90M cpu10    4    0   1:44:35 0.8% node/5
 15554 root     1876M 1861M sleep   59    0  96:05:36 0.6% qemu-system-x86/5
 82499 103       176M  135M sleep   14    0   3:46:18 0.4% node/7
 15755 root     4150M 4138M sleep    1    0  82:44:55 0.3% qemu-system-x86/5
 16429 root       17G   17G sleep   59    0  17:43:16 0.1% qemu-system-x86/5
 16483 root     1878M 1865M sleep   59    0  14:35:47 0.1% qemu-system-x86/6
  3133 root      180M  151M sleep    1    0   6:56:48 0.1% node/7
    85 root        0K    0K sleep   99  -20  86:20:47 0.1% zpool-zones/166
 86339 root     1084M 1071M sleep   59    0   3:41:59 0.1% qemu-system-x86/4
 64300 root     2137M 2124M sleep   59    0   4:35:36 0.1% qemu-system-x86/5
 15838 root     1861M 1846M sleep   59    0   8:53:51 0.0% qemu-system-x86/4
 80341 root      296M  283M sleep   59    0   2:46:39 0.0% qemu-system-x86/4
 42078 root       28M   20M sleep    1    0   0:09:07 0.0% nscd/35
 16065 root      305M  292M sleep   59    0   7:37:14 0.0% qemu-system-x86/4
 34728 zabbix   6152K 3944K sleep   59    0   4:06:40 0.0% zabbix_agentd/1
 12526 root       54M   36M sleep   58    0   2:20:22 0.0% node/7
 18025 root       52M   25M sleep   59    0   4:39:01 0.0% node/7
 11053 zabbix   5592K 3184K sleep   59    0   2:44:30 0.0% zabbix_agentd/1
 72139 829       210M 7708K sleep   59    0   1:29:42 0.0% mongod/11
 16466 829       210M 7588K sleep   59    0   1:22:50 0.0% mongod/11
 22362 829       210M 8500K sleep   59    0   0:37:48 0.0% mongod/11
 17703 829       211M 8440K sleep   59    0   1:48:13 0.0% mongod/11
 17856 829       211M 7372K sleep   59    0   1:44:26 0.0% mongod/11
  7185 jschmidt 4732K 3984K cpu15   59    0   0:00:00 0.0% prstat/1
 34733 zabbix   6192K 2940K sleep   59    0   0:55:59 0.0% zabbix_agentd/1
  4306 root       70M   48M sleep    1    0   2:29:19 0.0% node/7
 16418 root     1852K  904K sleep   59    0   0:00:00 0.0% pfexecd/2
  4292 root     2144K  772K sleep    1    0   0:00:00 0.0% iscsid/2
ZONEID    NPROC  SWAP   RSS MEMORY      TIME  CPU ZONE
    63        2 7749M 7737M   7.9% 108:26:25 5.4% b94f668a-01b6-cc70-e0b0-d79*
    15        2 8320M 8307M   8.5% 423:50:23 2.9% 3ca31d30-1613-404b-9db1-0de*
     7       18  327M  249M   0.2%   5:40:51 1.2% 929245ee-fcee-676d-fad7-c80*
    16        2 1876M 1861M   1.9%  96:05:36 0.6% 119f6ffc-6fde-4cf7-80ba-daa*
    18        2 4150M 4138M   4.2%  82:44:55 0.3% dbaa3a0d-8254-4fa3-8869-bf5*
     0      116 2047M 1373M   1.2% 174:06:38 0.3% global
    20        2   17G   17G    18%  17:43:16 0.1% d451978f-a325-618c-ae03-b76*
    22        2 1878M 1865M   1.9%  14:35:47 0.1% b5020965-b563-cc4e-9d9d-8b8*
    56        2 1084M 1071M   1.1%   3:41:59 0.1% 529c4dc3-d470-e168-fa57-c00*
    55        2 2137M 2124M   2.2%   4:35:36 0.1% 914ed552-6aa6-e6dc-f380-959*
    21        2 1861M 1846M   1.9%   8:53:51 0.0% f0cd9950-f6ab-e8b5-c2c1-fa5*
Total: 488 processes, 2310 lwps, load averages: 3.97, 5.51, 9.82
, given the UUID of the instance you are interested.
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
 22362 829       210M 8500K sleep   59    0   0:37:48 0.0% mongod/11
 22189 root       14M   10M sleep    1    0   0:00:19 0.0% nscd/26
 22364 root     2280K 1084K sleep   59    0   0:00:00 0.0% ttymon/1
 22370 root     6528K 2640K sleep   58    0   0:00:00 0.0% inetd/3
 21852 root     8584K 7012K sleep   43    0   0:00:18 0.0% svc.configd/13
 21744 root        0K    0K sleep   60    -   0:00:00 0.0% zsched/1
 22363 root     2144K 1056K sleep   59    0   0:00:00 0.0% sac/1
 21810 root     2680K 1652K sleep   59    0   0:00:01 0.0% init/1
 21850 root     6244K 4296K sleep   59    0   0:00:09 0.0% svc.startd/13
 22405 root     4344K 1776K sleep   59    0   0:00:06 0.0% sshd/1
 22252 root     2492K 1260K sleep   59    0   0:00:00 0.0% pfexecd/3
 22371 root     1940K 1024K sleep   59    0   0:00:00 0.0% ttymon/1
 22359 root     6072K 3944K sleep   59    0   0:00:17 0.0% rsyslogd/6
 22355 root     1872K 1068K sleep   59    0   0:00:00 0.0% cron/1
 22369 root     1652K  816K sleep   59    0   0:00:01 0.0% utmpd/1
 21916 netadm   4448K 3124K sleep   59    0   0:00:05 0.0% ipmgmtd/4

Total: 16 processes, 87 lwps, load averages: 3.66, 5.25, 9.55
, which includes microstate information and information on LWP (lightweight processes).

The -c flag tells prstat to print each new set of information below the previous set. This is ideal for writing prstat information to a log file, for example.

# prstat -mLc
Please wait...
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
 15971 root     0.7  20 0.0 0.0 0.0 0.2  74 4.6  1K 542  9K   6 qemu-system-/3
 16622 root     9.7 8.1 0.0 0.0 0.0 0.3  79 2.5 19K  71 .1M   0 qemu-system-/1
 82499 103       16 0.9 0.1 0.0 0.0 0.0  73  10 111 645 976   0 node/1
 15971 root     0.6  16 0.0 0.0 0.0 0.1  79 3.9  1K 283  6K   5 qemu-system-/4
  7334 jschmidt 2.9  13 0.0 0.0 0.0 0.0  84 0.0  41  41 49K   0 prstat/1
 16622 root     0.3 8.7 0.0 0.0 0.0 0.2  87 4.2  2K 260  2K   0 qemu-system-/3
 16622 root     0.3 8.7 0.0 0.0 0.0 0.2  87 4.2  2K 261  2K   0 qemu-system-/5
 16622 root     0.3 8.7 0.0 0.0 0.0 0.2  87 4.2  2K 267  2K   0 qemu-system-/4
 16622 root     0.3 8.7 0.0 0.0 0.0 0.2  87 4.2  2K 262  2K   0 qemu-system-/6
 15971 root     4.9 3.6 0.0 0.0 0.0 0.2  89 2.6 10K  84 81K   0 qemu-system-/1
 82497 103      8.1 0.0 0.0 0.0 0.0 0.0  90 2.2   9 112 100   0 forza/1
 15554 root     3.5 3.0 0.0 0.0 0.0 0.0  92 1.8  7K  54 58K   0 qemu-system-/1
  5155 root     0.0 6.3 0.0 0.0 0.0  94 0.1 0.0  29   2 302   0 zoneadmd/5
 15755 root     3.6 2.1 0.0 0.0 0.0 0.0  91 2.9  6K 200 44K   0 qemu-system-/1
 15554 root     0.2 4.5 0.0 0.0 0.0 0.0  94 0.8  1K  47  1K   0 qemu-system-/3
Total: 489 processes, 2334 lwps, load averages: 3.39, 4.73, 8.93
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
 15971 root     2.0  58 0.0 0.0 0.0 0.1  40 0.4  4K 288 24K   5 qemu-system-/3
 15971 root     1.3  53 0.0 0.0 0.0 0.1  45 0.3  4K  88 17K   8 qemu-system-/4
 82499 103       37 2.5 0.0 0.0 0.0 0.0  44  16 256 573  2K   0 node/1
 16622 root      12 9.5 0.0 0.0 0.0 0.0  78 1.2 22K  24 .2M   0 qemu-system-/1
 82497 103       18 0.0 0.0 0.0 0.0 0.0  78 3.5   9 150 109   0 forza/1
 16622 root     0.5  12 0.0 0.0 0.0 0.1  87 0.2  3K   3  3K   0 qemu-system-/3
 16622 root     0.5  12 0.0 0.0 0.0 0.2  87 0.4  3K 217  3K   0 qemu-system-/6
 16622 root     0.5  11 0.0 0.0 0.0 0.2  88 0.4  3K 261  3K   0 qemu-system-/4
 15971 root     6.1 4.8 0.0 0.0 0.0 0.0  88 1.0 12K  92 .1M   0 qemu-system-/1
 16622 root     0.4  10 0.0 0.0 0.0 0.1  89 0.2  3K   7  3K   0 qemu-system-/5
 15554 root     4.4 3.8 0.0 0.0 0.0 0.0  91 0.6  9K   9 74K   0 qemu-system-/1
 15554 root     0.4 5.9 0.0 0.0 0.0 0.1  93 0.2  2K  27  3K   0 qemu-system-/3
 15755 root     3.7 1.8 0.0 0.0 0.0 0.0  94 0.5  5K 118 40K   0 qemu-system-/1
 16429 root     0.0 4.4 0.0 0.0 0.0 0.0  96 0.0  71  16  56   0 qemu-system-/4
 16622 root     0.9 2.5 0.0 0.0 0.0 0.0  95 1.2 18K   5 36K 18K qemu-system-/7
Total: 489 processes, 2319 lwps, load averages: 3.39, 4.71, 8.90

The kstat command

The kstat utility permits you to read available kernel statistics; to access this facility you can call kstat(1M) with a set of criteria which it will then try to match against available statistics. Data that matches is printed with its module, instance, and name fields, as well as its actual value.

To view information on all zones, use kstat -p caps::cpucaps_zone_*

# kstat -p caps::cpucaps_zone_*
caps:2:cpucaps_zone_2:above_base_sec    52
caps:2:cpucaps_zone_2:above_sec 22
caps:2:cpucaps_zone_2:baseline  7
caps:2:cpucaps_zone_2:below_sec 2659222
caps:2:cpucaps_zone_2:burst_limit_sec   0
caps:2:cpucaps_zone_2:bursting_sec  0
caps:2:cpucaps_zone_2:class zone_caps
caps:2:cpucaps_zone_2:crtime    2007.585090139
caps:2:cpucaps_zone_2:effective 50
caps:2:cpucaps_zone_2:maxusage  65
caps:2:cpucaps_zone_2:nwait 0
caps:2:cpucaps_zone_2:snaptime  2661232.852473399
caps:2:cpucaps_zone_2:usage 0
caps:2:cpucaps_zone_2:value 50
caps:2:cpucaps_zone_2:zonename  f45b2cc1-4bec-c504-8661-bcb984796771
<-------- SNIP -------->
caps:66:cpucaps_zone_66:above_base_sec  20
caps:66:cpucaps_zone_66:above_sec   1
caps:66:cpucaps_zone_66:baseline    30
caps:66:cpucaps_zone_66:below_sec   146678
caps:66:cpucaps_zone_66:burst_limit_sec 0
caps:66:cpucaps_zone_66:bursting_sec    0
caps:66:cpucaps_zone_66:class   zone_caps
caps:66:cpucaps_zone_66:crtime  2514572.257500655
caps:66:cpucaps_zone_66:effective   200
caps:66:cpucaps_zone_66:maxusage    217
caps:66:cpucaps_zone_66:nwait   0
caps:66:cpucaps_zone_66:snaptime    2661241.997279293
caps:66:cpucaps_zone_66:usage   0
caps:66:cpucaps_zone_66:value   200
caps:66:cpucaps_zone_66:zonename    0f37f0e0-8c93-4f5f-bd9d-135a173bb901

Viewing information on a specific zone is a two step process.

  1. Get the zone_id by running zoneadm list -v and then grep'ing for the UUID of the zone you want information on.

    #zoneadm list -v | grep f0cd9950-f6ab-e8b5-c2c1-fa54c408f297
      21 f0cd9950-f6ab-e8b5-c2c1-fa54c408f297 running    /zones/f0cd9950-f6ab-e8b5-c2c1-fa54c408f297 kvm      excl
  2. Call kstat with the zone_id from above.

    # kstat -p caps::cpucaps_zone_21
    caps:21:cpucaps_zone_21:above_base_sec  0
    caps:21:cpucaps_zone_21:above_sec   5
    caps:21:cpucaps_zone_21:baseline    0
    caps:21:cpucaps_zone_21:below_sec   2659508
    caps:21:cpucaps_zone_21:burst_limit_sec 0
    caps:21:cpucaps_zone_21:bursting_sec    0
    caps:21:cpucaps_zone_21:class   zone_caps
    caps:21:cpucaps_zone_21:crtime  2031.315195651
    caps:21:cpucaps_zone_21:effective   100
    caps:21:cpucaps_zone_21:maxusage    162
    caps:21:cpucaps_zone_21:nwait   0
    caps:21:cpucaps_zone_21:snaptime    2661537.715075964
    caps:21:cpucaps_zone_21:usage   1
    caps:21:cpucaps_zone_21:value   100
    caps:21:cpucaps_zone_21:zonename    f0cd9950-f6ab-e8b5-c2c1-fa54c408f297

The key values you want to check are listed in the table below.

Name Description
usage current CPU usage
maxusage high watermark of CPU usage
value CPU cap value. This is the most CPU the instance can use while bursting.
baseline CPU minimum value. This is the guaranteed minimum CPU usage for the instance.
above_base_sec Number of seconds the instance was bursting (above baseline)

The percentage is the total across all CPUs (psrinfo). So, a value of 200 is equivalent to 2 virtual CPUs (a virtual CPU is either a core or a hyper-thread).

Compute node hard drive failures

Hard drive failures on compute nodes affect all instances that live on the server. Two key tools to help troubleshoot and identify potential problems with the underlying storage are iostat(1m), kstat(1m), zpool(1m)

Verifying failures using iostat / kstat

For a quick look at potential disk errors, you can run the iostat(1m) command with the -En option as shown below:

# iostat -En
c0t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: Generic  Product: STORAGE DEVICE   Revision: 9451 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: Kingston Product: DataTraveler 2.0 Revision: PMAP Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 82 Predictive Failure Analysis: 0
c2t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD10EZEX-00K Revision: 1H15 Serial No: WD-WCC1S5975038
Size: 1000.20GB <1000204886016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 116 Predictive Failure Analysis: 0
c2t1d0           Soft Errors: 0 Hard Errors: 104 Transport Errors: 0
Vendor: ATA      Product: WDC WD5000AVVS-0 Revision: 1B01 Serial No: WD-WCASU7437291
Size: 500.11GB <500107862016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 121 Predictive Failure Analysis: 0
c2t2d0           Soft Errors: 0 Hard Errors: 26 Transport Errors: 0
Vendor: HL-DT-ST Product: DVDRAM GH24NS95  Revision: RN01 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 26 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

In the example above, device c2t1d0 is showing 104 hard errors. This is indicative of a potential failure, and the disk should be monitored for failure.

The values reported by iostat -E can also be pulled from the kstat command:

[root@headnode (cak-1) ~]# kstat -n sd3,err
module: sderr                           instance: 3
name:   sd3,err                         class:    device_error
        crtime                          48.741894257
        Device Not Ready                0
        Hard Errors                     104
        Illegal Request                 62
        Media Error                     0
        No Device                       0
        Predictive Failure Analysis     0
        Product                         WDC WD5000AVVS-09
        Recoverable                     0
        Revision                        1B01
        Serial No                       WD-WCASU7437291
        Size                            500107862016
        snaptime                        766415.722265268
        Soft Errors                     0
        Transport Errors                0
        Vendor                          ATA

Verifying the status of zpools

The zpool command manages and configures ZFS storage pools, which are a collection of virtual devices (physical drives or LUNS) that are provided to ZFS datasets (zones). The actual physical configuration of disks comprising your zpool will depend on numerous factors, including your hardware vendor, type and number of disks, presence of RAID card, etc.

To view the health and status of storage pools you can run the the following zpool command:

# zpool status
  pool: zones
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zones       ONLINE       0     0     0
          c2t0d0    ONLINE       0     0     0
        cache
          c2t1d0    ONLINE       0     0     0

errors: No known data errors

The above output indicates a healthy storage pool with no errors or disk maintenance activities in progress.

The output below shows a zpool that is currently in a degraded state.

# zpool status
  pool: zones
 state: DEGRADED
status: One or more devices is an an offline state.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zones       DEGRADED     0     0     0
          mirror-0  ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0
            c2t1d0  ONLINE       0     0     0
          mirror-1  DEGRADED     0     0     0
            c3t0d0  OFFLINE      0     0     0
            c3t1d0  ONLINE       0     0     0
        spares
          c4t0d0    AVAIL
          c4t1d0    AVAIL

As you can see from the output above, the instructions on how to correct this issue are provided. In this case we are able to simply online the device that is showing as offline.

# zpool status
  pool: zones
 state: ONLINE
  scan: resilvered 143K in 0h0m with 0 errors on Fri Jun 27 13:30:37 2014
config:

        NAME        STATE     READ WRITE CKSUM
        zones       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            c2t0d0  ONLINE       0     0     0
            c2t1d0  ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            c3t0d0  ONLINE       0     0     0
            c3t1d0  ONLINE       0     0     0
        spares
          c4t0d0    AVAIL
          c4t1d0    AVAIL

errors: No known data errors

The zpool is now showing as healthy; the scan line show us that the resilvering process to bring the device we onlined back to a functional state completed without errors.

Recovering a compute node in Triton

In extreme cases where a hardware fault renders a compute node non-functional, it is generally possible to recover the compute node. To recover a failed compute node, please see recovering a compute node in Triton

Troubleshooting manatee

Manatee is an automated failover, fault monitoring and leader-election system built for managing a set of replicated Postgres servers. It is written completely in Node.js, and is used by Triton for the storage of persistent data. Please see Manatee overview and troubleshooting manatee for more information on manatee, how to check the health of manatee, and how to recover from failure modes with manatee.

Troubleshooting Cloud Firewall (FWAPI)

Triton comes with Cloud Firewall, a service that manages firewalls within the installation for infrastructure containers running SmartOS. Please see Troubleshooting Cloud Firewall for more information on Cloud Firewall (FWAPI) and how to troubleshoot it.

Troubleshooting virtual machines / instances

Instance (or Virtual Machine) level troubleshooting will require access to the instance in question. This access can come in one of four ways:

Most of these troubleshooting processes will require root (or root equivalent) access to the virtual machine / instance in question.

Checking instance disk usage

There are numerous tools that can be used to manage containers and hardware virtual machines in Triton. Please see instance disk usage for a discussion of those tools as well as some of the key differences between how disk space is allocated, used, and managed between the two types of instances.

Checking instance networking

The analysis of network utilization inside instances varies by instance type.

Instance Type Tools
SmartOS netstat(1m), netcat, nmap
Linux netstat(8), netcat, nmap
FreeBSD netstat(1) , netcat, nmap
Windows Performance Monitor

Triton provides tools to view and manage instance level networking in order to troubleshoot, diagnose, and solve problems at the instance level. For more information please see checking instance networking.

Checking instance CPU usage

The analysis of CPU utilization inside instances varies by instance type. The links in the table below provide additional information on the listed tools.

Instance Type Tools
SmartOS prstat(1m), kstat(1m)
Linux top(1)
FreeBSD top(1)
Windows Performance Monitor

Checking instance memory usage

The analysis of CPU utilization inside instances varies by instance type. The links in the table below provide additional information on the listed tools.

Instance Type Tools
SmartOS prstat(1m), kstat(1m)
Linux top(1)
FreeBSD top(1)
Windows Performance Monitor

Clearing an instance stuck in “provisioning"

There are certain job failure modes which will leave an instance stuck in a provisioning state. In order to clear this, you will need to follow the procedure outlined in the document Clearing an instance stuck in “provisioning”.

Forcing a re-sync of VMAPI for an instance

There are certain rare job failure modes which can cause the output of vmadm(1m) on a compute node and the data contained in VMAPI to differ. It is possible to force VMAPI to synchronize with the actual state of the instance as reported by vmadm. To do this, you will need to follow the procedure outlined in the document Forcing a re-sync of VMAPI.

Advanced troubleshooting

The following section outlines more advanced troubleshooting procedures.

General provision troubleshooting

At times, the provisioning process will fail. When this happens, the cause for the failure should be investigated. Please see provision troubleshooting for more information.

Changing NTP and/or DNS servers post install

The DNS and NTP settings that are configured at installation time for Triton are intended to remain static. However, if your circumstances dictate that either or both of these must be changed it can be achieved as describe in the document entitled Changing global NTP and DNS settings post configuration.

Sending files to Joyent

In the case of severe or chronic problems, Joyent support may request that you submit a crash dump or support bundle. These files provide valuable information that is used by Joyent Support and Engineering to investigate the particular issue you are experiencing.

Generating a crash dump

Any time a system becomes nonfunctional and has to be rebooted, Joyent support requires that a crash dump be written. This can be done via the console, or via the generation of a NMI (non-maskable interrupt) Without a crash dump, diagnosing system hangs is effectively impossible.

Additionally, there may be other situations where you are requested to generate a crash dump in order to supply Joyent support with information to help debug issue you are experiencing.

To generate a crash dump, please see generating a crash dump

Sending support bundles to Joyent

Triton DataCenter includes a feature to create and optionally send a support bundle to Joyent that helps Joyent Support troubleshoot problems with your installation. This bundle includes a number of logs and configuration files from your Triton Triton DataCenter installation. To generate and send a support bundle to Joyent, please see sending support bundles to Joyent

Additional resources