1 Daemontools and runit 2 3Tired of PID files, needing root access, and writing init scripts just 4to have your UNIX apps start when your server boots? Want a simpler, 5better alternative that will also restart them if they crash? If so, 6this is an introduction to process supervision with runit/daemontools. 7 8 9 Background 10 11Classic init scripts, e.g. /etc/init.d/apache, are widely used for 12starting processes at system boot time, when they are executed by init. 13Sadly, init scripts are cumbersome and error-prone to write, they must 14typically be edited and run as root, and the processes they launch do 15not get restarted automatically if they crash. 16 17In an alternative scheme called "process supervision", each important 18process is looked after by a tiny supervising process, which deals with 19starting and stopping the important process on request, and re-starting 20it when it exits unexpectedly. Those supervising processes can in turn 21be supervised by other supervising processes. 22 23Dan Bernstein wrote the process supervision toolkit, "daemontools", 24which is a set of small, reliable programs that cooperate in the 25UNIX tradition to manage process supervision trees. 26 27Runit is a more conveniently licensed and more actively maintained 28reimplementation of daemontools, written by Gerrit Pape. 29 30Here I’ll use runit, however, the ideas are the same for other 31daemontools-like projects (there are several). 32 33 34 Service directories and scripts 35 36In runit parlance a "service" is simply a directory containing a script 37named "run". 38 39There are just two key programs in runit. Firstly, runsv supervises the 40process for an individual service. Service directories themselves sit 41inside a containing directory, and the runsvdir program supervises that 42directory, running one child runsv process for the service in each 43subdirectory. A typical choice is to start an instance of runsvdir 44which supervises services in subdirectories of /var/service/. 45 46If /var/service/log/ exists, runsv will supervise two services, 47and will connect stdout of main service to the stdin of log service. 48This is primarily used for logging. 49 50You can debug an individual service by running its SERVICE_DIR/run script. 51In this case, its stdout and stderr go to your terminal. 52 53You can also run "runsv SERVICE_DIR", which runs both the service 54and its logger service (SERVICE_DIR/log/run) if logger service exists. 55If logger service exists, the output will go to it instead of the terminal. 56 57"runsvdir /var/service" merely runs "runsv SERVICE_DIR" for every subdirectory 58in /var/service. 59 60 61 Examples 62 63This directory contains some examples of services: 64 65 var_service/getty_<tty> 66 67Runs a getty on <tty>. (run script looks at $PWD and extracts suffix 68after "_" as tty name). Create copies (or symlinks) of this directory 69with different names to run many gettys on many ttys. 70 71 var_service/gpm 72 73Runs gpm, the cut and paste utility and mouse server for text consoles. 74 75 var_service/inetd 76 77Runs inetd. This is an example of a service with log. Log service 78writes timestamped, rotated log data to /var/log/service/inetd/* 79using "svlogd -tt". p_log and w_log scripts demonstrage how you can 80"page log" and "watch log". 81 82Other services which have logs handle them in the same way. 83 84 var_service/nmeter 85 86Runs nmeter '%t %c ....' with output to /dev/tty9. This gives you 87a 1-second sampling of server load and health on a dedicated text console. 88 89 90 Networking examples 91 92In many cases, network configuration makes it necessary to run several daemons: 93dhcp, zeroconf, ppp, openvpn and such. They need to be controlled, 94and in many cases you also want to babysit them. 95 96They present a case where different services need to control (start, stop, 97restart) each other. 98 99 var_service/dhcp_if 100 101controls a udhcpc instance which provides DHCP-assigned IP 102address on interface named "if". Copy/rename this directory as needed to run 103udhcpc on other interfaces (var_service/dhcp_if/run script uses _foo suffix 104of the parent directory as interface name). 105 106When IP address is obtained or lost, var_service/dhcp_if/dhcp_handler is run. 107It saves new config data to /var/run/service/fw/dhcp_if.ipconf and (re)starts 108/var/service/fw service. This example can be used as a template for other 109dynamic network link services (ppp/vpn/zcip). 110 111This is an example of service with has a "finish" script. If downed ("sv d"), 112"finish" is executed. For this service, it removes DHCP address from 113the interface. This is useful when ifplugd detects that the the link is dead 114(cable is no longer attached anywhere) and downs us - keeping DHCP configured 115addresses on the interface would make kernel still try to use it. 116 117 var_service/zcip_if 118 119Zeroconf IP service: assigns a 169.254.x.y/16 address to interface "if". 120This allows to talk to other devices on a network without DHCP server 121(if they also assign 169.254 addresses to themselves). 122 123 var_service/ifplugd_if 124 125Watches link status of interface "if". Downs and ups /var/service/dhcp_if 126service accordingly. In effect, it allows you to unplug/plug-to-different-network 127and have your IP properly re-negotiated at once. 128 129 var_service/dhcp_if_pinger 130 131Uses var_service/dhcp_if's data to determine router IP. Pings it. 132If ping fails, restarts /var/service/dhcp_if service. 133Basically, an example of watchdog service for networks which are not reliable 134and need babysitting. 135 136 var_service/supplicant_if 137 138Wireless supplicant (wifi association and encryption daemon) service for 139interface "if". 140 141 var_service/fw 142 143"Firewall" script, although it is tasked with much more than setting up firewall. 144It is responsible for all aspects of network configuration. 145 146This is an example of *one-shot* service. 147 148It reconfigures network based on current known state of ALL interfaces. 149Uses conf/*.ipconf (static config) and /var/run/service/fw/*.ipconf 150(dynamic config from dhcp/ppp/vpn/etc) to determine what to do. 151 152One-shot-ness of this service means that it shuts itself off after single run. 153IOW: it is not a constantly running daemon sort of thing. 154It starts, it configures the network, it shuts down, all done 155(unlike infamous NetworkManagers which sit in RAM forever). 156 157However, any dhcp/ppp/vpn or similar service can restart it anytime 158when it senses the change in network configuration. 159This even works while fw service runs: if dhcp signals fw to (re)start 160while fw runs, fw will not stop after its execution, but will re-execute once, 161picking up dhcp's new configuration. 162This is achieved very simply by having 163 # Make ourself one-shot 164 sv o . 165at the very beginning of fw/run script, not at the end. 166 167Therefore, any "sv u fw" command by any other script "undoes" o(ne-shot) 168command if fw still runs, thus runsv will rerun it; or start it 169in a normal way if fw is not running. 170 171This mechanism is the reason why fw is a service, not just a script. 172 173System administrators are expected to edit fw/run script, since 174network configuration needs are likely to be very complex and different 175for non-trivial installations. 176 177 var_service/ftpd 178 var_service/httpd 179 var_service/tftpd 180 var_service/ntpd 181 182Examples of typical network daemons. 183 184 185 Process tree 186 187Here is an example of the process tree from a live system with these services 188(and a few others). An interesting detail are ftpd and vpnc services, where 189you can see only logger process. These services are "downed" at the moment: 190their daemons are not launched. 191 192PID TIME COMMAND 193553 0:04 runsvdir -P /var/service 194561 0:00 runsv sshd 195576 0:00 svlogd -tt /var/log/service/sshd 196589 0:00 /usr/sbin/sshd -D -e -p22 -u0 -h /var/service/sshd/ssh_host_rsa_key 197562 0:00 runsv dhcp_eth0 198568 0:00 svlogd -tt /var/log/service/dhcp_eth0 199850 0:00 udhcpc -vv --foreground --interface=eth0 200 --pidfile=/var/service/dhcp_eth0/udhcpc.pid 201 --script=/var/service/dhcp_eth0/dhcp_handler 202 -x hostname bbox 203563 0:00 runsv ntpd 204573 0:01 svlogd -tt /var/log/service/ntpd 205845 0:00 busybox ntpd -dddnNl -S ./ntp.script -p 10.x.x.x -p 10.x.x.x 206564 0:00 runsv ifplugd_wlan0 207598 0:00 svlogd -tt /var/log/service/ifplugd_wlan0 208614 0:05 ifplugd -apqns -t3 -u0 -d0 -i wlan0 209 -r /var/service/ifplugd_wlan0/ifplugd_handler 210565 0:08 runsv dhcp_wlan0_pinger 211911 0:00 sleep 67 212566 0:00 runsv unscd 213583 0:03 svlogd -tt /var/log/service/unscd 214599 0:02 nscd -dddd 215567 0:00 runsv dhcp_wlan0 216591 0:00 svlogd -tt /var/log/service/dhcp_wlan0 217802 0:00 udhcpc -vv -C -o -V --foreground --interface=wlan0 218 --pidfile=/var/service/dhcp_wlan0/udhcpc.pid 219 --script=/var/service/dhcp_wlan0/dhcp_handler 220569 0:00 runsv fw 221570 0:00 runsv ifplugd_eth0 222597 0:00 svlogd -tt /var/log/service/ifplugd_eth0 223612 0:05 ifplugd -apqns -t3 -u8 -d8 -i eth0 224 -r /var/service/ifplugd_eth0/ifplugd_handler 225571 0:00 runsv zcip_eth0 226590 0:00 svlogd -tt /var/log/service/zcip_eth0 227607 0:01 zcip -fvv eth0 /var/service/zcip_eth0/zcip_handler 228572 0:00 runsv ftpd 229604 0:00 svlogd -tt /var/log/service/ftpd 230574 0:00 runsv vpnc 231603 0:00 svlogd -tt /var/log/service/vpnc 232575 0:00 runsv httpd 233602 0:00 svlogd -tt /var/log/service/httpd 234622 0:00 busybox httpd -p80 -vvv -f -h /home/httpd_root 235577 0:00 runsv supplicant_wlan0 236627 0:00 svlogd -tt /var/log/service/supplicant_wlan0 237638 0:03 wpa_supplicant -i wlan0 238 -c /var/service/supplicant_wlan0/wpa_supplicant.conf -d 239