1	Daemontools and runit
2
3Tired of PID files, needing root access, and writing init scripts just
4to have your UNIX apps start when your server boots? Want a simpler,
5better alternative that will also restart them if they crash? If so,
6this is an introduction to process supervision with runit/daemontools.
7
8
9	Background
10
11Classic init scripts, e.g. /etc/init.d/apache, are widely used for
12starting processes at system boot time, when they are executed by init.
13Sadly, init scripts are cumbersome and error-prone to write, they must
14typically be edited and run as root, and the processes they launch do
15not get restarted automatically if they crash.
16
17In an alternative scheme called "process supervision", each important
18process is looked after by a tiny supervising process, which deals with
19starting and stopping the important process on request, and re-starting
20it when it exits unexpectedly. Those supervising processes can in turn
21be supervised by other supervising processes.
22
23Dan Bernstein wrote the process supervision toolkit, "daemontools",
24which is a set of small, reliable programs that cooperate in the
25UNIX tradition to manage process supervision trees.
26
27Runit is a more conveniently licensed and more actively maintained
28reimplementation of daemontools, written by Gerrit Pape.
29
30Here I’ll use runit, however, the ideas are the same for other
31daemontools-like projects (there are several).
32
33
34	Service directories and scripts
35
36In runit parlance a "service" is simply a directory containing a script
37named "run".
38
39There are just two key programs in runit. Firstly, runsv supervises the
40process for an individual service. Service directories themselves sit
41inside a containing directory, and the runsvdir program supervises that
42directory, running one child runsv process for the service in each
43subdirectory. A typical choice is to start an instance of runsvdir
44which supervises services in subdirectories of /var/service/.
45
46If /var/service/log/ exists, runsv will supervise two services,
47and will connect stdout of main service to the stdin of log service.
48This is primarily used for logging.
49
50You can debug an individual service by running its SERVICE_DIR/run script.
51In this case, its stdout and stderr go to your terminal.
52
53You can also run "runsv SERVICE_DIR", which runs both the service
54and its logger service (SERVICE_DIR/log/run) if logger service exists.
55If logger service exists, the output will go to it instead of the terminal.
56
57"runsvdir /var/service" merely runs "runsv SERVICE_DIR" for every subdirectory
58in /var/service.
59
60
61	Examples
62
63This directory contains some examples of services:
64
65    var_service/getty_<tty>
66
67Runs a getty on <tty>. (run script looks at $PWD and extracts suffix
68after "_" as tty name). Create copies (or symlinks) of this directory
69with different names to run many gettys on many ttys.
70
71    var_service/gpm
72
73Runs gpm, the cut and paste utility and mouse server for text consoles.
74
75    var_service/inetd
76
77Runs inetd. This is an example of a service with log. Log service
78writes timestamped, rotated log data to /var/log/service/inetd/*
79using "svlogd -tt". p_log and w_log scripts demonstrage how you can
80"page log" and "watch log".
81
82Other services which have logs handle them in the same way.
83
84    var_service/nmeter
85
86Runs nmeter '%t %c ....' with output to /dev/tty9. This gives you
87a 1-second sampling of server load and health on a dedicated text console.
88
89
90	Networking examples
91
92In many cases, network configuration makes it necessary to run several daemons:
93dhcp, zeroconf, ppp, openvpn and such. They need to be controlled,
94and in many cases you also want to babysit them.
95
96They present a case where different services need to control (start, stop,
97restart) each other.
98
99    var_service/dhcp_if
100
101controls a udhcpc instance which provides DHCP-assigned IP
102address on interface named "if". Copy/rename this directory as needed to run
103udhcpc on other interfaces (var_service/dhcp_if/run script uses _foo suffix
104of the parent directory as interface name).
105
106When IP address is obtained or lost, var_service/dhcp_if/dhcp_handler is run.
107It saves new config data to /var/run/service/fw/dhcp_if.ipconf and (re)starts
108/var/service/fw service. This example can be used as a template for other
109dynamic network link services (ppp/vpn/zcip).
110
111This is an example of service with has a "finish" script. If downed ("sv d"),
112"finish" is executed. For this service, it removes DHCP address from
113the interface. This is useful when ifplugd detects that the the link is dead
114(cable is no longer attached anywhere) and downs us - keeping DHCP configured
115addresses on the interface would make kernel still try to use it.
116
117    var_service/zcip_if
118
119Zeroconf IP service: assigns a 169.254.x.y/16 address to interface "if".
120This allows to talk to other devices on a network without DHCP server
121(if they also assign 169.254 addresses to themselves).
122
123    var_service/ifplugd_if
124
125Watches link status of interface "if". Downs and ups /var/service/dhcp_if
126service accordingly. In effect, it allows you to unplug/plug-to-different-network
127and have your IP properly re-negotiated at once.
128
129    var_service/dhcp_if_pinger
130
131Uses var_service/dhcp_if's data to determine router IP. Pings it.
132If ping fails, restarts /var/service/dhcp_if service.
133Basically, an example of watchdog service for networks which are not reliable
134and need babysitting.
135
136    var_service/supplicant_if
137
138Wireless supplicant (wifi association and encryption daemon) service for
139interface "if".
140
141    var_service/fw
142
143"Firewall" script, although it is tasked with much more than setting up firewall.
144It is responsible for all aspects of network configuration.
145
146This is an example of *one-shot* service.
147
148It reconfigures network based on current known state of ALL interfaces.
149Uses conf/*.ipconf (static config) and /var/run/service/fw/*.ipconf
150(dynamic config from dhcp/ppp/vpn/etc) to determine what to do.
151
152One-shot-ness of this service means that it shuts itself off after single run.
153IOW: it is not a constantly running daemon sort of thing.
154It starts, it configures the network, it shuts down, all done
155(unlike infamous NetworkManagers which sit in RAM forever).
156
157However, any dhcp/ppp/vpn or similar service can restart it anytime
158when it senses the change in network configuration.
159This even works while fw service runs: if dhcp signals fw to (re)start
160while fw runs, fw will not stop after its execution, but will re-execute once,
161picking up dhcp's new configuration.
162This is achieved very simply by having
163	# Make ourself one-shot
164	sv o .
165at the very beginning of fw/run script, not at the end.
166
167Therefore, any "sv u fw" command by any other script "undoes" o(ne-shot)
168command if fw still runs, thus runsv will rerun it; or start it
169in a normal way if fw is not running.
170
171This mechanism is the reason why fw is a service, not just a script.
172
173System administrators are expected to edit fw/run script, since
174network configuration needs are likely to be very complex and different
175for non-trivial installations.
176
177    var_service/ftpd
178    var_service/httpd
179    var_service/tftpd
180    var_service/ntpd
181
182Examples of typical network daemons.
183
184
185	Process tree
186
187Here is an example of the process tree from a live system with these services
188(and a few others). An interesting detail are ftpd and vpnc services, where
189you can see only logger process. These services are "downed" at the moment:
190their daemons are not launched.
191
192PID TIME COMMAND
193553 0:04 runsvdir -P /var/service
194561 0:00   runsv sshd
195576 0:00     svlogd -tt /var/log/service/sshd
196589 0:00     /usr/sbin/sshd -D -e -p22 -u0 -h /var/service/sshd/ssh_host_rsa_key
197562 0:00   runsv dhcp_eth0
198568 0:00     svlogd -tt /var/log/service/dhcp_eth0
199850 0:00     udhcpc -vv --foreground --interface=eth0
200                --pidfile=/var/service/dhcp_eth0/udhcpc.pid
201                --script=/var/service/dhcp_eth0/dhcp_handler
202                -x hostname bbox
203563 0:00   runsv ntpd
204573 0:01     svlogd -tt /var/log/service/ntpd
205845 0:00     busybox ntpd -dddnNl -S ./ntp.script -p 10.x.x.x -p 10.x.x.x
206564 0:00   runsv ifplugd_wlan0
207598 0:00     svlogd -tt /var/log/service/ifplugd_wlan0
208614 0:05     ifplugd -apqns -t3 -u0 -d0 -i wlan0
209                -r /var/service/ifplugd_wlan0/ifplugd_handler
210565 0:08   runsv dhcp_wlan0_pinger
211911 0:00     sleep 67
212566 0:00   runsv unscd
213583 0:03     svlogd -tt /var/log/service/unscd
214599 0:02     nscd -dddd
215567 0:00   runsv dhcp_wlan0
216591 0:00     svlogd -tt /var/log/service/dhcp_wlan0
217802 0:00     udhcpc -vv -C -o -V  --foreground --interface=wlan0
218                --pidfile=/var/service/dhcp_wlan0/udhcpc.pid
219                --script=/var/service/dhcp_wlan0/dhcp_handler
220569 0:00   runsv fw
221570 0:00   runsv ifplugd_eth0
222597 0:00     svlogd -tt /var/log/service/ifplugd_eth0
223612 0:05     ifplugd -apqns -t3 -u8 -d8 -i eth0
224                -r /var/service/ifplugd_eth0/ifplugd_handler
225571 0:00   runsv zcip_eth0
226590 0:00     svlogd -tt /var/log/service/zcip_eth0
227607 0:01     zcip -fvv eth0 /var/service/zcip_eth0/zcip_handler
228572 0:00   runsv ftpd
229604 0:00     svlogd -tt /var/log/service/ftpd
230574 0:00   runsv vpnc
231603 0:00     svlogd -tt /var/log/service/vpnc
232575 0:00   runsv httpd
233602 0:00     svlogd -tt /var/log/service/httpd
234622 0:00     busybox httpd -p80 -vvv -f -h /home/httpd_root
235577 0:00   runsv supplicant_wlan0
236627 0:00     svlogd -tt /var/log/service/supplicant_wlan0
237638 0:03     wpa_supplicant -i wlan0
238                -c /var/service/supplicant_wlan0/wpa_supplicant.conf -d
239