1<HTML> 2<HEAD> 3<link rel="SHORTCUT ICON" href="http://www.cons.org/favicon.ico"> 4<TITLE>Proper handling of SIGINT/SIGQUIT [http://www.cons.org/cracauer/sigint.html]</TITLE> 5<!-- Created by: GNU m4 using $Revision: 1.20 $ of crawww.m4lib on 11-Feb-2005 --> 6<BODY BGCOLOR="#fff8e1"> 7<CENTER><H2>Proper handling of SIGINT/SIGQUIT</H2></CENTER> 8<img src=linie.png width="100%" alt=" "> 9<P> 10 11<table border=1 cellpadding=4> 12<tr><th valign=top align=left>Abstract: </th> 13<td valign=top align=left> 14In UNIX terminal sessions, you usually have a key like 15<code>C-c</code> (Control-C) to immediately end whatever program you 16have running in the foreground. This should work even when the program 17you called has called other programs in turn. Everything should be 18aborted, giving you your command prompt back, no matter how deep the 19call stack is. 20 21<p>Basically, it's trivial. But the existence of interactive 22applications that use SIGINT and/or SIGQUIT for other purposes than a 23complete immediate abort make matters complicated, and - as was to 24expect - left us with several ways to solve the problems. Of course, 25existing shells and applications follow different ways. 26 27<P>This Web pages outlines different ways to solve the problem and 28argues that only one of them can do everything right, although it 29means that we have to fix some existing software. 30 31 32 33</td></tr><tr><th valign=top align=left>Intended audience: </th> 34<td valign=top align=left>Programmers who implement programs that catch SIGINT/SIGQUIT. 35<BR>Programmers who implements shells or shell-like programs that 36execute batches of programs. 37 38<p>Users who have problems problems getting rid of runaway shell 39scripts using <code>Control-C</code>. Or have interactive applications 40that don't behave right when sending SIGINT. Examples are emacs'es 41that die on Control-g or shellscript statements that sometimes are 42executed and sometimes not, apparently not determined by the user's 43intention. 44 45 46</td></tr><tr><th valign=top align=left>Required knowledge: </th> 47<td valign=top align=left>You have to know what it means to catch SIGINT or SIGQUIT and how 48processes are waiting for other processes (children) they spawned. 49 50 51</td></tr></table> 52<img src=linie.png width="100%" alt=" "> 53 54 55<H3>Basic concepts</H3> 56 57What technically happens when you press Control-C is that all programs 58running in the foreground in your current terminal (or virtual 59terminal) get the signal SIGINT sent. 60 61<p>You may change the key that triggers the signal using 62<code>stty</code> and running programs may remap the SIGINT-sending 63key at any time they like, without your intervention and without 64asking you first. 65 66<p>The usual reaction of a running program to SIGINT is to exit. 67However, not all program do an exit on SIGINT, programs are free to 68use the signal for other actions or to ignore it at all. 69 70<p>All programs running in the foreground receive the signal. This may 71be a nested "stack" of programs: You started a program that started 72another and the outer is waiting for the inner to exit. This nesting 73may be arbitrarily deep. 74 75<p>The innermost program is the one that decides what to do on SIGINT. 76It may exit, do something else or do nothing. Still, when the user hit 77SIGINT, all the outer programs are awaken, get the signal and may 78react on it. 79 80<H3>What we try to achieve</H3> 81 82The problem is with shell scripts (or similar programs that call 83several subprograms one after another). 84 85<p>Let us consider the most basic script: 86<PRE> 87#! /bin/sh 88program1 89program2 90</PRE> 91and the usual run looks like this: 92<PRE> 93$ sh myscript 94[output of program1] 95[output of program2] 96$ 97</PRE> 98 99<p>Let us assume that both programs do nothing special on SIGINT, they 100just exit. 101 102<p>Now imagine the user hits C-c while a shellscript is executing its 103first program. The following programs receive SIGINT: program1 and 104also the shell executing the script. program1 exits. 105 106<p>But what should the shell do? If we say that it is only the 107innermost's programs business to react on SIGINT, the shell will do 108nothing special (not exit) and it will continue the execution of the 109script and run program2. But this is wrong: The user's intention in 110hitting C-c is to abort the whole script, to get his prompt back. If 111he hits C-c while the first program is running, he does not want 112program2 to be even started. 113 114<p>here is what would happen if the shell doesn't do anything: 115<PRE> 116$ sh myscript 117[first half of program1's output] 118C-c [users presses C-c] 119[second half of program1's output will not be displayed] 120[output of program2 will appear] 121</PRE> 122 123 124<p>Consider a more annoying example: 125<pre> 126#! /bin/sh 127# let's assume there are 300 *.dat files 128for file in *.dat ; do 129 dat2ascii $dat 130done 131</pre> 132 133If your shell wouldn't end if the user hits <code>C-c</code>, 134<code>C-c</code> would just end <strong>one</strong> dat2ascii run and 135the script would continue. Thus, you had to hit <code>C-c</code> up to 136300 times to end this script. 137 138<H3>Alternatives to do so</H3> 139 140<p>There are several ways to handle abortion of shell scripts when 141SIGINT is received while a foreground child runs: 142 143<menu> 144 145<li>As just outlined, the shellscript may just continue, ignoring the 146fact that the user hit <code>C-c</code>. That way, your shellscript - 147including any loops - would continue and you had no chance of aborting 148it except using the kill command after finding out the outermost 149shell's PID. This "solution" will not be discussed further, as it is 150obviously not desirable. 151 152<p><li>The shell itself exits immediately when it receives SIGINT. Not 153only the program called will exit, but the calling (the 154script-executing) shell. The first variant is to exit the shell (and 155therefore discontinuing execution of the script) immediately, while 156the background program may still be executing (remember that although 157the shell is just waiting for the called program to exit, it is woken 158up and may act). I will call the way of doing things the "IUE" (for 159"immediate unconditional exit") for the rest of this document. 160 161<p><li>As a variant of the former, when the shell receives SIGINT 162while it is waiting for a child to exit, the shell does not exit 163immediately. but it remembers the fact that a SIGINT happened. After 164the called program exits and the shell's wait ends, the shell will 165exit itself and hence discontinue the script. I will call the way of 166doing things the "WUE" (for "wait and unconditional exit") for the 167rest of this document. 168 169<p><li>There is also a way that the calling shell can tell whether the 170called program exited on SIGINT and if it ignored SIGINT (or used it 171for other purposes). As in the <sl>WUE</sl> way, the shell waits for 172the child to complete. It figures whether the program was ended on 173SIGINT and if so, it discontinue the script. If the program did any 174other exit, the script will be continued. I will call the way of doing 175things the "WCE" (for "wait and cooperative exit") for the rest of 176this document. 177 178</menu> 179 180<H3>The problem</H3> 181 182On first sight, all three solutions (IUE, WUE and WCE) all seem to do 183what we want: If C-c is hit while the first program of the shell 184script runs, the script is discontinued. The user gets his prompt back 185immediately. So what are the difference between these way of handling 186SIGINT? 187 188<p>There are programs that use the signal SIGINT for other purposes 189than exiting. They use it as a normal keystroke. The user is expected 190to use the key that sends SIGINT during a perfectly normal program 191run. As a result, the user sends SIGINT in situations where he/she 192does not want the program or the script to end. 193 194<p>The primary example is the emacs editor: C-g does what ESC does in 195other applications: It cancels a partially executed or prepared 196operation. Technically, emacs remaps the key that sends SIGINT from 197C-c to C-g and catches SIGINT. 198 199<p>Remember that the SIGINT is sent to all programs running in the 200foreground. If emacs is executing from a shell script, both emacs and 201the shell get SIGINT. emacs is the program that decides what to do: 202Exit on SIGINT or not. emacs decides not to exit. The problem arises 203when the shell draws its own conclusions from receiving SIGINT without 204consulting emacs for its opinion. 205 206<p>Consider this script: 207<PRE> 208#! /bin/sh 209emacs /tmp/foo 210cp /tmp/foo /home/user/mail/sent 211</PRE> 212 213<p>If C-g is used in emacs, both the shell and emacs will received 214SIGINT. Emacs will not exit, the user used C-g as a normal editing 215keystroke, he/she does not want the script to be aborted on C-g. 216 217<p>The central problem is that the second command (cp) may 218unintentionally be killed when the shell draws its own conclusion 219about the user's intention. The innermost program is the only one to 220judge. 221 222<H3>One more example</H3> 223 224<p>Imagine a mail session using a curses mailer in a tty. You called 225your mailer and started to compose a message. Your mailer calls emacs. 226<code>C-g</code> is a normal editing key in emacs. Technically it 227sends SIGINT (it was <code>C-c</code>, but emacs remapped the key) to 228<menu> 229<li>emacs 230<li>the shell between your mailer and emacs, the one from your mailers 231 system("emacs /tmp/bla.44") command 232<li>the mailer itself 233<li>possibly another shell if your mailer was called by a shell script 234or from another application using system(3) 235<li>your interactive shell (which ignores it since it is interactive 236and hence is not relevant to this discussion) 237</menu> 238 239<p>If everyone just exits on SIGINT, you will be left with nothing but 240your login shell, without asking. 241 242<p>But for sure you don't want to be dropped out of your editor and 243out of your mailer back to the commandline, having your edited data 244and mailer status deleted. 245 246<p>Understand the difference: While <code>C-g</code> is used an a kind 247of abort key in emacs, it isn't the major "abort everything" key. When 248you use <code>C-g</code> in emacs, you want to end some internal emacs 249command. You don't want your whole emacs and mailer session to end. 250 251<p>So, if the shell exits immediately if the user sends SIGINT (the 252second of the four ways shown above), the parent of emacs would die, 253leaving emacs without the controlling tty. The user will lose it's 254editing session immediately and unrecoverable. If the "main" shell of 255the operating system defaults to this behavior, every editor session 256that is spawned from a mailer or such will break (because it is 257usually executed by system(3), which calls /bin/sh). This was the case 258in FreeBSD before I and Bruce Evans changed it in 1998. 259 260<p>If the shell recognized that SIGINT was sent and exits after the 261current foreground process exited (the third way of the four), the 262editor session will not be disturbed, but things will still not work 263right. 264 265<H3>A further look at the alternatives</H3> 266 267<p>Still considering this script to examine the shell's actions in the 268IUE, WUE and ICE way of handling SIGINT: 269<PRE> 270#! /bin/sh 271emacs /tmp/foo 272cp /tmp/foo /home/user/mail/sent 273</PRE> 274 275<p>The IUE ("immediate unconditional exit") way does not work at all: 276emacs wants to survive the SIGINT (it's a normal editing key for 277emacs), but its parent shell unconditionally thinks "We received 278SIGINT. Abort everything. Now.". The shell will exit even before emacs 279exits. But this will leave emacs in an unusable state, since the death 280of its calling shell will leave it without required resources (file 281descriptors). This way does not work at all for shellscripts that call 282programs that use SIGINT for other purposes than immediate exit. Even 283for programs that exit on SIGINT, but want to do some cleanup between 284the signal and the exit, may fail before they complete their cleanup. 285 286<p>It should be noted that this way has one advantage: If a child 287blocks SIGINT and does not exit at all, this way will get control back 288to the user's terminal. Since such programs should be banned from your 289system anyway, I don't think that weighs against the disadvantages. 290 291<p>WUE ("wait and unconditional exit") is a little more clever: If C-g 292was used in emacs, the shell will get SIGINT. It will not immediately 293exit, but remember the fact that a SIGINT happened. When emacs ends 294(maybe a long time after the SIGINT), it will say "Ok, a SIGINT 295happened sometime while the child was executing, the user wants the 296script to be discontinued". It will then exit. The cp will not be 297executed. But that's bad. The "cp" will be executed when the emacs 298session ended without the C-g key ever used, but it will not be 299executed when the user used C-g at least one time. That is clearly not 300desired. Since C-g is a normal editing key in emacs, the user expects 301the rest of the script to behave identically no matter what keys he 302used. 303 304<p>As a result, the "WUE" way is better than the "IUE" way in that it 305does not break SIGINT-using programs completely. The emacs session 306will end undisturbed. But it still does not support scripts where 307other actions should be performed after a program that use SIGINT for 308non-exit purposes. Since the behavior is basically undeterminable for 309the user, this can lead to nasty surprises. 310 311<p>The "WCE" way fixes this by "asking" the called program whether it 312exited on SIGINT or not. While emacs receives SIGINT, it does not exit 313on it and a calling shell waiting for its exit will not be told that 314it exited on SIGINT. (Although it receives SIGINT at some point in 315time, the system does not enforce that emacs will exit with 316"I-exited-on-SIGINT" status. This is under emacs' control, see below). 317 318<p>this still work for the normal script without SIGINT-using 319programs:</p> 320<PRE> 321#! /bin/sh 322program1 323program2 324</PRE> 325 326Unless program1 and program2 mess around with signal handling, the 327system will tell the calling shell whether the programs exited 328normally or as a result of SIGINT. 329 330<p>The "WCE" way then has an easy way to things right: When one called 331program exited with "I-exited-on-SIGINT" status, it will discontinue 332the script after this program. If the program ends without this 333status, the next command in the script is started. 334 335<p>It is important to understand that a shell in "WCE" modus does not 336need to listen to the SIGINT signal at all. Both in the 337"emacs-then-cp" script and in the "several-normal-programs" script, it 338will be woken up and receive SIGINT when the user hits the 339corresponding key. But the shell does not need to react on this event 340and it doesn't need to remember the event of any SIGINT, either. 341Telling whether the user wants to end a script is done by asking that 342program that has to decide, that program that interprets keystrokes 343from the user, the innermost program. 344 345<H3>So everything is well with WCE?</H3> 346 347Well, almost. 348 349<p>The problem with the "WCE" modus is that there are broken programs 350that do not properly communicate the required information up to the 351calling program. 352 353<p>Unless a program messes with signal handling, the system does this 354automatically. 355 356<p>There are programs that want to exit on SIGINT, but they don't let 357the system do the automatic exit, because they want to do some 358cleanup. To do so, they catch SIGINT, do the cleanup and then exit by 359themselves. 360 361<p>And here is where the problem arises: Once they catch the signal, 362the system will no longer communicate the "I-exited-on-SIGINT" status 363to the calling program automatically. Even if the program exit 364immediately in the signal handler of SIGINT. Once it catches the 365signal, it has to take care of communicating the signal status 366itself. 367 368<p>Some programs don't do this. On SIGINT, they do cleanup and exit 369immediately, but the calling shell isn't told about the non-normal exit 370and it will call the next program in the script. 371 372<p>As a result, the user hits SIGINT and while one program exits, the 373shellscript continues. To him/her it looks like the shell fails to 374obey to his abortion command. 375 376<p>Both IUE or WUE shell would not have this problem, since they 377discontinue the script on their own. But as I said, they don't support 378programs using SIGINT for non-exiting purposes, no matter whether 379these programs properly communicate their signal status to the calling 380shell or not. 381 382<p>Since some shell in wide use implement the WUE way (and some even 383IUE), there is a considerable number of broken programs out there that 384break WCE shells. The programmers just don't recognize it if their 385shell isn't WCE. 386 387<H3>How to be a proper program</H3> 388 389<p>(Short note in advance: What you need to achieve is that 390WIFSIGNALED(status) is true in the calling program and that 391WTERMSIG(status) returns SIGINT.) 392 393<p>If you don't catch SIGINT, the system automatically does the right 394thing for you: Your program exits and the calling program gets the 395right "I-exited-on-SIGINT" status after waiting for your exit. 396 397<p>But once you catch SIGINT, you have to act. 398 399<p>Decide whether the SIGINT is used for exit/abort purposes and hence 400a shellscript calling this program should discontinue. This is 401hopefully obvious. If you just need to do some cleanup on SIGINT, but 402then exit immediately, the answer is "yes". 403 404<p>If so, you have to tell the calling program about it by exiting 405with the "I-exited-on-SIGINT" status. 406 407<p>There is no other way of doing this than to kill yourself with a 408SIGINT signal. Do it by resetting the SIGINT handler to SIG_DFL, then 409send yourself the signal. 410 411<PRE> 412void sigint_handler(int sig) 413{ 414 <do some cleanup> 415 signal(SIGINT, SIG_DFL); 416 kill(getpid(), SIGINT); 417} 418</PRE> 419 420Notes: 421 422<MENU> 423 424<LI>You cannot "fake" the proper exit status by an exit(3) with a 425special numeric value. People often assume this since the manuals for 426shells often list some return value for exactly this. But this is just 427a convention for your shell script. It does not work from one UNIX API 428program to another. 429 430<P>All that happens is that the shell sets the "$?" variable to a 431special numeric value for the convenience of your script, because your 432script does not have access to the lower-lever UNIX status evaluation 433functions. This is just an agreement between your script and the 434executing shell, it does not have any meaning in other contexts. 435 436<P><LI>Do not use kill(0, SIGINT) without consulting the manul for 437your OS implementation. I.e. on BSD, this would not send the signal to 438the current process, but to all processes in the group. 439 440<P><LI>POSIX 1003.1 allows all these calls to appear in signal 441handlers, so it is portable. 442 443</MENU> 444 445<p>In a bourne shell script, you can catch signals using the 446<code>trap</code> command. Here, the same as for C programs apply. If 447the intention of SIGINT is to end your program, you have to exit in a 448way that the calling programs "sees" that you have been killed. If 449you don't catch SIGINT, this happened automatically, but of you catch 450SIGINT, i.e. to do cleanup work, you have to end the program by 451killing yourself, not by calling exit. 452 453<p>Consider this example from FreeBSD's <code>mkdep</code>, which is a 454bourne shell script. 455 456<pre> 457TMP=_mkdep$$ 458trap 'rm -f $TMP ; trap 2 ; kill -2 $$' 1 2 3 13 15 459</pre> 460 461Yes, you have to do it the hard way. It's even more annoying in shell 462scripts than in C programs since you can't "pre-delete" temporary 463files (which isn't really portable in C, though). 464 465<P>All this applies to programs in all languages, not only C and 466bourne shell. Every language implementation that lets you catch SIGINT 467should also give you the option to reset the signal and kill yourself. 468 469<P>It is always desirable to exit the right way, even if you don't 470expect your usual callers to depend on it, some unusual one will come 471along. This proper exit status will be needed for WCE and will not 472hurt when the calling shell uses IUE or WUE. 473 474<H3>How to be a proper shell</H3> 475 476All this applies only for the script-executing case. Most shells will 477also have interactive modes where things are different. 478 479<MENU> 480 481<LI>Do nothing special when SIGINT appears while you wait for a child. 482You don't even have to remember that one happened. 483 484<P><LI>Wait for child to exit, get the exit status. Do not truncate it 485to type char. 486 487<P><LI>Look at WIFSIGNALED(status) and WTERMSIG(status) to tell 488whether the child says "I exited on SIGINT: in my opinion the user 489wants the shellscript to be discontinued". 490 491<P><LI>If the latter applies, discontinue the script. 492 493<P><LI>Exit. But since a shellscript may in turn be called by a 494shellscript, you need to make sure that you properly communicate the 495discontinue intention to the calling program. As in any other program 496(see above), do 497 498<PRE> 499 signal(SIGINT, SIG_DFL); 500 kill(getpid(), SIGINT); 501</PRE> 502 503</MENU> 504 505<H3>Other remarks</H3> 506 507Although this web page talks about SIGINT only, almost the same issues 508apply to SIGQUIT, including proper exiting by killing yourself after 509catching the signal and proper reaction on the WIFSIGNALED(status) 510value. One notable difference for SIGQUIT is that you have to make 511sure that not the whole call tree dumps core. 512 513<H3>What to fight</H3> 514 515Make sure all programs <em>really</em> kill themselves if they react 516to SIGINT or SIGQUIT and intend to abort their operation as a result 517of this signal. Programs that don't use SIGINT/SIGQUIT as a 518termination trigger - but as part of normal operation - don't kill 519themselves, but do a normal exit instead. 520 521<p>Make sure people understand why you can't fake an exit-on-signal by 522doing exit(...) using any numerical status. 523 524<p>Make sure you use a shell that behaves right. Especially if you 525develop programs, since it will help seeing problems. 526 527<H3>Concrete examples how to fix programs:</H3> 528<ul> 529 530<li>The fix for FreeBSD's 531<A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/time/time.c.diff?r1=1.10&r2=1.11">time(1)</A>. This fix is the best example, it's quite short and clear and 532it fixes a case where someone tried to fake signal exit status by a 533numerical value. And the complete program is small. 534 535<p><li>Fix for FreeBSD's 536<A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/truss/main.c.diff?r1=1.9&r2=1.10">truss(1)</A>. 537 538<p><li>The fix for FreeBSD's 539<A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/mkdep/mkdep.gcc.sh.diff?r1=1.8.2.1&r2=1.8.2.2">mkdep(1)</A>, a shell script. 540 541 542<p><li>Fix for FreeBSD's make(1), <A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/make/job.c.diff?r1=1.9&r2=1.10">part 1</A>, 543<A HREF="http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/make/compat.c.diff?r1=1.10&r2=1.11">part 2</A>. 544 545</ul> 546 547<H3>Testsuite for shells</H3> 548 549I have a collection of shellscripts that test shells for the 550behavior. See my <A HREF="download/">download dir</A> to get the newest 551"sh-interrupt" files, either as a tarfile or as individual file for 552online browsing. This isn't really documented, besides from the 553comments the scripts echo. 554 555<H3>Appendix 1 - table of implementation choices</H3> 556 557<table border cellpadding=2> 558 559<tr valign=top> 560<th>Method sign</th> 561<th>Does what?</th> 562<th>Example shells that implement it:</th> 563<th>What happens when a shellscript called emacs, the user used 564<code>C-g</code> and the script has additional commands in it?</th> 565<th>What happens when a shellscript called emacs, the user did not use 566<code>C-c</code> and the script has additional commands in it?</th> 567<th>What happens if a non-interactive child catches SIGINT?</th> 568<th>To behave properly, children must do what?</th> 569</tr> 570 571<tr valign=top align=left> 572<td>IUE</td> 573<td>The shell executing a script exits immediately if it receives 574SIGINT.</td> 575<td>4.4BSD ash (ash), NetBSD, FreeBSD prior to 3.0/22.8</td> 576<td>The editor session is lost and subsequent commands are not 577executed.</td> 578<td>The editor continues as normal and the subsequent commands are 579executed. </td> 580<td>The scripts ends immediately, returning to the caller even before 581the current foreground child of the shell exits. </td> 582<td>It doesn't matter what the child does or how it exits, even if the 583child continues to operate, the shell returns. </td> 584</tr> 585 586<tr valign=top align=left> 587<td>WUE</td> 588<td>If the shell executing a script received SIGINT while a foreground 589process was running, it will exit after that child's exit.</td> 590<td>pdksh (OpenBSD /bin/sh)</td> 591<td>The editor continues as normal, but subsequent commands from the 592script are not executed.</td> 593<td>The editor continues as normal and subsequent commands are 594executed. </td> 595<td>The scripts returns to its caller after the current foreground 596child exits, no matter how the child exited. </td> 597<td>It doesn't matter how the child exits (signal status or not), but 598if it doesn't return at all, the shell will not return. In no case 599will further commands from the script be executed. </td> 600</tr> 601 602<tr valign=top align=left> 603<td>WCE</td> 604<td>The shell exits if a child signaled that it was killed on a 605signal (either it had the default handler for SIGINT or it killed 606itself). </td> 607<td>bash (Linux /bin/sh), most commercial /bin/sh, FreeBSD /bin/sh 608from 3.0/2.2.8.</td> 609<td>The editor continues as normal and subsequent commands are 610executed. </td> 611<td>The editor continues as normal and subsequent commands are 612executed. </td> 613<td>The scripts returns to its caller after the current foreground 614child exits, but only if the child exited with signal status. If 615the child did a normal exit (even if it received SIGINT, but catches 616it), the script will continue. </td> 617<td>The child must be implemented right, or the user will not be able 618to break shell scripts reliably.</td> 619</tr> 620 621</table> 622 623<P><img src=linie.png width="100%" alt=" "> 624<BR>©2005 Martin Cracauer <cracauer @ cons.org> 625<A HREF="http://www.cons.org/cracauer/">http://www.cons.org/cracauer/</A> 626<BR>Last changed: $Date: 2005/02/11 21:44:43 $ 627</BODY></HTML> 628