1=========================================== 2Fault injection capabilities infrastructure 3=========================================== 4 5See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug. 6 7 8Available fault injection capabilities 9-------------------------------------- 10 11- failslab 12 13 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...) 14 15- fail_page_alloc 16 17 injects page allocation failures. (alloc_pages(), get_free_pages(), ...) 18 19- fail_usercopy 20 21 injects failures in user memory access functions. (copy_from_user(), get_user(), ...) 22 23- fail_futex 24 25 injects futex deadlock and uaddr fault errors. 26 27- fail_sunrpc 28 29 injects kernel RPC client and server failures. 30 31- fail_make_request 32 33 injects disk IO errors on devices permitted by setting 34 /sys/block/<device>/make-it-fail or 35 /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct()) 36 37- fail_mmc_request 38 39 injects MMC data errors on devices permitted by setting 40 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request 41 42- fail_function 43 44 injects error return on specific functions, which are marked by 45 ALLOW_ERROR_INJECTION() macro, by setting debugfs entries 46 under /sys/kernel/debug/fail_function. No boot option supported. 47 48- NVMe fault injection 49 50 inject NVMe status code and retry flag on devices permitted by setting 51 debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default 52 status code is NVME_SC_INVALID_OPCODE with no retry. The status code and 53 retry flag can be set via the debugfs. 54 55- Null test block driver fault injection 56 57 inject IO timeouts by setting config items under 58 /sys/kernel/config/nullb/<disk>/timeout_inject, 59 inject requeue requests by setting config items under 60 /sys/kernel/config/nullb/<disk>/requeue_inject, and 61 inject init_hctx() errors by setting config items under 62 /sys/kernel/config/nullb/<disk>/init_hctx_fault_inject. 63 64Configure fault-injection capabilities behavior 65----------------------------------------------- 66 67debugfs entries 68^^^^^^^^^^^^^^^ 69 70fault-inject-debugfs kernel module provides some debugfs entries for runtime 71configuration of fault-injection capabilities. 72 73- /sys/kernel/debug/fail*/probability: 74 75 likelihood of failure injection, in percent. 76 77 Format: <percent> 78 79 Note that one-failure-per-hundred is a very high error rate 80 for some testcases. Consider setting probability=100 and configure 81 /sys/kernel/debug/fail*/interval for such testcases. 82 83- /sys/kernel/debug/fail*/interval: 84 85 specifies the interval between failures, for calls to 86 should_fail() that pass all the other tests. 87 88 Note that if you enable this, by setting interval>1, you will 89 probably want to set probability=100. 90 91- /sys/kernel/debug/fail*/times: 92 93 specifies how many times failures may happen at most. A value of -1 94 means "no limit". 95 96- /sys/kernel/debug/fail*/space: 97 98 specifies an initial resource "budget", decremented by "size" 99 on each call to should_fail(,size). Failure injection is 100 suppressed until "space" reaches zero. 101 102- /sys/kernel/debug/fail*/verbose 103 104 Format: { 0 | 1 | 2 } 105 106 specifies the verbosity of the messages when failure is 107 injected. '0' means no messages; '1' will print only a single 108 log line per failure; '2' will print a call trace too -- useful 109 to debug the problems revealed by fault injection. 110 111- /sys/kernel/debug/fail*/task-filter: 112 113 Format: { 'Y' | 'N' } 114 115 A value of 'N' disables filtering by process (default). 116 Any positive value limits failures to only processes indicated by 117 /proc/<pid>/make-it-fail==1. 118 119- /sys/kernel/debug/fail*/require-start, 120 /sys/kernel/debug/fail*/require-end, 121 /sys/kernel/debug/fail*/reject-start, 122 /sys/kernel/debug/fail*/reject-end: 123 124 specifies the range of virtual addresses tested during 125 stacktrace walking. Failure is injected only if some caller 126 in the walked stacktrace lies within the required range, and 127 none lies within the rejected range. 128 Default required range is [0,ULONG_MAX) (whole of virtual address space). 129 Default rejected range is [0,0). 130 131- /sys/kernel/debug/fail*/stacktrace-depth: 132 133 specifies the maximum stacktrace depth walked during search 134 for a caller within [require-start,require-end) OR 135 [reject-start,reject-end). 136 137- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem: 138 139 Format: { 'Y' | 'N' } 140 141 default is 'Y', setting it to 'N' will also inject failures into 142 highmem/user allocations (__GFP_HIGHMEM allocations). 143 144- /sys/kernel/debug/failslab/ignore-gfp-wait: 145- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait: 146 147 Format: { 'Y' | 'N' } 148 149 default is 'Y', setting it to 'N' will also inject failures 150 into allocations that can sleep (__GFP_DIRECT_RECLAIM allocations). 151 152- /sys/kernel/debug/fail_page_alloc/min-order: 153 154 specifies the minimum page allocation order to be injected 155 failures. 156 157- /sys/kernel/debug/fail_futex/ignore-private: 158 159 Format: { 'Y' | 'N' } 160 161 default is 'N', setting it to 'Y' will disable failure injections 162 when dealing with private (address space) futexes. 163 164- /sys/kernel/debug/fail_sunrpc/ignore-client-disconnect: 165 166 Format: { 'Y' | 'N' } 167 168 default is 'N', setting it to 'Y' will disable disconnect 169 injection on the RPC client. 170 171- /sys/kernel/debug/fail_sunrpc/ignore-server-disconnect: 172 173 Format: { 'Y' | 'N' } 174 175 default is 'N', setting it to 'Y' will disable disconnect 176 injection on the RPC server. 177 178- /sys/kernel/debug/fail_sunrpc/ignore-cache-wait: 179 180 Format: { 'Y' | 'N' } 181 182 default is 'N', setting it to 'Y' will disable cache wait 183 injection on the RPC server. 184 185- /sys/kernel/debug/fail_function/inject: 186 187 Format: { 'function-name' | '!function-name' | '' } 188 189 specifies the target function of error injection by name. 190 If the function name leads '!' prefix, given function is 191 removed from injection list. If nothing specified ('') 192 injection list is cleared. 193 194- /sys/kernel/debug/fail_function/injectable: 195 196 (read only) shows error injectable functions and what type of 197 error values can be specified. The error type will be one of 198 below; 199 - NULL: retval must be 0. 200 - ERRNO: retval must be -1 to -MAX_ERRNO (-4096). 201 - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096). 202 203- /sys/kernel/debug/fail_function/<function-name>/retval: 204 205 specifies the "error" return value to inject to the given function. 206 This will be created when the user specifies a new injection entry. 207 Note that this file only accepts unsigned values. So, if you want to 208 use a negative errno, you better use 'printf' instead of 'echo', e.g.: 209 $ printf %#x -12 > retval 210 211Boot option 212^^^^^^^^^^^ 213 214In order to inject faults while debugfs is not available (early boot time), 215use the boot option:: 216 217 failslab= 218 fail_page_alloc= 219 fail_usercopy= 220 fail_make_request= 221 fail_futex= 222 mmc_core.fail_request=<interval>,<probability>,<space>,<times> 223 224proc entries 225^^^^^^^^^^^^ 226 227- /proc/<pid>/fail-nth, 228 /proc/self/task/<tid>/fail-nth: 229 230 Write to this file of integer N makes N-th call in the task fail. 231 Read from this file returns a integer value. A value of '0' indicates 232 that the fault setup with a previous write to this file was injected. 233 A positive integer N indicates that the fault wasn't yet injected. 234 Note that this file enables all types of faults (slab, futex, etc). 235 This setting takes precedence over all other generic debugfs settings 236 like probability, interval, times, etc. But per-capability settings 237 (e.g. fail_futex/ignore-private) take precedence over it. 238 239 This feature is intended for systematic testing of faults in a single 240 system call. See an example below. 241 242 243Error Injectable Functions 244-------------------------- 245 246This part is for the kernel developers considering to add a function to 247ALLOW_ERROR_INJECTION() macro. 248 249Requirements for the Error Injectable Functions 250^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 251 252Since the function-level error injection forcibly changes the code path 253and returns an error even if the input and conditions are proper, this can 254cause unexpected kernel crash if you allow error injection on the function 255which is NOT error injectable. Thus, you (and reviewers) must ensure; 256 257- The function returns an error code if it fails, and the callers must check 258 it correctly (need to recover from it). 259 260- The function does not execute any code which can change any state before 261 the first error return. The state includes global or local, or input 262 variable. For example, clear output address storage (e.g. `*ret = NULL`), 263 increments/decrements counter, set a flag, preempt/irq disable or get 264 a lock (if those are recovered before returning error, that will be OK.) 265 266The first requirement is important, and it will result in that the release 267(free objects) functions are usually harder to inject errors than allocate 268functions. If errors of such release functions are not correctly handled 269it will cause a memory leak easily (the caller will confuse that the object 270has been released or corrupted.) 271 272The second one is for the caller which expects the function should always 273does something. Thus if the function error injection skips whole of the 274function, the expectation is betrayed and causes an unexpected error. 275 276Type of the Error Injectable Functions 277^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 278 279Each error injectable functions will have the error type specified by the 280ALLOW_ERROR_INJECTION() macro. You have to choose it carefully if you add 281a new error injectable function. If the wrong error type is chosen, the 282kernel may crash because it may not be able to handle the error. 283There are 4 types of errors defined in include/asm-generic/error-injection.h 284 285EI_ETYPE_NULL 286 This function will return `NULL` if it fails. e.g. return an allocateed 287 object address. 288 289EI_ETYPE_ERRNO 290 This function will return an `-errno` error code if it fails. e.g. return 291 -EINVAL if the input is wrong. This will include the functions which will 292 return an address which encodes `-errno` by ERR_PTR() macro. 293 294EI_ETYPE_ERRNO_NULL 295 This function will return an `-errno` or `NULL` if it fails. If the caller 296 of this function checks the return value with IS_ERR_OR_NULL() macro, this 297 type will be appropriate. 298 299EI_ETYPE_TRUE 300 This function will return `true` (non-zero positive value) if it fails. 301 302If you specifies a wrong type, for example, EI_TYPE_ERRNO for the function 303which returns an allocated object, it may cause a problem because the returned 304value is not an object address and the caller can not access to the address. 305 306 307How to add new fault injection capability 308----------------------------------------- 309 310- #include <linux/fault-inject.h> 311 312- define the fault attributes 313 314 DECLARE_FAULT_ATTR(name); 315 316 Please see the definition of struct fault_attr in fault-inject.h 317 for details. 318 319- provide a way to configure fault attributes 320 321- boot option 322 323 If you need to enable the fault injection capability from boot time, you can 324 provide boot option to configure it. There is a helper function for it: 325 326 setup_fault_attr(attr, str); 327 328- debugfs entries 329 330 failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way. 331 Helper functions: 332 333 fault_create_debugfs_attr(name, parent, attr); 334 335- module parameters 336 337 If the scope of the fault injection capability is limited to a 338 single kernel module, it is better to provide module parameters to 339 configure the fault attributes. 340 341- add a hook to insert failures 342 343 Upon should_fail() returning true, client code should inject a failure: 344 345 should_fail(attr, size); 346 347Application Examples 348-------------------- 349 350- Inject slab allocation failures into module init/exit code:: 351 352 #!/bin/bash 353 354 FAILTYPE=failslab 355 echo Y > /sys/kernel/debug/$FAILTYPE/task-filter 356 echo 10 > /sys/kernel/debug/$FAILTYPE/probability 357 echo 100 > /sys/kernel/debug/$FAILTYPE/interval 358 echo -1 > /sys/kernel/debug/$FAILTYPE/times 359 echo 0 > /sys/kernel/debug/$FAILTYPE/space 360 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 361 echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 362 363 faulty_system() 364 { 365 bash -c "echo 1 > /proc/self/make-it-fail && exec $*" 366 } 367 368 if [ $# -eq 0 ] 369 then 370 echo "Usage: $0 modulename [ modulename ... ]" 371 exit 1 372 fi 373 374 for m in $* 375 do 376 echo inserting $m... 377 faulty_system modprobe $m 378 379 echo removing $m... 380 faulty_system modprobe -r $m 381 done 382 383------------------------------------------------------------------------------ 384 385- Inject page allocation failures only for a specific module:: 386 387 #!/bin/bash 388 389 FAILTYPE=fail_page_alloc 390 module=$1 391 392 if [ -z $module ] 393 then 394 echo "Usage: $0 <modulename>" 395 exit 1 396 fi 397 398 modprobe $module 399 400 if [ ! -d /sys/module/$module/sections ] 401 then 402 echo Module $module is not loaded 403 exit 1 404 fi 405 406 cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start 407 cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end 408 409 echo N > /sys/kernel/debug/$FAILTYPE/task-filter 410 echo 10 > /sys/kernel/debug/$FAILTYPE/probability 411 echo 100 > /sys/kernel/debug/$FAILTYPE/interval 412 echo -1 > /sys/kernel/debug/$FAILTYPE/times 413 echo 0 > /sys/kernel/debug/$FAILTYPE/space 414 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 415 echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 416 echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem 417 echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth 418 419 trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT 420 421 echo "Injecting errors into the module $module... (interrupt to stop)" 422 sleep 1000000 423 424------------------------------------------------------------------------------ 425 426- Inject open_ctree error while btrfs mount:: 427 428 #!/bin/bash 429 430 rm -f testfile.img 431 dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1 432 DEVICE=$(losetup --show -f testfile.img) 433 mkfs.btrfs -f $DEVICE 434 mkdir -p tmpmnt 435 436 FAILTYPE=fail_function 437 FAILFUNC=open_ctree 438 echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject 439 printf %#x -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval 440 echo N > /sys/kernel/debug/$FAILTYPE/task-filter 441 echo 100 > /sys/kernel/debug/$FAILTYPE/probability 442 echo 0 > /sys/kernel/debug/$FAILTYPE/interval 443 echo -1 > /sys/kernel/debug/$FAILTYPE/times 444 echo 0 > /sys/kernel/debug/$FAILTYPE/space 445 echo 1 > /sys/kernel/debug/$FAILTYPE/verbose 446 447 mount -t btrfs $DEVICE tmpmnt 448 if [ $? -ne 0 ] 449 then 450 echo "SUCCESS!" 451 else 452 echo "FAILED!" 453 umount tmpmnt 454 fi 455 456 echo > /sys/kernel/debug/$FAILTYPE/inject 457 458 rmdir tmpmnt 459 losetup -d $DEVICE 460 rm testfile.img 461 462 463Tool to run command with failslab or fail_page_alloc 464---------------------------------------------------- 465In order to make it easier to accomplish the tasks mentioned above, we can use 466tools/testing/fault-injection/failcmd.sh. Please run a command 467"./tools/testing/fault-injection/failcmd.sh --help" for more information and 468see the following examples. 469 470Examples: 471 472Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab 473allocation failure:: 474 475 # ./tools/testing/fault-injection/failcmd.sh \ 476 -- make -C tools/testing/selftests/ run_tests 477 478Same as above except to specify 100 times failures at most instead of one time 479at most by default:: 480 481 # ./tools/testing/fault-injection/failcmd.sh --times=100 \ 482 -- make -C tools/testing/selftests/ run_tests 483 484Same as above except to inject page allocation failure instead of slab 485allocation failure:: 486 487 # env FAILCMD_TYPE=fail_page_alloc \ 488 ./tools/testing/fault-injection/failcmd.sh --times=100 \ 489 -- make -C tools/testing/selftests/ run_tests 490 491Systematic faults using fail-nth 492--------------------------------- 493 494The following code systematically faults 0-th, 1-st, 2-nd and so on 495capabilities in the socketpair() system call:: 496 497 #include <sys/types.h> 498 #include <sys/stat.h> 499 #include <sys/socket.h> 500 #include <sys/syscall.h> 501 #include <fcntl.h> 502 #include <unistd.h> 503 #include <string.h> 504 #include <stdlib.h> 505 #include <stdio.h> 506 #include <errno.h> 507 508 int main() 509 { 510 int i, err, res, fail_nth, fds[2]; 511 char buf[128]; 512 513 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait"); 514 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid)); 515 fail_nth = open(buf, O_RDWR); 516 for (i = 1;; i++) { 517 sprintf(buf, "%d", i); 518 write(fail_nth, buf, strlen(buf)); 519 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds); 520 err = errno; 521 pread(fail_nth, buf, sizeof(buf), 0); 522 if (res == 0) { 523 close(fds[0]); 524 close(fds[1]); 525 } 526 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y', 527 res, err); 528 if (atoi(buf)) 529 break; 530 } 531 return 0; 532 } 533 534An example output:: 535 536 1-th fault Y: res=-1/23 537 2-th fault Y: res=-1/23 538 3-th fault Y: res=-1/12 539 4-th fault Y: res=-1/12 540 5-th fault Y: res=-1/23 541 6-th fault Y: res=-1/23 542 7-th fault Y: res=-1/23 543 8-th fault Y: res=-1/12 544 9-th fault Y: res=-1/12 545 10-th fault Y: res=-1/12 546 11-th fault Y: res=-1/12 547 12-th fault Y: res=-1/12 548 13-th fault Y: res=-1/12 549 14-th fault Y: res=-1/12 550 15-th fault Y: res=-1/12 551 16-th fault N: res=0/12 552