1--- 2title: Locking Block Device Access 3category: Interfaces 4layout: default 5SPDX-License-Identifier: LGPL-2.1-or-later 6--- 7 8# Locking Block Device Access 9 10*TL;DR: Use BSD file locks 11[(`flock(2)`)](http://man7.org/linux/man-pages/man2/flock.2.html) on block 12device nodes to synchronize access for partitioning and file system formatting 13tools.* 14 15`systemd-udevd` probes all block devices showing up for file system superblock 16and partition table information (utilizing `libblkid`). If another program 17concurrently modifies a superblock or partition table this probing might be 18affected, which is bad in itself, but also might in turn result in undesired 19effects in programs subscribing to `udev` events. 20 21Applications manipulating a block device can temporarily stop `systemd-udevd` 22from processing rules on it — and thus bar it from probing the device — by 23taking a BSD file lock on the block device node. Specifically, whenever 24`systemd-udevd` starts processing a block device it takes a `LOCK_SH|LOCK_NB` 25lock using [`flock(2)`](http://man7.org/linux/man-pages/man2/flock.2.html) on 26the main block device (i.e. never on any partition block device, but on the 27device the partition belongs to). If this lock cannot be taken (i.e. `flock()` 28returns `EAGAIN`), it refrains from processing the device. If it manages to take 29the lock it is kept for the entire time the device is processed. 30 31Note that `systemd-udevd` also watches all block device nodes it manages for 32`inotify()` `IN_CLOSE_WRITE` events: whenever such an event is seen, this is 33used as trigger to re-run the rule-set for the device. 34 35These two concepts allow tools such as disk partitioners or file system 36formatting tools to safely and easily take exclusive ownership of a block 37device while operating: before starting work on the block device, they should 38take an `LOCK_EX` lock on it. This has two effects: first of all, in case 39`systemd-udevd` is still processing the device the tool will wait for it to 40finish. Second, after the lock is taken, it can be sure that `systemd-udevd` 41will refrain from processing the block device, and thus all other client 42applications subscribed to it won't get device notifications from potentially 43half-written data either. After the operation is complete the 44partitioner/formatter can simply close the device node. This has two effects: 45it implicitly releases the lock, so that `systemd-udevd` can process events on 46the device node again. Secondly, it results an `IN_CLOSE_WRITE` event, which 47causes `systemd-udevd` to immediately re-process the device — seeing all 48changes the tool made — and notify subscribed clients about it. 49 50Ideally, `systemd-udevd` would explicitly watch block devices for `LOCK_EX` 51locks being released. Such monitoring is not supported on Linux however, which 52is why it watches for `IN_CLOSE_WRITE` instead, i.e. for `close()` calls to 53writable file descriptors referring to the block device. In almost all cases, 54the difference between these two events does not matter much, as any locks 55taken are implicitly released by `close()`. However, it should be noted that if 56an application unlocks a device after completing its work without closing it, 57i.e. while keeping the file descriptor open for further, longer time, then 58`systemd-udevd` will not notice this and not retrigger and thus reprobe the 59device. 60 61Besides synchronizing block device access between `systemd-udevd` and such 62tools this scheme may also be used to synchronize access between those tools 63themselves. However, do note that `flock()` locks are advisory only. This means 64if one tool honours this scheme and another tool does not, they will of course 65not be synchronized properly, and might interfere with each other's work. 66 67Note that the file locks follow the usual access semantics of BSD locks: since 68`systemd-udevd` never writes to such block devices it only takes a `LOCK_SH` 69*shared* lock. A program intending to make changes to the block device should 70take a `LOCK_EX` *exclusive* lock instead. For further details, see the 71`flock(2)` man page. 72 73And please keep in mind: BSD file locks (`flock()`) and POSIX file locks 74(`lockf()`, `F_SETLK`, …) are different concepts, and in their effect 75orthogonal. The scheme discussed above uses the former and not the latter, 76because these types of locks more closely match the required semantics. 77 78If multiple devices are to be locked at the same time (for example in order to 79format a RAID file system), the devices should be locked in the order of the 80the device nodes' major numbers (primary ordering key, ascending) and minor 81numbers (secondary ordering key, ditto), in order to avoid ABBA locking issues 82between subsystems. 83 84Note that the locks should only be taken while the device is repartitioned, 85file systems formatted or `dd`'ed in, and similar cases that 86apply/remove/change superblocks/partition information. It should not be held 87during normal operation, i.e. while file systems on it are mounted for 88application use. 89 90The [`udevadm 91lock`](https://www.freedesktop.org/software/systemd/man/udevadm.html) command 92is provided to lock block devices following this scheme from the command line, 93for the use in scripts and similar. (Note though that it's typically preferable 94to use native support for block device locking in tools where that's 95available.) 96 97Summarizing: it is recommended to take `LOCK_EX` BSD file locks when 98manipulating block devices in all tools that change file system block devices 99(`mkfs`, `fsck`, …) or partition tables (`fdisk`, `parted`, …), right after 100opening the node. 101