1Devfs (Device File System) FAQ
2
3
4Linux Devfs (Device File System) FAQ
5Richard Gooch
621-JUL-2002
7
8
9Document languages:
10
11
12
13
14
15
16
17-----------------------------------------------------------------------------
18
19NOTE: the master copy of this document is available online at:
20
21http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html
22and looks much better than the text version distributed with the
23kernel sources. A mirror site is available at:
24
25http://www.ras.ucalgary.ca/~rgooch/linux/docs/devfs.html
26
27There is also an optional daemon that may be used with devfs. You can
28find out more about it at:
29
30http://www.atnf.csiro.au/~rgooch/linux/
31
32A mailing list is available which you may subscribe to. Send
33email
34to majordomo@oss.sgi.com with the following line in the
35body of the message:
36subscribe devfs
37To unsubscribe, send the message body:
38unsubscribe devfs
39instead. The list is archived at
40
41http://oss.sgi.com/projects/devfs/archive/.
42
43-----------------------------------------------------------------------------
44
45Contents
46
47
48What is it?
49
50Why do it?
51
52Who else does it?
53
54How it works
55
56Operational issues (essential reading)
57
58Instructions for the impatient
59Permissions persistence accross reboots
60Dealing with drivers without devfs support
61All the way with Devfs
62Other Issues
63Kernel Naming Scheme
64Devfsd Naming Scheme
65Old Compatibility Names
66SCSI Host Probing Issues
67
68
69
70Device drivers currently ported
71
72Allocation of Device Numbers
73
74Questions and Answers
75
76Making things work
77Alternatives to devfs
78What I don't like about devfs
79How to report bugs
80Strange kernel messages
81Compilation problems with devfsd
82
83
84Other resources
85
86Translations of this document
87
88
89-----------------------------------------------------------------------------
90
91
92What is it?
93
94Devfs is an alternative to "real" character and block special devices
95on your root filesystem. Kernel device drivers can register devices by
96name rather than major and minor numbers. These devices will appear in
97devfs automatically, with whatever default ownership and
98protection the driver specified. A daemon (devfsd) can be used to
99override these defaults. Devfs has been in the kernel since 2.3.46.
100
101NOTE that devfs is entirely optional. If you prefer the old
102disc-based device nodes, then simply leave CONFIG_DEVFS_FS=n (the
103default). In this case, nothing will change. ALSO NOTE that if you do
104enable devfs, the defaults are such that full compatibility is
105maintained with the old devices names.
106
107There are two aspects to devfs: one is the underlying device
108namespace, which is a namespace just like any mounted filesystem. The
109other aspect is the filesystem code which provides a view of the
110device namespace. The reason I make a distinction is because devfs
111can be mounted many times, with each mount showing the same device
112namespace. Changes made are global to all mounted devfs filesystems.
113Also, because the devfs namespace exists without any devfs mounts, you
114can easily mount the root filesystem by referring to an entry in the
115devfs namespace.
116
117
118The cost of devfs is a small increase in kernel code size and memory
119usage. About 7 pages of code (some of that in __init sections) and 72
120bytes for each entry in the namespace. A modest system has only a
121couple of hundred device entries, so this costs a few more
122pages. Compare this with the suggestion to put /dev on a <a
123href="#why-faq-ramdisc">ramdisc.
124
125On a typical machine, the cost is under 0.2 percent. On a modest
126system with 64 MBytes of RAM, the cost is under 0.1 percent. The
127accusations of "bloatware" levelled at devfs are not justified.
128
129-----------------------------------------------------------------------------
130
131
132Why do it?
133
134There are several problems that devfs addresses. Some of these
135problems are more serious than others (depending on your point of
136view), and some can be solved without devfs. However, the totality of
137these problems really calls out for devfs.
138
139The choice is a patchwork of inefficient user space solutions, which
140are complex and likely to be fragile, or to use a simple and efficient
141devfs which is robust.
142
143There have been many counter-proposals to devfs, all seeking to
144provide some of the benefits without actually implementing devfs. So
145far there has been an absence of code and no proposed alternative has
146been able to provide all the features that devfs does. Further,
147alternative proposals require far more complexity in user-space (and
148still deliver less functionality than devfs). Some people have the
149mantra of reducing "kernel bloat", but don't consider the effects on
150user-space.
151
152A good solution limits the total complexity of kernel-space and
153user-space.
154
155
156Major&minor allocation
157
158The existing scheme requires the allocation of major and minor device
159numbers for each and every device. This means that a central
160co-ordinating authority is required to issue these device numbers
161(unless you're developing a "private" device driver), in order to
162preserve uniqueness. Devfs shifts the burden to a namespace. This may
163not seem like a huge benefit, but actually it is. Since driver authors
164will naturally choose a device name which reflects the functionality
165of the device, there is far less potential for namespace conflict.
166Solving this requires a kernel change.
167
168/dev management
169
170Because you currently access devices through device nodes, these must
171be created by the system administrator. For standard devices you can
172usually find a MAKEDEV programme which creates all these (hundreds!)
173of nodes. This means that changes in the kernel must be reflected by
174changes in the MAKEDEV programme, or else the system administrator
175creates device nodes by hand.
176
177The basic problem is that there are two separate databases of
178major and minor numbers. One is in the kernel and one is in /dev (or
179in a MAKEDEV programme, if you want to look at it that way). This is
180duplication of information, which is not good practice.
181Solving this requires a kernel change.
182
183/dev growth
184
185A typical /dev has over 1200 nodes! Most of these devices simply don't
186exist because the hardware is not available. A huge /dev increases the
187time to access devices (I'm just referring to the dentry lookup times
188and the time taken to read inodes off disc: the next subsection shows
189some more horrors).
190
191An example of how big /dev can grow is if we consider SCSI devices:
192
193host 6 bits (say up to 64 hosts on a really big machine)
194channel 4 bits (say up to 16 SCSI buses per host)
195id 4 bits
196lun 3 bits
197partition 6 bits
198TOTAL 23 bits
199
200
201This requires 8 Mega (1024*1024) inodes if we want to store all
202possible device nodes. Even if we scrap everything but id,partition
203and assume a single host adapter with a single SCSI bus and only one
204logical unit per SCSI target (id), that's still 10 bits or 1024
205inodes. Each VFS inode takes around 256 bytes (kernel 2.1.78), so
206that's 256 kBytes of inode storage on disc (assuming real inodes take
207a similar amount of space as VFS inodes). This is actually not so bad,
208because disc is cheap these days. Embedded systems would care about
209256 kBytes of /dev inodes, but you could argue that embedded systems
210would have hand-tuned /dev directories. I've had to do just that on my
211embedded systems, but I would rather just leave it to devfs.
212
213Another issue is the time taken to lookup an inode when first
214referenced. Not only does this take time in scanning through a list in
215memory, but also the seek times to read the inodes off disc.
216This could be solved in user-space using a clever programme which
217scanned the kernel logs and deleted /dev entries which are not
218available and created them when they were available. This programme
219would need to be run every time a new module was loaded, which would
220slow things down a lot.
221
222There is an existing programme called scsidev which will automatically
223create device nodes for SCSI devices. It can do this by scanning files
224in /proc/scsi. Unfortunately, to extend this idea to other device
225nodes would require significant modifications to existing drivers (so
226they too would provide information in /proc). This is a non-trivial
227change (I should know: devfs has had to do something similar). Once
228you go to this much effort, you may as well use devfs itself (which
229also provides this information). Furthermore, such a system would
230likely be implemented in an ad-hoc fashion, as different drivers will
231provide their information in different ways.
232
233Devfs is much cleaner, because it (naturally) has a uniform mechanism
234to provide this information: the device nodes themselves!
235
236
237Node to driver file_operations translation
238
239There is an important difference between the way disc-based character
240and block nodes and devfs entries make the connection between an entry
241in /dev and the actual device driver.
242
243With the current 8 bit major and minor numbers the connection between
244disc-based c&b nodes and per-major drivers is done through a
245fixed-length table of 128 entries. The various filesystem types set
246the inode operations for c&b nodes to {chr,blk}dev_inode_operations,
247so when a device is opened a few quick levels of indirection bring us
248to the driver file_operations.
249
250For miscellaneous character devices a second step is required: there
251is a scan for the driver entry with the same minor number as the file
252that was opened, and the appropriate minor open method is called. This
253scanning is done *every time* you open a device node. Potentially, you
254may be searching through dozens of misc. entries before you find your
255open method. While not an enormous performance overhead, this does
256seem pointless.
257
258Linux *must* move beyond the 8 bit major and minor barrier,
259somehow. If we simply increase each to 16 bits, then the indexing
260scheme used for major driver lookup becomes untenable, because the
261major tables (one each for character and block devices) would need to
262be 64 k entries long (512 kBytes on x86, 1 MByte for 64 bit
263systems). So we would have to use a scheme like that used for
264miscellaneous character devices, which means the search time goes up
265linearly with the average number of major device drivers on your
266system. Not all "devices" are hardware, some are higher-level drivers
267like KGI, so you can get more "devices" without adding hardware
268You can improve this by creating an ordered (balanced:-)
269binary tree, in which case your search time becomes log(N).
270Alternatively, you can use hashing to speed up the search.
271But why do that search at all if you don't have to? Once again, it
272seems pointless.
273
274Note that devfs doesn't use the major&minor system. For devfs
275entries, the connection is done when you lookup the /dev entry. When
276devfs_register() is called, an internal table is appended which has
277the entry name and the file_operations. If the dentry cache doesn't
278have the /dev entry already, this internal table is scanned to get the
279file_operations, and an inode is created. If the dentry cache already
280has the entry, there is *no lookup time* (other than the dentry scan
281itself, but we can't avoid that anyway, and besides Linux dentries
282cream other OS's which don't have them:-). Furthermore, the number of
283node entries in a devfs is only the number of available device
284entries, not the number of *conceivable* entries. Even if you remove
285unnecessary entries in a disc-based /dev, the number of conceivable
286entries remains the same: you just limit yourself in order to save
287space.
288
289Devfs provides a fast connection between a VFS node and the device
290driver, in a scalable way.
291
292/dev as a system administration tool
293
294Right now /dev contains a list of conceivable devices, most of which I
295don't have. Devfs only shows those devices available on my
296system. This means that listing /dev is a handy way of checking what
297devices are available.
298
299Major&minor size
300
301Existing major and minor numbers are limited to 8 bits each. This is
302now a limiting factor for some drivers, particularly the SCSI disc
303driver, which consumes a single major number. Only 16 discs are
304supported, and each disc may have only 15 partitions. Maybe this isn't
305a problem for you, but some of us are building huge Linux systems with
306disc arrays. With devfs an arbitrary pointer can be associated with
307each device entry, which can be used to give an effective 32 bit
308device identifier (i.e. that's like having a 32 bit minor
309number). Since this is private to the kernel, there are no C library
310compatibility issues which you would have with increasing major and
311minor number sizes. See the section on "Allocation of Device Numbers"
312for details on maintaining compatibility with userspace.
313
314Solving this requires a kernel change.
315
316Since writing this, the kernel has been modified so that the SCSI disc
317driver has more major numbers allocated to it and now supports up to
318128 discs. Since these major numbers are non-contiguous (a result of
319unplanned expansion), the implementation is a little more cumbersome
320than originally.
321
322Just like the changes to IPv4 to fix impending limitations in the
323address space, people find ways around the limitations. In the long
324run, however, solutions like IPv6 or devfs can't be put off forever.
325
326Read-only root filesystem
327
328Having your device nodes on the root filesystem means that you can't
329operate properly with a read-only root filesystem. This is because you
330want to change ownerships and protections of tty devices. Existing
331practice prevents you using a CD-ROM as your root filesystem for a
332*real* system. Sure, you can boot off a CD-ROM, but you can't change
333tty ownerships, so it's only good for installing.
334
335Also, you can't use a shared NFS root filesystem for a cluster of
336discless Linux machines (having tty ownerships changed on a common
337/dev is not good). Nor can you embed your root filesystem in a
338ROM-FS.
339
340You can get around this by creating a RAMDISC at boot time, making
341an ext2 filesystem in it, mounting it somewhere and copying the
342contents of /dev into it, then unmounting it and mounting it over
343/dev.
344
345A devfs is a cleaner way of solving this.
346
347Non-Unix root filesystem
348
349Non-Unix filesystems (such as NTFS) can't be used for a root
350filesystem because they variously don't support character and block
351special files or symbolic links. You can't have a separate disc-based
352or RAMDISC-based filesystem mounted on /dev because you need device
353nodes before you can mount these. Devfs can be mounted without any
354device nodes. Devlinks won't work because symlinks aren't supported.
355An alternative solution is to use initrd to mount a RAMDISC initial
356root filesystem (which is populated with a minimal set of device
357nodes), and then construct a new /dev in another RAMDISC, and finally
358switch to your non-Unix root filesystem. This requires clever boot
359scripts and a fragile and conceptually complex boot procedure.
360
361Devfs solves this in a robust and conceptually simple way.
362
363PTY security
364
365Current pseudo-tty (pty) devices are owned by root and read-writable
366by everyone. The user of a pty-pair cannot change
367ownership/protections without being suid-root.
368
369This could be solved with a secure user-space daemon which runs as
370root and does the actual creation of pty-pairs. Such a daemon would
371require modification to *every* programme that wants to use this new
372mechanism. It also slows down creation of pty-pairs.
373
374An alternative is to create a new open_pty() syscall which does much
375the same thing as the user-space daemon. Once again, this requires
376modifications to pty-handling programmes.
377
378The devfs solution allows a device driver to "tag" certain device
379files so that when an unopened device is opened, the ownerships are
380changed to the current euid and egid of the opening process, and the
381protections are changed to the default registered by the driver. When
382the device is closed ownership is set back to root and protections are
383set back to read-write for everybody. No programme need be changed.
384The devpts filesystem provides this auto-ownership feature for Unix98
385ptys. It doesn't support old-style pty devices, nor does it have all
386the other features of devfs.
387
388Intelligent device management
389
390Devfs implements a simple yet powerful protocol for communication with
391a device management daemon (devfsd) which runs in user space. It is
392possible to send a message (either synchronously or asynchronously) to
393devfsd on any event, such as registration/unregistration of device
394entries, opening and closing devices, looking up inodes, scanning
395directories and more. This has many possibilities. Some of these are
396already implemented. See:
397
398
399http://www.atnf.csiro.au/~rgooch/linux/
400
401Device entry registration events can be used by devfsd to change
402permissions of newly-created device nodes. This is one mechanism to
403control device permissions.
404
405Device entry registration/unregistration events can be used to run
406programmes or scripts. This can be used to provide automatic mounting
407of filesystems when a new block device media is inserted into the
408drive.
409
410Asynchronous device open and close events can be used to implement
411clever permissions management. For example, the default permissions on
412/dev/dsp do not allow everybody to read from the device. This is
413sensible, as you don't want some remote user recording what you say at
414your console. However, the console user is also prevented from
415recording. This behaviour is not desirable. With asynchronous device
416open and close events, you can have devfsd run a programme or script
417when console devices are opened to change the ownerships for *other*
418device nodes (such as /dev/dsp). On closure, you can run a different
419script to restore permissions. An advantage of this scheme over
420modifying the C library tty handling is that this works even if your
421programme crashes (how many times have you seen the utmp database with
422lingering entries for non-existent logins?).
423
424Synchronous device open events can be used to perform intelligent
425device access protections. Before the device driver open() method is
426called, the daemon must first validate the open attempt, by running an
427external programme or script. This is far more flexible than access
428control lists, as access can be determined on the basis of other
429system conditions instead of just the UID and GID.
430
431Inode lookup events can be used to authenticate module autoload
432requests. Instead of using kmod directly, the event is sent to
433devfsd which can implement an arbitrary authentication before loading
434the module itself.
435
436Inode lookup events can also be used to construct arbitrary
437namespaces, without having to resort to populating devfs with symlinks
438to devices that don't exist.
439
440Speculative Device Scanning
441
442Consider an application (like cdparanoia) that wants to find all
443CD-ROM devices on the system (SCSI, IDE and other types), whether or
444not their respective modules are loaded. The application must
445speculatively open certain device nodes (such as /dev/sr0 for the SCSI
446CD-ROMs) in order to make sure the module is loaded. This requires
447that all Linux distributions follow the standard device naming scheme
448(last time I looked RedHat did things differently). Devfs solves the
449naming problem.
450
451The same application also wants to see which devices are actually
452available on the system. With the existing system it needs to read the
453/dev directory and speculatively open each /dev/sr* device to
454determine if the device exists or not. With a large /dev this is an
455inefficient operation, especially if there are many /dev/sr* nodes. A
456solution like scsidev could reduce the number of /dev/sr* entries (but
457of course that also requires all that inefficient directory scanning).
458
459With devfs, the application can open the /dev/sr directory
460(which triggers the module autoloading if required), and proceed to
461read /dev/sr. Since only the available devices will have
462entries, there are no inefficencies in directory scanning or device
463openings.
464
465-----------------------------------------------------------------------------
466
467Who else does it?
468
469FreeBSD has a devfs implementation. Solaris and AIX each have a
470pseudo-devfs (something akin to scsidev but for all devices, with some
471unspecified kernel support). BeOS, Plan9 and QNX also have it. SGI's
472IRIX 6.4 and above also have a device filesystem.
473
474While we shouldn't just automatically do something because others do
475it, we should not ignore the work of others either. FreeBSD has a lot
476of competent people working on it, so their opinion should not be
477blithely ignored.
478
479-----------------------------------------------------------------------------
480
481
482How it works
483
484Registering device entries
485
486For every entry (device node) in a devfs-based /dev a driver must call
487devfs_register(). This adds the name of the device entry, the
488file_operations structure pointer and a few other things to an
489internal table. Device entries may be added and removed at any
490time. When a device entry is registered, it automagically appears in
491any mounted devfs'.
492
493Inode lookup
494
495When a lookup operation on an entry is performed and if there is no
496driver information for that entry devfs will attempt to call
497devfsd. If still no driver information can be found then a negative
498dentry is yielded and the next stage operation will be called by the
499VFS (such as create() or mknod() inode methods). If driver information
500can be found, an inode is created (if one does not exist already) and
501all is well.
502
503Manually creating device nodes
504
505The mknod() method allows you to create an ordinary named pipe in the
506devfs, or you can create a character or block special inode if one
507does not already exist. You may wish to create a character or block
508special inode so that you can set permissions and ownership. Later, if
509a device driver registers an entry with the same name, the
510permissions, ownership and times are retained. This is how you can set
511the protections on a device even before the driver is loaded. Once you
512create an inode it appears in the directory listing.
513
514Unregistering device entries
515
516A device driver calls devfs_unregister() to unregister an entry.
517
518Chroot() gaols
519
5202.2.x kernels
521
522The semantics of inode creation are different when devfs is mounted
523with the "explicit" option. Now, when a device entry is registered, it
524will not appear until you use mknod() to create the device. It doesn't
525matter if you mknod() before or after the device is registered with
526devfs_register(). The purpose of this behaviour is to support
527chroot(2) gaols, where you want to mount a minimal devfs inside the
528gaol. Only the devices you specifically want to be available (through
529your mknod() setup) will be accessible.
530
5312.4.x kernels
532
533As of kernel 2.3.99, the VFS has had the ability to rebind parts of
534the global filesystem namespace into another part of the namespace.
535This now works even at the leaf-node level, which means that
536individual files and device nodes may be bound into other parts of the
537namespace. This is like making links, but better, because it works
538across filesystems (unlike hard links) and works through chroot()
539gaols (unlike symbolic links).
540
541Because of these improvements to the VFS, the multi-mount capability
542in devfs is no longer needed. The administrator may create a minimal
543device tree inside a chroot(2) gaol by using VFS bindings. As this
544provides most of the features of the devfs multi-mount capability, I
545removed the multi-mount support code (after issuing an RFC). This
546yielded code size reductions and simplifications.
547
548If you want to construct a minimal chroot() gaol, the following
549command should suffice:
550
551mount --bind /dev/null /gaol/dev/null
552
553
554Repeat for other device nodes you want to expose. Simple!
555
556-----------------------------------------------------------------------------
557
558
559Operational issues
560
561
562Instructions for the impatient
563
564Nobody likes reading documentation. People just want to get in there
565and play. So this section tells you quickly the steps you need to take
566to run with devfs mounted over /dev. Skip these steps and you will end
567up with a nearly unbootable system. Subsequent sections describe the
568issues in more detail, and discuss non-essential configuration
569options.
570
571Devfsd
572OK, if you're reading this, I assume you want to play with
573devfs. First you should ensure that /usr/src/linux contains a
574recent kernel source tree. Then you need to compile devfsd, the device
575management daemon, available at
576
577http://www.atnf.csiro.au/~rgooch/linux/.
578Because the kernel has a naming scheme
579which is quite different from the old naming scheme, you need to
580install devfsd so that software and configuration files that use the
581old naming scheme will not break.
582
583Compile and install devfsd. You will be provided with a default
584configuration file /etc/devfsd.conf which will provide
585compatibility symlinks for the old naming scheme. Don't change this
586config file unless you know what you're doing. Even if you think you
587do know what you're doing, don't change it until you've followed all
588the steps below and booted a devfs-enabled system and verified that it
589works.
590
591Now edit your main system boot script so that devfsd is started at the
592very beginning (before any filesystem
593checks). /etc/rc.d/rc.sysinit is often the main boot script
594on systems with SysV-style boot scripts. On systems with BSD-style
595boot scripts it is often /etc/rc. Also check
596/sbin/rc.
597
598NOTE that the line you put into the boot
599script should be exactly:
600
601/sbin/devfsd /dev
602
603DO NOT use some special daemon-launching
604programme, otherwise the boot script may not wait for devfsd to finish
605initialising.
606
607System Libraries
608There may still be some problems because of broken software making
609assumptions about device names. In particular, some software does not
610handle devices which are symbolic links. If you are running a libc 5
611based system, install libc 5.4.44 (if you have libc 5.4.46, go back to
612libc 5.4.44, which is actually correct). If you are running a glibc
613based system, make sure you have glibc 2.1.3 or later.
614
615/etc/securetty
616PAM (Pluggable Authentication Modules) is supposed to be a flexible
617mechanism for providing better user authentication and access to
618services. Unfortunately, it's also fragile, complex and undocumented
619(check out RedHat 6.1, and probably other distributions as well). PAM
620has problems with symbolic links. Append the following lines to your
621/etc/securetty file:
622
623vc/1
624vc/2
625vc/3
626vc/4
627vc/5
628vc/6
629vc/7
630vc/8
631
632This will not weaken security. If you have a version of util-linux
633earlier than 2.10.h, please upgrade to 2.10.h or later. If you
634absolutely cannot upgrade, then also append the following lines to
635your /etc/securetty file:
636
6371
6382
6393
6404
6415
6426
6437
6448
645
646This may potentially weaken security by allowing root logins over the
647network (a password is still required, though). However, since there
648are problems with dealing with symlinks, I'm suspicious of the level
649of security offered in any case.
650
651XFree86
652While not essential, it's probably a good idea to upgrade to XFree86
6534.0, as patches went in to make it more devfs-friendly. If you don't,
654you'll probably need to apply the following patch to
655/etc/security/console.perms so that ordinary users can run
656startx. Note that not all distributions have this file (e.g. Debian),
657so if it's not present, don't worry about it.
658
659--- /etc/security/console.perms.orig Sat Apr 17 16:26:47 1999
660+++ /etc/security/console.perms Fri Feb 25 23:53:55 2000
661@@ -14,7 +14,7 @@
662 # man 5 console.perms
663
664 # file classes -- these are regular expressions
665-<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
666+<console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
667
668 # device classes -- these are shell-style globs
669 <floppy>=/dev/fd[0-1]*
670
671If the patch does not apply, then change the line:
672
673<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
674
675with:
676
677<console>=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
678
679
680Disable devpts
681I've had a report of devpts mounted on /dev/pts not working
682correctly. Since devfs will also manage /dev/pts, there is no
683need to mount devpts as well. You should either edit your
684/etc/fstab so devpts is not mounted, or disable devpts from
685your kernel configuration.
686
687Unsupported drivers
688Not all drivers have devfs support. If you depend on one of these
689drivers, you will need to create a script or tarfile that you can use
690at boot time to create device nodes as appropriate. There is a
691section which describes this. Another
692section lists the drivers which have
693devfs support.
694
695/dev/mouse
696
697Many disributions configure /dev/mouse to be the mouse device
698for XFree86 and GPM. I actually think this is a bad idea, because it
699adds another level of indirection. When looking at a config file, if
700you see /dev/mouse you're left wondering which mouse
701is being referred to. Hence I recommend putting the actual mouse
702device (for example /dev/psaux) into your
703/etc/X11/XF86Config file (and similarly for the GPM
704configuration file).
705
706Alternatively, use the same technique used for unsupported drivers
707described above.
708
709The Kernel
710Finally, you need to make sure devfs is compiled into your kernel. Set
711CONFIG_EXPERIMENTAL=y, CONFIG_DEVFS_FS=y and CONFIG_DEVFS_MOUNT=y by
712using favourite configuration tool (i.e. make config or
713make xconfig) and then make dep; make clean and then
714recompile your kernel and modules. At boot, devfs will be mounted onto
715/dev.
716
717If you encounter problems booting (for example if you forgot a
718configuration step), you can pass devfs=nomount at the kernel
719boot command line. This will prevent the kernel from mounting devfs at
720boot time onto /dev.
721
722In general, a kernel built with CONFIG_DEVFS_FS=y but without mounting
723devfs onto /dev is completely safe, and requires no
724configuration changes. One exception to take note of is when
725LABEL= directives are used in /etc/fstab. In this
726case you will be unable to boot properly. This is because the
727mount(8) programme uses /proc/partitions as part of
728the volume label search process, and the device names it finds are not
729available, because setting CONFIG_DEVFS_FS=y changes the names in
730/proc/partitions, irrespective of whether devfs is mounted.
731
732Now you've finished all the steps required. You're now ready to boot
733your shiny new kernel. Enjoy.
734
735Changing the configuration
736
737OK, you've now booted a devfs-enabled system, and everything works.
738Now you may feel like changing the configuration (common targets are
739/etc/fstab and /etc/devfsd.conf). Since you have a
740system that works, if you make any changes and it doesn't work, you
741now know that you only have to restore your configuration files to the
742default and it will work again.
743
744
745Permissions persistence across reboots
746
747If you don't use mknod(2) to create a device file, nor use chmod(2) or
748chown(2) to change the ownerships/permissions, the inode ctime will
749remain at 0 (the epoch, 12 am, 1-JAN-1970, GMT). Anything with a ctime
750later than this has had it's ownership/permissions changed. Hence, a
751simple script or programme may be used to tar up all changed inodes,
752prior to shutdown. Although effective, many consider this approach a
753kludge.
754
755A much better approach is to use devfsd to save and restore
756permissions. It may be configured to record changes in permissions and
757will save them in a database (in fact a directory tree), and restore
758these upon boot. This is an efficient method and results in immediate
759saving of current permissions (unlike the tar approach, which saves
760permissions at some unspecified future time).
761
762The default configuration file supplied with devfsd has config entries
763which you may uncomment to enable persistence management.
764
765If you decide to use the tar approach anyway, be aware that tar will
766first unlink(2) an inode before creating a new device node. The
767unlink(2) has the effect of breaking the connection between a devfs
768entry and the device driver. If you use the "devfs=only" boot option,
769you lose access to the device driver, requiring you to reload the
770module. I consider this a bug in tar (there is no real need to
771unlink(2) the inode first).
772
773Alternatively, you can use devfsd to provide more sophisticated
774management of device permissions. You can use devfsd to store
775permissions for whole groups of devices with a single configuration
776entry, rather than the conventional single entry per device entry.
777
778Permissions database stored in mounted-over /dev
779
780If you wish to save and restore your device permissions into the
781disc-based /dev while still mounting devfs onto /dev
782you may do so. This requires a 2.4.x kernel (in fact, 2.3.99 or
783later), which has the VFS binding facility. You need to do the
784following to set this up:
785
786
787
788make sure the kernel does not mount devfs at boot time
789
790
791make sure you have a correct /dev/console entry in your
792root file-system (where your disc-based /dev lives)
793
794create the /dev-state directory
795
796
797add the following lines near the very beginning of your boot
798scripts:
799
800mount --bind /dev /dev-state
801mount -t devfs none /dev
802devfsd /dev
803
804
805
806
807add the following lines to your /etc/devfsd.conf file:
808
809REGISTER ^pt[sy] IGNORE
810CREATE ^pt[sy] IGNORE
811CHANGE ^pt[sy] IGNORE
812DELETE ^pt[sy] IGNORE
813REGISTER .* COPY /dev-state/$devname $devpath
814CREATE .* COPY $devpath /dev-state/$devname
815CHANGE .* COPY $devpath /dev-state/$devname
816DELETE .* CFUNCTION GLOBAL unlink /dev-state/$devname
817RESTORE /dev-state
818
819Note that the sample devfsd.conf file contains these lines,
820as well as other sample configurations you may find useful. See the
821devfsd distribution
822
823
824reboot.
825
826
827
828
829Permissions database stored in normal directory
830
831If you are using an older kernel which doesn't support VFS binding,
832then you won't be able to have the permissions database in a
833mounted-over /dev. However, you can still use a regular
834directory to store the database. The sample /etc/devfsd.conf
835file above may still be used. You will need to create the
836/dev-state directory prior to installing devfsd. If you have
837old permissions in /dev, then just copy (or move) the device
838nodes over to the new directory.
839
840Which method is better?
841
842The best method is to have the permissions database stored in the
843mounted-over /dev. This is because you will not need to copy
844device nodes over to /dev-state, and because it allows you to
845switch between devfs and non-devfs kernels, without requiring you to
846copy permissions between /dev-state (for devfs) and
847/dev (for non-devfs).
848
849
850Dealing with drivers without devfs support
851
852Currently, not all device drivers in the kernel have been modified to
853use devfs. Device drivers which do not yet have devfs support will not
854automagically appear in devfs. The simplest way to create device nodes
855for these drivers is to unpack a tarfile containing the required
856device nodes. You can do this in your boot scripts. All your drivers
857will now work as before.
858
859Hopefully for most people devfs will have enough support so that they
860can mount devfs directly over /dev without losing most functionality
861(i.e. losing access to various devices). As of 22-JAN-1998 (devfs
862patch version 10) I am now running this way. All the devices I have
863are available in devfs, so I don't lose anything.
864
865WARNING: if your configuration requires the old-style device names
866(i.e. /dev/hda1 or /dev/sda1), you must install devfsd and configure
867it to maintain compatibility entries. It is almost certain that you
868will require this. Note that the kernel creates a compatibility entry
869for the root device, so you don't need initrd.
870
871Note that you no longer need to mount devpts if you use Unix98 PTYs,
872as devfs can manage /dev/pts itself. This saves you some RAM, as you
873don't need to compile and install devpts. Note that some versions of
874glibc have a bug with Unix98 pty handling on devfs systems. Contact
875the glibc maintainers for a fix. Glibc 2.1.3 has the fix.
876
877Note also that apart from editing /etc/fstab, other things will need
878to be changed if you *don't* install devfsd. Some software (like the X
879server) hard-wire device names in their source. It really is much
880easier to install devfsd so that compatibility entries are created.
881You can then slowly migrate your system to using the new device names
882(for example, by starting with /etc/fstab), and then limiting the
883compatibility entries that devfsd creates.
884
885IF YOU CONFIGURE TO MOUNT DEVFS AT BOOT, MAKE SURE YOU INSTALL DEVFSD
886BEFORE YOU BOOT A DEVFS-ENABLED KERNEL!
887
888Now that devfs has gone into the 2.3.46 kernel, I'm getting a lot of
889reports back. Many of these are because people are trying to run
890without devfsd, and hence some things break. Please just run devfsd if
891things break. I want to concentrate on real bugs rather than
892misconfiguration problems at the moment. If people are willing to fix
893bugs/false assumptions in other code (i.e. glibc, X server) and submit
894that to the respective maintainers, that would be great.
895
896
897All the way with Devfs
898
899The devfs kernel patch creates a rationalised device tree. As stated
900above, if you want to keep using the old /dev naming scheme,
901you just need to configure devfsd appopriately (see the man
902page). People who prefer the old names can ignore this section. For
903those of us who like the rationalised names and an uncluttered
904/dev, read on.
905
906If you don't run devfsd, or don't enable compatibility entry
907management, then you will have to configure your system to use the new
908names. For example, you will then need to edit your
909/etc/fstab to use the new disc naming scheme. If you want to
910be able to boot non-devfs kernels, you will need compatibility
911symlinks in the underlying disc-based /dev pointing back to
912the old-style names for when you boot a kernel without devfs.
913
914You can selectively decide which devices you want compatibility
915entries for. For example, you may only want compatibility entries for
916BSD pseudo-terminal devices (otherwise you'll have to patch you C
917library or use Unix98 ptys instead). It's just a matter of putting in
918the correct regular expression into /dev/devfsd.conf.
919
920There are other choices of naming schemes that you may prefer. For
921example, I don't use the kernel-supplied
922names, because they are too verbose. A common misconception is
923that the kernel-supplied names are meant to be used directly in
924configuration files. This is not the case. They are designed to
925reflect the layout of the devices attached and to provide easy
926classification.
927
928If you like the kernel-supplied names, that's fine. If you don't then
929you should be using devfsd to construct a namespace more to your
930liking. Devfsd has built-in code to construct a
931namespace that is both logical and easy to
932manage. In essence, it creates a convenient abbreviation of the
933kernel-supplied namespace.
934
935You are of course free to build your own namespace. Devfsd has all the
936infrastructure required to make this easy for you. All you need do is
937write a script. You can even write some C code and devfsd can load the
938shared object as a callable extension.
939
940
941Other Issues
942
943The init programme
944Another thing to take note of is whether your init programme
945creates a Unix socket /dev/telinit. Some versions of init
946create /dev/telinit so that the telinit programme can
947communicate with the init process. If you have such a system you need
948to make sure that devfs is mounted over /dev *before* init
949starts. In other words, you can't leave the mounting of devfs to
950/etc/rc, since this is executed after init. Other
951versions of init require a named pipe /dev/initctl
952which must exist *before* init starts. Once again, you need to
953mount devfs and then create the named pipe *before* init
954starts.
955
956The default behaviour now is not to mount devfs onto /dev at
957boot time for 2.3.x and later kernels. You can correct this with the
958"devfs=mount" boot option. This solves any problems with init,
959and also prevents the dreaded:
960
961Cannot open initial console
962
963message. For 2.2.x kernels where you need to apply the devfs patch,
964the default is to mount.
965
966If you have automatic mounting of devfs onto /dev then you
967may need to create /dev/initctl in your boot scripts. The
968following lines should suffice:
969
970mknod /dev/initctl p
971kill -SIGUSR1 1 # tell init that /dev/initctl now exists
972
973Alternatively, if you don't want the kernel to mount devfs onto
974/dev then you could use the following procedure is a
975guideline for how to get around /dev/initctl problems:
976
977# cd /sbin
978# mv init init.real
979# cat > init
980#! /bin/sh
981mount -n -t devfs none /dev
982mknod /dev/initctl p
983exec /sbin/init.real $*
984[control-D]
985# chmod a+x init
986
987Note that newer versions of init create /dev/initctl
988automatically, so you don't have to worry about this.
989
990Module autoloading
991You will need to configure devfsd to enable module
992autoloading. The following lines should be placed in your
993/etc/devfsd.conf file:
994
995LOOKUP .* MODLOAD
996
997
998As of devfsd-v1.3.10, a generic /etc/modules.devfs
999configuration file is installed, which is used by the MODLOAD
1000action. This should be sufficient for most configurations. If you
1001require further configuration, edit your /etc/modules.conf
1002file. The way module autoloading work with devfs is:
1003
1004
1005a process attempts to lookup a device node (e.g. /dev/fred)
1006
1007
1008if that device node does not exist, the full pathname is passed to
1009devfsd as a string
1010
1011
1012devfsd will pass the string to the modprobe programme (provided the
1013configuration line shown above is present), and specifies that
1014/etc/modules.devfs is the configuration file
1015
1016
1017/etc/modules.devfs includes /etc/modules.conf to
1018access local configurations
1019
1020modprobe will search it's configuration files, looking for an alias
1021that translates the pathname into a module name
1022
1023
1024the translated pathname is then used to load the module.
1025
1026
1027If you wanted a lookup of /dev/fred to load the
1028mymod module, you would require the following configuration
1029line in /etc/modules.conf:
1030
1031alias /dev/fred mymod
1032
1033The /etc/modules.devfs configuration file provides many such
1034aliases for standard device names. If you look closely at this file,
1035you will note that some modules require multiple alias configuration
1036lines. This is required to support module autoloading for old and new
1037device names.
1038
1039Mounting root off a devfs device
1040If you wish to mount root off a devfs device when you pass the
1041"devfs=only" boot option, then you need to pass in the
1042"root=<device>" option to the kernel when booting. If you use
1043LILO, then you must have this in lilo.conf:
1044
1045append = "root=<device>"
1046
1047Surprised? Yep, so was I. It turns out if you have (as most people
1048do):
1049
1050root = <device>
1051
1052
1053then LILO will determine the device number of <device> and will
1054write that device number into a special place in the kernel image
1055before starting the kernel, and the kernel will use that device number
1056to mount the root filesystem. So, using the "append" variety ensures
1057that LILO passes the root filesystem device as a string, which devfs
1058can then use.
1059
1060Note that this isn't an issue if you don't pass "devfs=only".
1061
1062TTY issues
1063The ttyname(3) function in some versions of the C library makes
1064false assumptions about device entries which are symbolic links. The
1065tty(1) programme is one that depends on this function. I've
1066written a patch to libc 5.4.43 which fixes this. This has been
1067included in libc 5.4.44 and a similar fix is in glibc 2.1.3.
1068
1069
1070Kernel Naming Scheme
1071
1072The kernel provides a default naming scheme. This scheme is designed
1073to make it easy to search for specific devices or device types, and to
1074view the available devices. Some device types (such as hard discs),
1075have a directory of entries, making it easy to see what devices of
1076that class are available. Often, the entries are symbolic links into a
1077directory tree that reflects the topology of available devices. The
1078topological tree is useful for finding how your devices are arranged.
1079
1080Below is a list of the naming schemes for the most common drivers. A
1081list of reserved device names is
1082available for reference. Please send email to
1083rgooch@atnf.csiro.au to obtain an allocation. Please be
1084patient (the maintainer is busy). An alternative name may be allocated
1085instead of the requested name, at the discretion of the maintainer.
1086
1087Disc Devices
1088
1089All discs, whether SCSI, IDE or whatever, are placed under the
1090/dev/discs hierarchy:
1091
1092 /dev/discs/disc0 first disc
1093 /dev/discs/disc1 second disc
1094
1095
1096Each of these entries is a symbolic link to the directory for that
1097device. The device directory contains:
1098
1099 disc for the whole disc
1100 part* for individual partitions
1101
1102
1103CD-ROM Devices
1104
1105All CD-ROMs, whether SCSI, IDE or whatever, are placed under the
1106/dev/cdroms hierarchy:
1107
1108 /dev/cdroms/cdrom0 first CD-ROM
1109 /dev/cdroms/cdrom1 second CD-ROM
1110
1111
1112Each of these entries is a symbolic link to the real device entry for
1113that device.
1114
1115Tape Devices
1116
1117All tapes, whether SCSI, IDE or whatever, are placed under the
1118/dev/tapes hierarchy:
1119
1120 /dev/tapes/tape0 first tape
1121 /dev/tapes/tape1 second tape
1122
1123
1124Each of these entries is a symbolic link to the directory for that
1125device. The device directory contains:
1126
1127 mt for mode 0
1128 mtl for mode 1
1129 mtm for mode 2
1130 mta for mode 3
1131 mtn for mode 0, no rewind
1132 mtln for mode 1, no rewind
1133 mtmn for mode 2, no rewind
1134 mtan for mode 3, no rewind
1135
1136
1137SCSI Devices
1138
1139To uniquely identify any SCSI device requires the following
1140information:
1141
1142 controller (host adapter)
1143 bus (SCSI channel)
1144 target (SCSI ID)
1145 unit (Logical Unit Number)
1146
1147
1148All SCSI devices are placed under /dev/scsi (assuming devfs
1149is mounted on /dev). Hence, a SCSI device with the following
1150parameters: c=1,b=2,t=3,u=4 would appear as:
1151
1152 /dev/scsi/host1/bus2/target3/lun4 device directory
1153
1154
1155Inside this directory, a number of device entries may be created,
1156depending on which SCSI device-type drivers were installed.
1157
1158See the section on the disc naming scheme to see what entries the SCSI
1159disc driver creates.
1160
1161See the section on the tape naming scheme to see what entries the SCSI
1162tape driver creates.
1163
1164The SCSI CD-ROM driver creates:
1165
1166 cd
1167
1168
1169The SCSI generic driver creates:
1170
1171 generic
1172
1173
1174IDE Devices
1175
1176To uniquely identify any IDE device requires the following
1177information:
1178
1179 controller
1180 bus (aka. primary/secondary)
1181 target (aka. master/slave)
1182 unit
1183
1184
1185All IDE devices are placed under /dev/ide, and uses a similar
1186naming scheme to the SCSI subsystem.
1187
1188XT Hard Discs
1189
1190All XT discs are placed under /dev/xd. The first XT disc has
1191the directory /dev/xd/disc0.
1192
1193TTY devices
1194
1195The tty devices now appear as:
1196
1197 New name Old-name Device Type
1198 -------- -------- -----------
1199 /dev/tts/{0,1,...} /dev/ttyS{0,1,...} Serial ports
1200 /dev/cua/{0,1,...} /dev/cua{0,1,...} Call out devices
1201 /dev/vc/0 /dev/tty Current virtual console
1202 /dev/vc/{1,2,...} /dev/tty{1...63} Virtual consoles
1203 /dev/vcc/{0,1,...} /dev/vcs{1...63} Virtual consoles
1204 /dev/pty/m{0,1,...} /dev/ptyp?? PTY masters
1205 /dev/pty/s{0,1,...} /dev/ttyp?? PTY slaves
1206
1207
1208RAMDISCS
1209
1210The RAMDISCS are placed in their own directory, and are named thus:
1211
1212 /dev/rd/{0,1,2,...}
1213
1214
1215Meta Devices
1216
1217The meta devices are placed in their own directory, and are named
1218thus:
1219
1220 /dev/md/{0,1,2,...}
1221
1222
1223Floppy discs
1224
1225Floppy discs are placed in the /dev/floppy directory.
1226
1227Loop devices
1228
1229Loop devices are placed in the /dev/loop directory.
1230
1231Sound devices
1232
1233Sound devices are placed in the /dev/sound directory
1234(audio, sequencer, ...).
1235
1236
1237Devfsd Naming Scheme
1238
1239Devfsd provides a naming scheme which is a convenient abbreviation of
1240the kernel-supplied namespace. In some
1241cases, the kernel-supplied naming scheme is quite convenient, so
1242devfsd does not provide another naming scheme. The convenience names
1243that devfsd creates are in fact the same names as the original devfs
1244kernel patch created (before Linus mandated the Big Name
1245Change). These are referred to as "new compatibility entries".
1246
1247In order to configure devfsd to create these convenience names, the
1248following lines should be placed in your /etc/devfsd.conf:
1249
1250REGISTER .* MKNEWCOMPAT
1251UNREGISTER .* RMNEWCOMPAT
1252
1253This will cause devfsd to create (and destroy) symbolic links which
1254point to the kernel-supplied names.
1255
1256SCSI Hard Discs
1257
1258All SCSI discs are placed under /dev/sd (assuming devfs is
1259mounted on /dev). Hence, a SCSI disc with the following
1260parameters: c=1,b=2,t=3,u=4 would appear as:
1261
1262 /dev/sd/c1b2t3u4 for the whole disc
1263 /dev/sd/c1b2t3u4p5 for the 5th partition
1264 /dev/sd/c1b2t3u4p5s6 for the 6th slice in the 5th partition
1265
1266
1267SCSI Tapes
1268
1269All SCSI tapes are placed under /dev/st. A similar naming
1270scheme is used as for SCSI discs. A SCSI tape with the
1271parameters:c=1,b=2,t=3,u=4 would appear as:
1272
1273 /dev/st/c1b2t3u4m0 for mode 0
1274 /dev/st/c1b2t3u4m1 for mode 1
1275 /dev/st/c1b2t3u4m2 for mode 2
1276 /dev/st/c1b2t3u4m3 for mode 3
1277 /dev/st/c1b2t3u4m0n for mode 0, no rewind
1278 /dev/st/c1b2t3u4m1n for mode 1, no rewind
1279 /dev/st/c1b2t3u4m2n for mode 2, no rewind
1280 /dev/st/c1b2t3u4m3n for mode 3, no rewind
1281
1282
1283SCSI CD-ROMs
1284
1285All SCSI CD-ROMs are placed under /dev/sr. A similar naming
1286scheme is used as for SCSI discs. A SCSI CD-ROM with the
1287parameters:c=1,b=2,t=3,u=4 would appear as:
1288
1289 /dev/sr/c1b2t3u4
1290
1291
1292SCSI Generic Devices
1293
1294The generic (aka. raw) interface for all SCSI devices are placed under
1295/dev/sg. A similar naming scheme is used as for SCSI discs. A
1296SCSI generic device with the parameters:c=1,b=2,t=3,u=4 would appear
1297as:
1298
1299 /dev/sg/c1b2t3u4
1300
1301
1302IDE Hard Discs
1303
1304All IDE discs are placed under /dev/ide/hd, using a similar
1305convention to SCSI discs. The following mappings exist between the new
1306and the old names:
1307
1308 /dev/hda /dev/ide/hd/c0b0t0u0
1309 /dev/hdb /dev/ide/hd/c0b0t1u0
1310 /dev/hdc /dev/ide/hd/c0b1t0u0
1311 /dev/hdd /dev/ide/hd/c0b1t1u0
1312
1313
1314IDE Tapes
1315
1316A similar naming scheme is used as for IDE discs. The entries will
1317appear in the /dev/ide/mt directory.
1318
1319IDE CD-ROM
1320
1321A similar naming scheme is used as for IDE discs. The entries will
1322appear in the /dev/ide/cd directory.
1323
1324IDE Floppies
1325
1326A similar naming scheme is used as for IDE discs. The entries will
1327appear in the /dev/ide/fd directory.
1328
1329XT Hard Discs
1330
1331All XT discs are placed under /dev/xd. The first XT disc
1332would appear as /dev/xd/c0t0.
1333
1334
1335Old Compatibility Names
1336
1337The old compatibility names are the legacy device names, such as
1338/dev/hda, /dev/sda, /dev/rtc and so on.
1339Devfsd can be configured to create compatibility symlinks so that you
1340may continue to use the old names in your configuration files and so
1341that old applications will continue to function correctly.
1342
1343In order to configure devfsd to create these legacy names, the
1344following lines should be placed in your /etc/devfsd.conf:
1345
1346REGISTER .* MKOLDCOMPAT
1347UNREGISTER .* RMOLDCOMPAT
1348
1349This will cause devfsd to create (and destroy) symbolic links which
1350point to the kernel-supplied names.
1351
1352
1353SCSI Host Probing Issues
1354
1355Devfs allows you to identify SCSI discs based in part on SCSI host
1356numbers. If you have only one SCSI host (card) in your computer, then
1357clearly it will be given host number 0. Life is not always that easy
1358is you have multiple SCSI hosts. Unfortunately, it can sometimes be
1359difficult to guess what the probing order of SCSI hosts is. You need
1360to know the probe order before you can use device names. To make this
1361easy, there is a kernel boot parameter called "scsihosts". This allows
1362you to specify the probe order for different types of SCSI hosts. The
1363syntax of this parameter is:
1364
1365scsihosts=<name_1>:<name_2>:<name_3>:...:<name_n>
1366
1367where <name_1>,<name_2>,...,<name_n> are the names
1368of drivers used in the /proc filesystem. For example:
1369
1370 scsihosts=aha1542:ppa:aha1542::ncr53c7xx
1371
1372
1373means that devices connected to
1374
1375- first aha1542 controller - will be /dev/scsi/host0/bus#/target#/lun#
1376- first parallel port ZIP - will be /dev/scsi/host1/bus#/target#/lun#
1377- second aha1542 controller - will be /dev/scsi/host2/bus#/target#/lun#
1378- first NCR53C7xx controller - will be /dev/scsi/host4/bus#/target#/lun#
1379- any extra controller - will be /dev/scsi/host5/bus#/target#/lun#,
1380 /dev/scsi/host6/bus#/target#/lun#, etc
1381- if any of above controllers will not be found - the reserved names will
1382 not be used by any other device.
1383- /dev/scsi/host3/bus#/target#/lun# names will never be used
1384
1385
1386You can use ',' instead of ':' as the separator character if you
1387wish. I have used the devfsd naming scheme
1388here.
1389
1390Note that this scheme does not address the SCSI host order if you have
1391multiple cards of the same type (such as NCR53c8xx). In this case you
1392need to use the driver-specific boot parameters to control this.
1393
1394-----------------------------------------------------------------------------
1395
1396
1397Device drivers currently ported
1398
1399- All miscellaneous character devices support devfs (this is done
1400 transparently through misc_register())
1401
1402- SCSI discs and generic hard discs
1403
1404- Character memory devices (null, zero, full and so on)
1405 Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
1406
1407- Loop devices (/dev/loop?)
1408
1409- TTY devices (console, serial ports, terminals and pseudo-terminals)
1410 Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
1411
1412- SCSI tapes (/dev/scsi and /dev/tapes)
1413
1414- SCSI CD-ROMs (/dev/scsi and /dev/cdroms)
1415
1416- SCSI generic devices (/dev/scsi)
1417
1418- RAMDISCS (/dev/ram?)
1419
1420- Meta Devices (/dev/md*)
1421
1422- Floppy discs (/dev/floppy)
1423
1424- Parallel port printers (/dev/printers)
1425
1426- Sound devices (/dev/sound)
1427 Thanks to Eric Dumas <dumas@linux.eu.org> and
1428 C. Scott Ananian <cananian@alumni.princeton.edu>
1429
1430- Joysticks (/dev/joysticks)
1431
1432- Sparc keyboard (/dev/kbd)
1433
1434- DSP56001 digital signal processor (/dev/dsp56k)
1435
1436- Apple Desktop Bus (/dev/adb)
1437
1438- Coda network file system (/dev/cfs*)
1439
1440- Virtual console capture devices (/dev/vcc)
1441 Thanks to Dennis Hou <smilax@mindmeld.yi.org>
1442
1443- Frame buffer devices (/dev/fb)
1444
1445- Video capture devices (/dev/v4l)
1446
1447
1448-----------------------------------------------------------------------------
1449
1450
1451Allocation of Device Numbers
1452
1453Devfs allows you to write a driver which doesn't need to allocate a
1454device number (major&minor numbers) for the internal operation of the
1455kernel. However, there are a number of userspace programmes that use
1456the device number as a unique handle for a device. An example is the
1457find programme, which uses device numbers to determine whether
1458an inode is on a different filesystem than another inode. The device
1459number used is the one for the block device which a filesystem is
1460using. To preserve compatibility with userspace programmes, block
1461devices using devfs need to have unique device numbers allocated to
1462them. Furthermore, POSIX specifies device numbers, so some kind of
1463device number needs to be presented to userspace.
1464
1465The simplest option (especially when porting drivers to devfs) is to
1466keep using the old major and minor numbers. Devfs will take whatever
1467values are given for major&minor and pass them onto userspace.
1468
1469Alternatively, you can have devfs choose unique device numbers for
1470you. When you register a character or block device using
1471devfs_register you can provide the optional
1472DEVFS_FL_AUTO_DEVNUM flag, which will then automatically allocate a
1473unique device number (the allocation is separated for the character
1474and block devices).
1475
1476This device number is a 16 bit number, so this leaves plenty of space
1477for large numbers of discs and partitions. This scheme can also be
1478used for character devices, in particular the tty devices, which are
1479currently limited to 256 pseudo-ttys (this limits the total number of
1480simultaneous xterms and remote logins). Note that the device number
1481is limited to the range 36864-61439 (majors 144-239), in order to
1482avoid any possible conflicts with existing official allocations.
1483
1484Please note that using dynamically allocated block device numbers may
1485break the NFS daemons (both user and kernel mode), which expect dev_t
1486for a given device to be constant over the lifetime of remote mounts.
1487
1488A final note on this scheme: since it doesn't increase the size of
1489device numbers, there are no compatibility issues with userspace.
1490
1491-----------------------------------------------------------------------------
1492
1493
1494Questions and Answers
1495
1496
1497Making things work
1498Alternatives to devfs
1499What I don't like about devfs
1500How to report bugs
1501Strange kernel messages
1502Compilation problems with devfsd
1503
1504
1505
1506Making things work
1507
1508Here are some common questions and answers.
1509
1510
1511
1512Devfsd is not managing all my permissions
1513
1514Make sure you are capturing the appropriate events. For example,
1515device entries created by the kernel generate REGISTER events,
1516but those created by devfsd generate CREATE events.
1517
1518
1519Devfsd is not capturing all REGISTER events
1520
1521See the previous entry: you may need to capture CREATE events.
1522
1523
1524X will not start
1525
1526Make sure you followed the steps
1527outlined above.
1528
1529
1530Why don't my network devices appear in devfs?
1531
1532This is not a bug. Network devices have their own, completely separate
1533namespace. They are accessed via socket(2) and
1534setsockopt(2) calls, and thus require no device nodes. I have
1535raised the possibilty of moving network devices into the device
1536namespace, but have had no response.
1537
1538
1539How can I test if I have devfs compiled into my kernel?
1540
1541All filesystems built-in or currently loaded are listed in
1542/proc/filesystems. If you see a devfs entry, then
1543you know that devfs was compiled into your kernel. If you have
1544correctly configured and rebuilt your kernel, then devfs will be
1545built-in. If you think you've configured it in, but
1546/proc/filesystems doesn't show it, you've made a mistake.
1547Common mistakes include:
1548
1549Using a 2.2.x kernel without applying the devfs patch (if you
1550don't know how to patch your kernel, use 2.4.x instead, don't bother
1551asking me how to patch)
1552Forgetting to set CONFIG_EXPERIMENTAL=y
1553Forgetting to set CONFIG_DEVFS_FS=y
1554Forgetting to set CONFIG_DEVFS_MOUNT=y (if you want devfs
1555to be automatically mounted at boot)
1556Editing your .config manually, instead of using make
1557config or make xconfig
1558Forgetting to run make dep; make clean after changing the
1559configuration and before compiling
1560Forgetting to compile your kernel and modules
1561Forgetting to install your kernel
1562Forgetting to install your modules
1563
1564Please check twice that you've done all these steps before sending in
1565a bug report.
1566
1567
1568
1569How can I test if devfs is mounted on /dev?
1570
1571The device filesystem will always create an entry called
1572".devfsd", which is used to communicate with the daemon. Even
1573if the daemon is not running, this entry will exist. Testing for the
1574existence of this entry is the approved method of determining if devfs
1575is mounted or not. Note that the type of entry (i.e. regular file,
1576character device, named pipe, etc.) may change without notice. Only
1577the existence of the entry should be relied upon.
1578
1579
1580When I start devfsd, I see the error:
1581Error opening file: ".devfsd" No such file or directory?
1582
1583This means that devfs is not mounted. Make sure you have devfs mounted.
1584
1585
1586How do I mount devfs?
1587
1588First make sure you have devfs compiled into your kernel (see
1589above). Then you will either need to:
1590
1591set CONFIG_DEVFS_MOUNT=y in your kernel config
1592pass devfs=mount to your boot loader
1593mount devfs manually in your boot scripts with:
1594mount -t none devfs /dev
1595
1596
1597
1598Mount by volume LABEL=<label> doesn't work with
1599devfs
1600
1601Most probably you are not mounting devfs onto /dev. What
1602happens is that if your kernel config has CONFIG_DEVFS_FS=y
1603then the contents of /proc/partitions will have the devfs
1604names (such as scsi/host0/bus0/target0/lun0/part1). The
1605contents of /proc/partitions are used by mount(8) when
1606mounting by volume label. If devfs is not mounted on /dev,
1607then mount(8) will fail to find devices. The solution is to
1608make sure that devfs is mounted on /dev. See above for how to
1609do that.
1610
1611
1612I have extra or incorrect entries in /dev
1613
1614You may have stale entries in your dev-state area. Check for a
1615RESTORE configuration line in your devfsd configuration
1616(typically /etc/devfsd.conf). If you have this line, check
1617the contents of the specified directory for stale entries. Remove
1618any entries which are incorrect, then reboot.
1619
1620
1621I get "Unable to open initial console" messages at boot
1622
1623This usually happens when you don't have devfs automounted onto
1624/dev at boot time, and there is no valid
1625/dev/console entry on your root file-system. Create a valid
1626/dev/console device node.
1627
1628
1629
1630
1631
1632Alternatives to devfs
1633
1634I've attempted to collate all the anti-devfs proposals and explain
1635their limitations. Under construction.
1636
1637
1638Why not just pass device create/remove events to a daemon?
1639
1640Here the suggestion is to develop an API in the kernel so that devices
1641can register create and remove events, and a daemon listens for those
1642events. The daemon would then populate/depopulate /dev (which
1643resides on disc).
1644
1645This has several limitations:
1646
1647
1648it only works for modules loaded and unloaded (or devices inserted
1649and removed) after the kernel has finished booting. Without a database
1650of events, there is no way the daemon could fully populate
1651/dev
1652
1653
1654if you add a database to this scheme, the question is then how to
1655present that database to user-space. If you make it a list of strings
1656with embedded event codes which are passed through a pipe to the
1657daemon, then this is only of use to the daemon. I would argue that the
1658natural way to present this data is via a filesystem (since many of
1659the events will be of a hierarchical nature), such as devfs.
1660Presenting the data as a filesystem makes it easy for the user to see
1661what is available and also makes it easy to write scripts to scan the
1662"database"
1663
1664
1665the tight binding between device nodes and drivers is no longer
1666possible (requiring the otherwise perfectly avoidable
1667table lookups)
1668
1669
1670you cannot catch inode lookup events on /dev which means
1671that module autoloading requires device nodes to be created. This is a
1672problem, particularly for drivers where only a few inodes are created
1673from a potentially large set
1674
1675
1676this technique can't be used when the root FS is mounted
1677read-only
1678
1679
1680
1681
1682Just implement a better scsidev
1683
1684This suggestion involves taking the scsidev programme and
1685extending it to scan for all devices, not just SCSI devices. The
1686scsidev programme works by scanning /proc/scsi
1687
1688Problems:
1689
1690
1691the kernel does not currently provide a list of all devices
1692available. Not all drivers register entries in /proc or
1693generate kernel messages
1694
1695
1696there is no uniform mechanism to register devices other than the
1697devfs API
1698
1699
1700implementing such an API is then the same as the
1701proposal above
1702
1703
1704
1705
1706Put /dev on a ramdisc
1707
1708This suggestion involves creating a ramdisc and populating it with
1709device nodes and then mounting it over /dev.
1710
1711Problems:
1712
1713
1714
1715this doesn't help when mounting the root filesystem, since you
1716still need a device node to do that
1717
1718
1719if you want to use this technique for the root device node as
1720well, you need to use initrd. This complicates the booting sequence
1721and makes it significantly harder to administer and configure. The
1722initrd is essentially opaque, robbing the system administrator of easy
1723configuration
1724
1725
1726insufficient information is available to correctly populate the
1727ramdisc. So we come back to the
1728proposal above to "solve" this
1729
1730
1731a ramdisc-based solution would take more kernel memory, since the
1732backing store would be (at best) normal VFS inodes and dentries, which
1733take 284 bytes and 112 bytes, respectively, for each entry. Compare
1734that to 72 bytes for devfs
1735
1736
1737
1738
1739Do nothing: there's no problem
1740
1741Sometimes people can be heard to claim that the existing scheme is
1742fine. This is what they're ignoring:
1743
1744
1745device number size (8 bits each for major and minor) is a real
1746limitation, and must be fixed somehow. Systems with large numbers of
1747SCSI devices, for example, will continue to consume the remaining
1748unallocated major numbers. USB will also need to push beyond the 8 bit
1749minor limitation
1750
1751
1752simply increasing the device number size is insufficient. Apart
1753from causing a lot of pain, it doesn't solve the management issues
1754of a /dev with thousands or more device nodes
1755
1756
1757ignoring the problem of a huge /dev will not make it go
1758away, and dismisses the legitimacy of a large number of people who
1759want a dynamic /dev
1760
1761
1762the standard response then becomes: "write a device management
1763daemon", which brings us back to the
1764proposal above
1765
1766
1767
1768
1769What I don't like about devfs
1770
1771Here are some common complaints about devfs, and some suggestions and
1772solutions that may make it more palatable for you. I can't please
1773everybody, but I do try :-)
1774
1775I hate the naming scheme
1776
1777First, remember that no naming scheme will please everybody. You hate
1778the scheme, others love it. Who's to say who's right and who's wrong?
1779Ultimately, the person who writes the code gets to choose, and what
1780exists now is a combination of the choices made by the
1781devfs author and the
1782kernel maintainer (Linus).
1783
1784However, not all is lost. If you want to create your own naming
1785scheme, it is a simple matter to write a standalone script, hack
1786devfsd, or write a script called by devfsd. You can create whatever
1787naming scheme you like.
1788
1789Further, if you want to remove all traces of the devfs naming scheme
1790from /dev, you can mount devfs elsewhere (say
1791/devfs) and populate /dev with links into
1792/devfs. This population can be automated using devfsd if you
1793wish.
1794
1795You can even use the VFS binding facility to make the links, rather
1796than using symbolic links. This way, you don't even have to see the
1797"destination" of these symbolic links.
1798
1799Devfs puts policy into the kernel
1800
1801There's already policy in the kernel. Device numbers are in fact
1802policy (why should the kernel dictate what device numbers I use?).
1803Face it, some policy has to be in the kernel. The real difference
1804between device names as policy and device numbers as policy is that
1805no one will use device numbers directly, because device
1806numbers are devoid of meaning to humans and are ugly. At least with
1807the devfs device names, (even though you can add your own naming
1808scheme) some people will use the devfs-supplied names directly. This
1809offends some people :-)
1810
1811Devfs is bloatware
1812
1813This is not even remotely true. As shown above,
1814both code and data size are quite modest.
1815
1816
1817How to report bugs
1818
1819If you have (or think you have) a bug with devfs, please follow the
1820steps below:
1821
1822
1823
1824make sure you have enabled debugging output when configuring your
1825kernel. You will need to set (at least) the following config options:
1826
1827CONFIG_DEVFS_DEBUG=y
1828CONFIG_DEBUG_KERNEL=y
1829CONFIG_DEBUG_SLAB=y
1830
1831
1832
1833please make sure you have the latest devfs patches applied. The
1834latest kernel version might not have the latest devfs patches applied
1835yet (Linus is very busy)
1836
1837
1838save a copy of your complete kernel logs (preferably by
1839using the dmesg programme) for later inclusion in your bug
1840report. You may need to use the -s switch to increase the
1841internal buffer size so you can capture all the boot messages.
1842Don't edit or trim the dmesg output
1843
1844
1845
1846
1847try booting with devfs=dall passed to the kernel boot
1848command line (read the documentation on your bootloader on how to do
1849this), and save the result to a file. This may be quite verbose, and
1850it may overflow the messages buffer, but try to get as much of it as
1851you can
1852
1853
1854if you get an Oops, run ksymoops to decode it so that the
1855names of the offending functions are provided. A non-decoded Oops is
1856pretty useless
1857
1858
1859send a copy of your devfsd configuration file(s)
1860
1861send the bug report to me first.
1862Don't expect that I will see it if you post it to the linux-kernel
1863mailing list. Include all the information listed above, plus
1864anything else that you think might be relevant. Put the string
1865devfs somewhere in the subject line, so my mail filters mark
1866it as urgent
1867
1868
1869
1870
1871Here is a general guide on how to ask questions in a way that greatly
1872improves your chances of getting a reply:
1873
1874http://www.tuxedo.org/~esr/faqs/smart-questions.html. If you have
1875a bug to report, you should also read
1876
1877http://www.chiark.greenend.org.uk/~sgtatham/bugs.html.
1878
1879
1880Strange kernel messages
1881
1882You may see devfs-related messages in your kernel logs. Below are some
1883messages and what they mean (and what you should do about them, if
1884anything).
1885
1886
1887
1888devfs_register(fred): could not append to parent, err: -17
1889
1890You need to check what the error code means, but usually 17 means
1891EEXIST. This means that a driver attempted to create an entry
1892fred in a directory, but there already was an entry with that
1893name. This is often caused by flawed boot scripts which untar a bunch
1894of inodes into /dev, as a way to restore permissions. This
1895message is harmless, as the device nodes will still
1896provide access to the driver (unless you use the devfs=only
1897boot option, which is only for dedicated souls:-). If you want to get
1898rid of these annoying messages, upgrade to devfsd-v1.3.20 and use the
1899recommended RESTORE directive to restore permissions.
1900
1901
1902devfs_mk_dir(bill): using old entry in dir: c1808724 ""
1903
1904This is similar to the message above, except that a driver attempted
1905to create a directory named bill, and the parent directory
1906has an entry with the same name. In this case, to ensure that drivers
1907continue to work properly, the old entry is re-used and given to the
1908driver. In 2.5 kernels, the driver is given a NULL entry, and thus,
1909under rare circumstances, may not create the require device nodes.
1910The solution is the same as above.
1911
1912
1913
1914
1915
1916Compilation problems with devfsd
1917
1918Usually, you can compile devfsd just by typing in
1919make in the source directory, followed by a make
1920install (as root). Sometimes, you may have problems, particularly
1921on broken configurations.
1922
1923
1924
1925error messages relating to DEVFSD_NOTIFY_DELETE
1926
1927This happened because you have an ancient set of kernel headers
1928installed in /usr/include/linux or /usr/src/linux.
1929Install kernel 2.4.10 or later. You may need to pass the
1930KERNEL_DIR variable to make (if you did not install
1931the new kernel sources as /usr/src/linux), or you may copy
1932the devfs_fs.h file in the kernel source tree into
1933/usr/include/linux.
1934
1935
1936
1937
1938-----------------------------------------------------------------------------
1939
1940
1941Other resources
1942
1943
1944
1945Douglas Gilbert has written a useful document at
1946
1947http://www.torque.net/sg/devfs_scsi.html which
1948explores the SCSI subsystem and how it interacts with devfs
1949
1950
1951Douglas Gilbert has written another useful document at
1952
1953http://www.torque.net/scsi/scsihosts.html which
1954discusses the scsihosts= boot option
1955
1956
1957Douglas Gilbert has written yet another useful document at
1958
1959http://www.torque.net/scsi/SCSI-2.4-HOWTO/ which
1960discusses the Linux SCSI subsystem in 2.4.
1961
1962
1963Johannes Erdfelt has started a discussion paper on Linux and
1964hot-swap devices, describing what the requirements are for a scalable
1965solution and how and why he's used devfs+devfsd. Note that this is an
1966early draft only, available in plain text form at:
1967
1968http://johannes.erdfelt.com/hotswap.txt.
1969Johannes has promised a HTML version will follow.
1970
1971
1972I presented an invited
1973paper
1974at the
1975
19762nd Annual Storage Management Workshop held in Miamia, Florida,
1977U.S.A. in October 2000.
1978
1979
1980
1981
1982-----------------------------------------------------------------------------
1983
1984
1985Translations of this document
1986
1987This document has been translated into other languages.
1988
1989
1990
1991
1992The document master (in English) by rgooch@atnf.csiro.au is
1993available at
1994
1995http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html
1996
1997
1998
1999A Korean translation by viatoris@nownuri.net is available at
2000
2001http://your.destiny.pe.kr/devfs/devfs.html
2002
2003
2004
2005
2006-----------------------------------------------------------------------------
2007Most flags courtesy of ITA's
2008Flags of All Countries
2009used with permission.
2010