1<!-- -*- sgml -*- -->
2<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V3.1//EN"[
3<!ENTITY procfsexample SYSTEM "procfs_example.sgml">
4]>
5
6<book id="LKProcfsGuide">
7  <bookinfo>
8    <title>Linux Kernel Procfs Guide</title>
9
10    <authorgroup>
11      <author>
12	<firstname>Erik</firstname>
13	<othername>(J.A.K.)</othername>
14	<surname>Mouw</surname>
15	<affiliation>
16	  <orgname>Delft University of Technology</orgname>
17	  <orgdiv>Faculty of Information Technology and Systems</orgdiv>
18	  <address>
19            <email>J.A.K.Mouw@its.tudelft.nl</email>
20            <pob>PO BOX 5031</pob>
21            <postcode>2600 GA</postcode>
22            <city>Delft</city>
23            <country>The Netherlands</country>
24          </address>
25	</affiliation>
26      </author>
27    </authorgroup>
28
29    <revhistory>
30      <revision>
31	<revnumber>1.0&nbsp;</revnumber>
32	<date>May 30, 2001</date>
33	<revremark>Initial revision posted to linux-kernel</revremark>
34      </revision>
35      <revision>
36	<revnumber>1.1&nbsp;</revnumber>
37	<date>June 3, 2001</date>
38	<revremark>Revised after comments from linux-kernel</revremark>
39      </revision>
40    </revhistory>
41
42    <copyright>
43      <year>2001</year>
44      <holder>Erik Mouw</holder>
45    </copyright>
46
47
48    <legalnotice>
49      <para>
50        This documentation is free software; you can redistribute it
51        and/or modify it under the terms of the GNU General Public
52        License as published by the Free Software Foundation; either
53        version 2 of the License, or (at your option) any later
54        version.
55      </para>
56
57      <para>
58        This documentation is distributed in the hope that it will be
59        useful, but WITHOUT ANY WARRANTY; without even the implied
60        warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
61        PURPOSE.  See the GNU General Public License for more details.
62      </para>
63
64      <para>
65        You should have received a copy of the GNU General Public
66        License along with this program; if not, write to the Free
67        Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
68        MA 02111-1307 USA
69      </para>
70
71      <para>
72        For more details see the file COPYING in the source
73        distribution of Linux.
74      </para>
75    </legalnotice>
76  </bookinfo>
77
78
79
80
81  <toc>
82  </toc>
83
84
85
86
87  <preface>
88    <title>Preface</title>
89
90    <para>
91      This guide describes the use of the procfs file system from
92      within the Linux kernel. The idea to write this guide came up on
93      the #kernelnewbies IRC channel (see <ulink
94      url="http://www.kernelnewbies.org/">http://www.kernelnewbies.org/</ulink>),
95      when Jeff Garzik explained the use of procfs and forwarded me a
96      message Alexander Viro wrote to the linux-kernel mailing list. I
97      agreed to write it up nicely, so here it is.
98    </para>
99
100    <para>
101      I'd like to thank Jeff Garzik
102      <email>jgarzik@pobox.com</email> and Alexander Viro
103      <email>viro@math.psu.edu</email> for their input, Tim Waugh
104      <email>twaugh@redhat.com</email> for his <ulink
105      url="http://people.redhat.com/twaugh/docbook/selfdocbook/">Selfdocbook</ulink>,
106      and Marc Joosen <email>marcj@historia.et.tudelft.nl</email> for
107      proofreading.
108    </para>
109
110    <para>
111      This documentation was written while working on the LART
112      computing board (<ulink
113      url="http://www.lart.tudelft.nl/">http://www.lart.tudelft.nl/</ulink>),
114      which is sponsored by the Mobile Multi-media Communications
115      (<ulink
116      url="http://www.mmc.tudelft.nl/">http://www.mmc.tudelft.nl/</ulink>)
117      and Ubiquitous Communications (<ulink
118      url="http://www.ubicom.tudelft.nl/">http://www.ubicom.tudelft.nl/</ulink>)
119      projects.
120    </para>
121
122    <para>
123      Erik
124    </para>
125  </preface>
126
127
128
129
130  <chapter id="intro">
131    <title>Introduction</title>
132
133    <para>
134      The <filename class="directory">/proc</filename> file system
135      (procfs) is a special file system in the linux kernel. It's a
136      virtual file system: it is not associated with a block device
137      but exists only in memory. The files in the procfs are there to
138      allow userland programs access to certain information from the
139      kernel (like process information in <filename
140      class="directory">/proc/[0-9]+/</filename>), but also for debug
141      purposes (like <filename>/proc/ksyms</filename>).
142    </para>
143
144    <para>
145      This guide describes the use of the procfs file system from
146      within the Linux kernel. It starts by introducing all relevant
147      functions to manage the files within the file system. After that
148      it shows how to communicate with userland, and some tips and
149      tricks will be pointed out. Finally a complete example will be
150      shown.
151    </para>
152
153    <para>
154      Note that the files in <filename
155      class="directory">/proc/sys</filename> are sysctl files: they
156      don't belong to procfs and are governed by a completely
157      different API described in the Kernel API book.
158    </para>
159  </chapter>
160
161
162
163
164  <chapter id="managing">
165    <title>Managing procfs entries</title>
166
167    <para>
168      This chapter describes the functions that various kernel
169      components use to populate the procfs with files, symlinks,
170      device nodes, and directories.
171    </para>
172
173    <para>
174      A minor note before we start: if you want to use any of the
175      procfs functions, be sure to include the correct header file!
176      This should be one of the first lines in your code:
177    </para>
178
179    <programlisting>
180#include &lt;linux/proc_fs.h&gt;
181    </programlisting>
182
183
184
185
186    <sect1 id="regularfile">
187      <title>Creating a regular file</title>
188
189      <funcsynopsis>
190	<funcprototype>
191	  <funcdef>struct proc_dir_entry* <function>create_proc_entry</function></funcdef>
192	  <paramdef>const char* <parameter>name</parameter></paramdef>
193	  <paramdef>mode_t <parameter>mode</parameter></paramdef>
194	  <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
195	</funcprototype>
196      </funcsynopsis>
197
198      <para>
199        This function creates a regular file with the name
200        <parameter>name</parameter>, file mode
201        <parameter>mode</parameter> in the directory
202        <parameter>parent</parameter>. To create a file in the root of
203        the procfs, use <constant>NULL</constant> as
204        <parameter>parent</parameter> parameter. When successful, the
205        function will return a pointer to the freshly created
206        <structname>struct proc_dir_entry</structname>; otherwise it
207        will return <constant>NULL</constant>. <xref
208        linkend="userland"> describes how to do something useful with
209        regular files.
210      </para>
211
212      <para>
213        Note that it is specifically supported that you can pass a
214        path that spans multiple directories. For example
215        <function>create_proc_entry</function>(<parameter>"drivers/via0/info"</parameter>)
216        will create the <filename class="directory">via0</filename>
217        directory if necessary, with standard
218        <constant>0755</constant> permissions.
219      </para>
220
221    <para>
222      If you only want to be able to read the file, the function
223      <function>create_proc_read_entry</function> described in <xref
224      linkend="convenience"> may be used to create and initialise
225      the procfs entry in one single call.
226    </para>
227    </sect1>
228
229
230
231
232    <sect1>
233      <title>Creating a symlink</title>
234
235      <funcsynopsis>
236	<funcprototype>
237	  <funcdef>struct proc_dir_entry*
238	  <function>proc_symlink</function></funcdef> <paramdef>const
239	  char* <parameter>name</parameter></paramdef>
240	  <paramdef>struct proc_dir_entry*
241	  <parameter>parent</parameter></paramdef> <paramdef>const
242	  char* <parameter>dest</parameter></paramdef>
243	</funcprototype>
244      </funcsynopsis>
245
246      <para>
247        This creates a symlink in the procfs directory
248        <parameter>parent</parameter> that points from
249        <parameter>name</parameter> to
250        <parameter>dest</parameter>. This translates in userland to
251        <literal>ln -s</literal> <parameter>dest</parameter>
252        <parameter>name</parameter>.
253      </para>
254    </sect1>
255
256
257
258
259    <sect1>
260      <title>Creating a device</title>
261
262      <funcsynopsis>
263	<funcprototype>
264	  <funcdef>struct proc_dir_entry* <function>proc_mknod</function></funcdef>
265	  <paramdef>const char* <parameter>name</parameter></paramdef>
266	  <paramdef>mode_t <parameter>mode</parameter></paramdef>
267	  <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
268	  <paramdef>kdev_t <parameter>rdev</parameter></paramdef>
269	</funcprototype>
270      </funcsynopsis>
271
272      <para>
273        Creates a device file <parameter>name</parameter> with mode
274        <parameter>mode</parameter> in the procfs directory
275        <parameter>parent</parameter>. The device file will work on
276        the device <parameter>rdev</parameter>, which can be generated
277        by using the <literal>MKDEV</literal> macro from
278        <literal>linux/kdev_t.h</literal>. The
279        <parameter>mode</parameter> parameter
280        <emphasis>must</emphasis> contain <constant>S_IFBLK</constant>
281        or <constant>S_IFCHR</constant> to create a device
282        node. Compare with userland <literal>mknod
283        --mode=</literal><parameter>mode</parameter>
284        <parameter>name</parameter> <parameter>rdev</parameter>.
285      </para>
286    </sect1>
287
288
289
290
291    <sect1>
292      <title>Creating a directory</title>
293
294      <funcsynopsis>
295	<funcprototype>
296	  <funcdef>struct proc_dir_entry* <function>proc_mkdir</function></funcdef>
297	  <paramdef>const char* <parameter>name</parameter></paramdef>
298	  <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
299	</funcprototype>
300      </funcsynopsis>
301
302      <para>
303        Create a directory <parameter>name</parameter> in the procfs
304        directory <parameter>parent</parameter>.
305      </para>
306    </sect1>
307
308
309
310
311    <sect1>
312      <title>Removing an entry</title>
313
314      <funcsynopsis>
315	<funcprototype>
316	  <funcdef>void <function>remove_proc_entry</function></funcdef>
317	  <paramdef>const char* <parameter>name</parameter></paramdef>
318	  <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
319	</funcprototype>
320      </funcsynopsis>
321
322      <para>
323        Removes the entry <parameter>name</parameter> in the directory
324        <parameter>parent</parameter> from the procfs. Entries are
325        removed by their <emphasis>name</emphasis>, not by the
326        <structname>struct proc_dir_entry</structname> returned by the
327        various create functions. Note that this function doesn't
328        recursively remove entries.
329      </para>
330
331      <para>
332        Be sure to free the <structfield>data</structfield> entry from
333        the <structname>struct proc_dir_entry</structname> before
334        <function>remove_proc_entry</function> is called (that is: if
335        there was some <structfield>data</structfield> allocated, of
336        course). See <xref linkend="usingdata"> for more information
337        on using the <structfield>data</structfield> entry.
338      </para>
339    </sect1>
340  </chapter>
341
342
343
344
345  <chapter id="userland">
346    <title>Communicating with userland</title>
347
348    <para>
349       Instead of reading (or writing) information directly from
350       kernel memory, procfs works with <emphasis>call back
351       functions</emphasis> for files: functions that are called when
352       a specific file is being read or written. Such functions have
353       to be initialised after the procfs file is created by setting
354       the <structfield>read_proc</structfield> and/or
355       <structfield>write_proc</structfield> fields in the
356       <structname>struct proc_dir_entry*</structname> that the
357       function <function>create_proc_entry</function> returned:
358    </para>
359
360    <programlisting>
361struct proc_dir_entry* entry;
362
363entry->read_proc = read_proc_foo;
364entry->write_proc = write_proc_foo;
365    </programlisting>
366
367    <para>
368      If you only want to use a the
369      <structfield>read_proc</structfield>, the function
370      <function>create_proc_read_entry</function> described in <xref
371      linkend="convenience"> may be used to create and initialise the
372      procfs entry in one single call.
373    </para>
374
375
376
377    <sect1>
378      <title>Reading data</title>
379
380      <para>
381        The read function is a call back function that allows userland
382        processes to read data from the kernel. The read function
383        should have the following format:
384      </para>
385
386      <funcsynopsis>
387	<funcprototype>
388	  <funcdef>int <function>read_func</function></funcdef>
389	  <paramdef>char* <parameter>page</parameter></paramdef>
390	  <paramdef>char** <parameter>start</parameter></paramdef>
391	  <paramdef>off_t <parameter>off</parameter></paramdef>
392	  <paramdef>int <parameter>count</parameter></paramdef>
393	  <paramdef>int* <parameter>eof</parameter></paramdef>
394	  <paramdef>void* <parameter>data</parameter></paramdef>
395	</funcprototype>
396      </funcsynopsis>
397
398      <para>
399        The read function should write its information into the
400        <parameter>page</parameter>. For proper use, the function
401        should start writing at an offset of
402        <parameter>off</parameter> in <parameter>page</parameter> and
403        write at most <parameter>count</parameter> bytes, but because
404        most read functions are quite simple and only return a small
405        amount of information, these two parameters are usually
406        ignored (it breaks pagers like <literal>more</literal> and
407        <literal>less</literal>, but <literal>cat</literal> still
408        works).
409      </para>
410
411      <para>
412        If the <parameter>off</parameter> and
413        <parameter>count</parameter> parameters are properly used,
414        <parameter>eof</parameter> should be used to signal that the
415        end of the file has been reached by writing
416        <literal>1</literal> to the memory location
417        <parameter>eof</parameter> points to.
418      </para>
419
420      <para>
421        The parameter <parameter>start</parameter> doesn't seem to be
422        used anywhere in the kernel. The <parameter>data</parameter>
423        parameter can be used to create a single call back function for
424        several files, see <xref linkend="usingdata">.
425      </para>
426
427      <para>
428        The <function>read_func</function> function must return the
429        number of bytes written into the <parameter>page</parameter>.
430      </para>
431
432      <para>
433        <xref linkend="example"> shows how to use a read call back
434        function.
435      </para>
436    </sect1>
437
438
439
440
441    <sect1>
442      <title>Writing data</title>
443
444      <para>
445        The write call back function allows a userland process to write
446        data to the kernel, so it has some kind of control over the
447        kernel. The write function should have the following format:
448      </para>
449
450      <funcsynopsis>
451	<funcprototype>
452	  <funcdef>int <function>write_func</function></funcdef>
453	  <paramdef>struct file* <parameter>file</parameter></paramdef>
454	  <paramdef>const char* <parameter>buffer</parameter></paramdef>
455	  <paramdef>unsigned long <parameter>count</parameter></paramdef>
456	  <paramdef>void* <parameter>data</parameter></paramdef>
457	</funcprototype>
458      </funcsynopsis>
459
460      <para>
461        The write function should read <parameter>count</parameter>
462        bytes at maximum from the <parameter>buffer</parameter>. Note
463        that the <parameter>buffer</parameter> doesn't live in the
464        kernel's memory space, so it should first be copied to kernel
465        space with <function>copy_from_user</function>. The
466        <parameter>file</parameter> parameter is usually
467        ignored. <xref linkend="usingdata"> shows how to use the
468        <parameter>data</parameter> parameter.
469      </para>
470
471      <para>
472        Again, <xref linkend="example"> shows how to use this call back
473        function.
474      </para>
475    </sect1>
476
477
478
479
480    <sect1 id="usingdata">
481      <title>A single call back for many files</title>
482
483      <para>
484         When a large number of almost identical files is used, it's
485         quite inconvenient to use a separate call back function for
486         each file. A better approach is to have a single call back
487         function that distinguishes between the files by using the
488         <structfield>data</structfield> field in <structname>struct
489         proc_dir_entry</structname>. First of all, the
490         <structfield>data</structfield> field has to be initialised:
491      </para>
492
493      <programlisting>
494struct proc_dir_entry* entry;
495struct my_file_data *file_data;
496
497file_data = kmalloc(sizeof(struct my_file_data), GFP_KERNEL);
498entry->data = file_data;
499      </programlisting>
500
501      <para>
502          The <structfield>data</structfield> field is a <type>void
503          *</type>, so it can be initialised with anything.
504      </para>
505
506      <para>
507        Now that the <structfield>data</structfield> field is set, the
508        <function>read_proc</function> and
509        <function>write_proc</function> can use it to distinguish
510        between files because they get it passed into their
511        <parameter>data</parameter> parameter:
512      </para>
513
514      <programlisting>
515int foo_read_func(char *page, char **start, off_t off,
516                  int count, int *eof, void *data)
517{
518        int len;
519
520        if(data == file_data) {
521                /* special case for this file */
522        } else {
523                /* normal processing */
524        }
525
526        return len;
527}
528      </programlisting>
529
530      <para>
531        Be sure to free the <structfield>data</structfield> data field
532        when removing the procfs entry.
533      </para>
534    </sect1>
535  </chapter>
536
537
538
539
540  <chapter id="tips">
541    <title>Tips and tricks</title>
542
543
544
545
546    <sect1 id="convenience">
547      <title>Convenience functions</title>
548
549      <funcsynopsis>
550	<funcprototype>
551	  <funcdef>struct proc_dir_entry* <function>create_proc_read_entry</function></funcdef>
552	  <paramdef>const char* <parameter>name</parameter></paramdef>
553	  <paramdef>mode_t <parameter>mode</parameter></paramdef>
554	  <paramdef>struct proc_dir_entry* <parameter>parent</parameter></paramdef>
555	  <paramdef>read_proc_t* <parameter>read_proc</parameter></paramdef>
556	  <paramdef>void* <parameter>data</parameter></paramdef>
557	</funcprototype>
558      </funcsynopsis>
559
560      <para>
561        This function creates a regular file in exactly the same way
562        as <function>create_proc_entry</function> from <xref
563        linkend="regularfile"> does, but also allows to set the read
564        function <parameter>read_proc</parameter> in one call. This
565        function can set the <parameter>data</parameter> as well, like
566        explained in <xref linkend="usingdata">.
567      </para>
568    </sect1>
569
570
571
572    <sect1>
573      <title>Modules</title>
574
575      <para>
576        If procfs is being used from within a module, be sure to set
577        the <structfield>owner</structfield> field in the
578        <structname>struct proc_dir_entry</structname> to
579        <constant>THIS_MODULE</constant>.
580      </para>
581
582      <programlisting>
583struct proc_dir_entry* entry;
584
585entry->owner = THIS_MODULE;
586      </programlisting>
587    </sect1>
588
589
590
591
592    <sect1>
593      <title>Mode and ownership</title>
594
595      <para>
596        Sometimes it is useful to change the mode and/or ownership of
597        a procfs entry. Here is an example that shows how to achieve
598        that:
599      </para>
600
601      <programlisting>
602struct proc_dir_entry* entry;
603
604entry->mode =  S_IWUSR |S_IRUSR | S_IRGRP | S_IROTH;
605entry->uid = 0;
606entry->gid = 100;
607      </programlisting>
608
609    </sect1>
610  </chapter>
611
612
613
614
615  <chapter id="example">
616    <title>Example</title>
617
618    <!-- be careful with the example code: it shouldn't be wider than
619    approx. 60 columns, or otherwise it won't fit properly on a page
620    -->
621
622&procfsexample;
623
624  </chapter>
625</book>
626