1@node I/O Overview, I/O on Streams, Pattern Matching, Top 2@c %MENU% Introduction to the I/O facilities 3@chapter Input/Output Overview 4 5Most programs need to do either input (reading data) or output (writing 6data), or most frequently both, in order to do anything useful. @Theglibc{} 7provides such a large selection of input and output functions 8that the hardest part is often deciding which function is most 9appropriate! 10 11This chapter introduces concepts and terminology relating to input 12and output. Other chapters relating to the GNU I/O facilities are: 13 14@itemize @bullet 15@item 16@ref{I/O on Streams}, which covers the high-level functions 17that operate on streams, including formatted input and output. 18 19@item 20@ref{Low-Level I/O}, which covers the basic I/O and control 21functions on file descriptors. 22 23@item 24@ref{File System Interface}, which covers functions for operating on 25directories and for manipulating file attributes such as access modes 26and ownership. 27 28@item 29@ref{Pipes and FIFOs}, which includes information on the basic interprocess 30communication facilities. 31 32@item 33@ref{Sockets}, which covers a more complicated interprocess communication 34facility with support for networking. 35 36@item 37@ref{Low-Level Terminal Interface}, which covers functions for changing 38how input and output to terminals or other serial devices are processed. 39@end itemize 40 41 42@menu 43* I/O Concepts:: Some basic information and terminology. 44* File Names:: How to refer to a file. 45@end menu 46 47@node I/O Concepts, File Names, , I/O Overview 48@section Input/Output Concepts 49 50Before you can read or write the contents of a file, you must establish 51a connection or communications channel to the file. This process is 52called @dfn{opening} the file. You can open a file for reading, writing, 53or both. 54@cindex opening a file 55 56The connection to an open file is represented either as a stream or as a 57file descriptor. You pass this as an argument to the functions that do 58the actual read or write operations, to tell them which file to operate 59on. Certain functions expect streams, and others are designed to 60operate on file descriptors. 61 62When you have finished reading to or writing from the file, you can 63terminate the connection by @dfn{closing} the file. Once you have 64closed a stream or file descriptor, you cannot do any more input or 65output operations on it. 66 67@menu 68* Streams and File Descriptors:: The GNU C Library provides two ways 69 to access the contents of files. 70* File Position:: The number of bytes from the 71 beginning of the file. 72@end menu 73 74@node Streams and File Descriptors, File Position, , I/O Concepts 75@subsection Streams and File Descriptors 76 77When you want to do input or output to a file, you have a choice of two 78basic mechanisms for representing the connection between your program 79and the file: file descriptors and streams. File descriptors are 80represented as objects of type @code{int}, while streams are represented 81as @code{FILE *} objects. 82 83File descriptors provide a primitive, low-level interface to input and 84output operations. Both file descriptors and streams can represent a 85connection to a device (such as a terminal), or a pipe or socket for 86communicating with another process, as well as a normal file. But, if 87you want to do control operations that are specific to a particular kind 88of device, you must use a file descriptor; there are no facilities to 89use streams in this way. You must also use file descriptors if your 90program needs to do input or output in special modes, such as 91nonblocking (or polled) input (@pxref{File Status Flags}). 92 93Streams provide a higher-level interface, layered on top of the 94primitive file descriptor facilities. The stream interface treats all 95kinds of files pretty much alike---the sole exception being the three 96styles of buffering that you can choose (@pxref{Stream Buffering}). 97 98The main advantage of using the stream interface is that the set of 99functions for performing actual input and output operations (as opposed 100to control operations) on streams is much richer and more powerful than 101the corresponding facilities for file descriptors. The file descriptor 102interface provides only simple functions for transferring blocks of 103characters, but the stream interface also provides powerful formatted 104input and output functions (@code{printf} and @code{scanf}) as well as 105functions for character- and line-oriented input and output. 106@c !!! glibc has dprintf, which lets you do printf on an fd. 107 108Since streams are implemented in terms of file descriptors, you can 109extract the file descriptor from a stream and perform low-level 110operations directly on the file descriptor. You can also initially open 111a connection as a file descriptor and then make a stream associated with 112that file descriptor. 113 114In general, you should stick with using streams rather than file 115descriptors, unless there is some specific operation you want to do that 116can only be done on a file descriptor. If you are a beginning 117programmer and aren't sure what functions to use, we suggest that you 118concentrate on the formatted input functions (@pxref{Formatted Input}) 119and formatted output functions (@pxref{Formatted Output}). 120 121If you are concerned about portability of your programs to systems other 122than GNU, you should also be aware that file descriptors are not as 123portable as streams. You can expect any system running @w{ISO C} to 124support streams, but @nongnusystems{} may not support file descriptors at 125all, or may only implement a subset of the GNU functions that operate on 126file descriptors. Most of the file descriptor functions in @theglibc{} 127are included in the POSIX.1 standard, however. 128 129@node File Position, , Streams and File Descriptors, I/O Concepts 130@subsection File Position 131 132One of the attributes of an open file is its @dfn{file position} that 133keeps track of where in the file the next character is to be read or 134written. On @gnusystems{}, and all POSIX.1 systems, the file position 135is simply an integer representing the number of bytes from the beginning 136of the file. 137 138The file position is normally set to the beginning of the file when it 139is opened, and each time a character is read or written, the file 140position is incremented. In other words, access to the file is normally 141@dfn{sequential}. 142@cindex file position 143@cindex sequential-access files 144 145Ordinary files permit read or write operations at any position within 146the file. Some other kinds of files may also permit this. Files which 147do permit this are sometimes referred to as @dfn{random-access} files. 148You can change the file position using the @code{fseek} function on a 149stream (@pxref{File Positioning}) or the @code{lseek} function on a file 150descriptor (@pxref{I/O Primitives}). If you try to change the file 151position on a file that doesn't support random access, you get the 152@code{ESPIPE} error. 153@cindex random-access files 154 155Streams and descriptors that are opened for @dfn{append access} are 156treated specially for output: output to such files is @emph{always} 157appended sequentially to the @emph{end} of the file, regardless of the 158file position. However, the file position is still used to control where in 159the file reading is done. 160@cindex append-access files 161 162If you think about it, you'll realize that several programs can read a 163given file at the same time. In order for each program to be able to 164read the file at its own pace, each program must have its own file 165pointer, which is not affected by anything the other programs do. 166 167In fact, each opening of a file creates a separate file position. 168Thus, if you open a file twice even in the same program, you get two 169streams or descriptors with independent file positions. 170 171By contrast, if you open a descriptor and then duplicate it to get 172another descriptor, these two descriptors share the same file position: 173changing the file position of one descriptor will affect the other. 174 175@node File Names, , I/O Concepts, I/O Overview 176@section File Names 177 178In order to open a connection to a file, or to perform other operations 179such as deleting a file, you need some way to refer to the file. Nearly 180all files have names that are strings---even files which are actually 181devices such as tape drives or terminals. These strings are called 182@dfn{file names}. You specify the file name to say which file you want 183to open or operate on. 184 185This section describes the conventions for file names and how the 186operating system works with them. 187@cindex file name 188 189@menu 190* Directories:: Directories contain entries for files. 191* File Name Resolution:: A file name specifies how to look up a file. 192* File Name Errors:: Error conditions relating to file names. 193* File Name Portability:: File name portability and syntax issues. 194@end menu 195 196 197@node Directories, File Name Resolution, , File Names 198@subsection Directories 199 200In order to understand the syntax of file names, you need to understand 201how the file system is organized into a hierarchy of directories. 202 203@cindex directory 204@cindex link 205@cindex directory entry 206A @dfn{directory} is a file that contains information to associate other 207files with names; these associations are called @dfn{links} or 208@dfn{directory entries}. Sometimes, people speak of ``files in a 209directory'', but in reality, a directory only contains pointers to 210files, not the files themselves. 211 212@cindex file name component 213The name of a file contained in a directory entry is called a @dfn{file 214name component}. In general, a file name consists of a sequence of one 215or more such components, separated by the slash character (@samp{/}). A 216file name which is just one component names a file with respect to its 217directory. A file name with multiple components names a directory, and 218then a file in that directory, and so on. 219 220Some other documents, such as the POSIX standard, use the term 221@dfn{pathname} for what we call a file name, and either @dfn{filename} 222or @dfn{pathname component} for what this manual calls a file name 223component. We don't use this terminology because a ``path'' is 224something completely different (a list of directories to search), and we 225think that ``pathname'' used for something else will confuse users. We 226always use ``file name'' and ``file name component'' (or sometimes just 227``component'', where the context is obvious) in GNU documentation. Some 228macros use the POSIX terminology in their names, such as 229@code{PATH_MAX}. These macros are defined by the POSIX standard, so we 230cannot change their names. 231 232You can find more detailed information about operations on directories 233in @ref{File System Interface}. 234 235@node File Name Resolution, File Name Errors, Directories, File Names 236@subsection File Name Resolution 237 238A file name consists of file name components separated by slash 239(@samp{/}) characters. On the systems that @theglibc{} supports, 240multiple successive @samp{/} characters are equivalent to a single 241@samp{/} character. 242 243@cindex file name resolution 244The process of determining what file a file name refers to is called 245@dfn{file name resolution}. This is performed by examining the 246components that make up a file name in left-to-right order, and locating 247each successive component in the directory named by the previous 248component. Of course, each of the files that are referenced as 249directories must actually exist, be directories instead of regular 250files, and have the appropriate permissions to be accessible by the 251process; otherwise the file name resolution fails. 252 253@cindex root directory 254@cindex absolute file name 255If a file name begins with a @samp{/}, the first component in the file 256name is located in the @dfn{root directory} of the process (usually all 257processes on the system have the same root directory). Such a file name 258is called an @dfn{absolute file name}. 259@c !!! xref here to chroot, if we ever document chroot. -rm 260 261@cindex relative file name 262Otherwise, the first component in the file name is located in the 263current working directory (@pxref{Working Directory}). This kind of 264file name is called a @dfn{relative file name}. 265 266@cindex parent directory 267The file name components @file{.} (``dot'') and @file{..} (``dot-dot'') 268have special meanings. Every directory has entries for these file name 269components. The file name component @file{.} refers to the directory 270itself, while the file name component @file{..} refers to its 271@dfn{parent directory} (the directory that contains the link for the 272directory in question). As a special case, @file{..} in the root 273directory refers to the root directory itself, since it has no parent; 274thus @file{/..} is the same as @file{/}. 275 276Here are some examples of file names: 277 278@table @file 279@item /a 280The file named @file{a}, in the root directory. 281 282@item /a/b 283The file named @file{b}, in the directory named @file{a} in the root directory. 284 285@item a 286The file named @file{a}, in the current working directory. 287 288@item /a/./b 289This is the same as @file{/a/b}. 290 291@item ./a 292The file named @file{a}, in the current working directory. 293 294@item ../a 295The file named @file{a}, in the parent directory of the current working 296directory. 297@end table 298 299@c An empty string may ``work'', but I think it's confusing to 300@c try to describe it. It's not a useful thing for users to use--rms. 301A file name that names a directory may optionally end in a @samp{/}. 302You can specify a file name of @file{/} to refer to the root directory, 303but the empty string is not a meaningful file name. If you want to 304refer to the current working directory, use a file name of @file{.} or 305@file{./}. 306 307Unlike some other operating systems, @gnusystems{} don't have any 308built-in support for file types (or extensions) or file versions as part 309of its file name syntax. Many programs and utilities use conventions 310for file names---for example, files containing C source code usually 311have names suffixed with @samp{.c}---but there is nothing in the file 312system itself that enforces this kind of convention. 313 314@node File Name Errors, File Name Portability, File Name Resolution, File Names 315@subsection File Name Errors 316 317@cindex file name errors 318@cindex usual file name errors 319 320Functions that accept file name arguments usually detect these 321@code{errno} error conditions relating to the file name syntax or 322trouble finding the named file. These errors are referred to throughout 323this manual as the @dfn{usual file name errors}. 324 325@table @code 326@item EACCES 327The process does not have search permission for a directory component 328of the file name. 329 330@item ENAMETOOLONG 331This error is used when either the total length of a file name is 332greater than @code{PATH_MAX}, or when an individual file name component 333has a length greater than @code{NAME_MAX}. @xref{Limits for Files}. 334 335On @gnuhurdsystems{}, there is no imposed limit on overall file name 336length, but some file systems may place limits on the length of a 337component. 338 339@item ENOENT 340This error is reported when a file referenced as a directory component 341in the file name doesn't exist, or when a component is a symbolic link 342whose target file does not exist. @xref{Symbolic Links}. 343 344@item ENOTDIR 345A file that is referenced as a directory component in the file name 346exists, but it isn't a directory. 347 348@item ELOOP 349Too many symbolic links were resolved while trying to look up the file 350name. The system has an arbitrary limit on the number of symbolic links 351that may be resolved in looking up a single file name, as a primitive 352way to detect loops. @xref{Symbolic Links}. 353@end table 354 355 356@node File Name Portability, , File Name Errors, File Names 357@subsection Portability of File Names 358 359The rules for the syntax of file names discussed in @ref{File Names}, 360are the rules normally used by @gnusystems{} and by other POSIX 361systems. However, other operating systems may use other conventions. 362 363There are two reasons why it can be important for you to be aware of 364file name portability issues: 365 366@itemize @bullet 367@item 368If your program makes assumptions about file name syntax, or contains 369embedded literal file name strings, it is more difficult to get it to 370run under other operating systems that use different syntax conventions. 371 372@item 373Even if you are not concerned about running your program on machines 374that run other operating systems, it may still be possible to access 375files that use different naming conventions. For example, you may be 376able to access file systems on another computer running a different 377operating system over a network, or read and write disks in formats used 378by other operating systems. 379@end itemize 380 381The @w{ISO C} standard says very little about file name syntax, only that 382file names are strings. In addition to varying restrictions on the 383length of file names and what characters can validly appear in a file 384name, different operating systems use different conventions and syntax 385for concepts such as structured directories and file types or 386extensions. Some concepts such as file versions might be supported in 387some operating systems and not by others. 388 389The POSIX.1 standard allows implementations to put additional 390restrictions on file name syntax, concerning what characters are 391permitted in file names and on the length of file name and file name 392component strings. However, on @gnusystems{}, any character except 393the null character is permitted in a file name string, and 394on @gnuhurdsystems{} there are no limits on the length of file name 395strings. 396