UNIX FILE API’S ,General File APIs, FILE APIs Uses, Directory File APIs, Device File APIs ,File and Record Locking, FIFO file API’s, Symbolic Link File API’s, Unix files API notes
This chapter describes how the UNIX applications interface with files. After reading this
chapter, students should be able to write programs that performs the following functions or
any type of files in a UNIX system.
To illustrate the applications of UNIX API’s for files, some C++ programs are depicted to
show the implementation of the UNIX commands ls, mv, chmod, chown and touch based on
these API’S. This chapter defines a C++ class called file. This file class inherits all the
properties of the C++ fstream class and it has additional member functions to create objects
of any file type as well as to display and change file object attribute.
File in UNIX system may be one of the following types.
➢ Symbolic Link file.
The file APIs that are available to perform various operations on files in a file system are:
It is used to open or create a file by establishing a connection between the calling process and
a file. It can be used to create brand new files and after file is created any process can call the
open function to get a file descriptor to refer to the file. The file descriptor is used in the read
and writes system calls to access the file content.
An integer values in the form of manifested constants which specifies how the
file is to be accessed by calling process. The manifested constants can be classified as access
mode flags and access modifier flags.
- If a file is to be opened for read-only, the file should already exist and no other modifier flags
can be used.
- O_APPEND, O_TRUNC, O_CREAT and O_EXCL are applicable for regular files, whereas
- O_NONBLOCK is for FIFO and device files only, and O_NOCTTY is for terminal device
file only.
Permission:
➢ The permission argument is required only if the O_CREAT flag is set in the
access_mode argument. It specifies the access permission of the file for its owner,
group and all the other people.
➢ Its data type is int and its value is octal integer value, such as 0764. The left-most,
middle and right-most bits specify the access permission for owner, group and others
respectively.
➢ In each octal digit the left-most, middle and right-most bits specify read, write and
execute permission respectively.
For example 0764 specifies 7 is for owner, 6 is for group and 4 is for other.
7 = 111 specifies read, write and execution permission for owner.
6 = 110 specifies read, write permission for group.
4 = 100 specifies read permission for others.
Each bit is either 1, which means a right is granted or zero, for no such rights.
➢ POSIX.1 defines the permission data type as mode_t and its value is manifested
constants which are aliases to octal integer values. For example, 0764 permission
value should be specified as:
creat:
The creat system call is used to create new regular files. Its prototype is:
- The path_name argument is the path name of a file to be created.
- The mode argument is same as that for open API.
Since O_CREAT flag was added to open API it was used to both create and open regular
files. So, the creat API has become obsolute. It is retained for backward-compatibility with
early versions of UNIX.
The creat function can be implemented using the open function as:
#define creat (path_name, mode)
open(path_name, O_WRONLY|O_CREAT|O_TRUNC, mode)
read:
This function fetches a fixed size block of data from a file referenced by a given file
descriptor.
Its prototype is:
➢ fdesc: is an integer file descriptor that refers to an opened file.
➢ buf: is the address of a buffer holding any data read.
➢ size: specifies how many bytes of data are to be read from the file.
**Note: read function can read text or binary files. This is why the data type of buf is a
universal pointer (void *).
For example the following code reads, sequentially one or more record of struct sample-typed
data from a file called dbase:
struct sample { int x; double y; char* a;} varX;
int fd = open(“dbase”, O_RDONLY);
while ( read(fd, &varX, sizeof(varX))>0)
➢ The return value of read is the number of bytes of data successfully read and stored in
the buf argument. It should be equal to the size value.
➢ If a file contains less than size bytes of data remaining to be read, the return value of
read will be less than that of size. If end-of-file is reached, read will return a zero
value.
➢ size_t is defined as int in header, users should not set size to exceed
INT_MAX in any read function call.
➢ If a read function call is interrupted by a caught signal and the OS does not restart the
system call automatically, POSIX.1 allows two possible behaviors:
- The read function will return -1 value, errno will be set to EINTR, and all the data will be
discarded.
- The read function will return the number of bytes of data read prior to the signal
interruption. This allows a process to continue reading the file.
The read function may block a calling process execution if it is reading a FIFO or device file
and data is not yet available to satisfy the read request. Users may specify the
O_NONBLOCK or O_NDELAY flags on a file descriptor to request nonblocking read
operations on the corresponding file.
write:
The write function puts a fixed size block of data to a file referenced by a file descriptor
Its prototype is:
fdesc: is an integer file descriptor that refers to an opened file.
buf: is the address of a buffer which contains data to be written to the file.
size: specifies how many bytes of data are in the buf argument.
**Note: write function can read text or binary files. This is why the data type of buf is a
universal pointer (void *). For example, the following code fragment writes ten records of
struct sample-types data to a file called dbase2:
struct sample { int x; double y; char* a;}
varX[10];
int fd = open(“dbase2”, O_WRONLY);
write(fd, (void*)varX, sizeof varX);
➢ The return value of write is the number of bytes of data successfully written to a file.
It should be equal to the size value.
➢ If the write will cause the file size to exceed a system imposed limit or if the file
system disk is full, the return value of write will be the actual number of bytes written
before the function was aborted.
➢ If a signal arrives during a write function call and the OS does not restart the system
call automatically, the write function may either return a -1 value and set errno to
EINTR or return the number of bytes of data written prior to the signal interruption.
➢ The write function may perform nonblocking operation if the O_NONBLOCK or
O_NDELAY flags are set on the fdesc argument to the function.
close:
The close function disconnects a file from a process. Its prototype is:
fdesc: is an integer file descriptor that refers to an opened file.
➢ The return value of close is zero if the call succeeds or -1 if it fails.
➢ The close function frees unused file descriptors so that they can be reused to
reference other files.
➢ The close function will deallocate system resources which reduces the memory
requirement of a process.
➢ If a process terminates without closing all the files it has opened, the kernel will close
files for the process.
link:
The link function creates a new link for an existing file . This function does not create a new
file. It create a new path name for an existing file.
cur_link: is a path name of an existing file.
new_link: is a new path name to be assigned to the same file.
➢ If this call succeeds, the hard link count attribute of the file will be increased by 1.
➢ link cannot be used to create hard links across file systems. It cannot be used on
directory files unless it is called by a process that has superuser previlege.
The ln command is implemented using the link API. The program is given below:
File and Record Locking:
UNIX systems allow multiple processes to read and write the same file concurrently which
provides data sharing among processes. It also renders difficulty for any process in
determining when data in a file can be overridden by another process. In some of the
applications like a database manager, where no other process can write or read a file while a
process is accessing a database file. To overcome this drawback, UNIX and POSIX systems
support a file locking mechanism.
File locking is applicable only for regular files. It allows a process to impose a lock on a file
so that other processes cannot modify the file until it is unlocked by the process.
A process can impose a write lock or a read lock on either a portion of a file or an entire file.
The difference between write locks and read locks is that when a write lock is set, it prevents
other processes from setting any overlapping read or write locks on the locked region of a
file. On the other hand, when a read lock is set, it prevents other processes from setting any
overlapping write locks on the locked region of a file.
The intention of a write lock is to
prevent other processes from both reading and writing the locked region while the process that sets the lock is modifying the region. A write lock is also
known as an exclusive lock. The use of a read lock is to prevent other processes from writing
to the locked region while the process that sets the lock is reading data from the region. Other
processes are allowed to lock and read data from the locked regions. Hence, a read lock is
also called a shared lock.
Mandatory Lock :
Mandatory locks are enforced by an operating system kernel. If a mandatory exclusive lock is
set on a file, no process can use the read or write system calls to access data on the locked
region. If a mandatory shared lock is set on a region of a file, no process can use the write
system call to modify the locked region. It is used to synchronize reading and writing of
shared files by multiple processes: If a process locks up a file, other processes that attempts to
write to the locked regions are blocked until the former process releases its lock.
Mandatory locks may cause problems: If a runaway process sets a mandatory exclusive lock
on a file and never unlocks it, no other processes can access the locked region of the file until
either the runaway process is killed or the system is rebooted. System V.3 and V.4 support
mandatory locks.
Advisory Lock :
An advisory lock is not enforced by a kernel at the system call level. This means that even
though lock (read or write) may be set on a file, other processes can still use the read or write
APIs to access the file. To make use of advisory locks, processes that manipulate the same
file must cooperate such that they follow this procedure for every read or write operation to
the file:
a. Try to set a lock at the region to be accessed. If this fails, a process can either wait for the
lock request to become successful or go do something else and try to lock the file again later.
b. After a lock is acquired successfully, read or write the locked region release the lock must
follow the above file locking procedure to be cooperative. This may be difficult to control
when programs are obtained from different sources.
Directory File APIs :
Directory files in UNIX and POSIX systems are used to help users in organizing their files
into some structure based on the specific use of file.
They are also used by the operating system to convert file path names to their inode numbers.
Directory files are created in BSD UNIX and POSIX.1 by mkdir API:
1. The path_name argument is the path name of a directory to be created.
2. The mode argument specifies the access permission for the owner, group and others to be
assigned to the file.
3. The return value of mkdir is 0 if it succeeds or -1 if it fails.
UNIX System V.3 uses the mknod API to create directory files.
UNIX System V.4 supports both the mkdir and mknod APIs for creating directory files.
The difference between the two APIs is that a directory created by mknod does not contain
the "." and ".." links. On the other hand, a directory created by mkdir has the "." and ".."
links created in one atomic operation, and it is ready to be used.
Device File APIs
Device files are used to interface physical devices with application programs. Specifically,
when a process reads or writes to a device file, the kernel uses the major and minor device
numbers of a file to select a device driver function to carry out the actual data transfer.
Device files may be character-based or block-based. UNIX systems define the mknod API to
create device files.
- The major and minor device numbers are extended to fourteen and eighteen bits,
respectively.
- In UNIX, if a calling process has no controlling terminal and it opens a character device
file, the kernel will set this device file as the controlling terminal of the process. How-ever, if
the O_NOCTTY flag is set in the open call, such action will be suppressed.
- The O_NONBLOCK flag specifies that the open call and any subsequent read or write
calls to a device file should be nonblocking to the process.
FIFO file API’s
FIFO files are sometimes called named pipes.
➢ Pipes can be used only between related processes when a common ancestor has
created the pipe.
➢ Creating a FIFO is similar to creating a file.
➢ Indeed the pathname for a FIFO exists in the file system.
The prototype of mkfifo is
➢ The first argument pathname is the pathname(filename) of a FIFO file to be
created.
➢ The second argument mode specifies the access permission for user, group and
others and as well as the S_IFIFO flag to indicate that it is a FIFO file.
➢ On success it returns 0 and on failure it returns –1
Example :
➢ The above statement creates a FIFO file “divya” with read-write-execute permission
for user and only read permission for group and others.
➢ Once we have created a FIFO using mkfifo, we open it using open.
➢ Indeed, the normal file I/O functions (read, write, unlink etc) all work with FIFOs.
When a process opens a FIFO file for reading, the kernel will block the process until there is
another process that opens the same file for writing.
Similarly whenever a process opens a FIFO file write, the kernel will block the process until
another process opens the same FIFO for reading.
Symbolic Link File API’s :
➢ A symbolic link is an indirect pointer to a file, unlike the hard links which pointed
directly to the inode of the file.
➢ Symbolic links are developed to get around the limitations of hard links:
➢ Symbolic links can link files across file systems.
➢ Symbolic links can link directory files
➢ Symbolic links always reference the latest version of the files to which they link
➢ There are no file system limitations on a symbolic link and what it points to and
anyone can create a symbolic link to a directory.
➢ Symbolic links are typically used to move a file or an entire directory hierarchy to
some other location on a system.
➢ A symbolic link is created with the symlink.
The prototype is
The org_link and sym_link arguments to a sym_link call specify the original file path name
and the symbolic link path name to be created.