UNIX FILE API’S ,General File APIs, FILE APIs Uses, Directory File APIs, Device File APIs ,File and Record Locking, FIFO file API’s, Symbolic Link File API’s, Unix files API notes

UNIX FILE API’S

This chapter describes how the UNIX applications interface with files. After reading this chapter, students should be able to write programs that performs the following functions or any type of files in a UNIX system.

Create files.
Open files.
Transfer data to and from files.
Close files.
Remove files.
Query file attributes.
Change file attributes.
Truncate files.

To illustrate the applications of UNIX API’s for files, some C++ programs are depicted to show the implementation of the UNIX commands ls, mv, chmod, chown and touch based on these API’S. This chapter defines a C++ class called file. This file class inherits all the properties of the C++ fstream class and it has additional member functions to create objects of any file type as well as to display and change file object attribute.

General File APIs :

File in UNIX system may be one of the following types.

➢ Regular file.

➢ Directory file.

➢ FIFO file.

➢ Character device file.

➢ Block device file.

➢ Symbolic Link file.

The file APIs that are available to perform various operations on files in a file system are:

FILE APIs USE :

Open:

It is used to open or create a file by establishing a connection between the calling process and a file. It can be used to create brand new files and after file is created any process can call the open function to get a file descriptor to refer to the file. The file descriptor is used in the read and writes system calls to access the file content.

The prototype of the open function is:

path_name:

The pathname of a file to be opened or created. It can be an absolute path name or relative path name. The pathname can also be a symbolic link name.

access_mode:

An integer values in the form of manifested constants which specifies how the file is to be accessed by calling process. The manifested constants can be classified as access mode flags and access modifier flags.

Example, a process is normally blocked on reading an empty pipe or on writing to a pipe that is full. It may be used to specify that such read and write operations are non-blocking.

int fdesc = open(“/usr/xyz/prog1”, O_RDWR|O_APPEND,0);

If a file is to be opened for read-only, the file should already exist and no other modifier flags can be used.
O_APPEND, O_TRUNC, O_CREAT and O_EXCL are applicable for regular files, whereas
O_NONBLOCK is for FIFO and device files only, and O_NOCTTY is for terminal device file only.

Permission:

➢ The permission argument is required only if the O_CREAT flag is set in the access_mode argument. It specifies the access permission of the file for its owner, group and all the other people.

➢ Its data type is int and its value is octal integer value, such as 0764. The left-most, middle and right-most bits specify the access permission for owner, group and others respectively.

➢ In each octal digit the left-most, middle and right-most bits specify read, write and execute permission respectively.

For example 0764 specifies 7 is for owner, 6 is for group and 4 is for other.

7 = 111 specifies read, write and execution permission for owner.
6 = 110 specifies read, write permission for group.
4 = 100 specifies read permission for others.
Each bit is either 1, which means a right is granted or zero, for no such rights.

➢ POSIX.1 defines the permission data type as mode_t and its value is manifested constants which are aliases to octal integer values. For example, 0764 permission value should be specified as:

creat:

The creat system call is used to create new regular files. Its prototype is:

The path_name argument is the path name of a file to be created.
The mode argument is same as that for open API.

Since O_CREAT flag was added to open API it was used to both create and open regular files. So, the creat API has become obsolute. It is retained for backward-compatibility with early versions of UNIX.

The creat function can be implemented using the open function as:

#define creat (path_name, mode)
open(path_name, O_WRONLY|O_CREAT|O_TRUNC, mode)

read:

This function fetches a fixed size block of data from a file referenced by a given file descriptor.

Its prototype is:

➢ fdesc: is an integer file descriptor that refers to an opened file.

➢ buf: is the address of a buffer holding any data read.

➢ size: specifies how many bytes of data are to be read from the file.

**Note: read function can read text or binary files. This is why the data type of buf is a universal pointer (void *).

For example the following code reads, sequentially one or more record of struct sample-typed data from a file called dbase:

struct sample { int x; double y; char* a;} varX;
int fd = open(“dbase”, O_RDONLY);
while ( read(fd, &varX, sizeof(varX))>0)

➢ The return value of read is the number of bytes of data successfully read and stored in the buf argument. It should be equal to the size value.

➢ If a file contains less than size bytes of data remaining to be read, the return value of read will be less than that of size. If end-of-file is reached, read will return a zero value.

➢ size_t is defined as int in header, users should not set size to exceed INT_MAX in any read function call.

➢ If a read function call is interrupted by a caught signal and the OS does not restart the system call automatically, POSIX.1 allows two possible behaviors:

The read function will return -1 value, errno will be set to EINTR, and all the data will be discarded.
The read function will return the number of bytes of data read prior to the signal interruption. This allows a process to continue reading the file.

The read function may block a calling process execution if it is reading a FIFO or device file and data is not yet available to satisfy the read request. Users may specify the O_NONBLOCK or O_NDELAY flags on a file descriptor to request nonblocking read operations on the corresponding file.

write:

The write function puts a fixed size block of data to a file referenced by a file descriptor

Its prototype is:

fdesc: is an integer file descriptor that refers to an opened file.

buf: is the address of a buffer which contains data to be written to the file.

size: specifies how many bytes of data are in the buf argument.

**Note: write function can read text or binary files. This is why the data type of buf is a universal pointer (void *). For example, the following code fragment writes ten records of struct sample-types data to a file called dbase2:

struct sample { int x; double y; char* a;}
varX[10];
int fd = open(“dbase2”, O_WRONLY);
write(fd, (void*)varX, sizeof varX);

➢ The return value of write is the number of bytes of data successfully written to a file. It should be equal to the size value.

➢ If the write will cause the file size to exceed a system imposed limit or if the file system disk is full, the return value of write will be the actual number of bytes written before the function was aborted.

➢ If a signal arrives during a write function call and the OS does not restart the system call automatically, the write function may either return a -1 value and set errno to EINTR or return the number of bytes of data written prior to the signal interruption.

➢ The write function may perform nonblocking operation if the O_NONBLOCK or O_NDELAY flags are set on the fdesc argument to the function.

close:

The close function disconnects a file from a process. Its prototype is:

fdesc: is an integer file descriptor that refers to an opened file.

➢ The return value of close is zero if the call succeeds or -1 if it fails.

➢ The close function frees unused file descriptors so that they can be reused to reference other files.

➢ The close function will deallocate system resources which reduces the memory requirement of a process.

➢ If a process terminates without closing all the files it has opened, the kernel will close files for the process.

link:

The link function creates a new link for an existing file . This function does not create a new file. It create a new path name for an existing file.

Its prototype is:

cur_link: is a path name of an existing file.

new_link: is a new path name to be assigned to the same file.

➢ If this call succeeds, the hard link count attribute of the file will be increased by 1.

➢ link cannot be used to create hard links across file systems. It cannot be used on directory files unless it is called by a process that has superuser previlege.

The ln command is implemented using the link API. The program is given below:

File and Record Locking:

UNIX systems allow multiple processes to read and write the same file concurrently which provides data sharing among processes. It also renders difficulty for any process in determining when data in a file can be overridden by another process. In some of the applications like a database manager, where no other process can write or read a file while a process is accessing a database file. To overcome this drawback, UNIX and POSIX systems support a file locking mechanism.

File locking is applicable only for regular files. It allows a process to impose a lock on a file so that other processes cannot modify the file until it is unlocked by the process. A process can impose a write lock or a read lock on either a portion of a file or an entire file. The difference between write locks and read locks is that when a write lock is set, it prevents other processes from setting any overlapping read or write locks on the locked region of a file. On the other hand, when a read lock is set, it prevents other processes from setting any overlapping write locks on the locked region of a file.

The intention of a write lock is to prevent other processes from both reading and writing the locked region while the process that sets the lock is modifying the region. A write lock is also known as an exclusive lock. The use of a read lock is to prevent other processes from writing to the locked region while the process that sets the lock is reading data from the region. Other processes are allowed to lock and read data from the locked regions. Hence, a read lock is also called a shared lock.

Mandatory Lock :

Mandatory locks are enforced by an operating system kernel. If a mandatory exclusive lock is set on a file, no process can use the read or write system calls to access data on the locked region. If a mandatory shared lock is set on a region of a file, no process can use the write system call to modify the locked region. It is used to synchronize reading and writing of shared files by multiple processes: If a process locks up a file, other processes that attempts to write to the locked regions are blocked until the former process releases its lock. Mandatory locks may cause problems: If a runaway process sets a mandatory exclusive lock on a file and never unlocks it, no other processes can access the locked region of the file until either the runaway process is killed or the system is rebooted. System V.3 and V.4 support mandatory locks.

Advisory Lock :

An advisory lock is not enforced by a kernel at the system call level. This means that even though lock (read or write) may be set on a file, other processes can still use the read or write APIs to access the file. To make use of advisory locks, processes that manipulate the same file must cooperate such that they follow this procedure for every read or write operation to the file:

a. Try to set a lock at the region to be accessed. If this fails, a process can either wait for the lock request to become successful or go do something else and try to lock the file again later.

b. After a lock is acquired successfully, read or write the locked region release the lock must follow the above file locking procedure to be cooperative. This may be difficult to control when programs are obtained from different sources.

Directory File APIs :

Directory files in UNIX and POSIX systems are used to help users in organizing their files into some structure based on the specific use of file.

They are also used by the operating system to convert file path names to their inode numbers. Directory files are created in BSD UNIX and POSIX.1 by mkdir API:

1. The path_name argument is the path name of a directory to be created.

2. The mode argument specifies the access permission for the owner, group and others to be assigned to the file.

3. The return value of mkdir is 0 if it succeeds or -1 if it fails.

UNIX System V.3 uses the mknod API to create directory files.

UNIX System V.4 supports both the mkdir and mknod APIs for creating directory files.

The difference between the two APIs is that a directory created by mknod does not contain the "." and ".." links. On the other hand, a directory created by mkdir has the "." and ".." links created in one atomic operation, and it is ready to be used.

Device File APIs

Device files are used to interface physical devices with application programs. Specifically, when a process reads or writes to a device file, the kernel uses the major and minor device numbers of a file to select a device driver function to carry out the actual data transfer. Device files may be character-based or block-based. UNIX systems define the mknod API to create device files.

The major and minor device numbers are extended to fourteen and eighteen bits, respectively.

In UNIX, if a calling process has no controlling terminal and it opens a character device file, the kernel will set this device file as the controlling terminal of the process. How-ever, if the O_NOCTTY flag is set in the open call, such action will be suppressed.

The O_NONBLOCK flag specifies that the open call and any subsequent read or write calls to a device file should be nonblocking to the process.

FIFO file API’s

FIFO files are sometimes called named pipes.

➢ Pipes can be used only between related processes when a common ancestor has created the pipe.

➢ Creating a FIFO is similar to creating a file.

➢ Indeed the pathname for a FIFO exists in the file system.

The prototype of mkfifo is

➢ The first argument pathname is the pathname(filename) of a FIFO file to be created.

➢ The second argument mode specifies the access permission for user, group and others and as well as the S_IFIFO flag to indicate that it is a FIFO file.

➢ On success it returns 0 and on failure it returns –1

Example :

➢ The above statement creates a FIFO file “divya” with read-write-execute permission for user and only read permission for group and others.

➢ Once we have created a FIFO using mkfifo, we open it using open.

➢ Indeed, the normal file I/O functions (read, write, unlink etc) all work with FIFOs.

When a process opens a FIFO file for reading, the kernel will block the process until there is another process that opens the same file for writing.

Similarly whenever a process opens a FIFO file write, the kernel will block the process until another process opens the same FIFO for reading.

Symbolic Link File API’s :

➢ A symbolic link is an indirect pointer to a file, unlike the hard links which pointed directly to the inode of the file.

➢ Symbolic links are developed to get around the limitations of hard links:

➢ Symbolic links can link files across file systems.

➢ Symbolic links can link directory files

➢ Symbolic links always reference the latest version of the files to which they link

➢ There are no file system limitations on a symbolic link and what it points to and anyone can create a symbolic link to a directory.

➢ Symbolic links are typically used to move a file or an entire directory hierarchy to some other location on a system.

➢ A symbolic link is created with the symlink.

The prototype is

The org_link and sym_link arguments to a sym_link call specify the original file path name and the symbolic link path name to be created.