It seems that almost everything in Unix is a file. Even things that don't appear in the directory structure are often accessed through file descriptors in some way or another. In the file system, files can be many things: symbolic links, FIFOs, device files, directories, or even normal files, amongst other things.
All stat() variants return information about the file
"file_name", such as size, date last modified, and lots of
other goodies stored in the i-node. The information is stored in a
struct stat { dev_t st_dev; /* device */ ino_t st_ino; /* inode */ umode_t st_mode; /* protection */ nlink_t st_nlink; /* number of hard links */ uid_t st_uid; /* user ID of owner */ gid_t st_gid; /* group ID of owner */ dev_t st_rdev; /* device type (if inode device) */ off_t st_size; /* total size, in bytes */ unsigned long st_blksize; /* blocksize for filesystem I/O */ unsigned long st_blocks; /* number of blocks allocated */ time_t st_atime; /* time of last access */ time_t st_mtime; /* time of last modification */ time_t st_ctime; /* time of last change */ };
The fstat() call returns the identical information, but can be used if you already have open()'d the file since it accepts the file descriptor as the argument. Finally, lstat() will return information to you about a symbolic link, rather than information about the file it points to.
The field st_mode contains the file permissions, as well as info that describes what the file type is. A series of macros will tell you a file's type when passed a certain st_mode:
|
Additionally, the st_mode field encodes what the file
permissions are. Specifically, these are the bits that make up the
"rwxrwxrwx" that you see when you do an "
| ||||||
| ||||||
|
Another field in
There are two other bits in the st_mode field of the
This is the basis for the power behind the legendary "SUID root" programs--these programs have the SUID bit set and are owned by root. Thus, they run as root all the time. For instance, if you had a copy of /bin/sh which was SUID-root, you would would effectively be root when you were running shell commands! Power, indeed.
When a new file is created, the owner is set to the effective user-ID of the creating process. The group of the file is either set to the effective group-ID of the creating process, or to the group of the directory if the SGID bit is set on the directory.
A way to modify the permissions on a file that is being created is to set the process' umask with the umask() system call. Basically, you set the bits of the umask to the permissions you want to mask out of the file permissions.
For instance, the following snippet of code creat()s a file with 0600 ("-rw-------") permissions, even though the creat() call asks for 0666 permissions:
/* don't set any of the following bits on file creation: */ umask(S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH); /* 0066 */ /* creat the file: */ creat("bar", S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH); /* 0666 */ /* file is now created with 0600 permissions */
On a bit level, the formula is this:
new_mask = mode & ~umask
so the following is true of the above example:
umask = 000110110 = ---rw-rw- = 0066 ~umask = 111001001 mode = 110110110 = rw-rw-rw- = 0666 new_mask = 110000000 = rw------- = 0600
Use this system call to determine if the calling process has read, write, or execute permissions for a file, or if the process is allowed to check for the existence of a file. (This last case could fail if the file is buried past a directory that the process doesn't have execute permission to.)
Simply set pathname to the name of the file to check, then set mode to one of more of the following values OR'd together:
|
If 0 is returned, it means you can access the file. If you don't have the requested permissions, -1 will be returned and errno will be set to EACCES.
This system call accomplishes the same thing as the Unix utility chmod. With chmod(), you can set the file permission bits for any particular file (if the calling process owns the file). Set path to the full path of the file to change, then set mode to whatever permissions you desire. The mode can be constructed by OR'ing together the macros in Table 2, or can be the octal representation of the permission.
The following six defined macros are also useful in this situation:
| ||||||
|
The "sticky bit"? Well, it's a long story. In a nutshell, if the sticky bit is set on a directory, files in that directory can only be renamed or removed if the user is the owner of the file, the owner of the directory, or the superuser.
For instance, the /tmp directory often has the sticky bit set as well as world-write permissions:
drwxrwxrwt 3 root root 2048 Aug 12 22:23 /tmp
In this case, anyone can write to the directory to their heart's content, but no one else (except the superuser) can remove their files since the sticky bit is set. Feature.
fchmod() works the same way, but is useful if you already have an open file descriptor to the file in question.
These functions can be used to set the owner UID and GID of a given file. Note that on some systems, these calls are only available to superuser as they can be used to get around disk quotas. Usage is similar to chmod(). Later, we'll discuss functions that can be used to determine the necessary numerical UID and GID for a given user or group name.
These functions truncate a file to a given length. If the length is less than the length of the file, the data past the new length would be inaccessible. If length is greater than the length of the file, the results are system dependent; it could be that the file is extended to the new length (SVR4) or nothing at all happens (4.3BSD).
Another way to truncate a file to zero length is to use the O_TRUNC call to open():
open("foo", O_RDWR | O_TRUNC);
Not all systems implement the truncate() call, but many do.
One use for this trick is in creating temp files: open the file with creat() or open(), then immediately unlink() it. Now the temp file invisible to everyone else. More importantly, if your program crashes, the temp file cleanup is automatic because the file is already unlink()'d!
Another way to unlink a file is to use the ANSI remove() function.
rename() is the basis for the Unix "mv" command. It can be used to rename a file, or to move a file to a new directory. Note that if you want to simply move a file from one directory to another but preserve the name, you must still specify the file name in the newpath (given that "foo" is a normal file, below):
rename("foo", "/tmp/foo"); /* valid */ rename("foo", "/tmp"); /* invalid */
More simply, the command
Most functions automatically follow the symlink, with the exception of chown(), lstat(), readlink(), remove(), rename(), and unlink(), all of which operate on the symlink itself.
The return value is the length of the symlink in bytes, or -1 on error. Note that the string stored in buf is not automatically null-terminated.
Sample call:
#include <stdio.h> #include <stdlib.h> #include <errno.h> #include <unistd.h> main() { char buf[51]; int count; if ((count = readlink("/home", buf, 50)) == -1) { perror("readlink"); exit(1); } buf[count] = '\0'; /* null terminate that puppy */ printf("/home -> %s\n", buf); }
On my machine here at home, this produces the following:
/home -> /usr/local/home
which is absolutely correct. (I don't have enough room in my root partition for all the crap that's amassed in my home directory, so all the home directories are moved to my 2.2GB partition. 36% full--and to think I grew up on 92K floppies.)
Use this system call to set the access time and modification time of a file. The Unix touch command often uses this function.
The filename is passed as the filename argument, and the time
is passed as a pointer to a
struct utimbuf { time_t actime; /* access time */ time_t modtime; /* modification time */ };
Each of the time values are the number of seconds since Epoch (often January 1, 1970). In the following sample, the times of file "foo" are updated, and the times of "bar" are set to whatever time it is now (both files already exist prior to the run):
#include <sys/types.h> #include <utime.h> main() { struct utimbuf tb; tb.actime = 2300000; /* Jan 27 1970 on Linux */ tb.modtime = 2400000; /* Jan 28 1970 */ utime("foo", &tb); utime("bar", NULL); /* make time on bar to now */ }
The following are the results immediately after running the above program:
$ date Wed Aug 13 15:33:52 PDT 1997 $ ls -l foo bar [Shows last modification time] -rw-r--r-- 1 beej users 0 Aug 13 15:33 bar -rw-r--r-- 1 beej users 0 Jan 28 1970 foo $ ls -lu foo bar [Shows last access time] -rw-r--r-- 1 beej users 0 Aug 13 15:33 bar -rw-r--r-- 1 beej users 0 Jan 27 1970 foo
As previously mentioned, you can access the values for access and
modification time through the
Directories are files, just like everything else. They consist of a number of records that contain the filename, i-node number for the corresponding file, and maybe some other things. What is in there specifically is system-dependent, so those great POSIX guys designed a bunch of routines for reading directories that are platform independent. Use these whenever you want to read a directory (unless you really really know exactly what you're doing and why you're doing it).
The basic idea is this:
struct dirent { long d_ino; /* i-node number */ __kernel_off_t d_off; /* kernel stuff */ unsigned short d_reclen; /* more kernel stuff */ char d_name[256]; /* Aha! This is the name!! */ };
Of all the above fields, the only one that's sure to be there (and the only one you really care about) is d_name which is the null-terminated name of the file. Once you have the file name, you can do anything.
Once you're done with the directory, close it to free up the DIR*.
Returns the current offset in the directory. Like the library call ftell(), except for directories.
Seeks to a specified point in the directory stream. You should only use values returned from telldir() for your offset.
Read a directory and print out the i-node number and filename (Linux):
#include <stdio.h> #include <sys/types.h> #include <dirent.h> main() { struct dirent *de; DIR *d; d = opendir("/home"); /* print the i-node and name for each file: */ while((de = readdir(d)) != NULL) printf("%7d %s\n", de->d_ino, de->d_name); closedir(d); }
Output on my machine (Ooh! Those lucky people with an account on Beej's computer!):
65537 . 2 .. 67585 ftp 83969 beej 239617 becca 307201 aaron 309249 carl 391169 bapper 243719 sd 10284 pberry
Those of you who got plenty of sleep last night will undoubtedly remember that "/home" is actually a symlink on my machine, and will notice that opendir() automatically followed the symlink.
The sync() call schedules the writing of all unflushed buffers in the buffer cache. It doesn't wait around for this to actually occur, but returns immediately. You can be pretty sure, though, that they will be written in the next few seconds.
For a given file descriptor, fd, writes all unwritten blocks to disk, then returns when done. One way to get files to do this all the time automatically is to use the O_SYNC flag when open()ing them.