We
are used to thinking about a directory containing files. This is really an
illusion. Directories do not contain files. The data of the files is not stored
in the directory.
A
directory is really just a file. It's a special file with special rules (you
can't just type "cp /dev/null directory" to erase it. It's got
special bits to make sure a mere mortal can't mess it up. Because if a file
system gets corrupted, then you can say goodbye to your data. On older UNIX
systems, you actually could "read" the contents, using 'cat .', of a
directory. But let me get back to that in a second...
A
Unix file is "stored" in two different parts of the disk - the data
blocks and the inodes. (I won't get into superblocks and other esoteric
information.) The data blocks contain the "contents" of the file. The
information about the file is stored elsewhere - in the inode.
Both
the inodes and data blocks are stored in a "filesystem" which is how
a disk partition is organized. But these inodes are strange and confusing. Let
me give you an introduction.
"ls
-i" lists the inode of a file
Normal
Unix/Linux/MacOS users aren't even aware that inodes exist. But there's an easy
way to discover them - using the "ls -i" command. Let's look at the
root file system:
% cd /
% ls -i
2637825 bin 983041 etc 1572865 lib 2981889 media 2531329 root
106497 selinux 81921 usr
196609 boot 2 home 1761281 lib64 2129921 mnt
6416 run 2457601 srv 425985
var
The "-i"
option lists the inode number before the filename. The numbers look like large
numbers, except for "home." Now let's get more information, and list
some more files by added "-a" and "-l" options:
% ls -lai | tail -7
total 132
2 drwxr-xr-x 24 root root 4096 Feb 26 13:31 .
2 drwxr-xr-x 24 root root 4096 Feb 26 13:31 ..
2637825 drwxr-xr-x 2 root root
4096 Jan 14 19:02 bin
196609 drwxr-xr-x 3 root root
4096 Feb 24 10:41 boot
3 drwxr-xr-x 16 root root 4460 Mar
5 09:35 dev
983041 drwxr-xr-x 206 root root 12288 Mar
5 07:45 etc
2 drwxr-xr-x 14 root root 4096 Dec 29 09:24 home
That's
interesting - three of the files have the inode value of "2". But as
you shall see, this makes perfect sense.
As
Unix systems can support many different types of file systems, in the "classic"filesystem, inode #2
is always the root file system. If you want to look for a file, you start with
inode #2 and work down into the directory structure. Normally the
".." directory points to the parent directory, but since
"/" is the top of the tree, the parent of "/" is
"/".
The
"dev" directory has the inode "3". I suspect that when the
filesystem was created, the "/dev" directory was the first file to be
created.
But,
you may wonder, why does "home" have the inode of "2"? You
have sharp eyes.
The
reason is simple. It happens to be a different partition, and "/home"
is the root of that partition.
Inodes
are always unique, but unique per partition. To uniquely identify a
file, you need the inode and the device (the disk partition).
What is in an inode?
Before
I said the data blocks contain the contents of the file. The inode contains the
following pieces of information
- Mode/permission (protection)
- Owner ID
- Group ID
- Size of file
- Number of hard links to the file
- Time last accessed
- Time last modified
- Time inode last modified
As I
said, a file system is divided into two parts - the inodes and data blocks.
Once created, the number of blocks of each type is fixed. You can't increase
the number of inodes on a partition, or increase the number of disk blocks.
(See the manual pages on making and tuning file systems - mkfs.ext2).
Notice
something missing? Where is the NAME of the file. Or the Path? It's NOT in the
inode. It's NOT in the data blocks. It's _in_ the directory. That's right. A
"file" is really in three (or more) places on the disk.
You
see, the directory is just a table that contains the filenames in the
directory, and the matching inode. Think of it as a table, and the first two
entries are always "." and ".." The first points to the
inode of the current directory, and the second points to the inode of the
parent directory. By Definition. As spoken by the Gods of Unix. Verily.
This
inode-magic is how you can create a "hard link" - having two or more
names for the same file. Think of a directory as a table, which contains the
name and the inode of each file in the directory. This is an important point -
the name of the file is only used in directory. You can have another directory
"containing" the same file, but it can have a different name.
When
you create a hard link, it just created a new name in the table, along with the
inode, without moving the file. When you move a file (or rename
it), you don't copy the data. That would be Slow. You just create the
(name,inode) entry in a new directory, and delete the old entry in the table
inside the old directory entry. In other words, moving a gigabyte file takes
very little time. In the same way, you can move/rename directories very easily.
That's why "mv /usr /Old_usr" is so fast, even though
"/usr" may contain (for example) 57981 files.
You
can see this "inode" stuff if you use the "ls -i" option.
It lists the inode number. find(1) can use it as well. Let's also use the
"-d" option to list information about the directory, rather than the
contents of the directory.
First
- let's make a new directory using
cd /tmp
mkdir junk
cd junk
If
you do a
ls -id ..
cd ..
ls -id .
You
will get results that look like this
/tmp/junk$ ls -id ..
327681 ..
/tmp/junk$ cd ..
/tmp$ ls -id .
327681 .