Typically a file entry consists of a file name and its unique identifier
This identifier leads to more information such as where the file is located on disk, etc.
The OS presents the file as a series of blocks (e.g. 1024 bytes), and gives the block number in relative terms
to the application (e.g. block 1, block 2) even though this may be like block 10,000 and block 2,500 on the disk
This decouples the actual file storage from the application
The file itself may have an index, a map of block number to pointers to the actual block
For large files, the index may have an index, etc.
For hard links, Unix keeps track of the number of entries pointing to a given file in the inode, only deleting it
when the reference number gets to
The OS must search its directory entries for the given file in order to perform operations on it. To prevent
the OS from having to do this every time, the system calls require a file handle returned from the open system
call. Open creates an indexed entry into an open file table for quick access, and returns this pointer as a file
handle
To allow for multiple processes operating on a file at once there are two levels of open file tables:
Per-process table -> This keeps current read/write pointer location, etc.
System-wide open-file table -> This keeps shared information like location on disk, file open count, etc.
Files can also be opened with locks
Unix treats all files as a stream of individually addressable bytes
Distributed (mounted on different hosts) e.g. HDFS
Transfer to discs currently happens in block sizes (cannot directly address memory, but this could be the future)
In Linux, the basic file system issues generic commands to device drivers to read and write from disks (e.g. retrieve
block 23
A layer above that, the file organization module keeps track of information about files and their blocks
Each file has logical blocks numbered 0 through N
This system also tracks unallocated blocks
A layer above that is the logical file system, which organizes the file system structure and keeps track of metadata
It maintains this structure through File Control Blocks (FCB’s) or inodes
inodes contain information about the file - permissions, its location on disk, etc.
A boot block is the first block of a volume and tells the system how to boot the OS on that volume (if there is one)
A volume control block comes next, which describes the number of blocks in the volume, free space, etc.
To create a file, the system allocates an inode, reads the corresponding directory entry into memory, and updates the
directory entry with the new filename and inode info
On Unix, inodes are pre-allocated on a volume
Some systems use a buffer cache which caches file system blocks that are likely to be used again
Most systems cache file data and process information in page caches (unified virtual memory)
Syncronous writes
Ensure that data is written in the order it is sent to the device, writes are not buffered (e.g. used for database transactions)
Asynchronous writes
Data is written to the cache (eventually flushed to disk). Most writes are asynchronous
Disk flushes happen when convenient
The device driver may send these writes in an order that minimizes seek time for HDDs, for example
A directory structure is just a tree that starts at the root and has pointers to files and their associated FCBs
A directory entry is just a mapping from filename to inode number
Need device name and location within the main file system for the mounted file system to attach
Whenever macOS detects a disk, it searches for a filesystem on that disk and if it exists, mounts it under /Volumes
In Linux, specify mount points in fstab (filesyste table)
Windows mounts each file system at a different letter (e.g. E:\)
In Unix, if a filesystem is mounted on a directory, the directory has a flag set saying that it is a mount point and
has a field pointing to the corresponding entry in the mount table
This then has a pointer to the superblock of the mounted filesystem (the metadata of the filesystem, UUID,
filesystem type, no. blocks, no. free blocks, etc.)
Virtual File Systems / Virtual File System Layer #
To allow things like NFS, etc. to work/be mounted all within an ext4 filesystem without directly supporting every
type of fileystem, Unix implements virtual filesystes
the filesystem interface (e.g. read, write, open, close, speaks to a virtual filesystem (VFS) implementation
that abstracts away differet fileystem details from the OS)
VFS’s maintain vnodes which are globally unique network-wide (inodes are only uniqe per filesystem)
The VFS performs the actual fileystem operations requested by the filesystem interface (handles local requests,
uses network protocol for NFS), etc.
All filesystems can be thought of as virutal file systems (and got through the VFS pipeline), even the root
Made up of 4 main object:
inode
file object (open files)
superblock
dentry object (directory entry)
Each object type must implement certain functionality
This makes up the file_operations struct
As long as a filesystem implementation implements the required filesystem operations, the VFS can just call those
specific methods and be completely agnostic
The VFS is the translation layer, it does not implement filesystem-specific code
client-server filesystem (recall that with the file system interface and virtual file systems these files can
be interacted with just like they are locally as long as internally the NFS protocol is implemented in the open, close
methods, etc.)
One machine may be both the client and the server
First the remote directory is mounted
The client then connects to the mountd daemon on the server through an RPC. If permissions are allowed, the mountd
daemon will return a file handle for the mounted file system (file system identifier and inode number of mounted directory)
The client sends the server hostname and the directory to be mounted in the request
The server will maintain an export list of directories that can be mounted along with the macines that can mount them
Whenever a client tries to access a file that is actual an NFS file, the VFS implementation makes an RPC to the server
with the file handle returned from the mount, and user and group id’s for permission checking
Thus, user and group id’s must be the same on the client and server
nfsd is the NFS daemon responding to requests on the server
The NFS protocol for file access is just a series of RPCs
All requests take a file identifier and an offset in the file
The protocol is stateless - there is no concept of open/closed files
Thus each request is idempotent - implemented by having a sequence number so the server knows whether or not
to process again (or if there are any requests missing)