WHY DO WE NEED FILE LOCKING?
If several processes wish to READ the same data file at the same time, and no process is writing to it at the same time, it's easy - each reader can open and read the file independently of the others and with no special programming to take any account of the concurrent access.
However, if a process wishes to WRITE to a data file that's being concurrently read by another process, there's a risk that the act of writing to the file while another process is reading it will cause that reader to get corrupted / incomplete / mixed data. If two or more processes wish to write to a data file concurrently, there's a risk that the data in the file itself will get corrupted.
A FILE LOCKING scheme can be used to overcome these problems. Reader processes indicate that they are accessing the file (for reading) by requesting and obtaining a non-exclusive lock before they read. As many processes as need to may request and obtain non-exclusive locks at the same time. Writer processes will need to request and obtain an exclusive lock before they write. Non-exclusive locks will only be granted if there is NOT an exclusive lock issued at the time, and exclusive locks will only be granted if these is not any lock (exclusive or nonexclusive) issued at the time. It is important for processes to release locks once they have completed the activity for which the lock was requested.
ISSUES TO BE CONSIDERED
a) Deadlocks. If one process asks for a lock on file "a" and then for a lock on file "b", and another asks for the same two files to be locked in reverse order, then it's possible for a situation to arise in which both processes are blocked while they wait for the other
b) Blocking or non-blocking. If a process is refused a lock, what should it do? Wait until a lock is available (a blocking lock, as the process blocks), or simply raise a flag within the calling process and carry on but without the data access at that point (a non-blocking lock).
c) Ensuring that locks are released. Important, as the locks form a bottleneck and if a release is overlooked or fails, you'll be left with a jammed system. Take especial care if there's any danger of a process failing between obtaining and releasing a lock
d) Most schemes are co-operative locking schemes. In other words all the processes involved must co-operate for them to work. There's not usually anything to stop a rogue process reading or writing a file without obtaining a lock.
ALTERNATIVES
a) If the data is structured such that it can logically be held as a series of data files each of which is written by only a single process at a time, locking becomes easier. For example, in a timecard system using plain files, it's easier for the system to work with a directory of files (one per individual) that with a single file containing everyone's records
b) Using a database engine. All requests are made through the same daemon (or closely linked copies) which provides a locking facility that's usually easier to use. A database engine will also ensure against low level corruption at the time that parallel inserts are done (but it will only ensure against high level corruption if you take care when designing accesses).
FILE LOCKING IN PERL, C AND OTHER LANGUAGES USING FLOCK.
The C language flock function (which is exposed to Perl through the built in flock function in that language) allows the programmer to call for an exclusive (or non-exclusive) blocking or non-blocking lock, or to release such a lock.
If you're opening a file for read:
1. Open the file THEN
2. request the non-exclusive lock
If you're opening a file for write (append):
1. Open the file THEN
2. request the exclusive lock THEN
3. When the lock is obtained, use seek / fseek to reposition
to the end
and when you come to release the lock:
1. Ensure all data is written (by closing the file or doing
a seek) THEN
2. Release the lock
FILE LOCKING IN TCL
There's a flock function (and a funlock function too) built into the TclX extension, which is now supplied as standard in many distributions of Tcl. It appears to be a slightly more sophisticated system which allows you to lock parts of a file as well as the file as a whole.
FILE LOCKING ACROSS LANGUAGES / APPLICATIONS
If you wish to apply file locking across a number of languages / technologies, you'll possibly need to roll your own; even if the technologies concerned both have their own locking schemes, there's a chance that they differ such that an exclusive lock under one scheme won't deny access under another.
I'm going to describe a simple setup here using a lock file, with all locks granted being exclusive. Unless you have multiple readers frequently active and severe resource issues, this slightly less powerful but simpler scheme will be all you need.
When any (reader or writer) process needs to access the file:
a) Open a second file - a lock file of a specific name - for read/write.
b) If the lock file already contains your process ID, proceed
c) If the lock file already contains a different process ID, deny
d) If the lock file is new / empty write and flush your process ID
to it, then go back to step (a)
Although manipulating a lock file will be a lot quicker that working with a large data file, there are still possible conflicts - thus the need to check that you really have obtained a lock at the appropriate stage.
See also
Advanced file handling in Perl
Please note that articles in this section of our
web site were current and correct to the best of our ability when published,
but by the nature of our business may go out of date quite quickly. The
quoting of a price, contract term or any other information in this area of
our website is NOT an offer to supply now on those terms - please check
back via
our main web site