kowh informs me that Linux has
inotify() for this purpose. BSD has
kqueue. Since the novel idea in my post already exists and has been implemented for years,
I've put the rest of the post below a cut.
Certain files change from time to time, and it could be useful
for programs to know when they have changed.
- Program A changes a file.
- Procedure B would like to know when the file is changed.
Currently, one would rewrite Program A to implement Procedure B
during its write routine.
Ideally, one could write a standalone program that would be
signaled whenever the file is written, without having to make
any changes to Program A. The operating system's write routine
would be modified to include notification logic.
On writing to a file:
If there is a list of listeners for that file:
Poke every listener to wake it up.
Permissions are easy; anything that can read a file can listen
to it.
The hard part is defining what is to be sent to the listeners.
This is easy for an append since the listener will only need the
new data. It's more difficult for a write. Should the listener
receive the new data in full or a diff? What about a case where the
file is written-in-place on the disk with a low-level API, and the
old state of the data is gone and irrecoverable?
Is the listener going to be a process that runs in the background
and simply waits or could it be a program that the OS will start up
when the event occurs?
Should reads also be listenable? Should information about the
environment of the file-access, such as date, time, user, and
process ID, also be be sent to the listener? Should the listener API
work on a queryable set of files rather than one file at a time,
which would also listen to any new file that matched the query
string?
Then there are concurrency considerations. The file could easily
be re-rewritten between the time that notices are fired and the
listeners finish taking action.
Listener blocking action could be "friendly" in which the listener
quits and starts over if any other process edits the file, "greedy"
in which the listener blocks the file until its routine is done,
or "ignorant" in which multiple listener threads carry on their
work and piss over each others' output if their output methods are
not prepared for concurrency.
There's also the problem of an infinite event loop. The dispatcher
will need to track sets of which triggers have opened which processes
and stop signaling if an event is later thrown on a file that has
already been triggered by one of the processes in the chain.
Alternatives:
Have program B ping the file timestamp for updates every few minutes
and reread the whole file if it changed.
This is what most developers do, but it is wasteful.
Do whatever tail -f does.
The fact that developers don't do this is proof that this method is
too difficult to use, or at least more obscure than it should be.