Monday, August 17, 2009

Trailing a Growing File in Perl

http://oreilly.com/catalog/cookbook/chapter/ch08.html
Trailing a Growing File

Problem
You want to read from a continually growing file, but the read fails when you reach the (current) end of file.
Solution
Read until the end of file. Sleep, clear the EOF flag, and read some more. Repeat until interrupted. To clear the EOF flag, either use seek: for (;;) {
while () { .... }
sleep $SOMETIME;
seek(FH, 0, 1);
}
or the IO::Handle module's clearerr method: use IO::Seekable;

for (;;) {
while () { .... }
sleep $SOMETIME;
FH->clearerr();
}
Discussion
When you read to the end of a file, an internal flag is set that prevents further reading. The most direct way to clear this flag is the clearerr method, if supported: it's in the IO::Handle and FileHandle modules. $naptime = 1;

use IO::Handle;
open (LOGFILE, "/tmp/logfile") or die "can't open /tmp/logfile: $!";
for (;;) {
while () { print } # or appropriate processing
sleep $naptime;
LOGFILE->clearerr(); # clear stdio error flag
}
If that simple approach doesn't work on your system, you may need to use seek . The seek code given above tries to move zero bytes from the current position, which nearly always works. It doesn't change the current position, but it should clear the end-of-file condition on the handle so that the next picks up new data.
If that still doesn't work (e.g., it relies on features of your C library's (so-called) standard I/O implementation), then you may need to use the following seek code, which remembers the old file position explicitly and returns there directly. for (;;) {
for ($curpos = tell(LOGFILE); ; $curpos = tell(LOGFILE)) {
# process $_ here
}
sleep $naptime;
seek(LOGFILE, $curpos, 0); # seek to where we had been
}
On some kinds of filesystems, the file could be removed while you are reading it. If so, there's probably little reason to continue checking whether it grows. To make the program exit in that case, stat the handle and make sure its link count (the third field in the return list) hasn't gone to 0: exit if (stat(LOGFILE))[3] == 0
If you're using the File::stat module, you could write that more readably as: use File::stat;
exit if stat(*LOGFILE)->nlink == 0;

See Also
The seek function in perlfunc (1) and in Chapter 3 of Programming Perl ; your system's tail (1) and stdio (3) manpages