Wednesday, June 4, 2008

Cross-Platform Monitoring of Filesystem Events

A recent problem with deployment of the Ringlight client has been that users with a large number of folders have been experiencing annoying amounts of CPU usage. This is because the most fundamental functionality that Ringlight provides is periodically rescanning your filesystem to automatically find changes to the filesystem. Rescan too infrequently and changes won't appear on the website when expected, confusing users. Rescan too frequently and users will complain about too much CPU usage. There are many applications that require rescanning the filesystem, from virus scanners to automatic backup programs, and they all have to deal with this problem.

An attractive alternative to rescanning the filesystem is to use filesystem monitoring events. Rather than periodically scanning to see if anything has changed, instead let the operating system notify you when something has change. Very efficient! Unfortunately, unlike a simple recursive traversal of directories, this approach has to be implemented separately on each major platform and each OS has its own pitfalls and gotchas. I will focus primarily on building this in python, although the same underlying mechanisms can be used in any language with appropriate bindings.

On Windows, filesystem events are available using the Python for Windows Extensions. It is not a particularly simple API to use, being a direct binding to the Windows system calls.

On OS X, PyKQueue offers a binding to kqueue, which is also available on BSD.

On Linux, there are two kernel interfaces, dnotify and inotify, depending on whether the kernel version is lesser or greater than 2.6.13. You can call inotify directly with pyinotify. You can also use a more generic library such as Gamin, which will use either inotify or dnotify, whichever is available. Of course, really old versions of Linux don't even have dnotify, and you'll have to fall back to periodic rescanning.

Every platform's filesystem monitoring API is different and each has different issues, however they generally share a common set of issues as well:

  • Network drives don't generate events - you'll have to use periodic scanning for these
  • Every folder to be watched must be registered separately - you can't request notifications for an entire directory tree. You can to register all the directories separately and when a new directory is create you have to remember to register it as well. You sometimes need to keep a file descriptor around for each directory you're watching, so watch out for running out of file descriptors.
  • No special handling is done for shortcuts, aliases, or symlinks - if you're monitoring, say, a directory, and that directory has a shortcut (or the equivalent for that OS), you need to monitor two objects: the shortcut itself (in case its target is changed), and the targeted file or directory.
  • Sometimes deleting a directory won't send deletion events for files in that directory or subdirectories. You have to maintain a copy of this information yourself and perform a virtual recursive delete on your database when the parent directory deletion event is received.
The Ringlight client, being a cross-platform application the primary function of which is to monitor filesystem changes (and report them to the server, where the real fun happens), naturally takes all this into account. I am planning on release the filesystem monitoring core of the Ringlight client as open source, as there are no good cross-platform filesystem monitoring libraries available and it's really a shame that people have to reimplement all of this for their applications.

By the way, the users seem quite happy with the new version of the client that users filesystem monitoring events. No complaints about excessive CPU usage anymore!