Speeding Up File Modification Time Lookup on Windows
Some of this information has been revised. Be sure to read updated finding in Efficientlly Walking File Directories in Rust (Part 2).
File modification time, or mtime is a common value used to efficiently tell
if a file has changed.
Build systems will look at the mtimes of the inputs and outputs of a build
target. If any of the input’s mtimes are newer than the output(s) mtime
the build system knows to invoke the build step for the target.
Another use case is version control systems (VCS). In order to efficiently
determine which files may be out of date the version control system will look
at the mtime of the on disk files, any that are newer than it had
previously stored are likely changed compared to what’s stored in the version
control system.
In particular I’m focusing on an attempt to implement a more efficient git
status for Windows, win-git-status.
Baseline Performance
The purpose of win-git-status is to be a more efficient version of
git status on Windows. In particular when submodules are involved. To that
end it should match the performance of git status for repos without submodules.
For a large repo without submodules I chose to use llvm-project at commit 0f9f0a40. The repo and specific are more or less arbitrary, but the more than 90k files provides a nice stress test.
On my machine with 6 cores (12 threads) and an SSD, I’ve taken timings using
git status as well as libgit2.
| Command | Time (seconds) |
|---|---|
git status |
0.340 |
lg2 status |
1.228 |
The git version was 2.29.2.windows.2. lg2 was the example build from
version 1.1.0 of libgit2.
libgit2 performs ok, but I don’t think it does threadying on it’s own.
While the git status from git for Windows does use multi-threading.
Performance of win-git-status
Using std::fs::modified() resulted in 1.6 seconds to get the
mtime from the 90k files in commit 0f9f0a40. Though this isn’t too
far from the libgit2 run it was more than 5 times worse than normal git
status.
I decided that it was worth investigating how to speed this up. Looking at the profiling of running win-git-status every file was being opened. See Profiling Rust On Windows. It seemed unnecessary to be opening every file just for the file modification time.
I’ll admit I don’t know the underlying meaning of “opening” on Windows. So it may just be a generic way to access a resource and might not do extra work.
In an attempt to speed up the mtime retrieval. I attempted to write a C
wrapper that directly called _stat and friends. This resulted in more or
less the same performance. I next tried to call GetFileAttributesExA
directly in rust. This too resulted in about the same performance of 1.6
seconds.
I decided it was time to investigate what git, and in particular what git for Windows might be doing. In doing some googling I ran across fscache.
fscache
I’m pretty sure fscache stands for “File Status Cache”.
When I initially ran across fscache. I thought it was an on disk cache of
the file mtimes. After looking into it more, it’s actually an in memory
cache that is populated when an associated git operation is ran.
For example git status will look up the mtime of file foo.c. It will
request this value from the cache. The cache will not have this file yet, so
it will then get the mtime from the file system directly, storing it in the
cache as well as returning it to the caller.
fscache is enabled by default on newer versions of git for Windows. One can
see for themselves the performance. Simply time git status
normally. Then call git config core.fscache false and retime the git
status command.
Below is doing timings with and without fscache enabled on 0f9f0a40.
| fscache state | Time (seconds) |
|---|---|
core.fscache true |
0.340 |
core.fscache false |
2.231 |
Be sure to turn the cache back on if you want to maintain performance
git config core.fscache true.
If one thinks about it some, there should be little to no reason that
git status needs to get the mtime more than once per file. In fact my
timings of win-git-status were the result of getting the mtimes once.
I decided to look further into how fscache worked.
NtQueryDirectoryFile
fscache utilizes NtQueryDirectoryFile to get the file mtimes. fscache
“open”s a directory to cache and then it makes successive calls to
NtQueryDirectoryFile to populate all the mtimes of the files in that
directory.
The ntapi crate happens to provide this function. After some wrangling, I
was able to implement getting file mtimes utilizing NtQueryDirectoryFile.
With the use of NtQueryDirectoryFile, win-git-status can now get the
mtimes from commit 0f9f0a40 in 0.772 seconds. This is still a bit more
than 2x slower than git status, but I think this path has the potential to
close the gap.
My naive implementation using NtQueryDirectoryFile retrieved one directory entry at a time. Looking more closely at the interface, it’s possible to pass a buffer and NtQueryDirectoryFile will provide an offset to subsequent entries in the buffer. This means right now the prototype implementation is calling back 90k times to get each file entry in the directory. While using a buffer it’s possible to minimize these calls and possible user to kernel context switches.
If one looks closely at the implementation of fscache, it is passing a
buffer and walking entries in the buffer when possible. I’m hopeful changing
win-git-status to utilize that approach of multiple entries per call to
NtQueryDirectoryFile will bring the time closer to the 0.340 seconds that
git status achieves.
Side Note
I had mentioned in
Profiling Rust On Windows
how it was taking ~6 seconds to get the mtimes from commit 0f9f0a40. This
is not the 1.6 seconds I mention above.
I had something similar to the following:
let walk_dir = WalkDirGeneric::<((usize),(bool))>::new("foo")
.process_read_dir(|depth, path, read_dir_state, children| {
// per directory processing
});
for entry in walk_dir {
// look up mtime
}For those unfamiliar with jwalk it may not be apparent what’s happening, so we’ll update some comments
let walk_dir = WalkDirGeneric::<((usize),(bool))>::new("foo")
.process_read_dir(|depth, path, read_dir_state, children| {
// per directory processing
// happens on a separate thread per directory.
});
for entry in walk_dir {
// look up mtime
// happens on the main thread
}Basically the code I had written lost the benefit of the multithreaded directory traversal provided by jwalk and was doing a lot of work on the main thread.
Summary
If performance is a concern when getting mtimes on windows, consider using
NtQueryDirectoryFile. It takes a little more work to implement, but the
performance is 5x or more (based on git status).
At least for git status fscache isn’t so much about caching as it is
about how it gets the mtimes of files.
I’m not fond that my first decent attempt at rust has me needing to delve into unsafe code, but I think the performance in this case justifies it.