Speeding up Git using a built-in file system watcher #3251
Replies: 14 comments 44 replies
-
Nicely said! Thank you for writing this up! Let me add that the file system watcher also greatly speeds up untracked file detection when the "untracked cache" extension is enabled. This is in addition to the |
Beta Was this translation helpful? Give feedback.
-
Caveat for Unity users wanting to try it out right now: it breaks Unity Package Manager so you can't add packages from git repositories (via URL) to your project. |
Beta Was this translation helpful? Give feedback.
-
Out of curiosity: The Windows kernel comes with a built-in file system watcher: https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-findfirstchangenotificationw Does the Scalar project use that? If not, wouldn't it have been beneficial to use the built-in Windows watcher? |
Beta Was this translation helpful? Give feedback.
-
So an actual bug (maybe, please let me know!):
I'm not sure if this should be raised to an issue, please let me know. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Sometimes when I tried to remove some files from repo within editors (e.g. vscode), this monitor is blocking system from removing these files from disk, which caused git thinking I haven't remove these files - not sure if it's caused by this monitor only, or the combination of |
Beta Was this translation helpful? Give feedback.
-
Seems like using |
Beta Was this translation helpful? Give feedback.
-
Nice feature! |
Beta Was this translation helpful? Give feedback.
-
Note: upstream has decided on |
Beta Was this translation helpful? Give feedback.
-
In the port of 2.36.0-rc0 we added a patch to continue to support |
Beta Was this translation helpful? Give feedback.
-
Thank you!! This is just amazing. |
Beta Was this translation helpful? Give feedback.
-
Just so i am understanding this correctly, i just need to add the following in my .gitconfig right ?
|
Beta Was this translation helpful? Give feedback.
-
Is this still where the link in the installer should point? |
Beta Was this translation helpful? Give feedback.
-
tl;dr If your worktrees have many files, you can make
git status
,git commit
andgit add
faster by using the (experimental!) config settingcore.useBuiltinFSMonitor = true
. For users’ convenience, there is a checkbox in the upcoming Git for Windows v2.32.0 installer to do that.With this config setting, Git commands that want to refresh Git’s index will query the built-in file system watcher (“FSMonitor”) which files have been changed, and only look at those. When there are tens of thousands of files in the worktree, this makes a difference because Git would otherwise have to look at the
lastModified
times of all files (“full scan”) to figure out which ones were modified since Git looked last. If the FSMonitor is not yet running, it is automatically started, and subsequent Git commands will benefit from it.Background
Over sixteen years ago, when Git was invented to replace the version control system used by the Linux kernel project, the fastest way to determine which worktree files were modified vs which ones were unchanged was to use Linux’ ultra-fast
lstat()
call, essentially to compare thelastModified
times to the ones recorded in Git’s index (and also taking thelastModified
time of the Git index itself into account).This worked pretty well, even for the source code of the Linux kernel project (which consisted of a little over 17k files at the time).
Today, many projects in the software industry have worktrees containing many, many more files. Combined with the different performance characteristics of NTFS, refreshing the index in such worktrees can take minutes!
Using file system watchers
The obvious idea how to accelerate refreshing Git’s index is to avoid looking at all files. Instead, run a watcher that asks the operating system to be notified whenever a file or directory within a certain directory tree was modified, added or deleted. This watcher can be queried, and Git can look at the
lastModified
times of just that set of paths.This changes the time complexity taken by a
git status
invocation from the order of total file count to the order of the number of modified/added/deleted files.Git supports the use of file system watchers since v2.16.0, when it introduced the config setting
core.fsmonitor
. This setting points to a script or executable that talks to the file system watcher.So why is this
core.fsmonitor
feature not in more wide-spread use?There are multiple explanations, the most prominent one being: it takes effort to set up a file system monitor. Watchman, the most common file system monitor, does not regularly release Windows binaries and it is difficult to build from source for many platforms.
Another reason is that the speed still leaves much to be desired, even when using a file system watcher, because Git has to spawn a hook whenever it wants to refresh Git’s index. Spawning processes is expensive on Windows, in particular when the process in question is the Perl interpreter that then has to load a JSON library to manage the Watchman query.
As a consequence of these hurdles, adoption of this feature has been low, and therefore several bugs have been lurking in the code base for years, undetected.
A built-in file system watcher
In 2019, we set out to change the game. We started to implement a file system watcher that is integrated into Git itself (“built-in”). This solves not only the build issues, but also lowers the bar to enable the feature by quite some margin. It is now as easy as setting the config option
core.useBuiltinFSMonitor
totrue
.Another, crucial improvement is that Git can talk to the built-in file system watcher directly, without having to spawn a new process. This speeds up the common case noticeably.
Finally, it allows the file system monitor to be Git-aware. Things like excluding
.git
from being watched automatically are much, much easier that way.The built-in file system watcher is a direct outcome of the Scalar project, a project whose mission is to bring as many of the improvements developed and refined in VFS for Git as possible to Git proper. Some things can obviously not be moved: the virtual file system aspect was never accepted in the Git project, and it also turned out to be impractical e.g. with recent macOS versions. Using a file system watcher to accelerate Git index refreshes, however, provided too nice of a benefit to pass up on, hence the idea to implement a Git native one.
Acknowledgements
This feature is the culmination of many years of effort, by many developers. First, Ben Peart introduced the
core.fsmonitor
setting to allow for running a hook that talks to Watchman (developed at Facebook), and almost as importantly also worked with the Watchman development team to make it run better on Windows. Alex Vandiver contributed to the effort by making the feature more robust. Johannes Schindelin started implementing the built-in file system watcher, Kevin Willford continued the work on it, and then Jeff Hostetler took over, reworking and redesigning the parts that needed it.Beta Was this translation helpful? Give feedback.
All reactions