GVFS/GVFS.Platform.Windows/Readme.md
The purpose of this document is to give a high level overview of how the virtualization on Windows works. ProjFS is the file system level driver that is used to intercept file system calls and then call out to a user mode process, in this case GVFS.Mount.exe to get virtualized information or to notify of file system events. There are two interfaces that are exposed by ProjFS. IVirtualizationInstance which has all the notifications callbacks and methods that can be called to manipulate the state of the virtual file system. IRequiredCallbacks are the methods that are required for virtualization to work. This interface is passed to the StartVirtualizing method.
IRequiredCallbacks interfaceThe methods of this interface are required for virtualization and are for enumerating directories and getting placeholder and file data in order for ProjFS to project the files or provide the file content.
IVirtualizationInstance interfaceThe ProjFS managed library provide the implementation of this interface with VirtualizationInstance which is created in the constructor of the WindowsFileSystemVirtualizer. In addition to the root directory of the working directory which will be the virtualization root, it allows the caller to control thread counts, the negative path cache, and notification mappings.
The negative path cache is a feature in ProjFS that allows it to cache paths that VFSForGit has returned as not found. This gave significant performance benefit because ProjFS no longer needed to make the call to the user mode process (GVFS.Mount.exe) to find out that the path doesn't exist. There is also a method on the VirtualizationInstance called ClearNegativePathCache that VFSForGit needs to call when it is changing the projection so that paths that may not have exists at the previous commits will now show up.
Notification mappings are used to set what notification callbacks will be called for a certain path. Any path in the virtualization root can have different notifications setup for it using bitwise OR-ed values of NotificationType from ProjFS. VFSForGit has the combined values in the Notifications class for specific files and folders.
The WindowsFileSystemVirtulizer turns off notifications for the .git directory except for some specific files like the index file or the folder refs/heads/. This helps the performance of git because for most file system access to the .git directory will be close to NTFS speed.
The notifications callbacks are used to let the user mode process know about various file system actions that are about to take place or have taken place. These are used by VFSForGit to keep the modified paths (the files that git should be keeping up to date) and placeholders correct based on what has happened on the file system.
There are other methods that VFSForGit can use on the VirtualizationInstance to interact with the files/folders in the virtualization. A couple examples of these methods are listed.
MarkDirectoryAsPlaceholderUsed to change a directory to a placeholder so that ProjFS will start asking VFSForGit what the contents of that folder should be and merging that with what is on disk. This is called in a couple of places when a new folder is created.
DeleteFile and UpdateFileIfNeededUsed by VFSForGit when the projection changes to update with new file data and SHA1 or delete the placeholders that ProjFS has on disk so that it will match the new projection and files will have the correct content when read. Since what is on disk takes priority, these methods can fail if called after a file has been marked dirty, converted to a full file, or turned into a tombstone.
Files and folders in the projection can be in various states to keep the virtual state of the file system.
Files and folders in this state have nothing on disk. They show when a directory is enumerated and ProjFS gets the list of files and folders from VFSForGit to satisfy the enumeration request.
This is a file or folder that is on disk with a specific reparse point that means it doesn't have all the data. For files that means it has the attributes for the file but not the content on disk. There is a SHA1 stored as the contentId in the placeholder so that ProjFS can pass that back to VFSForGit to get the content. For folders it means that ProjFS will ask VFSForGit what the contents of the folder should be and merges that with what is on disk to give the view of the file system for that folder.
A file that has been read and the contents for the file have been retrieved from VFSForGit and is now on disk. This means any future reads are passthrough to the file system for native file system performance.
When a placeholder that is hydrated or not has its attributes changed and it is now different from what the provider (VFSForGit) gave. This comes into play when trying to update or delete placeholders. When it is dirty and the code didn't pass AllowDirtyMetadata, the update or delete will fail with a UpdateFailureReason of DirtyMetadata.
File has been written to or opened for write. This means the file will no longer be updated by VFSForGit and is a regular NTFS file. The path will be added to the modified paths of VFSForGit and git will be the process updating/deleting the file.
This is created when an item is deleted to track what items have been deleted so that they won't get projected again because when there is not a item then ProjFS uses the items from the projection. These need to be deleted when the projection changes so that the correct files and folders will be projected.
+-------------------------+
| |
| Virtual |
| |
+----+--------------------+
| |
| lstat
| |
| v
| +------+------+
| | |
| | Placeholder +-------+
| | | |
| +-+-----+-----+ |
| | | |
| | open open
| | | |
| | for for
| | | |
| | read write
| | | |
| | v |
| | +-------------+ |
| | | | |
| | | Hydrated | |
| | | Placeholder | |
| | | | |
| | +-+------+----+ |
+<----+ | | |
| | open |
| | | |
| | for |
| | | |
| | write |
| | | |
| | v v
| | +---------------+
| | | |
| | | Full File |
| | | |
| | +----+----------+
| | |
| v |
+----------+-------+
|
|
deleted
|
v
+---+---------------------+
| |
| Tombstone |
| |
+-------------------------+
In the src folder which is the virtualization root after an initial gvfs clone there is a file (file1.txt).
srcStartDirectoryEnumerationCallback.ActiveEnumeration from the current projection and adds to list.GetDirectoryEnumerationCallback.ActiveEnumeration by the enumeration Guid and add to the enumeration results via the IDirectoryEnumerationResults interface.EndDirectoryEnumerationCallback when done.ActiveEnumeration from list.State of the files and folders are still all virtual, same as before the enumeration.
src/file1.txtGetPlaceholderInfoCallback.GetProjectedFileInfo and if null returns not found.WritePlaceholderInfo to create the placeholder file.src/file1.txtGetFileDataCallback.contentId which will be the blob's SHA1 to get the file content.CreateWriteBuffer to create an IWriteBuffer.IWriteBuffer.Stream.src/file1.txtOnNotifyFilePreConvertToFull.src/file1.txtOnNotifyFileHandleClosedFileModifiedOrDeletedAt this point file1.txt is still in the projection and will be return by enumeration requests but because ProjFS has the tombstone file and that is given precedence over projected files ProjFS will not return file1.txt for the enumeration.