Documentation/mm/split_page_table_lock.rst
Originally, mm->page_table_lock spinlock protected all page tables of the mm_struct. But this approach leads to poor page fault scalability of multi-threaded applications due high contention on the lock. To improve scalability, split page table lock was introduced.
With split page table lock we have separate per-table lock to serialize access to the table. At the moment we use split lock for PTE and PMD tables. Access to higher level tables protected by mm->page_table_lock.
There are helpers to lock/unlock a table and other accessor functions:
Split page table lock for PTE tables is enabled compile-time if CONFIG_SPLIT_PTLOCK_CPUS (usually 4) is less or equal to NR_CPUS. If split lock is disabled, all tables are guarded by mm->page_table_lock.
Split page table lock for PMD tables is enabled, if it's enabled for PTE tables and the architecture supports it (see below).
Hugetlb can support several page sizes. We use split lock only for PMD level, but not for PUD.
Hugetlb-specific helpers:
There's no need in special enabling of PTE split page table lock: everything required is done by pagetable_pte_ctor() and pagetable_pte_dtor(), which must be called on PTE table allocation / freeing.
Make sure the architecture doesn't use slab allocator for page table allocation: slab uses page->slab_cache for its pages. This field shares storage with page->ptl.
PMD split lock only makes sense if you have more than two page table levels.
PMD split lock enabling requires pagetable_pmd_ctor() call on PMD table allocation and pagetable_pmd_dtor() on freeing.
Allocation usually happens in pmd_alloc_one(), freeing in pmd_free() and pmd_free_tlb(), but make sure you cover all PMD table allocation / freeing paths: i.e X86_PAE preallocate few PMDs on pgd_alloc().
With everything in place you can set CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK.
NOTE: pagetable_pte_ctor() and pagetable_pmd_ctor() can fail -- it must be handled properly.
page->ptl is used to access split page table lock, where 'page' is struct page of page containing the table. It shares storage with page->private (and few other fields in union).
To avoid increasing size of struct page and have best performance, we use a trick:
The spinlock_t allocated in pagetable_pte_ctor() for PTE table and in pagetable_pmd_ctor() for PMD table.
Please, never access page->ptl directly -- use appropriate helper.