Menu

Improved exploit search engine. Try it out

"Linux - 'page->_refcount' Overflow via FUSE"

Author

"Google Security Research"

Platform

linux

Release date

2019-04-23

Release Date Title Type Platform Author
2019-05-08 "NetNumber Titan ENUM/DNS/NP 7.9.1 - Path Traversal / Authorization Bypass" webapps linux MobileNetworkSecurity
2019-05-08 "MiniFtp - 'parseconf_load_setting' Buffer Overflow" local linux strider
2019-05-03 "Blue Angel Software Suite - Command Execution" remote linux "Paolo Serracino_ Pietro Minniti_ Damiano Proietti"
2019-05-02 "Ruby On Rails - DoubleTap Development Mode secret_key_base Remote Code Execution (Metasploit)" remote linux Metasploit
2019-05-01 "CentOS Web Panel 0.9.8.793 (Free) / v0.9.8.753 (Pro) / 0.9.8.807 (Pro) - Domain Field (Add DNS Zone) Cross-Site Scripting" webapps linux DKM
2019-04-30 "Linux - Missing Locking Between ELF coredump code and userfaultfd VMA Modification" dos linux "Google Security Research"
2019-04-26 "systemd - DynamicUser can Create setuid Binaries when Assisted by Another Process" dos linux "Google Security Research"
2019-04-23 "Linux - 'page->_refcount' Overflow via FUSE" dos linux "Google Security Research"
2019-04-23 "Linux - Missing Locking in Siemens R3964 Line Discipline Race Condition" dos linux "Google Security Research"
2019-04-23 "systemd - Lack of Seat Verification in PAM Module Permits Spoofing Active Session to polkit" dos linux "Google Security Research"
2019-04-19 "SystemTap 1.3 - MODPROBE_OPTIONS Privilege Escalation (Metasploit)" local linux Metasploit
2019-04-12 "Zimbra Collaboration - Autodiscover Servlet XXE and ProxyServlet SSRF (Metasploit)" remote linux Metasploit
2019-04-08 "CentOS Web Panel 0.9.8.793 (Free) / 0.9.8.753 (Pro) - Cross-Site Scripting" webapps linux DKM
2019-04-08 "Apache 2.4.17 < 2.4.38 - 'apache2ctl graceful' 'logrotate' Local Privilege Escalation" local linux cfreal
2019-03-29 "CentOS Web Panel 0.9.8.789 - NameServer Field Persistent Cross-Site Scripting" webapps linux DKM
2019-03-28 "gnutls 3.6.6 - 'verify_crt()' Use-After-Free" dos linux "Google Security Research"
2019-03-22 "snap - seccomp BBlacklist for TIOCSTI can be Circumvented" dos linux "Google Security Research"
2019-03-19 "libseccomp < 2.4.0 - Incorrect Compilation of Arithmetic Comparisons" dos linux "Google Security Research"
2019-03-11 "Linux Kernel 4.4 (Ubuntu 16.04) - 'snd_timer_user_ccallback()' Kernel Pointer Leak" dos linux wally0813
2019-03-07 "Imperva SecureSphere 13.x - 'PWS' Command Injection (Metasploit)" remote linux Metasploit
2019-03-06 "Linux < 4.20.14 - Virtual Address 0 is Mappable via Privileged write() to /proc/*/mem" dos linux "Google Security Research"
2019-03-04 "FileZilla 3.40.0 - 'Local search' / 'Local site' Denial of Service (PoC)" dos linux "Mr Winst0n"
2019-03-01 "Linux < 4.14.103 / < 4.19.25 - Out-of-Bounds Read and Write in SNMP NAT Module" dos linux "Google Security Research"
2019-02-28 "Usermin 1.750 - Remote Command Execution (Metasploit)" webapps linux AkkuS
2019-02-28 "WebKitGTK 2.23.90 / WebKitGTK+ 2.22.6 - Denial of Service" dos linux "Dhiraj Mishra"
2019-02-22 "Micro Focus Filr 3.4.0.217 - Path Traversal / Local Privilege Escalation" webapps linux SecureAuth
2019-02-20 "MatrixSSL < 4.0.2 - Stack Buffer Overflow Verifying x.509 Certificates" dos linux "Google Security Research"
2019-02-21 "Valentina Studio 9.0.5 Linux - 'Host' Buffer Overflow (PoC)" dos linux "Alejandra Sánchez"
2019-02-13 "runc < 1.0-rc6 (Docker < 18.09.2) - Container Breakout (2)" local linux embargo
2019-02-15 "Linux - 'kvm_ioctl_create_device()' NULL Pointer Dereference" dos linux "Google Security Research"
Release Date Title Type Platform Author
2019-05-23 "Microsoft Windows 10 1809 - 'CmKeyBodyRemapToVirtualForEnum' Arbitrary Key Enumeration Privilege Escalation" local windows "Google Security Research"
2019-05-23 "Visual Voicemail for iPhone - IMAP NAMESPACE Processing Use-After-Free" dos ios "Google Security Research"
2019-05-21 "Apple macOS < 10.14.5 / iOS < 12.3 XNU - 'in6_pcbdetach' Stale Pointer Use-After-Free" dos multiple "Google Security Research"
2019-05-21 "Apple macOS < 10.14.5 / iOS < 12.3 XNU - Wild-read due to bad cast in stf_ioctl" dos multiple "Google Security Research"
2019-05-21 "Apple macOS < 10.14.5 / iOS < 12.3 JavaScriptCore - AIR Optimization Incorrectly Removes Assignment to Register" dos multiple "Google Security Research"
2019-05-21 "Apple macOS < 10.14.5 / iOS < 12.3 JavaScriptCore - Loop-Invariant Code Motion (LICM) in DFG JIT Leaves Stack Variable Uninitialized" dos multiple "Google Security Research"
2019-05-21 "Apple macOS < 10.14.5 / iOS < 12.3 DFG JIT Compiler - 'HasIndexedProperty' Use-After-Free" dos multiple "Google Security Research"
2019-05-13 "Google Chrome V8 - Turbofan JSCallReducer::ReduceArrayIndexOfIncludes Out-of-Bounds Read/Write" dos multiple "Google Security Research"
2019-04-30 "Linux - Missing Locking Between ELF coredump code and userfaultfd VMA Modification" dos linux "Google Security Research"
2019-04-26 "systemd - DynamicUser can Create setuid Binaries when Assisted by Another Process" dos linux "Google Security Research"
2019-04-24 "Google Chrome 72.0.3626.121 / 74.0.3725.0 - 'NewFixedDoubleArray' Integer Overflow" remote multiple "Google Security Research"
2019-04-24 "VirtualBox 6.0.4 r128413 - COM RPC Interface Code Injection Host Privilege Escalation" local windows "Google Security Research"
2019-04-23 "Linux - 'page->_refcount' Overflow via FUSE" dos linux "Google Security Research"
2019-04-23 "Linux - Missing Locking in Siemens R3964 Line Discipline Race Condition" dos linux "Google Security Research"
2019-04-23 "systemd - Lack of Seat Verification in PAM Module Permits Spoofing Active Session to polkit" dos linux "Google Security Research"
2019-04-17 "Oracle Java Runtime Environment - Heap Corruption During TTF font Rendering in GlyphIterator::setCurrGlyphID" dos multiple "Google Security Research"
2019-04-17 "Oracle Java Runtime Environment - Heap Corruption During TTF font Rendering in sc_FindExtrema4" dos multiple "Google Security Research"
2019-04-16 "Microsoft Windows 10 1809 - LUAFV PostLuafvPostReadWrite SECTION_OBJECT_POINTERS Race Condition Privilege Escalation" local windows "Google Security Research"
2019-04-16 "Microsoft Windows 10 1809 - LUAFV Delayed Virtualization Cache Manager Poisoning Privilege Escalation" local windows "Google Security Research"
2019-04-16 "Microsoft Windows 10 1809 - LUAFV NtSetCachedSigningLevel Device Guard Bypass" local windows "Google Security Research"
2019-04-16 "Microsoft Windows 10 1809 - LUAFV LuafvCopyShortName Arbitrary Short Name Privilege Escalation" local windows "Google Security Research"
2019-04-16 "Microsoft Windows 10 1809 - LUAFV Delayed Virtualization Cross Process Handle Duplication Privilege Escalation" local windows "Google Security Research"
2019-04-16 "Microsoft Windows 10 1809 - LUAFV Delayed Virtualization MAXIMUM_ACCESS DesiredAccess Privilege Escalation" local windows "Google Security Research"
2019-04-16 "Microsoft Windows 10 1809 / 1709 - CSRSS SxSSrv Cached Manifest Privilege Escalation" local windows "Google Security Research"
2019-04-03 "Google Chrome 72.0.3626.96 / 74.0.3702.0 - 'JSPromise::TriggerPromiseReactions' Type Confusion" remote multiple "Google Security Research"
2019-04-03 "Google Chrome 73.0.3683.39 / Chromium 74.0.3712.0 - 'ReadableStream' Internal Object Leak Type Confusion" dos multiple "Google Security Research"
2019-04-03 "Google Chrome 72.0.3626.81 - 'V8TrustedTypePolicyOptions::ToImpl' Type Confusion" dos multiple "Google Security Research"
2019-04-03 "WebKitGTK+ - 'ThreadedCompositor' Race Condition" dos multiple "Google Security Research"
2019-04-03 "WebKit JavaScriptCore - CodeBlock Dangling Watchpoints Use-After-Free" dos multiple "Google Security Research"
2019-04-03 "WebKit JavaScriptCore - Out-Of-Bounds Access in FTL JIT due to LICM Moving Array Access Before the Bounds Check" dos multiple "Google Security Research"
import requests
response = requests.get('https://www.nmmapper.com/api/exploitdetails/46745/?format=json')
                                                {"url": "https://www.nmmapper.com/api/exploitdetails/46745/?format=json", "download_file": "https://www.nmmapper.com/st/exploitdetails/46745/41184/linux-page-_refcount-overflow-via-fuse/download/", "exploit_id": "46745", "exploit_description": "\"Linux - 'page->_refcount' Overflow via FUSE\"", "exploit_date": "2019-04-23", "exploit_author": "\"Google Security Research\"", "exploit_type": "dos", "exploit_platform": "linux", "exploit_port": null}
                                            

For full documentation follow the link above

Browse exploit DB API Browse

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
Linux: page->_refcount overflow via FUSE with ~140GiB RAM usage

Tested on:
Debian Buster
distro kernel "4.19.0-1-amd64 #1 SMP Debian 4.19.12-1 (2018-12-22)"
KVM guest with 160000MiB RAM

A while back, there was some discussion about possible overflows of the
`mapcount` in `struct page`, started by Daniel Micay.
See the following threads:

https://lore.kernel.org/lkml/CAG48ez3R7XL8MX_sjff1FFYuARX_58wA_=ACbv2im-XJKR8tvA@mail.gmail.com/t/#u
"Re: [PATCH v5 07/27] mm/mmap: Create a guard area between VMAs"
Sent by me, forwarding Daniel Micay's concern about overflows of `mapcount`.

https://lore.kernel.org/lkml/20180208021112.GB14918@bombadil.infradead.org/T/
"[RFC] Warn the user when they could overflow mapcount"
from Matthew Wilcox <willy@infradead.org>


I have now noticed that the `_refcount` has a similar problem, and it is
possible to overflow it on a machine with ~140GiB of RAM (or probably also less
on kernels that have commit 5da784cce4308 ("fuse: add max_pages to init_out"),
but that's very recent, it landed in 4.20).

A FUSE request can, by default (and on kernels <4.20 always), contain up to
FUSE_DEFAULT_MAX_PAGES_PER_REQ==32 (on older kernels FUSE_MAX_PAGES_PER_REQ==32)
page references. (>=4.20 allows the user to bump that limit up to
FUSE_MAX_MAX_PAGES==256.) The page references in a FUSE request are stored as
an array whose elements are concatenations of a `struct page *` and a
`struct fuse_page_desc` (8 bytes, containing length and offset inside the page).
This means that each page reference consumes 16 bytes, so to overflow the
32-bit `_refcount` of a page, pow(2,32)*16B=64GiB of kernel memory are needed as
storage for such references allocated with fuse_req_pages_alloc(). All other
overhead is at least per-FUSE-request and distributed over
FUSE_DEFAULT_MAX_PAGES_PER_REQ==32 references.

FUSE does permit read/write operations that operate on more pages than the
maximum FUSE request page count; in this case, if direct I/O is used,
fuse_direct_io() splits the operation into multiple requests. This means that
the only limits at the VFS layer are MAX_RW_COUNT==0x7ffff000 and
UIO_MAXIOV==0x400.

This means that it is possible to create 0x7ffff references to a page that can
be freely mapped in userspace as follows:

 - Set up a virtual memory area that contains 0x200 consecutive mappings of the
   same page.
 - Create an array of UIO_MAXIOV==0x400 identical IO vectors that point to the
   area containing the 0x200 mappings.
 - Open a FUSE-backed file with O_DIRECT. (This file should ***NOT*** be served
   as FOPEN_DIRECT_IO by the FUSE filesystem, that prevents AIO from working
   AFAICS! That probably counts as a bug if I'm right...)
 - Use the UIO_MAXIOV==0x400 IO vectors for a read operation on the file.
 - Let the FUSE filesystem leave the read requests pending.

By sending 0x2000 such read operations, the _refcount can be brought close to
overflow.

(Technically, you could play games with unaligned addresses and such to increase
the number of references per read operation a bit further.)

In order to avoid needing one client-side userspace thread per read operation,
it is possible to use AIO. AIO is able to send read operations that will be
processed asynchronously by FUSE; however, FUSE limits the number of resulting
FUSE requests ***per FUSE filesystem*** to a variable number that depends on the
amount of physical memory the system has (see sanitize_global_limit(); the limit
is the amount of RAM multiplied with 2^-13). Since this limit is per-filesystem,
as long as a single filesystem operation's FUSE requests fit in the limit,
an attacker can distribute the filesystem operations across multiple FUSE
filesystems.

AIO also imposes a global limit on the number of pending operations.
The official limit for pending AIO operations across the system is
aio_max_nr==0x10000; however, as a comment in fs/aio.c explains,
the real limit is significantly higher, and up to 0x10000 *pages* of
io_event structs (minus the overhead of `struct aio_ring`)
can be used (see aio_setup_ring()); this means that the real limit is
0x10000*((0x1000-128)/32)==0x7c0000 operations.
But since the bug can be triggered with ~0x2000 parallel pread operations, that
doesn't matter here anyway.


I am attaching a crash PoC.

First, to make it possible to call dump_page() from userspace for easier
debugging:

 - Unpack dump_page_dev.tar.
 - Build the kernel module in dump_page_dev/ with "make".
 - Load the built kernel module with "sudo insmod dump_page_dev.ko".

For the actual PoC:

 - Ensure that there is no distro-specific sysctl that prevents unprivileged
   namespace creation (on Debian:
   "echo 1 > /proc/sys/kernel/unprivileged_userns_clone"). This is necessary
   to be able to create a mount namespace and mount as many FUSE filesystems as
   we want in there; the SUID fusermount helper imposes a limit of 1000 FUSE
   mounts.
 - Unpack fuse_aio.tar.
 - Build the PoC with ./compile.sh.
 - Launch a new graphical terminal with multiple tabs in a new mount namespace,
   using a command like
   `unshare -mUrp --mount-proc --fork xfce4-terminal --disable-server`.
 - Inside the namespace, run ./fuse_aio to mount 0x2000 FUSE filesystems.
 - In a second terminal tab inside the namespace, run ./aio_reader to trigger
   the bug.
 - Wait and watch `sudo dmesg -w`.

You should see debug output like this in dmesg:

[  304.782310] fuse init (API version 7.27)
[  309.607367] mmap: aio_reader (10371) uses deprecated remap_file_pages() syscall. See Documentation/vm/remap_file_pages.rst.
[  309.631150] dump_page: ---------- STARTING DUMP ----------
[  309.631154] dump_page: DUMP MARKER: 0x0
[  309.631158] page:fffff7bad9e04fc0 count:8194 mapcount:8192 mapping:ffffa0f08abdb358 index:0x0
[  309.631162] flags: 0x17fffc00004007c(referenced|uptodate|dirty|lru|active|swapbacked)
[  309.631165] raw: 017fffc00004007c fffff7bad9e049c8 ffffa0f0a04e0c10 ffffa0f08abdb358
[  309.631167] raw: 0000000000000000 0000000000000000 0000200200001fff ffffa0f0a036e000
[  309.631169] page dumped because: dump requested via ioctl
[  309.631170] page->mem_cgroup:ffffa0f0a036e000
[  309.631171] dump_page: ==========  END OF DUMP  ==========
[  309.667063] dump_page: ---------- STARTING DUMP ----------
[  309.667067] dump_page: DUMP MARKER: 0x1
[  309.667070] page:fffff7bad9e04fc0 count:532481 mapcount:8192 mapping:ffffa0f08abdb358 index:0x0
[  309.667074] flags: 0x17fffc00004007c(referenced|uptodate|dirty|lru|active|swapbacked)
[  309.667078] raw: 017fffc00004007c fffff7bad9e049c8 fffff7bad9d09a08 ffffa0f08abdb358
[  309.667080] raw: 0000000000000000 0000000000000000 0008200100001fff ffffa0f0a036e000
[  309.667081] page dumped because: dump requested via ioctl
[  309.667082] page->mem_cgroup:ffffa0f0a036e000
[  309.667083] dump_page: ==========  END OF DUMP  ==========
[  423.507289] dump_page: ---------- STARTING DUMP ----------
[  423.507293] dump_page: DUMP MARKER: 0x2
[  423.507296] page:fffff7bad9e04fc0 count:-2147479550 mapcount:8192 mapping:ffffa0f08abdb358 index:0x0
[  423.507299] flags: 0x17fffc00004007c(referenced|uptodate|dirty|lru|active|swapbacked)
[  423.507302] raw: 017fffc00004007c fffff7bad9e049c8 fffff7bad9d09a08 ffffa0f08abdb358
[  423.507303] raw: 0000000000000000 0000000000000000 8000100200001fff ffffa0f0a036e000
[  423.507304] page dumped because: dump requested via ioctl
[  423.507305] page->mem_cgroup:ffffa0f0a036e000
[  423.507306] dump_page: ==========  END OF DUMP  ==========
[  608.388324] dump_page: ---------- STARTING DUMP ----------
[  608.388333] dump_page: DUMP MARKER: 0x3
[  608.388340] page:fffff7bad9e04fc0 count:2 mapcount:8192 mapping:ffffa0f08abdb358 index:0x0
[  608.388347] flags: 0x17fffc00004007c(referenced|uptodate|dirty|lru|active|swapbacked)
[  608.388353] raw: 017fffc00004007c fffff7bad9e049c8 fffff7bad9d09a08 ffffa0f08abdb358
[  608.388358] raw: 0000000000000000 0000000000000000 0000000200001fff ffffa0f0a036e000
[  608.388361] page dumped because: dump requested via ioctl
[  608.388363] page->mem_cgroup:ffffa0f0a036e000
[  608.388365] dump_page: ==========  END OF DUMP  ==========
[  608.390616] dump_page: ---------- STARTING DUMP ----------
[  608.390620] dump_page: DUMP MARKER: 0x4
[  608.390624] page:fffff7bad9e04fc0 count:-510 mapcount:7680 mapping:ffffa0f08abdb358 index:0x1
[  608.390628] flags: 0x17fffc000000004(referenced)
[  608.390632] raw: 017fffc000000004 fffff7ba54000948 ffffa0f0b35e62f8 ffffa0f08abdb358
[  608.390636] raw: 0000000000000001 0000000000000000 fffffe0200001dff 0000000000000000
[  608.390639] page dumped because: dump requested via ioctl
[  608.390641] dump_page: ==========  END OF DUMP  ==========
[...]
[  608.409077] dump_page: ---------- STARTING DUMP ----------
[  608.409079] dump_page: DUMP MARKER: 0x4
[  608.409081] page:fffff7bad9e04fc0 count:-7678 mapcount:512 mapping:ffffa0f08abdb358 index:0x1
[  608.409083] flags: 0x17fffc000000004(referenced)
[  608.409085] raw: 017fffc000000004 fffff7ba54000948 ffffa0f0b35e62f8 ffffa0f08abdb358
[  608.409086] raw: 0000000000000001 0000000000000000 ffffe202000001ff 0000000000000000
[  608.409087] page dumped because: dump requested via ioctl
[  608.409088] dump_page: ==========  END OF DUMP  ==========
[  608.409988] dump_page: ---------- STARTING DUMP ----------
[  608.409990] dump_page: DUMP MARKER: 0x5
[  608.409992] page:fffff7bad9e04fc0 count:-8189 mapcount:1 mapping:ffffa0f08abdb358 index:0x1
[  608.409994] flags: 0x17fffc000000004(referenced)
[  608.409996] raw: 017fffc000000004 fffff7ba54000948 ffffa0f0b35e62f8 ffffa0f08abdb358
[  608.409999] raw: 0000000000000001 0000000000000000 ffffe00300000000 0000000000000000
[  608.410000] page dumped because: dump requested via ioctl
[  608.410000] dump_page: ==========  END OF DUMP  ==========

As you can see, the reference count of the page (when interpreted as an unsigned
number) goes up to 2^32-1 and wraps around, then goes down again and wraps back. 
When the refcount wraps back, the page AFAIU moves onto a freelist, and you can
see that e.g. its flags change at that point.

If you interact with the system a bit at this point, you'll soon run into
various kinds of kernel BUG()s.


My guess is that most people don't have machines with >=140GiB RAM at this
point, so luckily, issues like this are probably not a big problem for most
users yet.

As far as I can tell, there are a bunch of potential ways to deal with this
issue:

1. Make refcount/mapcount bigger; but as Matthew Wilcox points out in
   <https://lore.kernel.org/lkml/20180208194235.GA3424@bombadil.infradead.org/>,
   that would cost something like 2GiB of RAM on a machine with 1TiB RAM.
2. Dirty hack: Detect refcount/mapcount overflow and freeze them at a high
   value, in order to deterministically leak references to that page.
   Downside is that memory is still going to leak permanently.
   This is what refcount_t does on X86 or when CONFIG_REFCOUNT_FULL is set.
3. Daniel Micay's suggestion: Dynamically switch from a small inline refcount to
   an out-of-line refcount in some sort of lookup structure
   (<https://lore.kernel.org/lkml/CA+DvKQKba0iU+tydbmGkAJsxCxazORDnuoe32sy-2nggyagUxQ@mail.gmail.com/>).
4. Ad-hoc fixes to keep the number of possible references down, see e.g.:
    - https://lore.kernel.org/lkml/20180208213743.GC3424@bombadil.infradead.org/
    - commit 92117d8443bc5afacc8d5ba82e541946310f106e ("bpf: fix refcnt overflow")

Number 1 is obviously correct, but probably unacceptable given its cost; number
4 is probably the next-easiest solution for any specific way to overflow some
reference counter, but as Daniel said, it smells of whack-a-mole.
That leaves numbers 2 and 3, I guess, unless someone has a better idea?


Proof of Concept:
https://github.com/offensive-security/exploit-database-bin-sploits/raw/master/bin-sploits/46745.zip