There's a party at Ring0 and you're invited

    There's a party at Ring0 and you're invited
                  BlackHat Las Vegas 2010
               Tavis Ormandy, Julien Tinnes

                                                           - digit!

                                          WANGyu aka. keen

---[ 概述 ]

All systems make some assumptions about kernel security. Sometimes a single kernel flaw can break the entire security model

类似于谷歌 Chrome 浏览器和 Android 系统的沙箱都使我们更加依赖内核的安全
Things like sandboxing in Google Chrome and Android make us even more dependent on kernel security

Over the last year we've been involved in finding, fixing and mitigating some fascinating kernel bugs, and we want to share some of the details

We will discuss some of the ways to protect the kernel from malicious userland code, and mitigate unknown kernel vulnerabilities

---[ 以内核为目标 ]


---[ 本地权限提升 ]

You have arbitrary code execution on a machine

You want to escalate (or change) privileges

What can you target?

更多/其他权限(Unix 平台可运行守护进程,suid)
Processes with more/other privileges (Running deamons, suid binaries you can execute on Unix)

The kernel

Big code base, Performs complex, error-prone tasks, Responsible for the security model of the system

---[ 将 Linux 内核视为本地目标 ]

Linux 内核成为目标已经十年有余了
The Linux kernel has been a target for over a decade

Memory/memory management corruption vs. logical bug

The complexity of a kernel makes for more diverse and interesting logical bugs. Fun logical bugs include:

ptrace() / suidexec       (Nergal, CVE-2001-1384)
ptrace() / kernel threads (cliph / Szombierski, CVE-2003-0127)
/proc file give-away      (H00lyshit, CVE-2006-3626)
prctl suidsafe            (CVE-2006-2451)

---[ Linux 内核内存管理漏洞 ]

Tend to be more interesting and diverse than userland counterpart

Complexity of memory management

Interesting different paradigm (the attacker finely controls a full address space)

cliph / ihaquer do_brk()                                (CVE-2003-0961)
cliph / ihaquer / Devine / others "Lost-VMA"-style bugs (check isec.pl)
Couple of "classic" overflows
Null (or to-userland) pointer dereferences

---[ 从内核逃逸 ]

Exploiting the kernel is often the easiest way out of:

chroot() 监狱
chroot() jails

Mandatory access control

Container-style segregation (vserver etc..)

使用上述隔离技术会充分暴露内核的攻击面? (是指因为要接收各种输入?)
Using those for segregation, you mostly expose the full kernel attack surface

Virtualization is a popular alternative

强制访问控制在全安全补丁环境下更有意义,例如 Grsecurity
MAC makes more sense in a full security patch such as Grsecurity

---[ Windows 及本地内核漏洞 ]

传统上白帽子们不考虑 Windows 平台(鄙视微软?)
Traditionally were not considered relevant on Windows

近期(Windows 安全性)有了些改变
Changed somewhat recently

Increased reliance on domain controls

Use of network services

introduction of features like protected mode/integrity levels

Windows 的(安全水平)在过去的几年中已经有了改变,现在已经变得和 Linux 差不多了
This has changed in the last few years and Windows is roughly in the same situation as Linux now

只是 Windows 系统欠缺了一些高级权限隔离机制 (缺少强制访问控制实例)
With a bit less focus on advanced privilege separation and segregation (Lacks MACs for instance)

---[ 远程内核漏洞 ]

在 Linux 平台公开的漏洞利用仍然相当罕见
Published exploits are still quite rare for Linux

Notable exceptions

WIFI 驱动程序(那些写得不好的代码易受攻击)
Wifi drivers (big attack surface, poorly written code)

请参看 Stephane Duverger,sgrakkyu/Julien 的利用代码
See few exploits by Stephane Duverger, sgrakkyu or Julien

或阅读 Stephane 撰写的文档
Read Stephane's paper

sgrakkyu 针对 SCTP 漏洞的利用也让人映像深刻(Phrack 杂志上有他和 twiz 联合撰写的文章)
sgrakkyu's impressive SCTP exploit (Read his article co-written with twiz in Phrack)

Few others

---[ 远程内核漏洞 2 ]

在 Windows 平台至少流行 6/7 年时间了
Have been quite popular on Windows for at least 6/7 years

Third party antivirus and personal firewall code

GDI 相关的缺陷
GDI-related bugs

TCP/IP 协议栈相关的缺陷
TCP/IP stack related ones (Neel Mehta et al.)

Immunity 关于 SMBv2 漏洞的利用
Immunity's SMBv2 exploit

Web 浏览器也在改变着游戏(的规则)
Web browsers changed the game

威胁模型随着 GDI 的内核化也开始变得不同起来
The threat model for in-kernel GDI is now different

请参看 Linux 平台 NVidia 显卡驱动的远程利用
See also the remotely exploitable NVidia drivers bug on Linux

Stay tuned

---[ 从去年开始的一些漏洞 ]


---[ (漏洞生命周期的)时间表 ]

请参看 PPT 图示

---[ 内核攻击大曝光 ]

There are many entrypoints for attackers to expose kernel attack surface, apart from system calls there are also

I/O 控制码,设备,内核解析(引擎),文件系统,网络协议,字体(格式解析),位图,可执行文件格式等等
Ioctls, devices, kernel parsers, Filesystems, network protocols, Fonts, Bitmaps, etc. (primarily Windows)
Executables formats (COFF, ELF, a.out, etc.)

一个不错的选择可能是 DPL 3 的中断处理例程,所以我们决定去看一看
Perhaps one under appreciated entrypoint is dpl3 interrupt handlers, so we decided to take a look

---[ Windows 2003 KiRaiseAssertion 函数缺陷 ]

在 Windows Server 2003 中,微软引入了一个新的 DPL 3 (环三代码可访问) IDT 项 (即位于公有符号的 KiRaiseAssertion)
In Windows Server 2003, Microsoft introduced a new dpl3 (accessible to ring3 code) IDT entry (KiRaiseAssertion in the public symbols)

这使得 INT 0x2c 大致相当于 RaiseException (STATUS_ASSERTION_FAILED)
This makes int 0x2c roughly equivalent to RaiseException (STATUS_ASSERTION_FAILED)

I've never seen this feature used, but analysis revealed an interesting error; interrupts were not enabled before the exception dispatch!

This bug has two interesting characteristics...

---[ Windows 2003 KiRaiseAssertion 函数缺陷 ]

简短的利用代码 (BSOD 利用?)
Tiny exploit (4 bytes)

    00000000  31E4      xor esp,esp
    00000002  CD2C      int 0x2c

Tiny patch (1 byte)


---[ 缺页异常 ]

A page fault exception occurs when code:

Attempts to access a non-present page

Has insufficient privilege to access a present page

Various other paging related errors

The handler is passed a set of flags describing the error:

    I/D - Instruction / Data Fetch
    U/S - User / Supervisor Mode
    W/R - Read / Write access
      P - Present / Not present

---[ 管态 ]

如果发生异常时处理器处于特权级则 supervisor 标志被置位(注* 这里描述有误)
If the processor is privileged when the exception occurs, the supervisor bit is set

[注*] U/S - 0  The access causing the fault originated when the processor
               was executing in supervisor mode (CPL < 3).
            1  The access causing the fault originated when the processor
               was executing in user mode (CPL = 3).


Operating system kernels use this to detect when special conditions occurs

这可能意味着内核遇到了错误 —— Oops、蓝屏、Panic 等
This could mean a kernel bug is encountered. Oops, BugCheck, Panic, etc.

Or some other unusual low-level event

Can also happen in specific situations (copy-from-user etc...)

If the processor can be tricked into setting the flag incorrectly, ring3 code can confuse the privileged code handling the interrupt

---[ VMware 错误的 #PF 处理代码 ]

通过学习虚拟 8086 模式任务执行时的机器状态,我们发现了一种用户态代码缺页时 VMware 错误的设置 supervisor 位的方法
By studying the machine state while executing a Virtual-8086 mode task, we found a way to cause VMware to set the supervisor bit for user mode page faults

远调用在虚拟 8086 模式中被(VMware)模拟的不正确
Far calls in Virtual-8086 mode were emulated incorrectly

当 CS:IP 对压栈时,(VMware 错误的赋予了) supervisor 权限
When the cs:ip pair are pushed onto the stack, this is done with supervisor access

我们可以以此获得 VMware 内的环零权限
We were able to exploit this to gain ring0 in VMware guests

Linux 内核检查 CS 值判断 PNPBIOS 的支持情况
The linux kernel checks for a magic CS value to check for PNPBIOS support

但是我们因为是在虚拟 8086 模式,我们可被允许任意的 CS 值
But... Because we're in Virtual-8086 mode we must be permitted any value CS

---[ 利用不正确的 U/S 位 ]

我们可以利用这个错误 :-)
We can exploit this error :-)

我们 mmap() shellcode 到 NULL 地址,然后进入 VM86 模式
We mmap() our shellcode at NULL, then enter vm86 mode

mmap_min_addr 方法开始变得普及,我们也将基于此方法展开溢出利用,就像 CVE-2009-1895(漏洞的利用那样)
mmap_min_addr was beginning to gain popularity at the time we were working on this, so we bypassed that as well (CVE-2009-1895)

当我们远调用不存在的页面(地址 ss:sp)时,#PF 被触发
When we far call with a non-present page at ss:sp, a #PF is delivered

又因为我们可以伪造任意的 CS,我们可设置一个值使得内核认为 PNPBIOS 故障
Because we can spoof arbitrary CS, we set a value that the kernel recognises as a PNPBIOS fault

内核尝试调用 PNPBIOS 故障处理例程
The kernel tries to call the PNPBIOS fault handler

但这一切又不是一个真正的 PNPBIOS 故障,处理例程故为 NULL
But because this is not a real fault, the handler will be NULL

结合之前 mmap() 的 shellcode 被执行,我们便拿到了 root 权限
=> r00t

---[ 利用不正确的 U/S 位(演示代码) ]

Triggering this issue was simple, we used a code sequence like this:

    vm.regs.esp = 0xDEADBEEF;
    vm.regs.eip = 0x00000000;
    vm.regs.cs = 0x0090;
    vm.regs.ss = 0xFFFF;

    CODE16("call 0xaabb:0xccdd", code, codesize);

    memcpy(REAL(vm.regs.cs, vm.regs.eip), code, codesize);

    vm86(Vm86Enter, &vm);

---[ 页故障多欢乐 ]

If the kernel ever trusts data from userspace, a security issue may exist

However, it's worth remembering that it's not just the data that users control, it's also the presence or absence of data

By claiming to have more data available than we really do, we can reach lots of unusual error paths

在数据结构相互交错的 Windows 平台,这一点是尤为真实的
This is especially true on Windows where the base system types are large inter-dependent structures

我们在 Windows NT 中发现了这个问题的一个有趣例子,它可导致特权提升
We found an interesting example of this problem on Windows NT, resulting in a privilege escalation

MS10-015 NtFilterToken() 例程的两次释放漏洞
MS10-015, a double-free in NtFilterToken()

---[ Windows NT NtFilterToken() 例程的问题 ]

NtFilterToken() 是一个系统服务,该例程类似于 CreateRestrictedToken() 的工作
NtFilterToken() is the system service that makes routines like CreateRestrictedToken() work

NtFilterToken() 会传入一个 (void **) 类型的参数到辅助例程,其目的是存放捕获到的数据
NtFilterToken() would pass a (void **) to a helper routine, which would be used to store the captured data

我可以强制捕获过程失败,方法是伪造 SID 大于实际的大小,并且强迫结构跨越页边界
I can force the capture to fail by claiming the SID is bigger than it really is, and forcing the structure to straddle a page boundary

---[ Windows NT NtFilterToken() 例程的问题 ]

在错误发生时,辅助例程会释放空间但它并不重置 (void **) 参数,而调用者 NtFilterToken() 例程会再次释放上述空间!
On error, the helper routine releases but doesn't reset the (void **) parameter, which NtFilterToken() will release again!

The kernel detects a double free and BugChecks, so we only get one attempt to exploit this...

We need to get the buffer reallocated a small window. This is possible, but unfortunately is unavoidably unreliable

Example Code: http://bit.ly/b9tPqn

---[ Windows NT TTF 字体解析漏洞 ]

将 GDI 模块从用户态移到内核态有利于提高图形子系统的性能且不会有任何重大的系统稳定性或可靠性降低(深入解析 Windows 操作系统 第四版)
"Moving [...] the GDI from user mode to kernel mode has provided improved performance without any significant decrease in system stability or reliability."
(Windows Internals, 4th Ed., Microsoft Press)

GDI 是一个重要的内核攻击入口,而且也许是最轻松的远程利用方式
GDI represents a significant kernel attack surface, and is perhaps the most easily accessible remotely

我们确信字体解析是一个薄弱环节,它可以通过 Internet Explorer 的 @font-face 支持远程触发
We identified font parsing as one of the likely weak points, and easily accessible via Internet Explorer's @font-face support

This resulted in perhaps our most critical discovery, remote ring0 code execution when a user visits a hostile website
(even for unprivileged or protected mode users)

---[ Windows NT TTF 字体解析漏洞 ]

Internet Explorer 支持的字体格式被称为 EOT (嵌入式 OpenType 格式),本质上这个格式是 TTF 格式的一个外包(外包了 DRM 层)
The font format supported by Internet Explorer is called EOT (Embedded OpenType), essentially a trivial DRM layer added to TTF format fonts

EOT 还定义了可选的子格式 CTF 和 MTX (我们也在其中发现了一些环三漏洞,请参见 MS10-001 及其他)
EOT also defines optional sub-formats called CTF and MTX (in which we also identified ring3 vulnerabilities, see MS10-001 and others),

(CTF 和 MTX)本质上是带压缩的(减少冗余) TTF,请参见 http://www.w3.org/Submission/2008/SUBM-EOT-20080305/
but are essentially TTF with added compression and reduced redundancy. See http://www.w3.org/Submission/2008/SUBM-EOT-20080305/

EOT 还增加了异或加密支持,以及其他高级的 DRM 技术来阻止漫画盗版
EOT also adds support for XOR encryption, and other advanced DRM techniques to stop you pirating Comic Sans

t2embed.dll 库接受 EOT 的输入处理 TTF 文件,处理包括解密请求等,它算是 GDI 的一个外包库
The t2embed library handles reconstructing TTF files from EOT input, including decryption and so on, at which point GDI takes over

---[ Windows NT TTF 字体解析漏洞 ]

在 GDI 解析 TTF 目录时,我们发现了多个整数溢出错误(这些目录简单的描述了每个表在 TTF 文件中的位置偏移)
We found multiple integer errors when GDI parses TTF directories (these directories simply describe the position of each table in the file)

这些解析代码位于内核且基本上从 NT4 开始就没有变动过
This code is executed at ring0, and was essentially unchanged since at least NT4

Microsoft wasn't alone, most other implementations we tested were vulnerable,

but as the decoder ran at ring0 on Microsoft platforms, the impact was far more serious

---[ 空指针解引用问题 ]

To-userland pointer dereferences

If at any time the kernel trusts data in user space, privilege escalation is likely

NULL dereferences are a common error

(NULL 是)常见的指针初始化值/错误指针类型返回值
Common initialization value/error-returned as pointers

在 C 语言中 NULL 是一个特殊的值,但对 x86 底层硬件而言则无任何特殊
NULL is a special value in C, but has no special meaning to the underlying hardware on x86

---[ 空指针解引用问题 ]

有趣的是,在 Linux 2.0/I386 时他们不可被利用
Interestingly, they used to not be exploitable in Linux 2.0/i386

NULL 段描述符使用了 NULL
Segmentation was used

A dereferenced pointer without a segment override would not reach userland

Wrong pointer dereferences didn't become "to-userland" pointer dereferences

thus their destination would be harder to control

有趣的是到 2004/2005 年仍有不少 Linux 内核开发者还不明白这些安全隐患的危害
Interesting threads in ~2004/2005, where many Linux kernel developers did not understand the security consequences

Was still the case for some of them until recently

我们稍后将谈论 mmap_min_addr 问题
Will talk about mmap_min_addr later

---[ Linux 内核的 sock_sendpage 例程 ]

CVE-2009-2692 被发现于去年八月
CVE-2009-2692, found it last August

影响所有的 2.4 和 2.6 内核更新
Affected all 2.4 and 2.6 kernels to date

Every major distribution shipped vulnerable kernels

NULL function pointer dereference

Trivial to exploit

---[ Linux 内核的 sock_sendpage 例程 ]

Linux 内核中每一个套接字都含有一个 proto_ops 类型的结构,其内部是一组(协议操作相关例程的)函数指针
Every socket in the Linux kernel has a set of function pointers associated with it called proto_ops (Protocol Operations)

上述(指针指向的)例程实现了套接字相关的各种操作,例如:accept, bind, shutdown 等
Implement the various operations that can be performed on a socket, e.g. accept, bind, shutdown, and so on

The general socket management code doesn't have to know about the underlying transport or protocol, because this is all abstracted away

proto_ops 结构定义于 /include/linux/net.h 文件
The proto_ops definition is available in include/linux/net.h

---[ Linux 内核的 sock_sendpage 例程 ]

Drivers implement the operations they support and point operations they don't support to pre-defined kernel stubs

This model is very fragile if you add a new operation:

You need to update all drivers and point the new operation to a stub (or implement it)

It's a lot of code to update, including macros used for initialization

当 sock_sendpage() 例程被添加时,它会假定 proto_ops 结构中相应的域始终被正确初始化
When sock_sendpage() was added, it assumed the corresponding proto_ops field would always be correctly initialized

---[ Linux 内核的 sock_sendpage 例程 ]

Unfortunately, a lot of drivers did not get properly updated

The SOCKOPS_WRAP macro had a bug

许多驱动使用宏 SOCKOPS_WRAP 初始化 proto_ops 结构
Used by many drivers to initialize proto_ops

Making them vulnerable in any case

驱动的 sendpage 指针被默认赋值为 NULL
.sendpage was implicitly initialized to NULL for many drivers

所以 sock_sendpage() 函数将执行位于 NULL 地址的代码
And sock_sendpage() would start executing code at NULL

如果攻击者将 shellcode 预先映射到 NULL 地址,这些 shellcode 将被执行
Map your shellcode at NULL and it'll get executed

我们编写了一段利用示例代码给我们的合作厂商,spender 则公开了完整的利用代码
We wrote a trivial exploit that we shared with vendors, spender released one with a fully featured shellcode

---[ Linux 的 udp_sendmsg 例程 ]

CVE-2009-2698 被发现于今年八月
CVE-2009-2698, released in August

udp_sendmsg() 例程内部的一个代码分支会在某些情况下以 NULL 路由表为输入参数调用 ip_append_data() 函数
It's possible to trigger a codepath in udp_sendmsg() that will result in calling ip_append_data() with a NULL routing table

This time, it's a data NULL pointer dereference

攻击者将可以通过空地址控制内核的数据(rtable 路由结构)
An attacker will control kernel's data (rtable) through address NULL

Still exploitable

---[ Linux fasync 的内存释放后再使用问题 ]

提供"异步 I/O 通知"支持的驱动程序都必须存在 fasync_struct (结构)链表,fasync_struct 结构包含 fds 域(以及对应的 file 结构)
Drivers which want to provide asynchronous I/O notification have a linked list of fasync_struct containing fds
(and the corresponding file structure) to notify

同一个 file 结构可以出现在多个 fasync_struct 链表中
The same file structure could be in multiple fasync_struct lists

Most notably a special one for locked files

在文件被锁定而后被关闭的情形下,(内核)代码会犯一个逻辑错误:它仅将上述 file 结构从特定的"被锁定文件链表"中移除、释放
If the file was locked, and then closed, a logical bug would remove the file structure only from the special locked files linked list and free the file structure

此时,驱动程序仍旧持有这个已被释放了的 file 结构的引用(指针)
The driver would still have a reference to this freed file structure

Gabriel Campana 编写了利用代码
Gabriel Campana wrote an exploit

Tricky to make it reliable

---[ NetBSD 系统中断返回指令 #GP 处理问题 ]

An inter-privilege iret can fail before the privilege switch occurs

举例来说,如果要恢复的 EIP 超出了代码段的限制这将导致常规保护错
For instance, if restored EIP is past the code segment limit, #GP will occur

while in kernel mode, No privilege switch occurs, so no stack switch

No saved stack information on the trap frame

但是 NetBSD 系统却需要构建完整的陷阱帧
But NetBSD expects a full trap frame

Due to the non executable stack emulation, this can happen during a legitimate program's execution

---[ Windows NT 系统 #GP 陷阱处理例程的问题 ]

After discovering these fun bugs in interrupt handlers, we audited the remaining interrupt handlers

KiTrap0D 函数(也即 #GP 陷阱处理例程的公有符号)的一段代码看起来信任了陷阱帧的内容
One section of code in KiTrap0D (the name of the #GP trap handler in the public symbols) appeared to trust the contents of the trap frame

这段代码是虚拟 8086 监视器的一部分,因为存在着大量的逻辑分支所以很少有人注意到这里
The code itself is a component of the Virtual-8086 monitor, introducing lots of fun special cases that few people are familiar with

It took another two weeks of research to figure out how to reach the code and write a reliable exploit,

最终的结果令人兴奋! 我们挖到了一个源于 Windows NT 时代的古老漏洞!
but the end result was a fascinating and ancient vulnerability in the core of Windows NT

---[ BIOS 调用及敏感指令 ]

如果你还记得 MS-DOS 编程,你应该会熟悉 INT 0x21 号系统服务中断
If you can remember programming MS-DOS, you'll be familiar with int 0x21 to invoke system services

BIOS 调用用于和硬件交互,大多数人都还会记得 BIOS INT 0x10 号中断提供了显示相关的服务
BIOS calls were then used to interact with hardware, most people will remember int 0x10 was used for video related services

在虚拟 8086 模式下,这些服务被监视器代码所截获
In Virtual-8086 mode, these services are intercepted by the monitor code

"敏感指令"一词由英特尔公司定义,任何在虚拟 8086 模式下运行的实模式代码都应该能被正确执行,但(敏感指令)在保护模式下则不被允许
"Sensitive Instructions" is the term given by Intel
to any action in Virtual-8086 mode that real mode programs expect to be able to perform, but cannot be permitted in protected mode

上述指令截获机制(Actions Trap)将使内核有机会决定如何处理这些实模式代码
These actions trap, and the kernel is given an opportunity to decide how to proceed

---[ Windows NT 系统 #GP 陷阱处理例程的问题 ]

Windows NT 的虚拟 8086 监视器设计于上世纪九十年代初,从那以后这些代码就几乎没有变动过
The design of the Virtual-8086 monitor in Windows NT has barely changed since it's original implementation in the early nineties

为了支持 BIOS 服务,#GP 陷阱处理例程中的一个存根(桩)函数会基于陷阱帧恢复执行上下文环境
In order to support BIOS service routines, a stub exists in the #GP trap handler that restores execution context from the trap frame

对于上述(桩)代码的调用需要先通过校验,但我知道我们可以在 VMWare 平台伪造 magic values
Access to this code is authenticated, but by magic values that I knew we could forge from our work on vmware

However, There were several hurdles we needed to overcome before we could reach this code, but each one was an interesting exercise

---[ Windows NT 系统 #GP 陷阱处理例程的问题 ]

虚拟 8086 监视器(的代码)可通过未文档系统服务 NtVdmControl() 例程来调用
The Virtual-8086 monitor is exposed via the undocumented system service NtVdmControl()

这个调用同样需要身份验证 —— 只有含有 VdmAllowed 权限的进程才能访问它
This call is authenticated, a process is required to have a flag called VdmAllowed in order to access it

我们发现只有拥有 SeTcbPrivilege 超级权限(只被授予最有特权的代码)才能设置 VdmAllowed 标志位
We found that the VdmAllowed flag can only be set with SeTcbPrivilege (which is only granted to the most privileged code)

但我们可以利用 NTVDM 子系统绕开上述检测,此后我们再利用 CreateRemoteThread() 给受信进程插入远程线程
We were able to defeat this check by requesting the NTVDM subsystem, and then using CreateRemoteThread() to execute within the authorised subsystem process

此时,我们就可以访问 NtVdmControl() 了! (我们可以尝试"触发"漏洞代码了)
Now that we were authorised to access NtVdmControl(), we could try to reach the vulnerable code...

---[ Windows NT 系统 #GP 陷阱处理例程的问题 ]

漏洞代码还受到一个判断条件 —— "陷阱帧上特定 CS:IP 对"的保护
The vulnerable code was guarded by a test for a specific cs:eip pair in the trap frame

We can forge trap frames by making iret fail, but we still can't request iret return into arbitrary code segments,

as this would be an obvious privilege escalation (rpl0)

但是在虚拟 8086 模式下 CS 失去了它特殊的意义,因为 CS 可保证恒为 cpl3,所以任何值都是被允许的
But... cs loses it's special meaning in Virtual-8086 mode, which is guaranteed to always be cpl3, so it's reasonable to request any value

我们还需要让中断返回指令产生常规保护异常,这可以通过在返回时设置 eflags.TF=1 来实现。这个操作会被 CPU 认为是"敏感的",CPU 会发出 #GP 异常
We still need to cause iret to #GP, we did this by setting eflags.TF=1, when returning. This is considered "sensitive", and we get #GP instead

This is poorly documented by Intel, but is self-evident from experimentation

---[ 自动化漏洞模糊挖掘 ]


---[ 针对系统服务的漏洞挖掘 ]

在 Windows 平台,系统服务调用接口是复杂的、不易用的、缺乏支持的、未文档化的
On Windows, the system call interface is complex, unstable, unsupported and undocumented

系统服务例程数超过 1400 个(而 Linux 内核大约提供了 300 个例程)
It's also vast, with ~1400 entries (cf. Linux ~300)

They are designed to only ever be called by Microsoft code

Rarely see exposure to malformed parameters, so simple fuzzing will generally expose interesting bugs

The parameters are often complex objects, multiple levels deep with large inter-dependencies

Pathological parameters will often reach rarely exercised code

Of course, the kernel also parses fonts, pixmaps, and other complex formats all at ring0...

All excellent fuzz candidates!

---[ 针对系统服务的模糊挖掘 ]

模糊挖掘可发现 Windows 系统的漏洞
Trivial fuzzing will find Windows bugs

同样也可以发现 Linux 系统的漏洞,但这也不那么容易
Fuzzing will find Linux bugs, but the task is not so trivial

我们已经积累了一些 Linux 平台上模糊挖掘的有趣技巧,并成功找到了一些小错误
We've developed some interesting techniques for fuzzing on Linux, and have had some success finding minor bugs

---[ 保护内核及其攻击面 ]


---[ TPE(可信执行路径) ]

A reasonably old concept to prevent local privilege escalation

Aims to prevent gaining arbitrary code execution in the first place

Linux 平台一个天真的做法是挂载用户可写路径为 "NOEXEC"
A naive way of doing it on Linux was to mount user-writable PATHs "noexec"

Easy bypass by going through the dynamic loader

Grsecurity 基于 gid/uid 的设计方案也有许多年了
Grsecurity had a good gid/uid based one for years

现在可以实际应用("NOEXEC" 防止文件被作为 PROT_EXEC 映射)
Now could actually works ("noexec" prevents file mappings as PROT_EXEC)

白名单方法在 Windows 平台变得越发普及
This approach is gaining popularity on the Windows platform (white listing)

---[ TPE(可信执行路径)的缺点 ]

"Arbitrary code execution" should not only mean "arbitrary opcodes"

从 Python 或 Ruby 的解释器中你可以利用很多错误
You can exploit lots of bugs from a Python or Ruby interpreter

GDB 调试器

The threat model is changed for many binaries

a local vulnerability in 'nethack' now becomes useful

ZSH ? 当然,如果攻击者已经拥有任意代码执行权限这确实是无用的
or those zsh / make vulnerabilities. Of course, useless if the attacker already has arbitrary code execution

Browser sandbox

OpenSSH/vsftpd 权限隔离沙箱
OpenSSH/vsftpd 'privilege-separated' sandbox

---[ 沙箱和攻击面降低 ]

Ideally, a process could opt-out from some kernel features it does not require

可 Linux 从没有真正的(打算)实现"酌情限/降权设计"
Linux does not have any real "discretionary privilege dropping facility"

Most of the focus is on Mandatory Access Control

Programmer defined 和 Administratively defined (方案)也一直处于争论中
Programmer defined vs. Administratively defined policies debate

Windows 系统有一些限权特性的设计(基于令牌控制)
Windows has more privilege-dropping like features (control over tokens)

But still nothing to really protect the kernel's attack surface

---[ 有限的选择 ]

在 Linux 平台,chroot() 改变根目录位置到一个空目录可降低一些攻击面
On Linux, things such as chroot() to an empty directory remove a small chunk of attack surface

请参考 Linux 平台 Chrome 浏览器 suid 的沙箱设计
cf. Chrome's Linux suid sandbox design

基于沙箱的 ptrace()
ptrace() based sandbox

Good choice but slow (and not trivial to get right)

基于 SECCOMP 的沙箱,Chrome 浏览器的未来?
SECCOMP-based sandbox, Chrome Linux' future?

If we can't protect the kernel let's reduce it's privileges

Virtualization is an interesting alternative for segregation

---[ UDEREF ]

Unexpected to userland pointer dereferences are an issue

我们已经提到 Linux/I386 有分离的内核态/用户态逻辑地址空间
We've mentioned Linux/I386 used to have separate logical address space for Kernel/Userland

内核的段描述符基址在 PAGE_OFFSET 之上
The Kernel's segment descriptors bases were above PAGE_OFFSET

PaX 的 UDEREF 使数据段扩大,限制在 PAGE_OFFSET 之上
PaX' UDEREF makes data segments expand-down, limit them above PAGE_OFFSET

KERNEXEC 也小心的处理代码段
KERNEXEC takes care of the code segment

AMD_64 平台上应该做些什么? 没有段的概念,完整的地址空间切换(Xen 实现了)?
What to do on AMD_64? No segmentation, Full address space switching (Xen does it)?

---[ mmap_min_addr 问题 ]

mmap_min_addr 问题的解决需要仔细考虑(注* 要考虑修改的可移植性)
mmap_min_addr is a pragmatic attempt to tackle this problem portably

[注*] 许多程序实际是依赖 mmap_min_addr 为零的,如:

      Qemu, as shipped in Debian 5.0, requires low virtual memory mmaps.
      mmap_min_addr must be set to 0 to run qemu as a non-root user.
      This limitation has been removed upstream,
      so qemu should work with an increased mmap_min_addr starting with Debian squeeze.

着眼于(防御) NULL 指针的解引用
Focusing on NULL pointers dereferences

system-wide minimum address that can be used at a process

This has been plagued with many bugs in the past

In much better shape now

我们发现了一个利用 personality 函数(将进程执行环境改变成 SVr4)和 suid 绕开检测的方法
We've found one bypass using personalities and suid binaries

Another one we need to investigate

---[ mmap_min_addr personalities 检测旁路 ]

漏洞编号 CVE-2009-1895

SVr4 会以只读权限映射页面到零地址,而(SVr4 平台)许多程序的工作也依赖于这个行为(注*)
SVr4 maps page 0 as read-only, some programs depend on this behaviour

[注*] 请参看下面的代码及注释

 976        if (current->personality & MMAP_PAGE_ZERO) {
 977                /* Why this, you ask???  Well SVr4 maps page 0 as read-only,
 978                   and some applications "depend" upon this behavior.
 979                   Since we do not have the power to recompile these, we
 980                   emulate the SVr4 behavior. Sigh. */
 981                down_write(&current->mm->mmap_sem);
 982                error = do_mmap(NULL, 0, PAGE_SIZE, PROT_READ | PROT_EXEC,
 983                                MAP_FIXED | MAP_PRIVATE, 0);
 984                up_write(&current->mm->mmap_sem);
 985        }

为了获得更佳的移植性,Linux 内核支持 SVr4 进程执行域设置(personality 函数是用来设置"执行域"的,系统默认的执行域是 Linux)
To make porting programs easier, Linux supports a SVr4 personality

personality 函数的执行域设置是进程相关的,我们可以切换到 SVr4 执行环境并运行 setuid 程序
The personality is per process and is kept on execve() We could get this personality and execute a setuid binary

此时进程拥有了 CAP_SYS_RAWIO 权限,因为它现在以 root 身份执行
The process gets CAP_SYS_RAWIO since it executes as root now

得益于此功能(注*)我们成功的混过了 mmap_min_addr 检测,页面被映射到零地址空间
thanks to this capability the mmap_min_addr check succeeds and a page is mapped at zero in the address space

[注*] 作者分析了下面的内核判断代码,将"执行域"切换到 SVr4 系统,逐条构造出旁路条件

      if ((addr < mmap_min_addr) && !capable(CAP_SYS_RAWIO))
          return -EACCES;
      return 0;

---[ mmap_min_addr personalities 检测旁路 ]

We now have a process we don't control with a page mapped at zero

Can we regain control of the process?

我们打算寻找这样一个二进制文件 —— 它会主动放弃特权、让我们获得控制而不用去通过调用 EXECVE
We were looking for a binary that would drop privileges, and let us regain control without going through execve

我们找到了一个目标:pulseaudio (注*)
We found one: pulseaudio

[注*] 命令行参数 pulseaudio -L

---[ 其他的内核保护机制 ]

从 PaX 项目开始
From PaX


Permission tightening

Data in kernel non executable

Make some sensitive structures read-only

其他:引用计数器溢出,Slab 对象大小检查
Misc: Reference counters overflow, Slab object size checks

---[ 结论 ]

There are lots of bugs to find in kernels

And the attack surface is growing in general

And easier to reach from remote

Their exploitation difficulty goes from very easy to very challenging

It's hard to get rid of the kernel's attack surface

Remains even in systems designed with security in mind

May evolve soon

Userland exploitation prevention is maturing

Kernel exploitation prevention is immature

---[ 感谢/提问 ]


---[ Windows 虚拟路径解析 ]

MS10-021 修复了一个有趣的虚拟路径错误解析漏洞
MS10-021 fixed an interesting bug parsing virtual paths

A core routine handling virtualized keys made some invalid assumptions about virtualized registry keys

一个典型的注册表路径类似于 L"\\Registry\\user\\S-x-y-z"
A typical path would something like L"\\Registry\\user\\S-x-y-z"

A registry key can be nested arbitrarily deep

But we found a routine that assumed every path would contain at least five path seperators!

This is simply not the case...

---[ Windows 虚拟路径解析 ]

    while (MaxDirectories) {
        if (*CurrentChar == '\\') {
            if (--MaxDirectories == 0)
        } else {

---[ Windows 虚拟路径解析 ]

简单的为"路径里包含不足五分隔符"的键设置 VirtualTarget 标志就可以打破上述假定
This assumption can be broken by simply setting the VirtualTarget flag on a key that does not have five path components

    // Set Virtual Target
    Virt.VirtualTarget = 1;

    // http://msdn.microsoft.com/en-us/library/cc512139%28VS.85%29.aspx
    ReturnCode = NtSetInformationKey(KeyHandle,

---[ Windows 虚拟路径解析 ]

It's not immediately clear why anyone would make this error

甚至是一个没有经验的 Windows 开发新手也会相信我们所说的
Not even an inexperienced Windows developer would believe an arbitrary registry key would conform to these rules

Matthieu Suiche 指出,微软根本没有测试过 VirtualStore 键可能带来的影响
Matthieu Suiche pointed out that VirtualStore keys do conform to these rules, and so it's likely Microsoft simply didn't test with any other keys

---[ MiCreatePagingFileMap() 例程的漏洞 ]

MiCreatePagingFileMap() 例程在 PAE 内核模式下有一处有趣的优化
MiCreatePagingFileMap() contained an interesting optimisation in PAE kernels

该例程是用户态 CreateFileMapping() 函数的内核部分,它接受一个 PLARGE_INTEGER 类型的输入参数
This routine accepts a PLARGE_INTEGER parameter, and is the kernel code responsible for things like CreateFileMapping()

我们注意到,该例程的某些分支会视上述参数为 64 位,而另一些分支则视其为 32 位
We noticed that part of the routine realised the parameter was 64bits, and part assumed it was 32bits

我们可以通过隐藏高 32 位来绕开例程的参数检查
We could bypass the sanity checks by hiding bits in the upper dword

这么做的结果是导致了 Windows 的堆溢出,最小测试用例如下:
This results in an obvious heap overflow, a minimal testcase would be something like this

CreateFileMappingA(NULL, NULL, PAGE_WRITECOPY, 0x6C, 0, NULL);

posted on 2012年5月31日 23:28 由 WANGyu

# re: There's a party at Ring0 and you're invited @ 2012年7月1日 10:46

有深度,随便在Win7 64位上尝试了一下最后的测试用例,WOW方式,没能触发问题...


# re: There's a party at Ring0 and you're invited @ 2012年7月16日 10:24

回张老师:这个是 2010 年的,微软可能已经补了。:)


# re: There's a party at Ring0 and you're invited @ 2012年9月5日 9:47



# re: There's a party at Ring0 and you're invited @ 2012年9月5日 9:47



Powered by Community Server Powered by CnForums.Net