android_kernel_google_msm/arch
H. Peter Anvin 3989298cbd x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack
commit 3891a04aaf upstream.

The IRET instruction, when returning to a 16-bit segment, only
restores the bottom 16 bits of the user space stack pointer.  This
causes some 16-bit software to break, but it also leaks kernel state
to user space.  We have a software workaround for that ("espfix") for
the 32-bit kernel, but it relies on a nonzero stack segment base which
is not available in 64-bit mode.

In checkin:

    b3b42ac2cb x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels

we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
the logic that 16-bit support is crippled on 64-bit kernels anyway (no
V86 support), but it turns out that people are doing stuff like
running old Win16 binaries under Wine and expect it to work.

This works around this by creating percpu "ministacks", each of which
is mapped 2^16 times 64K apart.  When we detect that the return SS is
on the LDT, we copy the IRET frame to the ministack and use the
relevant alias to return to userspace.  The ministacks are mapped
readonly, so if IRET faults we promote #GP to #DF which is an IST
vector and thus has its own stack; we then do the fixup in the #DF
handler.

(Making #GP an IST exception would make the msr_safe functions unsafe
in NMI/MC context, and quite possibly have other effects.)

Special thanks to:

- Andy Lutomirski, for the suggestion of using very small stack slots
  and copy (as opposed to map) the IRET frame there, and for the
  suggestion to mark them readonly and let the fault promote to #DF.
- Konrad Wilk for paravirt fixup and testing.
- Borislav Petkov for testing help and useful comments.

Reported-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Lutomriski <amluto@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dirk Hohndel <dirk@hohndel.org>
Cc: Arjan van de Ven <arjan.van.de.ven@intel.com>
Cc: comex <comexk@gmail.com>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-08-07 12:00:10 -07:00
..
alpha alpha: makefile: don't enforce small data model for kernel builds 2013-08-20 08:26:28 -07:00
arm ARM: 8115/1: LPAE: reduce damage caused by idmap to virtual memory layout 2014-08-07 12:00:10 -07:00
avr32 avr32: Makefile: add '-D__linux__' flag for gcc-4.4.7 use 2014-03-11 16:09:57 -07:00
blackfin
c6x
cris cris: media platform drivers: fix build 2013-11-29 10:50:37 -08:00
frv frv: Use core allocator for task_struct 2013-08-20 08:26:28 -07:00
h8300 signal: Define __ARCH_HAS_SA_RESTORER so we know whether to clear sa_restorer 2013-04-05 10:04:14 -07:00
hexagon
ia64 exec/ptrace: fix get_dumpable() incorrect tests 2013-11-29 10:50:34 -08:00
m32r m32r: make memset() global for CONFIG_KERNEL_BZIP2=y 2013-09-14 06:02:11 -07:00
m68k m68k/atari: ARAnyM - Fix NatFeat module support 2013-08-20 08:26:29 -07:00
microblaze microblaze: Update microblaze defconfigs 2013-08-20 08:26:27 -07:00
mips MIPS: MSC: Prevent out-of-bounds writes to MIPS SC ioremap'd region 2014-07-06 18:49:19 -07:00
mn10300 signal: Define __ARCH_HAS_SA_RESTORER so we know whether to clear sa_restorer 2013-04-05 10:04:14 -07:00
openrisc
parisc parisc: fix epoll_pwait syscall on compat kernel 2014-06-07 16:01:57 -07:00
powerpc powerpc/perf: Never program book3s PMCs with values >= 0x80000000 2014-07-17 15:39:50 -07:00
s390 s390/ptrace: fix PSW mask check 2014-07-31 12:54:53 -07:00
score
sh sh: fix format string bug in stack tracer 2014-05-06 07:51:45 -07:00
sparc sparc64: don't treat 64-bit syscall return codes as 32-bit 2014-04-26 17:13:19 -07:00
tile tile: use a more conservative __my_cpu_offset in CONFIG_PREEMPT 2013-10-13 15:42:50 -07:00
um um: add missing declaration of 'getrlimit()' and friends 2013-12-11 22:34:11 -08:00
unicore32 mm, show_mem: suppress page counts in non-blockable contexts 2013-10-13 15:42:49 -07:00
x86 x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack 2014-08-07 12:00:10 -07:00
xtensa xtensa: don't use alternate signal stack on threads 2013-11-13 12:01:49 +09:00
.gitignore
Kconfig