android_kernel_samsung_msm8976/mm
David Herrmann d26af5a68d shm: add sealing API
If two processes share a common memory region, they usually want some
guarantees to allow safe access. This often includes:
  - one side cannot overwrite data while the other reads it
  - one side cannot shrink the buffer while the other accesses it
  - one side cannot grow the buffer beyond previously set boundaries

If there is a trust-relationship between both parties, there is no need
for policy enforcement.  However, if there's no trust relationship (eg.,
for general-purpose IPC) sharing memory-regions is highly fragile and
often not possible without local copies.  Look at the following two
use-cases:

  1) A graphics client wants to share its rendering-buffer with a
     graphics-server. The memory-region is allocated by the client for
     read/write access and a second FD is passed to the server. While
     scanning out from the memory region, the server has no guarantee that
     the client doesn't shrink the buffer at any time, requiring rather
     cumbersome SIGBUS handling.
  2) A process wants to perform an RPC on another process. To avoid huge
     bandwidth consumption, zero-copy is preferred. After a message is
     assembled in-memory and a FD is passed to the remote side, both sides
     want to be sure that neither modifies this shared copy, anymore. The
     source may have put sensible data into the message without a separate
     copy and the target may want to parse the message inline, to avoid a
     local copy.

While SIGBUS handling, POSIX mandatory locking and MAP_DENYWRITE provide
ways to achieve most of this, the first one is unproportionally ugly to
use in libraries and the latter two are broken/racy or even disabled due
to denial of service attacks.

This patch introduces the concept of SEALING.  If you seal a file, a
specific set of operations is blocked on that file forever.  Unlike locks,
seals can only be set, never removed.  Hence, once you verified a specific
set of seals is set, you're guaranteed that no-one can perform the blocked
operations on this file, anymore.

An initial set of SEALS is introduced by this patch:
  - SHRINK: If SEAL_SHRINK is set, the file in question cannot be reduced
            in size. This affects ftruncate() and open(O_TRUNC).
  - GROW: If SEAL_GROW is set, the file in question cannot be increased
          in size. This affects ftruncate(), fallocate() and write().
  - WRITE: If SEAL_WRITE is set, no write operations (besides resizing)
           are possible. This affects fallocate(PUNCH_HOLE), mmap() and
           write().
  - SEAL: If SEAL_SEAL is set, no further seals can be added to a file.
          This basically prevents the F_ADD_SEAL operation on a file and
          can be set to prevent others from adding further seals that you
          don't want.

The described use-cases can easily use these seals to provide safe use
without any trust-relationship:

  1) The graphics server can verify that a passed file-descriptor has
     SEAL_SHRINK set. This allows safe scanout, while the client is
     allowed to increase buffer size for window-resizing on-the-fly.
     Concurrent writes are explicitly allowed.
  2) For general-purpose IPC, both processes can verify that SEAL_SHRINK,
     SEAL_GROW and SEAL_WRITE are set. This guarantees that neither
     process can modify the data while the other side parses it.
     Furthermore, it guarantees that even with writable FDs passed to the
     peer, it cannot increase the size to hit memory-limits of the source
     process (in case the file-storage is accounted to the source).

The new API is an extension to fcntl(), adding two new commands:
  F_GET_SEALS: Return a bitset describing the seals on the file. This
               can be called on any FD if the underlying file supports
               sealing.
  F_ADD_SEALS: Change the seals of a given file. This requires WRITE
               access to the file and F_SEAL_SEAL may not already be set.
               Furthermore, the underlying file must support sealing and
               there may not be any existing shared mapping of that file.
               Otherwise, EBADF/EPERM is returned.
               The given seals are _added_ to the existing set of seals
               on the file. You cannot remove seals again.

The fcntl() handler is currently specific to shmem and disabled on all
files. A file needs to explicitly support sealing for this interface to
work. A separate syscall is added in a follow-up, which creates files that
support sealing. There is no intention to support this on other
file-systems. Semantics are unclear for non-volatile files and we lack any
use-case right now. Therefore, the implementation is specific to shmem.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ryan Lortie <desrt@desrt.ca>
Cc: Lennart Poettering <lennart@poettering.net>
Cc: Daniel Mack <zonque@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Angelo G. Del Regno <kholk11@gmail.com>
2020-10-08 05:52:37 -07:00
..
kasan kasan, module, vmalloc: rework shadow allocation for modules 2015-05-04 14:03:58 -07:00
Kconfig Import T813XXS2BRC2 kernel source changes 2018-05-26 00:39:42 +02:00
Kconfig.debug defconfig: 8994: enable CONFIG_DEBUG_SLUB_PANIC_ON 2014-10-21 14:00:18 -07:00
Makefile mm: per-thread vma caching 2019-07-27 22:08:06 +02:00
backing-dev.c arch: Mass conversion of smp_mb__*() 2014-08-15 11:45:28 -07:00
balloon_compaction.c
bootmem.c
bounce.c mm: convert some level-less printks to pr_* 2019-07-27 22:08:13 +02:00
cleancache.c
compaction.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
debug-pagealloc.c mm/debug-pagealloc.c: print page physical address for 2015-08-23 23:19:22 -07:00
dmapool.c
early_ioremap.c mm: create generic early_ioremap() support 2014-08-15 11:45:23 -07:00
fadvise.c mm/fadvise.c: fix signed overflow UBSAN complaint 2019-07-27 21:51:37 +02:00
failslab.c
filemap.c mm, fs: check for fatal signals in do_generic_file_read() 2019-07-27 21:43:51 +02:00
filemap_xip.c
fremap.c
frontswap.c mm: frontswap: invalidate expired data on a dup-store failure 2014-12-16 09:09:41 -08:00
highmem.c
huge_memory.c mm, thp: fix collapsing of hugepages on madvise 2019-07-27 22:08:13 +02:00
hugetlb.c mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault 2015-10-22 14:37:50 -07:00
hugetlb_cgroup.c
hwpoison-inject.c
init-mm.c
internal.h mm: Enhance per process reclaim to consider shared pages 2015-04-16 10:14:27 -07:00
interval_tree.c
kmemcheck.c
kmemleak-test.c
kmemleak.c mm: kmemleak: allow safe memory scanning during kmemleak disabling 2015-06-22 10:47:32 +05:30
ksm.c mm,ksm: fix endless looping in allocating memory when ksm enable 2019-07-27 21:42:51 +02:00
maccess.c
madvise.c mm/madvise.c: fix madvise() infinite loop under special circumstances 2019-07-27 21:45:21 +02:00
memblock.c mm/memblock: add memblock_get_current_limit 2014-04-08 09:51:10 -07:00
memcontrol.c UPSTREAM: memcg: Only free spare array when readers are done 2016-05-18 14:36:06 +05:30
memory-failure.c mm: hwpoison: use do_send_sig_info() instead of force_sig() 2019-07-27 22:10:18 +02:00
memory.c mm: introduce vma_is_anonymous(vma) helper 2019-07-27 22:11:11 +02:00
memory_hotplug.c mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone() 2019-07-27 21:43:51 +02:00
mempolicy.c mm: convert some level-less printks to pr_* 2019-07-27 22:08:13 +02:00
mempool.c
memtest.c memtest: use phys_addr_t for physical addresses 2015-04-01 09:27:43 -07:00
migrate.c Sanitize 'move_pages()' permission checks 2019-07-27 21:44:50 +02:00
mincore.c mm/mincore.c: make mincore() more conservative 2019-07-27 22:11:11 +02:00
mlock.c mm: do not bug_on on incorrect length in __mm_populate() 2019-07-27 22:08:08 +02:00
mm_init.c
mmap.c coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping 2020-04-03 21:59:11 +02:00
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c mm/mprotect: add a cond_resched() inside change_pmd_range() 2019-07-27 21:46:25 +02:00
mremap.c mremap: properly flush TLB before releasing the page 2019-07-27 21:53:28 +02:00
msync.c
nobootmem.c mm/nobootmem.c: Drop __init annotation from free_bootmem_late 2014-04-21 15:28:38 -07:00
nommu.c mm: convert some level-less printks to pr_* 2019-07-27 22:08:13 +02:00
oom_kill.c mm, oom: fix use-after-free in oom_kill_process 2019-07-27 22:05:56 +02:00
page-writeback.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
page_alloc.c ANDROID: Remove conflicting Samsung options for upstream changes 2019-07-27 22:09:50 +02:00
page_cgroup.c cgroup/kmemleak: add kmemleak_free() for cgroup deallocations. 2014-11-14 08:47:59 -08:00
page_io.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
page_isolation.c mm: add zone counter for cma pages 2019-07-27 21:51:09 +02:00
pageowner.c debugging: keep track of page owners 2014-03-28 13:33:08 -07:00
pagewalk.c pagewalk: improve vma handling 2019-07-27 21:51:51 +02:00
percpu-km.c
percpu-vm.c percpu: perform tlb flush after pcpu_map_pages() failure 2014-10-05 14:54:13 -07:00
percpu.c Revert "percpu: free percpu allocation info for uniprocessor system" 2014-11-14 08:47:53 -08:00
pgtable-generic.c
process_reclaim.c Revert "lowmemorykiller: Introduce sysfs node for ALMK and PPR adj threshold" 2019-07-27 22:09:43 +02:00
process_vm_access.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-02-25 11:57:47 -08:00
quicklist.c
readahead.c readahead: make context readahead more conservative 2019-07-27 21:49:54 +02:00
rmap.c mm: fix anon_vma->degree underflow in anon_vma endless growing prevention 2019-07-27 22:08:15 +02:00
shmem.c shm: add sealing API 2020-10-08 05:52:37 -07:00
showmem.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
slab.c cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags 2019-07-27 21:44:59 +02:00
slab.h
slab_common.c mm: slub: add kernel address sanitizer support for slub allocator 2015-05-04 14:03:56 -07:00
slob.c
slub.c mm: slub: add kernel address sanitizer support for slub allocator 2015-05-04 14:03:56 -07:00
sparse-vmemmap.c
sparse.c
swap.c mm: close PageTail race 2014-04-03 12:01:05 -07:00
swap_state.c Revert "lowmemorykiller: Don't count swap cache pages twice" 2019-07-27 22:09:45 +02:00
swapfile.c swapfile: fix memory corruption via malformed swapfile 2019-07-27 21:42:14 +02:00
truncate.c mm: Remove false WARN_ON from pagecache_isize_extended() 2014-11-14 08:48:00 -08:00
util.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
vmacache.c mm: get rid of vmacache_flush_all() entirely 2019-07-27 22:08:09 +02:00
vmalloc.c mm/vmalloc.c: fix kernel BUG at mm/vmalloc.c:512! 2019-07-27 22:10:02 +02:00
vmpressure.c mm: vmpressure: fix sending wrong events on underflow 2019-07-27 21:43:56 +02:00
vmscan.c mm: convert some level-less printks to pr_* 2019-07-27 22:08:13 +02:00
vmstat.c Revert "lowmemorykiller: Don't count swap cache pages twice" 2019-07-27 22:09:45 +02:00
zbud.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
zpool.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
zsmalloc.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
zswap.c Import latest Samsung release 2017-04-18 03:43:52 +02:00