android_kernel_samsung_msm8976/kernel
Eric W. Biederman cf47f6e1ff fork: Allow CLONE_PARENT after setns(CLONE_NEWPID)
Serge Hallyn <serge.hallyn@ubuntu.com> writes:
> Hi Oleg,
>
> commit 40a0d32d1eaffe6aac7324ca92604b6b3977eb0e :
> "fork: unify and tighten up CLONE_NEWUSER/CLONE_NEWPID checks"
> breaks lxc-attach in 3.12.  That code forks a child which does
> setns() and then does a clone(CLONE_PARENT).  That way the
> grandchild can be in the right namespaces (which the child was
> not) and be a child of the original task, which is the monitor.
>
> lxc-attach in 3.11 was working fine with no side effects that I
> could see.  Is there a real danger in allowing CLONE_PARENT
> when current->nsproxy->pidns_for_children is not our pidns,
> or was this done out of an "over-abundance of caution"?  Can we
> safely revert that new extra check?

The two fundamental things I know we can not allow are:
- A shared signal queue aka CLONE_THREAD.  Because we compute the pid
  and uid of the signal when we place it in the queue.

- Changing the pid and by extention pid_namespace of an existing
  process.

From a parents perspective there is nothing special about the pid
namespace, to deny CLONE_PARENT, because the parent simply won't know or
care.

From the childs perspective all that is special really are shared signal
queues.

User mode threading with CLONE_PARENT|CLONE_VM|CLONE_SIGHAND and tasks
in different pid namespaces is almost certainly going to break because
it is complicated.  But shared signal handlers can look at per thread
information to know which pid namespace a process is in, so I don't know
of any reason not to support CLONE_PARENT|CLONE_VM|CLONE_SIGHAND threads
at the kernel level.  It would be absolutely stupid to implement but
that is a different thing.

So hmm.

Because it can do no harm, and because it is a regression let's remove
the CLONE_PARENT check and send it stable.

Cc: stable@vger.kernel.org
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2019-07-27 22:10:35 +02:00
..
cpu sched/idle: Add missing checks to the exit condition of cpu_idle_poll() 2019-07-27 21:50:45 +02:00
debug mm: per-thread vma caching 2019-07-27 22:08:06 +02:00
events perf/core: Fix impossible ring-buffer sizes warning 2019-07-27 22:10:22 +02:00
gcov
irq
locking locking/lockdep: Use for_each_process_thread() for debug_show_all_locks() 2019-07-27 22:09:22 +02:00
power kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
rcu
sched kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
time kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
trace kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
.gitignore
acct.c
async.c kernel/async.c: revert "async: simplify lowest_in_progress()" 2019-07-27 21:49:48 +02:00
audit.c BACKPORT: audit: consistently record PIDs with task_tgid_nr() 2019-07-27 21:50:56 +02:00
audit.h
audit_tree.c
audit_watch.c audit: fix use-after-free in audit_add_watch 2019-07-27 21:51:40 +02:00
auditfilter.c
auditsc.c BACKPORT: audit: consistently record PIDs with task_tgid_nr() 2019-07-27 21:50:56 +02:00
backtracetest.c
bounds.c
capability.c
cgroup.c
cgroup_freezer.c
compat.c
configs.c
context_tracking.c
cpu.c
cpu_pm.c
cpuset.c sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs 2019-07-27 22:08:19 +02:00
crash_dump.c
cred.c
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c wait/ptrace: assume __WALL if the child is traced 2019-07-27 22:09:56 +02:00
extable.c kernel/extable.c: mark core_kernel_text notrace 2019-07-27 21:44:25 +02:00
fork.c fork: Allow CLONE_PARENT after setns(CLONE_NEWPID) 2019-07-27 22:10:35 +02:00
freezer.c freezer: set PF_SUSPEND_TASK flag on tasks that call freeze_processes 2019-07-27 22:09:18 +02:00
futex.c futex: Ensure that futex address is aligned in handle_futex_death() 2019-07-27 22:08:53 +02:00
futex_compat.c
groups.c kernel: make groups_sort calling a responsibility group_info allocators 2019-07-27 21:46:18 +02:00
hrtimer.c hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers) 2019-07-27 21:49:51 +02:00
hung_task.c kernel/hung_task.c: change hung_task.c to use for_each_process_thread() 2019-07-27 22:09:21 +02:00
irq_work.c
itimer.c
jump_label.c
kallsyms.c kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
kmod.c
kprobes.c
ksysfs.c kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
kthread.c
latencytop.c
Makefile
modsign_pubkey.c
module-internal.h
module.c
module_signing.c
notifier.c
nsproxy.c Rename nsproxy.pid_ns to nsproxy.pid_ns_for_children 2019-07-27 22:10:33 +02:00
padata.c padata: avoid race in reordering 2019-07-27 21:44:05 +02:00
panic.c
params.c kernel/params.c: align add_sysfs_param documentation with code 2019-07-27 21:45:35 +02:00
pid.c kernel/fork.c:copy_process(): don't add the uninitialized child to thread/task/pid lists 2019-07-27 22:10:32 +02:00
pid_namespace.c Rename nsproxy.pid_ns to nsproxy.pid_ns_for_children 2019-07-27 22:10:33 +02:00
posix-cpu-timers.c posix-timers: Sanitize overrun handling 2019-07-27 21:53:21 +02:00
posix-timers.c posix-timers: Sanitize overrun handling 2019-07-27 21:53:21 +02:00
printk.c printk: use rcuidle console tracepoint 2019-07-27 21:44:09 +02:00
profile.c
ptrace.c ptrace: revert "Prepare to fix racy accesses on task breakpoints" 2019-07-27 22:09:55 +02:00
range.c
relay.c treewide: Fix typo in Documentation/DocBook 2019-07-27 22:10:20 +02:00
res_counter.c
resource.c /proc/iomem: only expose physical resource addresses to privileged users 2019-07-27 22:05:58 +02:00
seccomp.c UPSTREAM: seccomp: always propagate NO_NEW_PRIVS on tsync 2019-07-27 21:51:01 +02:00
signal.c kernel: use macros from compiler.h instead of __attribute__((...)) 2019-07-27 22:10:27 +02:00
smp.c
smpboot.c
smpboot.h
softirq.c
stacktrace.c
stop_machine.c
sys.c exit.c: unexport __set_special_pids() 2019-07-27 22:09:28 +02:00
sys_ni.c
sysctl.c kernel/sysctl.c: fix out-of-bounds access when setting file-max 2019-07-27 22:10:11 +02:00
sysctl_binary.c
system_certificates.S
system_keyring.c
task_work.c
taskstats.c
test_kprobes.c
time.c time: Make sure jiffies_to_msecs() preserves non-zero time periods 2019-07-27 21:52:48 +02:00
timeconst.bc
timer.c
tracepoint.c tracing: syscall_regfunc() should not skip kernel threads 2019-07-27 22:09:17 +02:00
tsacct.c
uid16.c kernel: make groups_sort calling a responsibility group_info allocators 2019-07-27 21:46:18 +02:00
up.c
user-return-notifier.c
user.c
user_namespace.c userns: move user access out of the mutex 2019-07-27 21:51:26 +02:00
utsname.c
utsname_sysctl.c
watchdog.c
workqueue.c workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq 2019-07-27 21:45:23 +02:00
workqueue_internal.h