commit 7c80eb1c7e2b8420477fbc998971d62a648035d9 upstream.
In both functions, if pfkey_xfrm_policy2msg failed we leaked the newly
allocated sk_buff. Free it on error.
Fixes: 55569ce256 ("Fix conversion between IPSEC_MODE_xxx and XFRM_MODE_xxx.")
Change-Id: I9ddc961fd70c2c2f30d56c10b5a322d1e780778d
Reported-by: syzbot+4f0529365f7f2208d9f0@syzkaller.appspotmail.com
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit 903869bd10e6719b9df6718e785be7ec725df59f upstream.
ip_sf_list_clear_all() needs to be defined even if !CONFIG_IP_MULTICAST
Fixes: 3580d04aa674 ("ipv4/igmp: fix another memory leak in igmpv3_del_delrec()")
Change-Id: I486d8608b0e56a1ebff5b7e41420d7639c625b19
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit b4846fc3c8559649277e3e4e6b5cec5348a8d208 upstream.
Andrey reported a lockdep warning on non-initialized
spinlock:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x292/0x395 lib/dump_stack.c:52
register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755
? 0xffffffffa0000000
__lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255
lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
__raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135
_raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175
spin_lock_bh ./include/linux/spinlock.h:304
ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076
igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194
ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736
We miss a spin_lock_init() in igmpv3_add_delrec(), probably
because previously we never use it on this code path. Since
we already unlink it from the global mc_tomb list, it is
probably safe not to acquire this spinlock here. It does not
harm to have it although, to avoid conditional locking.
Fixes: c38b7d327aaf ("igmp: acquire pmc lock for ip_mc_clear_src()")
Change-Id: I1b59ee14a75d081dd01445afda7001163c4ead38
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit c38b7d327aafd1e3ad7ff53eefac990673b65667 upstream.
Andrey reported a use-after-free in add_grec():
for (psf = *psf_list; psf; psf = psf_next) {
...
psf_next = psf->sf_next;
where the struct ip_sf_list's were already freed by:
kfree+0xe8/0x2b0 mm/slub.c:3882
ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
sock_release+0x8d/0x1e0 net/socket.c:597
sock_close+0x16/0x20 net/socket.c:1072
This happens because we don't hold pmc->lock in ip_mc_clear_src()
and a parallel mr_ifc_timer timer could jump in and access them.
The RCU lock is there but it is merely for pmc itself, this
spinlock could actually ensure we don't access them in parallel.
Thanks to Eric and Long for discussion on this bug.
Change-Id: I53caaedd2487e58e14026c6d770380bb3dbb78c0
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit 9c8bb163ae784be4f79ae504e78c862806087c54 upstream.
In function igmpv3/mld_add_delrec() we allocate pmc and put it in
idev->mc_tomb, so we should free it when we don't need it in del_delrec().
But I removed kfree(pmc) incorrectly in latest two patches. Now fix it.
Fixes: 24803f38a5c0 ("igmp: do not remove igmp souce list info when ...")
Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when ...")
Change-Id: I81bad9c39912fbdd8170fafaf7f9d122cb9c52c3
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
kgsl_cmdbatch_destroy() tries to cancel all pending sync events
by taking local copy of pending list. In case of sync point timestamp
event, it goes ahead and accesses context's events list assuming that
event's context would be alive.
But at the same time, if the other context, which is of interest for
these sync point events, can be destroyed by cancelling all
events in its group.
This leads to use-after-free in kgsl_cmdbatch_destroy() path.
Fix is to give the responsibility of putting the context's ref count
to the thread which clears the pending mask.
Change-Id: I8d08ef6ddb38ca917f75088071c04727bced11d2
Signed-off-by: Rajesh Kemisetti <rajeshk@codeaurora.org>
Signed-off-by: Archana Sriram <apsrir@codeaurora.org>
commit 5f8cf712582617d523120df67d392059eaf2fc4b upstream.
If a USB sound card reports 0 interfaces, an error condition is triggered
and the function usb_audio_probe errors out. In the error path, there was a
use-after-free vulnerability where the memory object of the card was first
freed, followed by a decrement of the number of active chips. Moving the
decrement above the atomic_dec fixes the UAF.
[ The original problem was introduced in 3.1 kernel, while it was
developed in a different form. The Fixes tag below indicates the
original commit but it doesn't mean that the patch is applicable
cleanly. -- tiwai ]
Fixes: 362e4e49ab ("ALSA: usb-audio - clear chip->probing on error exit")
Reported-by: Hui Peng <benquike@gmail.com>
Reported-by: Mathias Payer <mathias.payer@nebelwelt.net>
Signed-off-by: Hui Peng <benquike@gmail.com>
Signed-off-by: Mathias Payer <mathias.payer@nebelwelt.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[surenb@google.com: resolve 3.18 differences]
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I54aecb9fe09beb178bc5d48f18ffa9ca13cf26e0
CVE-2018-19824
To prevent races with ep_remove_waitqueue() removing the
waitqueue at the same time.
Change-Id: Ib0cb4fc4549a813bc7f788961e37c1b89d318d83
Reported-by: syzbot+a2a3c4909716e271487e@syzkaller.appspotmail.com
Signed-off-by: Martijn Coenen <maco@android.com>
Cc: stable <stable@vger.kernel.org> # 4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
binder_poll() passes the thread->wait waitqueue that
can be slept on for work. When a thread that uses
epoll explicitly exits using BINDER_THREAD_EXIT,
the waitqueue is freed, but it is never removed
from the corresponding epoll data structure. When
the process subsequently exits, the epoll cleanup
code tries to access the waitlist, which results in
a use-after-free.
Prevent this by using POLLFREE when the thread exits.
(cherry picked from commit f5cb779ba16334b45ba8946d6bfa6d9834d1527f)
Change-Id: Ib34b1cbb8ab2192d78c3d9956b2f963a66ecad2e
Signed-off-by: Martijn Coenen <maco@android.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f88eb7c0d002a67ef31aeb7850b42ff69abc46dc upstream.
We currently don't validate the beacon head, i.e. the header,
fixed part and elements that are to go in front of the TIM
element. This means that the variable elements there can be
malformed, e.g. have a length exceeding the buffer size, but
most downstream code from this assumes that this has already
been checked.
Add the necessary checks to the netlink policy.
Change-Id: Ib0fc57efd6ef4bd4fd5e93de3af0a22dded6e520
Cc: stable@vger.kernel.org
Fixes: ed1b6cc7f8 ("cfg80211/nl80211: add beacon settings")
Link: https://lore.kernel.org/r/1569009255-I7ac7fbe9436e9d8733439eab8acbbd35e55c74ef@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
only the attributes are required and not the whole netlink info, as the
function accesses the attributes only anyway. This makes it easier to
parse nested beacon IEs later.
Change-Id: I1445cf91edbf018f8a1de5434f0a84acb27dbbdd
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
commit 7388afe09143210f555bdd6c75035e9acc1fab96 upstream.
Enforce the first argument to be a correct type of a pointer to struct
element and avoid unnecessary typecasts from const to non-const pointers
(the change in validate_ie_attr() is needed to make this part work). In
addition, avoid signed/unsigned comparison within for_each_element() and
mark struct element packed just in case.
Change-Id: I9351410943062797b21ef76a93b9c955f62242cb
Signed-off-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f3b07f027f87a38ebe5c436490095df762819be upstream.
Rather than always iterating elements from frames with pure
u8 pointers, add a type "struct element" that encapsulates
the id/datalen/data format of them.
Then, add the element iteration macros
* for_each_element
* for_each_element_id
* for_each_element_extid
which take, as their first 'argument', such a structure and
iterate through a given u8 array interpreting it as elements.
While at it and since we'll need it, also add
* for_each_subelement
* for_each_subelement_id
* for_each_subelement_extid
which instead of taking data/length just take an outer element
and use its data/datalen.
Also add for_each_element_completed() to determine if any of
the loops above completed, i.e. it was able to parse all of
the elements successfully and no data remained.
Use for_each_element_id() in cfg80211_find_ie_match() as the
first user of this.
Change-Id: I4222114545b72f91688c0d6e4ea0915842e089c9
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a possibility of use-after-free and
double free because of not marking buffer as
NULL after freeing. The patch marks buffer
as NULL after freeing in error case.
Change-Id: Iacf8f8a4a4e644f48c87d5445ccd594766f2e156
Signed-off-by: Hardik Arya <harya@codeaurora.org>
Make change to validate if there exists enough space to write a
unit64 instead of a unit32 value, in __qseecom_update_cmd_buf_64.
Change-Id: Iabf61dea240f16108e1765585aae3a12d2d651c9
Signed-off-by: jitendra thakare <jitendrathakare@codeaurora.org>
netlbl_unlabel_addrinfo_get() assumes that if it finds the
NLBL_UNLABEL_A_IPV4ADDR attribute, it must also have the
NLBL_UNLABEL_A_IPV4MASK attribute as well. However, this is
not necessarily the case as the current checks in
netlbl_unlabel_staticadd() and friends are not sufficent to
enforce this.
If passed a netlink message with NLBL_UNLABEL_A_IPV4ADDR,
NLBL_UNLABEL_A_IPV6ADDR, and NLBL_UNLABEL_A_IPV6MASK attributes,
these functions will all call netlbl_unlabel_addrinfo_get() which
will then attempt dereference NULL when fetching the non-existent
NLBL_UNLABEL_A_IPV4MASK attribute:
Unable to handle kernel NULL pointer dereference at virtual address 0
Process unlab (pid: 31762, stack limit = 0xffffff80502d8000)
Call trace:
netlbl_unlabel_addrinfo_get+0x44/0xd8
netlbl_unlabel_staticremovedef+0x98/0xe0
genl_rcv_msg+0x354/0x388
netlink_rcv_skb+0xac/0x118
genl_rcv+0x34/0x48
netlink_unicast+0x158/0x1f0
netlink_sendmsg+0x32c/0x338
sock_sendmsg+0x44/0x60
___sys_sendmsg+0x1d0/0x2a8
__sys_sendmsg+0x64/0xb4
SyS_sendmsg+0x34/0x4c
el0_svc_naked+0x34/0x38
Code: 51001149 7100113f 540000a0 f9401508 (79400108)
---[ end trace f6438a488e737143 ]---
Kernel panic - not syncing: Fatal exception
Change-Id: Ib2ec6e8c8296554b8b7394592a24e0cb2e92cbf5
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
propagation from qcacld-3.0 to qcacld-2.0
In wma_log_supported_evt_handler, events_logs_list in
wma handle is freed if previously allocated. If the
num_of_diag_events_logs exceeds the max size, we exit
from the function early without allocating memory for
events_logs_list. This can result in potential double
free scenario if we receive another DIAG_EVENT_LOG_SUPPORTED
event from firmware.
Fix is to set events_logs_list pointer to NULL after
freeing memory.
Change-Id: I9d6148dfc064d87e2947d1b5ec4492c08913dd4c
CRs-Fixed: 2482603
commit 38c73529de13e1e10914de7030b659a2f8b01c3b upstream.
In commit 19e4e768064a8 ("ipv4: Fix raw socket lookup for local
traffic"), the dif argument to __raw_v4_lookup() is coming from the
returned value of inet_iif() but the change was done only for the first
lookup. Subsequent lookups in the while loop still use skb->dev->ifIndex.
Fixes: 19e4e768064a8 ("ipv4: Fix raw socket lookup for local traffic")
Change-Id: I2e40ae96d0513cbab9332fd58d6dd96a2ac3c307
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.16: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
currently only NULL pointer check is used to validate the return
value from clk_get, this change to handle all the failures.
This snapshot is taken from msm-4.9
Ported it from 4.9 to 3.18
Change-Id: Icd8b7e33d0f235a7c5dde2307972a594908e6a60
Signed-off-by: Sumalatha Malothu <smalot@codeaurora.org>
To avoid access of variable after being freed, using
list_first_entry_safe function to iterate over list
of given type, safe against removal of list entry.
Change-Id: I70611fddf3e9b80b1affa3e5235be24eac0d0a58
Signed-off-by: Monika Singh <monising@codeaurora.org>
When reading an extra descriptor, we need to properly check the minimum
and maximum size allowed, to prevent from invalid data being sent by a
device.
Change-Id: If4dd31307e0531261c9d9a21fbea5487732f7baa
Reported-by: Hui Peng <benquike@gmail.com>
Reported-by: Mathias Payer <mathias.payer@nebelwelt.net>
Co-developed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hui Peng <benquike@gmail.com>
Signed-off-by: Mathias Payer <mathias.payer@nebelwelt.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This change fixes buffer overflows and silent data corruption with the
usbmon device driver text file read operations.
Change-Id: Ie9953b9b05863feebfe81f4d2e18f2b6af72d58d
Signed-off-by: Fredrik Noring <noring@nocrew.org>
Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sec_ts touch driver sysfs store callback had couple of userspace buffer copy
operations where it was not checking for validity of length being copied
from source buffer. This CL adds necessary boundary checks to make sure the
destination kernel buffer is not overflown.
Bug: 120211708
Bug: 120211415
Change-Id: I8bfe1ab9ae50d89ce12eeaf856204c20056a2061
Signed-off-by: Biswajit Dash <bisdash@google.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
The kernel address is used as cookie to keep track
of stats request. This address can be disclosed to
target leading to a security vulnerability.
Implement a FW stats descriptor pool, and use a
descriptor ID to keep track of stats requests,
instead of the kernel address, to prevent
kernel address leak.
Change-Id: Ib49150da899c0b9314f614868a90867f4aa92d3d
CRs-Fixed: 2276007
Propagate from qcacld3.0 to qcacld2.0
Currently variable "num_mpdu_ranges" is from message, which is used
directly without any validation which causes buffer over-write.
To avoid buffer over-write add check for the valid num_mpdu_ranges
Change-Id: I54e138d4bd63cbe7a0ae4faf0fe9d8e59ca92c71
CRs-Fixed: 2500393
Add adf_print API to print error logs from ADF module.
Add ADF_BUG implementation to warn in case crash is not
required.
Change-Id: If4ba15c669cf5d6769cb7850314cd3bd66f8fd90
CRs-Fixed: 1074129
'nRoamingTime' is 32bit integer, it can overflow when multipled
with PAL_TICKS_PER_SECOND so type cast it to 64bit before
multiplying to avoid overflow.
Change-Id: I66b303dc0631078cc442fcf3c95027bc224bf57f
[ Upstream commit 732706afe1cc46ef48493b3d2b69c98f36314ae4 ]
On policies with a transport mode template, we pass the addresses
from the flowi to xfrm_state_find(), assuming that the IP addresses
(and address family) don't change during transformation.
Unfortunately our policy template validation is not strict enough.
It is possible to configure policies with transport mode template
where the address family of the template does not match the selectors
address family. This lead to stack-out-of-bound reads because
we compare arddesses of the wrong family. Fix this by refusing
such a configuration, address family can not change on transport
mode.
We use the assumption that, on transport mode, the first templates
address family must match the address family of the policy selector.
Subsequent transport mode templates must mach the address family of
the previous template.
Change-Id: I33678e32df020045f419f38fc4d955863c42409a
Git-commit: 732706afe1cc46ef48493b3d2b69c98f36314ae4
Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
When we do tunnel or beet mode, we pass saddr and daddr from the
template to xfrm_state_find(), this is ok. On transport mode,
we pass the addresses from the flowi, assuming that the IP
addresses (and address family) don't change during transformation.
This assumption is wrong in the IPv4 mapped IPv6 case, packet
is IPv4 and template is IPv6.
Fix this by catching address family missmatches of the policy
and the flow already before we do the lookup.
Change-Id: I4e3da03ed3b8f0cf0fdd01d5cdc8a69e9504240b
Git-commit: ddc47e4404b58f03e98345398fb12d38fe291512
Git-repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[tejaswit@codeaurora.org : resolved minor conflicts. ]
Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
Change data type for gpu ib vote to unsigned
long to suit the bw vote data type in devfreq
governor functions.
Change-Id: I6aeb201ee67d111ee527c17e051b5125968a9683
Signed-off-by: Archana Sriram <apsrir@codeaurora.org>
Remove the use of dmac_flush_range for userspace buffers and add
msm_ion_do_cache_op for flushing user space buffers.
Change-Id: Ice73eafac840bd1cabee0a2bfc8a641832a7d0c8
Acked-by: Bharath Kumar <bkumar@qti.qualcomm.com>
Signed-off-by: Tharun Kumar Merugu <mtharu@codeaurora.org>
validate structures and payload sizes in the
packet against packet size to avoid OOB access
Change-Id: Id44e5c6be4dde3e6545d453f5edd3219776a4e58
Signed-off-by: Manikanta Kanamarlapudi <kmanikan@codeaurora.org>
commit 5e86bdda41534e17621d5a071b294943cae4376e upstream.
Currently, we are releasing the indirect buffer where we are done with
it in ext4_ind_remove_space(), so we can see the brelse() and
BUFFER_TRACE() everywhere. It seems fragile and hard to read, and we
may probably forget to release the buffer some day. This patch cleans
up the code by putting of the code which releases the buffers to the
end of the function.
Change-Id: I48b5b058dfb66fc76471aa2449a466328ab56475
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit 674a2b27234d1b7afcb0a9162e81b2e53aeef217 upstream.
All indirect buffers get by ext4_find_shared() should be released no
mater the branch should be freed or not. But now, we forget to release
the lower depth indirect buffers when removing space from the same
higher depth indirect block. It will lead to buffer leak and futher
more, it may lead to quota information corruption when using old quota,
consider the following case.
- Create and mount an empty ext4 filesystem without extent and quota
features,
- quotacheck and enable the user & group quota,
- Create some files and write some data to them, and then punch hole
to some files of them, it may trigger the buffer leak problem
mentioned above.
- Disable quota and run quotacheck again, it will create two new
aquota files and write the checked quota information to them, which
probably may reuse the freed indirect block(the buffer and page
cache was not freed) as data block.
- Enable quota again, it will invoke
vfs_load_quota_inode()->invalidate_bdev() to try to clean unused
buffers and pagecache. Unfortunately, because of the buffer of quota
data block is still referenced, quota code cannot read the up to date
quota info from the device and lead to quota information corruption.
This problem can be reproduced by xfstests generic/231 on ext3 file
system or ext4 file system without extent and quota features.
This patch fix this problem by releasing the missing indirect buffers,
in ext4_ind_remove_space().
Change-Id: I7279fb664e0ca8e72cbdc2662babe599090b1b58
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
commit 6f30b7e37a8239f9d27db626a1d3427bc7951908 upstream.
Commit 4f579ae7de56 (ext4: fix punch hole on files with indirect
mapping) rewrote FALLOC_FL_PUNCH_HOLE for ext4 files with indirect
mapping. However, there are bugs in several corner cases. This fixes 5
distinct bugs:
1. When there is at least one entire level of indirection between the
start and end of the punch range and the end of the punch range is the
first block of its level, we can't return early; we have to free the
intervening levels.
2. When the end is at a higher level of indirection than the start and
ext4_find_shared returns a top branch for the end, we still need to free
the rest of the shared branch it returns; we can't decrement partial2.
3. When a punch happens within one level of indirection, we need to
converge on an indirect block that contains the start and end. However,
because the branches returned from ext4_find_shared do not necessarily
start at the same level (e.g., the partial2 chain will be shallower if
the last block occurs at the beginning of an indirect group), the walk
of the two chains can end up "missing" each other and freeing a bunch of
extra blocks in the process. This mismatch can be handled by first
making sure that the chains are at the same level, then walking them
together until they converge.
4. When the punch happens within one level of indirection and
ext4_find_shared returns a top branch for the start, we must free it,
but only if the end does not occur within that branch.
5. When the punch happens within one level of indirection and
ext4_find_shared returns a top branch for the end, then we shouldn't
free the block referenced by the end of the returned chain (this mirrors
the different levels case).
Change-Id: I7d0166b64cc2efd7e53ac9c3ef65ad9306ae91af
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
commit 4f579ae7de560e5f449587a6c3f02594d53d4d51 upstream.
Currently punch hole code on files with direct/indirect mapping has some
problems which may lead to a data loss. For example (from Jan Kara):
fallocate -n -p 10240000 4096
will punch the range 10240000 - 12632064 instead of the range 1024000 -
10244096.
Also the code is a bit weird and it's not using infrastructure provided
by indirect.c, but rather creating it's own way.
This patch fixes the issues as well as making the operation to run 4
times faster from my testing (punching out 60GB file). It uses similar
approach used in ext4_ind_truncate() which takes advantage of
ext4_free_branches() function.
Also rename the ext4_free_hole_blocks() to something more sensible, like
the equivalent we have for extent mapped files. Call it
ext4_ind_remove_space().
This has been tested mostly with fsx and some xfstests which are testing
punch hole but does not require unwritten extents which are not
supported with direct/indirect mapping. Not problems showed up even with
1024k block size.
Change-Id: I50cfdf1a688774674341fef9cb3b6ba65875e6af
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
free_holes_block() passed local variable as a block pointer
to ext4_clear_blocks(). Thus ext4_clear_blocks() zeroed out this local
variable instead of proper place in inode / indirect block. We later
zero out proper place in inode / indirect block but don't dirty the
inode / buffer again which can lead to subtle issues (some changes e.g.
to inode can be lost).
Change-Id: I0e5aed175d60feecba2b37f73986988412282958
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext4_ind_trans_blocks() wrongly used 'chunk' argument to decide whether
blocks mapped are logically contiguous. That is wrong since the argument
informs whether the blocks are physically contiguous. As the blocks
mapped are always logically contiguous and that's all
ext4_ind_trans_blocks() cares about, just remove the 'chunk' argument.
Change-Id: Iae320c4f355316739847743bbb54dfdb8e123123
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Denis Andzakovic discovered a potential use-after-free in older kernel
versions, using syzkaller. tcp_write_queue_purge() frees all skbs in
the TCP write queue and can leave sk->sk_send_head pointing to freed
memory. tcp_disconnect() clears that pointer after calling
tcp_write_queue_purge(), but tcp_connect() does not. It is
(surprisingly) possible to add to the write queue between
disconnection and reconnection, so this needs to be done in both
places.
This bug was introduced by backports of commit 7f582b248d0a ("tcp:
purge write queue in tcp_connect_init()") and does not exist upstream
because of earlier changes in commit 75c119afe14f ("tcp: implement
rb-tree based retransmit queue"). The latter is a major change that's
not suitable for stable.
Change-Id: I97a1a1f3f753b950984e48af6c28cfd4a346db8a
Reported-by: Denis Andzakovic <denis.andzakovic@pulsesecurity.co.nz>
Bisected-by: Salvatore Bonaccorso <carnil@debian.org>
Fixes: 7f582b248d0a ("tcp: purge write queue in tcp_connect_init()")
Cc: <stable@vger.kernel.org> # before 4.15
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently data in "pl_tgt_hdr" is used directly from firmware without
any length check which may cause buffer over-read.
To address this issue add length check before accessing data offset
Change-Id: Ia968806b765bb41e39395e15fcc3c8f880ae7335
CRs-Fixed: 2240226
This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg
handler msg_name and msg_namelen logic").
DECLARE_SOCKADDR validates that the structure we use for writing the
name information to is not larger than the buffer which is reserved
for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
consistently in sendmsg code paths.
Change-Id: I0589c7ce694ef02dbc1e8b227fb51eeebf610e47
Signed-off-by: Steffen Hurrle <steffen@hurrle.net>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In tpacket_snd(), when we've discovered a first frame that is
not in status TP_STATUS_SEND_REQUEST, and return a NULL buffer,
we exit the send routine in case of MSG_DONTWAIT, since we've
finished traversing the mmaped send ring buffer and don't care
about pending frames.
While doing so, we still unconditionally call an expensive
schedule() in the packet_current_frame() "error" path, which
is unnecessary in this case since it's enough to just quit
the function.
Also, in case MSG_DONTWAIT is not set, we should rather test
for need_resched() first and do schedule() only if necessary
since meanwhile pending frames could already have finished
processing and called skb destructor.
Change-Id: Ic83ad580914e70fbffb871ded8ee9045f4679da4
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only update *addr_len when we actually fill in sockaddr, otherwise we
can return uninitialized memory from the stack to the caller in the
recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
checks because we only get called with a valid addr_len pointer either
from sock_common_recvmsg or inet_recvmsg.
If a blocking read waits on a socket which is concurrently shut down we
now return zero and set msg_msgnamelen to 0.
[cherry-pick of net-next bceaa90240b6019ed73b49965eac7d167610be69]
Bug: 28347599
Change-Id: Id292aa64697c5b5bccc8e0f7d9236b519454ef27
Reported-by: mpb <mpb.mail@gmail.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Git-commit: 1ad7f4b67337194d668fd8983fad0c84eff00fac
Git-repo: https://android.googlesource.com/kernel/common.git
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
[ Upstream commit b617158dc096709d8600c53b6052144d12b89fab ]
Some applications set tiny SO_SNDBUF values and expect
TCP to just work. Recent patches to address CVE-2019-11478
broke them in case of losses, since retransmits might
be prevented.
We should allow these flows to make progress.
This patch allows the first and last skb in retransmit queue
to be split even if memory limits are hit.
It also adds the some room due to the fact that tcp_sendmsg()
and tcp_sendpage() might overshoot sk_wmem_queued by about one full
TSO skb (64KB size). Note this allowance was already present
in stable backports for kernels < 4.15
Note for < 4.15 backports :
tcp_rtx_queue_tail() will probably look like :
static inline struct sk_buff *tcp_rtx_queue_tail(const struct sock *sk)
{
struct sk_buff *skb = tcp_send_head(sk);
return skb ? tcp_write_queue_prev(sk, skb) : tcp_write_queue_tail(sk);
}
Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrew Prout <aprout@ll.mit.edu>
Tested-by: Andrew Prout <aprout@ll.mit.edu>
Tested-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Tested-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Christoph Paasch <cpaasch@apple.com>
Cc: Jonathan Looney <jtl@netflix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 5917ca48053447dac0e13c51ed7d4e2471a1cbc9)
Change-Id: I8541a25d6a10934cc6d59c750b9a70c975f3b8f5