commit 3c8f7710c1c44fb650bc29b6ef78ed8b60cfaa28 upstream.
The previous fix of pxa library support, which was introduced to fix the
library dependency, broke the previous SoC behavior, where a machine
code binding pxa2xx-ac97 with a coded relied on :
- sound/soc/pxa/pxa2xx-ac97.c
- sound/soc/codecs/XXX.c
For example, the mioa701_wm9713.c machine code is currently broken. The
"select ARM" statement wrongly selects the soc/arm/pxa2xx-ac97 for
compilation, as per an unfortunate fate SND_PXA2XX_AC97 is both declared
in sound/arm/Kconfig and sound/soc/pxa/Kconfig.
Fix this by ensuring that SND_PXA2XX_SOC correctly triggers the correct
pxa2xx-ac97 compilation.
Fixes: 846172dfe3 ("ASoC: fix SND_PXA2XX_LIB Kconfig warning")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 93efac3f2e03321129de67a3c0ba53048bb53e31 upstream.
The IPv6 IPsec pre-encap path performs fragmentation for tunnel-mode
packets. That is, we perform fragmentation pre-encap rather than
post-encap.
A check was added later to ensure that proper MTU information is
passed back for locally generated traffic. Unfortunately this
check was performed on all IPsec packets, including transport-mode
packets.
What's more, the check failed to take GSO into account.
The end result is that transport-mode GSO packets get dropped at
the check.
This patch fixes it by moving the tunnel mode check forward as well
as adding the GSO check.
Fixes: dd767856a3 ("xfrm6: Don't call icmpv6_send on local error")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
[lizf: Backported to 3.4:
- adjust context
- s/ignore_df/local_df]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 275d7d44d802ef271a42dc87ac091a495ba72fc5 upstream.
Poma (on the way to another bug) reported an assertion triggering:
[<ffffffff81150529>] module_assert_mutex_or_preempt+0x49/0x90
[<ffffffff81150822>] __module_address+0x32/0x150
[<ffffffff81150956>] __module_text_address+0x16/0x70
[<ffffffff81150f19>] symbol_put_addr+0x29/0x40
[<ffffffffa04b77ad>] dvb_frontend_detach+0x7d/0x90 [dvb_core]
Laura Abbott <labbott@redhat.com> produced a patch which lead us to
inspect symbol_put_addr(). This function has a comment claiming it
doesn't need to disable preemption around the module lookup
because it holds a reference to the module it wants to find, which
therefore cannot go away.
This is wrong (and a false optimization too, preempt_disable() is really
rather cheap, and I doubt any of this is on uber critical paths,
otherwise it would've retained a pointer to the actual module anyway and
avoided the second lookup).
While its true that the module cannot go away while we hold a reference
on it, the data structure we do the lookup in very much _CAN_ change
while we do the lookup. Therefore fix the comment and add the
required preempt_disable().
Reported-by: poma <pomidorabelisima@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: a6e6abd575 ("module: remove module_text_address()")
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 03da3ff1cfcd7774c8780d2547ba0d995f7dc03d upstream.
In 2007, commit 07190a08ee ("Mark TSC on GeodeLX reliable")
bypassed verification of the TSC on Geode LX. However, this code
(now in the check_system_tsc_reliable() function in
arch/x86/kernel/tsc.c) was only present if CONFIG_MGEODE_LX was
set.
OpenWRT has recently started building its generic Geode target
for Geode GX, not LX, to include support for additional
platforms. This broke the timekeeping on LX-based devices,
because the TSC wasn't marked as reliable:
https://dev.openwrt.org/ticket/20531
By adding a runtime check on is_geode_lx(), we can also include
the fix if CONFIG_MGEODEGX1 or CONFIG_X86_GENERIC are set, thus
fixing the problem.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Andres Salomon <dilinger@queued.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1442409003.131189.87.camel@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 9b55613f42e8d40d5c9ccb8970bde6af4764b2ab upstream.
When a kernel is built covering ARMv6 to ARMv7, we omit to clear the
IT state when entering a signal handler. This can cause the first
few instructions to be conditionally executed depending on the parent
context.
In any case, the original test for >= ARMv7 is broken - ARMv6 can have
Thumb-2 support as well, and an ARMv6T2 specific build would omit this
code too.
Relax the test back to ARMv6 or greater. This results in us always
clearing the IT state bits in the PSR, even on CPUs where these bits
are reserved. However, they're reserved for the IT state, so this
should cause no harm.
Fixes: d71e1352e2 ("Clear the IT state when invoking a Thumb-2 signal handler")
Acked-by: Tony Lindgren <tony@atomide.com>
Tested-by: H. Nikolaus Schaller <hns@goldelico.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 6ecf830e50 upstream.
The ARM architecture reference specifies that the IT state bits in the
PSR must be all zeros in ARM mode or behavior is unspecified. On the
Qualcomm Snapdragon S4/Krait architecture CPUs the processor continues
to consider the IT state bits while in ARM mode. This makes it so
that some instructions are skipped by the CPU.
Signed-off-by: T.J. Purtell <tj@mobisocial.us>
[rmk+kernel@arm.linux.org.uk: fixed whitespace formatting in patch]
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit e297c939b745e420ef0b9dc989cb87bda617b399 upstream.
This fixes a race which can result in the same virtual IRQ number
being assigned to two different MSI interrupts. The most visible
consequence of that is usually a warning and stack trace from the
sysfs code about an attempt to create a duplicate entry in sysfs.
The race happens when one CPU (say CPU 0) is disposing of an MSI
while another CPU (say CPU 1) is setting up an MSI. CPU 0 calls
(for example) pnv_teardown_msi_irqs(), which calls
msi_bitmap_free_hwirqs() to indicate that the MSI (i.e. its
hardware IRQ number) is no longer in use. Then, before CPU 0 gets
to calling irq_dispose_mapping() to free up the virtal IRQ number,
CPU 1 comes in and calls msi_bitmap_alloc_hwirqs() to allocate an
MSI, and gets the same hardware IRQ number that CPU 0 just freed.
CPU 1 then calls irq_create_mapping() to get a virtual IRQ number,
which sees that there is currently a mapping for that hardware IRQ
number and returns the corresponding virtual IRQ number (which is
the same virtual IRQ number that CPU 0 was using). CPU 0 then
calls irq_dispose_mapping() and frees that virtual IRQ number.
Now, if another CPU comes along and calls irq_create_mapping(), it
is likely to get the virtual IRQ number that was just freed,
resulting in the same virtual IRQ number apparently being used for
two different hardware interrupts.
To fix this race, we just move the call to msi_bitmap_free_hwirqs()
to after the call to irq_dispose_mapping(). Since virq_to_hw()
doesn't work for the virtual IRQ number after irq_dispose_mapping()
has been called, we need to call it before irq_dispose_mapping() and
remember the result for the msi_bitmap_free_hwirqs() call.
The pattern of calling msi_bitmap_free_hwirqs() before
irq_dispose_mapping() appears in 5 places under arch/powerpc, and
appears to have originated in commit 05af7bd2d7 ("[POWERPC] MPIC
U3/U4 MSI backend") from 2007.
Fixes: 05af7bd2d7 ("[POWERPC] MPIC U3/U4 MSI backend")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[bwh: Backported to 3.2:
- powernv uses a private functions instead of msi_bitmap_free_hwirqs()
- Adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit a077224fd35b2f7fbc93f14cf67074fc792fbac2 upstream.
While working on the 32-bit ARM port of UEFI, I noticed a strange
corruption in the kernel log. The following snprintf() statement
(in drivers/firmware/efi/efi.c:efi_md_typeattr_format())
snprintf(pos, size, "|%3s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
was producing the following output in the log:
| | | | | |WB|WT|WC|UC]
| | | | | |WB|WT|WC|UC]
| | | | | |WB|WT|WC|UC]
|RUN| | | | |WB|WT|WC|UC]*
|RUN| | | | |WB|WT|WC|UC]*
| | | | | |WB|WT|WC|UC]
|RUN| | | | |WB|WT|WC|UC]*
| | | | | |WB|WT|WC|UC]
|RUN| | | | | | | |UC]
|RUN| | | | | | | |UC]
As it turns out, this is caused by incorrect code being emitted for
the string() function in lib/vsprintf.c. The following code
if (!(spec.flags & LEFT)) {
while (len < spec.field_width--) {
if (buf < end)
*buf = ' ';
++buf;
}
}
for (i = 0; i < len; ++i) {
if (buf < end)
*buf = *s;
++buf; ++s;
}
while (len < spec.field_width--) {
if (buf < end)
*buf = ' ';
++buf;
}
when called with len == 0, triggers an issue in the GCC SRA optimization
pass (Scalar Replacement of Aggregates), which handles promotion of signed
struct members incorrectly. This is a known but as yet unresolved issue.
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932). In this particular
case, it is causing the second while loop to be executed erroneously a
single time, causing the additional space characters to be printed.
So disable the optimization by passing -fno-ipa-sra.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 294ab783ad98066b87296db1311c7ba2a60206a5 upstream.
It looks like the Kconfig check that was meant to fix this (commit
fe9233fb69 [SCSI] scsi_dh: fix kconfig related
build errors) was actually reversed, but no-one noticed until the new set of
patches which separated DM and SCSI_DH).
Fixes: fe9233fb69
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit b4cc0efea4f0bfa2477c56af406cfcf3d3e58680 upstream.
Fix B-tree corruption when a new record is inserted at position 0 in the
node in hfs_brec_insert().
This is an identical change to the corresponding hfs b-tree code to Sergei
Antonov's "hfsplus: fix B-tree corruption after insertion at position 0",
to keep similar code paths in the hfs and hfsplus drivers in sync, where
appropriate.
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Cc: Sergei Antonov <saproj@gmail.com>
Cc: Joe Perches <joe@perches.com>
Reviewed-by: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 7cb74be6fd827e314f81df3c5889b87e4c87c569 upstream.
Pages looked up by __hfs_bnode_create() (called by hfs_bnode_create() and
hfs_bnode_find() for finding or creating pages corresponding to an inode)
are immediately kmap()'ed and used (both read and write) and kunmap()'ed,
and should not be page_cache_release()'ed until hfs_bnode_free().
This patch fixes a problem I first saw in July 2012: merely running "du"
on a large hfsplus-mounted directory a few times on a reasonably loaded
system would get the hfsplus driver all confused and complaining about
B-tree inconsistencies, and generates a "BUG: Bad page state". Most
recently, I can generate this problem on up-to-date Fedora 22 with shipped
kernel 4.0.5, by running "du /" (="/" + "/home" + "/mnt" + other smaller
mounts) and "du /mnt" simultaneously on two windows, where /mnt is a
lightly-used QEMU VM image of the full Mac OS X 10.9:
$ df -i / /home /mnt
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/fedora-root 3276800 551665 2725135 17% /
/dev/mapper/fedora-home 52879360 716221 52163139 2% /home
/dev/nbd0p2 4294967295 1387818 4293579477 1% /mnt
After applying the patch, I was able to run "du /" (60+ times) and "du
/mnt" (150+ times) continuously and simultaneously for 6+ hours.
There are many reports of the hfsplus driver getting confused under load
and generating "BUG: Bad page state" or other similar issues over the
years. [1]
The unpatched code [2] has always been wrong since it entered the kernel
tree. The only reason why it gets away with it is that the
kmap/memcpy/kunmap follow very quickly after the page_cache_release() so
the kernel has not had a chance to reuse the memory for something else,
most of the time.
The current RW driver appears to have followed the design and development
of the earlier read-only hfsplus driver [3], where-by version 0.1 (Dec
2001) had a B-tree node-centric approach to
read_cache_page()/page_cache_release() per bnode_get()/bnode_put(),
migrating towards version 0.2 (June 2002) of caching and releasing pages
per inode extents. When the current RW code first entered the kernel [2]
in 2005, there was an REF_PAGES conditional (and "//" commented out code)
to switch between B-node centric paging to inode-centric paging. There
was a mistake with the direction of one of the REF_PAGES conditionals in
__hfs_bnode_create(). In a subsequent "remove debug code" commit [4], the
read_cache_page()/page_cache_release() per bnode_get()/bnode_put() were
removed, but a page_cache_release() was mistakenly left in (propagating
the "REF_PAGES <-> !REF_PAGE" mistake), and the commented-out
page_cache_release() in bnode_release() (which should be spanned by
!REF_PAGES) was never enabled.
References:
[1]:
Michael Fox, Apr 2013
http://www.spinics.net/lists/linux-fsdevel/msg63807.html
("hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large'")
Sasha Levin, Feb 2015
http://lkml.org/lkml/2015/2/20/85 ("use after free")
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/740814https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1027887https://bugzilla.kernel.org/show_bug.cgi?id=42342https://bugzilla.kernel.org/show_bug.cgi?id=63841https://bugzilla.kernel.org/show_bug.cgi?id=78761
[2]:
http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
fs/hfs/bnode.c?id=d1081202f1d0ee35ab0beb490da4b65d4bc763db
commit d1081202f1d0ee35ab0beb490da4b65d4bc763db
Author: Andrew Morton <akpm@osdl.org>
Date: Wed Feb 25 16:17:36 2004 -0800
[PATCH] HFS rewrite
http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
fs/hfsplus/bnode.c?id=91556682e0bf004d98a529bf829d339abb98bbbd
commit 91556682e0bf004d98a529bf829d339abb98bbbd
Author: Andrew Morton <akpm@osdl.org>
Date: Wed Feb 25 16:17:48 2004 -0800
[PATCH] HFS+ support
[3]:
http://sourceforge.net/projects/linux-hfsplus/http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/http://linux-hfsplus.cvs.sourceforge.net/viewvc/linux-hfsplus/linux/\
fs/hfsplus/bnode.c?r1=1.4&r2=1.5
Date: Thu Jun 6 09:45:14 2002 +0000
Use buffer cache instead of page cache in bnode.c. Cache inode extents.
[4]:
http://git.kernel.org/cgit/linux/kernel/git/\
stable/linux-stable.git/commit/?id=a5e3985fa014029eb6795664c704953720cc7f7d
commit a5e3985fa0
Author: Roman Zippel <zippel@linux-m68k.org>
Date: Tue Sep 6 15:18:47 2005 -0700
[PATCH] hfs: remove debug code
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Signed-off-by: Sergei Antonov <saproj@gmail.com>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Sougata Santra <sougata@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit a068acf2ee77693e0bf39d6e07139ba704f461c3 upstream.
Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g. new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else. This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.
Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use
of "sudo" is something more sneaky:
$ BASE="ovl"
$ MNT="$BASE/mnt"
$ LOW="$BASE/lower"
$ UP="$BASE/upper"
$ WORK="$BASE/work/ 0 0
none /proc fuse.pwn user_id=1000"
$ mkdir -p "$LOW" "$UP" "$WORK"
$ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
$ cat /proc/mounts
none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
none /proc fuse.pwn user_id=1000 0 0
$ fusermount -u /proc
$ cat /proc/mounts
cat: /proc/mounts: No such file or directory
This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed. Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.
[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters]
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Jan Kara <jack@suse.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: J. R. Okajima <hooanon05g@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[lizf: Backported to 3.4:
- adjust context
- one more place in ceph needs to be changed
- drop changes to overlayfs
- drop showing vers in cifs]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 71c6da846be478a61556717ef1ee1cea91f5d6a8 upstream.
Currently context size (cra_ctxsize) doesn't specified for
ghash_async_alg. Which means it's zero. Thus crypto_create_tfm()
doesn't allocate needed space for ghash_async_ctx, so any
read/write to ctx (e.g. in ghash_async_init_tfm()) is not valid.
Signed-off-by: Andrey Ryabinin <aryabinin@odin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit f49a26e7718dd30b49e3541e3e25aecf5e7294e2 upstream.
Update ctime and mtime when a directory is modified. (though OS/2 doesn't
update them anyway)
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit b632ffa7cee439ba5dce3b3bc4a5cbe2b3e20133 upstream.
We have many WR opcodes that are only supported in kernel space
and/or require optional information to be copied into the WR
structure. Reject all those not explicitly handled so that we
can't pass invalid information to drivers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 09bfda10e6efd7b65bcc29237bee1765ed779657 upstream.
With the radeon driver loaded the HP Compaq dc5750
Small Form Factor machine fails to resume from suspend.
Adding a quirk similar to other devices avoids
the problem and the system resumes properly.
Signed-off-by: Jeffery Miller <jmiller@neverware.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 35d4a0b63dc0c6d1177d4f532a9deae958f0662c upstream.
Fixes: 2a72f21226 ("IB/uverbs: Remove dev_table")
Before this commit there was a device look-up table that was protected
by a spin_lock used by ib_uverbs_open and by ib_uverbs_remove_one. When
it was dropped and container_of was used instead, it enabled the race
with remove_one as dev might be freed just after:
dev = container_of(inode->i_cdev, struct ib_uverbs_device, cdev) but
before the kref_get.
In addition, this buggy patch added some dead code as
container_of(x,y,z) can never be NULL and so dev can never be NULL.
As a result the comment above ib_uverbs_open saying "the open method
will either immediately run -ENXIO" is wrong as it can never happen.
The solution follows Jason Gunthorpe suggestion from below URL:
https://www.mail-archive.com/linux-rdma@vger.kernel.org/msg25692.html
cdev will hold a kref on the parent (the containing structure,
ib_uverbs_device) and only when that kref is released it is
guaranteed that open will never be called again.
In addition, fixes the active count scheme to use an atomic
not a kref to prevent WARN_ON as pointed by above comment
from Jason.
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 5e99b139f1b68acd65e36515ca347b03856dfb5a upstream.
The mlx4 IB driver implementation for ib_query_ah used a wrong offset
(28 instead of 29) when link type is Ethernet. Fixed to use the correct one.
Fixes: fa417f7b52 ('IB/mlx4: Add support for IBoE')
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 0c78789e3a030615c6650fde89546cadf40ec2cc upstream.
In case the reconnection attempt fails.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
[lizf: Backported to 3.4: add definition of variable xprt]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 7f5dcaf1fdf289767a126a0a5cc3ef39b5254b06 upstream.
The unregister path of platform_device is broken. On registration, it
will register all resources with either a parent already set, or
type==IORESOURCE_{IO,MEM}. However, on unregister it will release
everything with type==IORESOURCE_{IO,MEM}, but ignore the others. There
are also cases where resources don't get registered in the first place,
like with devices created by of_platform_populate()*.
Fix the unregister path to be symmetrical with the register path by
checking the parent pointer instead of the type field to decide which
resources to unregister. This is safe because the upshot of the
registration path algorithm is that registered resources have a parent
pointer, and non-registered resources do not.
* It can be argued that of_platform_populate() should be registering
it's resources, and they argument has some merit. However, there are
quite a few platforms that end up broken if we try to do that due to
overlapping resources in the device tree. Until that is fixed, we need
to solve the immediate problem.
Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Tested-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 3a496b00b6f90c41bd21a410871dfc97d4f3c7ab upstream.
If the internal call to of_address_to_resource() fails, we end up
looping forever in of_find_matching_node_by_address(). This can be
caused by a defective device tree, or calling with an incorrect
matches argument.
Fix by calling of_find_matching_node() unconditionally at the end of
the loop.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 924f92bf12bfbef3662619e3ed24a1cea7c1cbcd upstream.
Most of the time this isn't an issue since hotplugging an adaptor will
trigger a crtc mode change which in turn, causes the driver to probe
every DisplayPort for a dpcd. However, in cases where hotplugging
doesn't cause a mode change (specifically when one unplugs a monitor
from a DisplayPort connector, then plugs that same monitor back in
seconds later on the same port without any other monitors connected), we
never probe for the dpcd before starting the initial link training. What
happens from there looks like this:
- GPU has only one monitor connected. It's connected via
DisplayPort, and does not go through an adaptor of any sort.
- User unplugs DisplayPort connector from GPU.
- Change in HPD is detected by the driver, we probe every
DisplayPort for a possible connection.
- Probe the port the user originally had the monitor connected
on for it's dpcd. This fails, and we clear the first (and only
the first) byte of the dpcd to indicate we no longer have a
dpcd for this port.
- User plugs the previously disconnected monitor back into the
same DisplayPort.
- radeon_connector_hotplug() is called before everyone else,
and tries to handle the link training. Since only the first
byte of the dpcd is zeroed, the driver is able to complete
link training but does so against the wrong dpcd, causing it
to initialize the link with the wrong settings.
- Display stays blank (usually), dpcd is probed after the
initial link training, and the driver prints no obvious
messages to the log.
In theory, since only one byte of the dpcd is chopped off (specifically,
the byte that contains the revision information for DisplayPort), it's
not entirely impossible that this bug may not show on certain monitors.
For instance, the only reason this bug was visible on my ASUS PB238
monitor was due to the fact that this monitor using the enhanced framing
symbol sequence, the flag for which is ignored if the radeon driver
thinks that the DisplayPort version is below 1.1.
Signed-off-by: Stephen Chandler Paul <cpaul@redhat.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 5556e7e6d30e8e9b5ee51b0e5edd526ee80e5e36 upstream.
Consider eCryptfs dcache entries to be stale when the corresponding
lower inode's i_nlink count is zero. This solves a problem caused by the
lower inode being directly modified, without going through the eCryptfs
mount, leaving stale eCryptfs dentries cached and the eCryptfs inode's
i_nlink count not being cleared.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Richard Weinberger <richard@nod.at>
[bwh: Backported to 3.2:
- Test d_revalidate pointer directly rather than a DCACHE_OP flag
- Open-code d_inode()
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 1fb8dc36384ae1140ee6ccc470de74397606a9d5 upstream.
CustomWare uses the FTDI VID with custom PIDs for their ShipModul MiniPlex
products.
Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 0521cfd06e1ebcd575e7ae36aab068b38df23850 upstream.
The ehci platform device's drvdata is the pointer of struct usb_hcd
already, so we doesn't need to call bus_to_hcd conversion again.
Signed-off-by: Peter Chen <peter.chen@freescale.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit efcbc04e16dfa95fef76309f89710dd1d99a5453 upstream.
It is unusual to combine the open flags O_RDONLY and O_EXCL, but
it appears that libre-office does just that.
[pid 3250] stat("/home/USER/.config", {st_mode=S_IFDIR|0700, st_size=8192, ...}) = 0
[pid 3250] open("/home/USER/.config/libreoffice/4-suse/user/extensions/buildid", O_RDONLY|O_EXCL <unfinished ...>
NFSv4 takes O_EXCL as a sign that a setattr command should be sent,
probably to reset the timestamps.
When it was an O_RDONLY open, the SETATTR command does not
identify any actual attributes to change.
If no delegation was provided to the open, the SETATTR uses the
all-zeros stateid and the request is accepted (at least by the
Linux NFS server - no harm, no foul).
If a read-delegation was provided, this is used in the SETATTR
request, and a Netapp filer will justifiably claim
NFS4ERR_BAD_STATEID, which the Linux client takes as a sign
to retry - indefinitely.
So only treat O_EXCL specially if O_CREAT was also given.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit fe2b592173ff0274e70dc44d1d28c19bb995aa7c upstream.
wf_unregister_client() increments the client count when a client
unregisters. That is obviously incorrect. Decrement that client count
instead.
Fixes: 75722d3992 ("[PATCH] ppc64: Thermal control for SMU based machines")
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 64526370d11ce8868ca495723d595b61e8697fbf upstream.
Currently, devres_get() passes devres_free() the pointer to devres,
but devres_free() should be given with the pointer to resource data.
Fixes: 9ac7849e35 ("devres: device resource management")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit bab383de3b84e584b0f09227151020b2a43dc34c upstream.
parport_find_base() will implicitly do parport_get_port() which
increases the refcount. Then parport_register_device() will again
increment the refcount. But while unloading the module we are only
doing parport_unregister_device() decrementing the refcount only once.
We add an parport_put_port() to neutralize the effect of
parport_get_port().
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 199dc6ed5179251fa6158a461499c24bdd99c836 upstream.
When a (e.g.) RAID5 array is reshaped to RAID0, the updating
of queue parameters (e.g. max number of sectors per bio) is
done in the wrong place.
It should be part of ->run, but it is actually part of ->takeover.
This means it happens before level_store() calls:
blk_set_stacking_limits(&mddev->queue->limits);
and so it ineffective. This can lead to errors from underlying
devices.
So move all the relevant settings out of create_stripe_zones()
and into raid0_run().
As this can lead to a bug-on it is suitable for any -stable
kernel which supports reshape to RAID0. So 2.6.35 or later.
As the bug has been present for five years there is no urgency,
so no need to rush into -stable.
Fixes: 9af204cf72 ("md: Add support for Raid5->Raid0 and Raid10->Raid0 takeover")
Reported-by: Yi Zhang <yizhan@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
[lizf: Backported to 3.4:
- adjust context
- remove changes to discard and write-same features]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 9d11b51ce7c150a69e761e30518f294fc73d55ff upstream.
The Linux NFS server returns garbage in the data payload of inline
NFS/RDMA READ replies. These are READs of under 1000 bytes or so
where the client has not provided either a reply chunk or a write
list.
The NFS server delivers the data payload for an NFS READ reply to
the transport in an xdr_buf page list. If the NFS client did not
provide a reply chunk or a write list, send_reply() is supposed to
set up a separate sge for the page containing the READ data, and
another sge for XDR padding if needed, then post all of the sges via
a single SEND Work Request.
The problem is send_reply() does not advance through the xdr_buf
when setting up scatter/gather entries for SEND WR. It always calls
dma_map_xdr with xdr_off set to zero. When there's more than one
sge, dma_map_xdr() sets up the SEND sge's so they all point to the
xdr_buf's head.
The current Linux NFS/RDMA client always provides a reply chunk or
a write list when performing an NFS READ over RDMA. Therefore, it
does not exercise this particular case. The Linux server has never
had to use more than one extra sge for building RPC/RDMA replies
with a Linux client.
However, an NFS/RDMA client _is_ allowed to send small NFS READs
without setting up a write list or reply chunk. The NFS READ reply
fits entirely within the inline reply buffer in this case. This is
perhaps a more efficient way of performing NFS READs that the Linux
NFS/RDMA client may some day adopt.
Fixes: b432e6b3d9 ('svcrdma: Change DMA mapping logic to . . .')
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=285
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 1c2cb594441d02815d304cccec9742ff5c707495 upstream.
The EPOW interrupt handler uses rtas_get_sensor(), which in turn
uses rtas_busy_delay() to wait for RTAS becoming ready in case it
is necessary. But rtas_busy_delay() is annotated with might_sleep()
and thus may not be used by interrupts handlers like the EPOW handler!
This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is
enabled:
BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496
in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6
Call Trace:
[c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable)
[c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180
[c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0
[c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0
[c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450
[c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300
[c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0
[c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260
[c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80
[c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200
[c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24
[c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110
[c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180
Fix this issue by introducing a new rtas_get_sensor_fast() function
that does not use rtas_busy_delay() - and thus can only be used for
sensors that do not cause a BUSY condition - known as "fast" sensors.
The EPOW sensor is defined to be "fast" in sPAPR - mpe.
Fixes: 587f83e8dd ("powerpc/pseries: Use rtas_get_sensor in RAS code")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 7aa6ca4d39edf01f997b9e02cf6d2fdeb224f351 upstream.
Set the PCI_DEV_FLAGS_VPD_REF_F0 flag on all Intel Ethernet device
functions other than function 0, so that on multi-function devices, we will
always read VPD from function 0 instead of from the other functions.
[bhelgaas: changelog]
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
[bwh: Backported to 3.2:
- Put the class check in the new function as there is no
DECLARE_PCI_FIXUP_CLASS_EARLY(
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 932c435caba8a2ce473a91753bad0173269ef334 upstream.
Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through
function 0 to provide VPD access on other functions. This is for hardware
devices that provide copies of the same VPD capability registers in
multiple functions. Because the kernel expects that each function has its
own registers, both the locking and the state tracking are affected by VPD
accesses to different functions.
On such devices for example, if a VPD write is performed on function 0,
*any* later attempt to read VPD from any other function of that device will
hang. This has to do with how the kernel tracks the expected value of the
F bit per function.
Concurrent accesses to different functions of the same device can not only
hang but also corrupt both read and write VPD data.
When hangs occur, typically the error message:
vpd r/w failed. This is likely a firmware bug on this device.
will be seen.
Never set this bit on function 0 or there will be an infinite recursion.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 3633ebebab2bbe88124388b7620442315c968e8f upstream.
We already set a station to be associated when peering completes, both
in user space and in the kernel. Thus we should always have an
associated sta before sending data frames to that station.
Failure to check assoc state can cause crashes in the lower-level driver
due to transmitting unicast data frames before driver sta structures
(e.g. ampdu state in ath9k) are initialized. This occurred when
forwarding in the presence of fixed mesh paths: frames were transmitted
to stations with whom we hadn't yet completed peering.
Reported-by: Alexis Green <agreen@cococorp.com>
Tested-by: Jesse Jones <jjones@cococorp.com>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit d1541dc977d376406f4584d8eb055488655c98ec upstream.
In fixup_ti816x_class(), we assigned "class = PCI_CLASS_MULTIMEDIA_VIDEO".
But PCI_CLASS_MULTIMEDIA_VIDEO is only the two-byte base class/sub-class
and needs to be shifted to make space for the low-order interface byte.
Shift PCI_CLASS_MULTIMEDIA_VIDEO to set the correct class code.
Fixes: 63c4408074 ("PCI: Add quirk for setting valid class for TI816X Endpoint")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Hemant Pedanekar <hemantp@ti.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit a66b0c41ad277ae62a3ae6ac430a71882f899557 upstream.
The input_dev is already gone when the rc device is being unregistered
so checking for its presence only means that no remove uevent will be
generated.
Signed-off-by: David Härdeman <david@hardeman.nu>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 7cae2bedcbd4680b155999655e49c27b9cf020fa upstream.
As reported at https://bugs.launchpad.net/qemu/+bug/1494350,
it is possible to have vcpu->arch.st.last_steal initialized
from a thread other than vcpu thread, say the iothread, via
KVM_SET_MSRS.
Which can cause an overflow later (when subtracting from vcpu threads
sched_info.run_delay).
To avoid that, move steal time accumulation to vcpu entry time,
before copying steal time data to guest.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 2871c69e025e8bc507651d5a9cf81a8a7da9d24b upstream.
Commit 4c7e309340ff ("dm btree remove: fix bug in redistribute3") wasn't
a complete fix for redistribute3().
The redistribute3 function takes 3 btree nodes and shares out the entries
evenly between them. If the three nodes in total contained
(MAX_ENTRIES * 3) - 1 entries between them then this was erroneously getting
rebalanced as (MAX_ENTRIES - 1) on the left and right, and (MAX_ENTRIES + 1) in
the center.
Fix this issue by being more careful about calculating the target number
of entries for the left and right nodes.
Unit tested in userspace using this program:
https://github.com/jthornber/redistribute3-test/blob/master/redistribute3_t.c
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit c450960187f45d4260db87c7dd4fc0bceb5565d8 upstream.
The assignement of EP transfer resources was not handled properly in the
dwc3 driver. Commit aebda6187181 ("usb: dwc3: Reset the transfer
resource index on SET_INTERFACE") previously fixed one aspect of this
where resources may be exhausted with multiple calls to SET_INTERFACE.
However, it introduced an issue where composite devices with multiple
interfaces can be assigned the same transfer resources for different
endpoints. This patch solves both issues.
The assignment of transfer resources cannot perfectly follow the data
book due to the fact that the controller driver does not have all
knowledge of the configuration in advance. It is given this information
piecemeal by the composite gadget framework after every
SET_CONFIGURATION and SET_INTERFACE. Trying to follow the databook
programming model in this scenario can cause errors. For two reasons:
1) The databook says to do DEPSTARTCFG for every SET_CONFIGURATION and
SET_INTERFACE (8.1.5). This is incorrect in the scenario of multiple
interfaces.
2) The databook does not mention doing more DEPXFERCFG for new endpoint
on alt setting (8.1.6).
The following simplified method is used instead:
All hardware endpoints can be assigned a transfer resource and this
setting will stay persistent until either a core reset or hibernation.
So whenever we do a DEPSTARTCFG(0) we can go ahead and do DEPXFERCFG for
every hardware endpoint as well. We are guaranteed that there are as
many transfer resources as endpoints.
This patch triggers off of the calling dwc3_gadget_start_config() for
EP0-out, which always happens first, and which should only happen in one
of the above conditions.
Fixes: aebda6187181 ("usb: dwc3: Reset the transfer resource index on SET_INTERFACE")
Reported-by: Ravi Babu <ravibabu@ti.com>
Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@kernel.org>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 42e3121d90f42e57f6dbd6083dff2f57b3ec7daa upstream.
AudioQuest DragonFly DAC reports a volume control range of 0..50
(0x0000..0x0032) which in USB Audio means a range of 0 .. 0.2dB, which
is obviously incorrect and would cause software using the dB information
in e.g. volume sliders to have a massive volume difference in 100..102%
range.
Commit 2d1cb7f658fb ("ALSA: usb-audio: add dB range mapping for some
devices") added a dB range mapping for it with range 0..50 dB.
However, the actual volume mapping seems to be neither linear volume nor
linear dB scale, but instead quite close to the cubic mapping e.g.
alsamixer uses, with a range of approx. -53...0 dB.
Replace the previous quirk with a custom dB mapping based on some basic
output measurements, using a 10-item range TLV (which will still fit in
alsa-lib MAX_TLV_RANGE_SIZE).
Tested on AudioQuest DragonFly HW v1.2. The quirk is only applied if the
range is 0..50, so if this gets fixed/changed in later HW revisions it
will no longer be applied.
v2: incorporated Takashi Iwai's suggestion for the quirk application
method
Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[lizf: Backoported to 3.4: use dev_info() instead of usb_audio_info()]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit bf1d1c9b61 upstream.
Add a DECLARE_TLV_DB_RANGE() macro so that dB range information
can be specified without having to count the items manually for
TLV_DB_RANGE_HEAD().
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit b5b9eb5467 upstream.
Add helper macros with a little bit of preprocessor magic to
automatically compute the length of a TLV item. This lets us avoid
having to compute this by hand, and will allow to use items that do
not use a fixed length.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 0d430e3fb3f7cdc13c0d22078b820f682821b45a upstream.
This was meant to print base address and entry count; make it do so
again.
Fixes: 37868fe113ff "x86/ldt: Make modify_ldt synchronous"
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/56797D8402000078000C24F0@prv-mh.provo.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit a5527dda344fff0514b7989ef7a755729769daa1 upstream.
The unix_dgram_sendmsg routine use the following test
if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {
to determine if sk and other are in an n:1 association (either
established via connect or by using sendto to send messages to an
unrelated socket identified by address). This isn't correct as the
specified address could have been bound to the sending socket itself or
because this socket could have been connected to itself by the time of
the unix_peer_get but disconnected before the unix_state_lock(other). In
both cases, the if-block would be entered despite other == sk which
might either block the sender unintentionally or lead to trying to unlock
the same spin lock twice for a non-blocking send. Add a other != sk
check to guard against this.
Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue")
Reported-By: Philipp Hahn <pmhahn@pmhahn.de>
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Tested-by: Philipp Hahn <pmhahn@pmhahn.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 7bbadd2d1009575dad675afc16650ebb5aa10612 upstream.
Docbook does not like the definition of macros inside a field declaration
and adds a warning. Move the definition out.
Fixes: 79462ad02e86180 ("net: add validation for the socket syscall protocol argument")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 62572e29bc upstream.
I ran into a scenario where while one cpu was stuck and should have
panic'd because of the NMI watchdog, it didn't. The reason was another
cpu was spewing stack dumps on to the console. Upon investigation, I
noticed that when writing to the console and also when dumping the
stack, the watchdog is touched.
This causes all the cpus to reset their NMI watchdog flags and the
'stuck' cpu just spins forever.
This change causes the semantics of touch_nmi_watchdog to be changed
slightly. Previously, I accidentally changed the semantics and we
noticed there was a codepath in which touch_nmi_watchdog could be
touched from a preemtible area. That caused a BUG() to happen when
CONFIG_DEBUG_PREEMPT was enabled. I believe it was the acpi code.
My attempt here re-introduces the change to have the
touch_nmi_watchdog() code only touch the local cpu instead of all of the
cpus. But instead of using __get_cpu_var(), I use the
__raw_get_cpu_var() version.
This avoids the preemption problem. However my reasoning wasn't because
I was trying to be lazy. Instead I rationalized it as, well if
preemption is enabled then interrupts should be enabled to and the NMI
watchdog will have no reason to trigger. So it won't matter if the
wrong cpu is touched because the percpu interrupt counters the NMI
watchdog uses should still be incrementing.
Don said:
: I'm ok with this patch, though it does alter the behaviour of how
: touch_nmi_watchdog works. For the most part I don't think most callers
: need to touch all of the watchdogs (on each cpu). Perhaps a corner case
: will pop up (the scheduler?? to mimic touch_all_softlockup_watchdogs() ).
:
: But this does address an issue where if a system is locked up and one cpu
: is spewing out useful debug messages (or error messages), the hard lockup
: will fail to go off. We have seen this on RHEL also.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Ben Zhang <benzh@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@huawei.com>
commit 2ac3ac8f86 upstream.
On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.
This happens because fib6_run_gc() uses fib6_gc_lock to allow
only one thread to run the garbage collector but ip6_dst_gc()
doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
returns. On a system with many entries, this can take some time
so that in the meantime, other threads pass the tests in
ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
the lock. They then have to run the garbage collector one after
another which blocks them for quite long.
Resolve this by replacing special value ~0UL of expire parameter
to fib6_run_gc() by explicit "force" parameter to choose between
spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
force=false if gc_thresh is reached but not max_size.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@huawei.com>