Some callers of request_firmware_direct may need additional
context to be able to map firmware memory. Allow private data
to be passed in with request_firmware_direct, and send this
data along with the [unmap|map]_fw_mem callbacks.
Change-Id: I05a15eb46cc663a4476b784e30e80182a28e10c3
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
dma-removed alloc hook tries to ioremap a carve-out in order
to memset it to 0s. Warn the user if it fails due to low
vmalloc space.
CRs-fixed: 652009
Change-Id: If8d1a220c8e604222a44c5bf5201fc2d03e370b8
Signed-off-by: Shiraz Hashim <shashim@codeaurora.org>
If a free CMA region cannot be found because every one is busy,
an error needs to be propegated up by returning a zero pfn.
Ensure the pfn is actually zero when returning an error.
Change-Id: I0d5a66a25c4483bf0f219cec1d7009239518f27a
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Commit 42e668f cma: Delay non-placed memblocks until after all allocations
delayed calling memblock_alloc until later but didn't move the
fixup. Make sure to call the fixup after allocation of the
memblock.
Change-Id: I305c05f1b1e0b6aeb462746e91b30560ddf5b934
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
When CMA allocator gets error return from the page
allocator framework, except for the -EBUSY case, it
will bail out. Caller depends on the 'pfn' for
confirming allocation successful or not. Return 0
for those error case.
Change-Id: Ica4e04a9f9f18b1a29035ba2bae9deecfd68a5e8
Signed-off-by: Chintan Pandya <cpandya@codeaurora.org>
CMA is now responsible for almost all memory reservation/removal.
Some regions are at fixed locations, some are placed dynamically.
We need to place all fixed regions first before trying to place
dynamic regions to avoid overlap. Additionally, allow an
architectural callback after all removals/fixed location has
happened to potentially update any relevant limits.
Change-Id: Iaaffe60445ef44d432f0d87875ce2b292b717cc7
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
The lock that was locked was cma->lock not cma_mutex. Drop
the right one when breaking out of the loop.
Change-Id: I0a1831b23613c5220795fe2a63f9db7439268c3f
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
The following commits have been reverted from this merge, as they are
known to introduce new bugs and are currently incompatible with our
audio implementation. Investigation of these commits is ongoing, and
they are expected to be brought in at a later time:
86e6de7 ALSA: compress: fix drain calls blocking other compress functions (v6)
16442d4 ALSA: compress: fix drain calls blocking other compress functions
This merge commit also includes a change in block, necessary for
compilation. Upstream has modified elevator_init_fn to prevent race
conditions, requring updates to row_init_queue and test_init_queue.
* commit 'v3.10.28': (1964 commits)
Linux 3.10.28
ARM: 7938/1: OMAP4/highbank: Flush L2 cache before disabling
drm/i915: Don't grab crtc mutexes in intel_modeset_gem_init()
serial: amba-pl011: use port lock to guard control register access
mm: Make {,set}page_address() static inline if WANT_PAGE_VIRTUAL
md/raid5: Fix possible confusion when multiple write errors occur.
md/raid10: fix two bugs in handling of known-bad-blocks.
md/raid10: fix bug when raid10 recovery fails to recover a block.
md: fix problem when adding device to read-only array with bitmap.
drm/i915: fix DDI PLLs HW state readout code
nilfs2: fix segctor bug that causes file system corruption
thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only
ftrace/x86: Load ftrace_ops in parameter not the variable holding it
SELinux: Fix possible NULL pointer dereference in selinux_inode_permission()
writeback: Fix data corruption on NFS
hwmon: (coretemp) Fix truncated name of alarm attributes
vfs: In d_path don't call d_dname on a mount point
staging: comedi: adl_pci9111: fix incorrect irq passed to request_irq()
staging: comedi: addi_apci_1032: fix subdevice type/flags bug
mm/memory-failure.c: recheck PageHuge() after hugetlb page migrate successfully
GFS2: Increase i_writecount during gfs2_setattr_chown
perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h
perf scripting perl: Fix build error on Fedora 12
ARM: 7815/1: kexec: offline non panic CPUs on Kdump panic
Linux 3.10.27
sched: Guarantee new group-entities always have weight
sched: Fix hrtimer_cancel()/rq->lock deadlock
sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
sched: Fix race on toggling cfs_bandwidth_used
x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround
netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper
SCSI: sd: Reduce buffer size for vpd request
intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters.
mac80211: move "bufferable MMPDU" check to fix AP mode scan
ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS
ACPI / TPM: fix memory leak when walking ACPI namespace
mfd: rtsx_pcr: Disable interrupts before cancelling delayed works
clk: exynos5250: fix sysmmu_mfc{l,r} gate clocks
clk: samsung: exynos5250: Add CLK_IGNORE_UNUSED flag for the sysreg clock
clk: samsung: exynos4: Correct SRC_MFC register
clk: clk-divider: fix divisor > 255 bug
ahci: add PCI ID for Marvell 88SE9170 SATA controller
parisc: Ensure full cache coherency for kmap/kunmap
drm/nouveau/bios: make jump conditional
ARM: shmobile: mackerel: Fix coherent DMA mask
ARM: shmobile: armadillo: Fix coherent DMA mask
ARM: shmobile: kzm9g: Fix coherent DMA mask
ARM: dts: exynos5250: Fix MDMA0 clock number
ARM: fix "bad mode in ... handler" message for undefined instructions
ARM: fix footbridge clockevent device
net: Loosen constraints for recalculating checksum in skb_segment()
bridge: use spin_lock_bh() in br_multicast_set_hash_max
netpoll: Fix missing TXQ unlock and and OOPS.
net: llc: fix use after free in llc_ui_recvmsg
virtio-net: fix refill races during restore
virtio_net: don't leak memory or block when too many frags
virtio-net: make all RX paths handle errors consistently
virtio_net: fix error handling for mergeable buffers
vlan: Fix header ops passthru when doing TX VLAN offload.
net: rose: restore old recvmsg behavior
rds: prevent dereference of a NULL device
ipv6: always set the new created dst's from in ip6_rt_copy
net: fec: fix potential use after free
hamradio/yam: fix info leak in ioctl
drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl()
net: inet_diag: zero out uninitialized idiag_{src,dst} fields
ip_gre: fix msg_name parsing for recvfrom/recvmsg
net: unix: allow bind to fail on mutex lock
ipv6: fix illegal mac_header comparison on 32bit
netvsc: don't flush peers notifying work during setting mtu
tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0
net: unix: allow set_peek_off to fail
net: drop_monitor: fix the value of maxattr
ipv6: don't count addrconf generated routes against gc limit
packet: fix send path when running with proto == 0
virtio: delete napi structures from netdev before releasing memory
macvtap: signal truncated packets
tun: update file current position
macvtap: update file current position
macvtap: Do not double-count received packets
rds: prevent BUG_ON triggered on congestion update to loopback
net: do not pretend FRAGLIST support
IPv6: Fixed support for blackhole and prohibit routes
HID: Revert "Revert "HID: Fix logitech-dj: missing Unifying device issue""
gpio-rcar: R-Car GPIO IRQ share interrupt
clocksource: em_sti: Set cpu_possible_mask to fix SMP broadcast
irqchip: renesas-irqc: Fix irqc_probe error handling
Linux 3.10.26
sh: add EXPORT_SYMBOL(min_low_pfn) and EXPORT_SYMBOL(max_low_pfn) to sh_ksyms_32.c
ext4: fix bigalloc regression
arm64: Use Normal NonCacheable memory for writecombine
arm64: Do not flush the D-cache for anonymous pages
arm64: Avoid cache flushing in flush_dcache_page()
ARM: KVM: arch_timers: zero CNTVOFF upon return to host
ARM: hyp: initialize CNTVOFF to zero
clocksource: arch_timer: use virtual counters
arm64: Remove unused cpu_name ascii in arch/arm64/mm/proc.S
arm64: dts: Reserve the memory used for secondary CPU release address
arm64: check for number of arguments in syscall_get/set_arguments()
arm64: fix possible invalid FPSIMD initialization state
...
Change-Id: Ia0e5d71b536ab49ec3a1179d59238c05bdd03106
Signed-off-by: Ian Maund <imaund@codeaurora.org>
CMA locking is currently very coarse. The cma_mutex protects both
the bitmap and avoids concurrency with alloc_contig_range. There
are several situations which may result in a deadlock on the CMA
mutex currently, mostly involving AB/BA situations with alloc and
free. Fix this issue by protecting the bitmap with a mutex per CMA
region and use the existing mutex for protecting against concurrency
with alloc_contig_range.
Change-Id: I6863ba7ab7fae07c68fee23b0aa4c869244fe2a1
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Due to a bug in code logic, the default CMA region is
reserved when CONFIG_CMA_RESERVE_DEFAULT_AREA is NOT set
instead of when it is set. Fix this logic.
Change-Id: I476594c1e3745386741f2aba1b978a436b60c7a4
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
The total number of CMA regions that can be created is statically
decided at compile time via a config option. Incoming memory map
changes increase the number of CMA regions significantly across
almost all targets. Rather than make every target set the CONFIG,
just increase the default number. The additional overhead is very
small.
Change-Id: I2f60ee56b43c256f3bfa4d1e3ab95a8f378568ab
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
CMA requires that regions be aligned to page block order or higher
to ensure migration types can be changed for whole regions. This
is not applicable if the memory is removed from the system completely.
Keep the alignment requirement at PAGE_SIZE if the memory is outside
the system allocator. Note that if there are alignment requirements
these can still be set up by manually aligning the base/size.
Change-Id: I316008095469492c150e8b69bb20b369579e3a36
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
CMA scans the flattened devicetree to reserve memory early. Check
for the status of a DT node to possibly skip initialization.
Change-Id: I58760ee5ff241a1ce3a93955b1e507d506f92162
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
A report was given of a deadlock situation on the cma_mutex:
mutex_lock(cma_mutex) (*2)
dma_release_from_contiguous
ion_secure_cma_free_chunk
ion_secure_cma_shrinker
shrink_slab
try_to_free_pages
migrate_pages
alloc_contig_range
mutex_lock(cma_mutex) (*1)
dma_alloc_from_contiguous
ion_alloc
We may need to free CMA allocations while a current CMA allocation is in
progress if CMA is freed from a shrinker. cma_mutex currently protects
two things: the bitmap indicating which pages are allocated/free and
serialization of isolation/migration calls on allocation. There is
no need to take the mutex on free calls though as the pages are
freed back into the system via the regular __free_page call. Move
the free_contig_range call outside the cma_mutex to break the
chain dependency. We can safely free the pages back into the system
before changing the bitmap without any risk of races.
Change-Id: I4989eb2891e502b08db8117a51bd86652e902778
CRs-Fixed: 619644
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
The put_device(dev) at the bottom of the loop of device_shutdown
may result in the dev being cleaned up. In device_create_release,
the dev is kfreed.
However, device_shutdown attempts to use the dev pointer again after
put_device by referring to dev->parent.
Copy the parent pointer instead to avoid this condition.
This bug was found on Chromium OS's chromeos-3.8, which is based on v3.8.11.
See bug report : https://code.google.com/p/chromium/issues/detail?id=297842
This can easily be reproduced when shutting down with
hidraw devices that report battery condition.
Two examples are the HP Bluetooth Mouse X4000b and the Apple Magic Mouse.
For example, with the magic mouse :
The dev in question is "hidraw0"
dev->parent is "magicmouse"
In the course of the shutdown for this device, the input event cleanup calls
a put on hidraw0, decrementing its reference count.
When we finally get to put_device(dev) in device_shutdown, kobject_cleanup
is called and device_create_release does kfree(dev).
dev->parent is no longer valid, and we may crash in
put_device(dev->parent).
This change should be applied on any kernel with this change :
d1c6c030fc
Change-Id: Ib8c7dbce155558aa1087349130d5d1b58c15540f
Cc: stable@vger.kernel.org
Signed-off-by: Benson Leung <bleung@chromium.org>
Reviewed-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Git-commit: f123db8e9d6c84c863cb3c44d17e61995dc984fb
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Signed-off-by: Osvaldo Banuelos <osvaldob@codeaurora.org>
Drivers that are registered at an initcall level may have to
wait until late_init before the probe deferral mechanism can
retry their probe functions. It is possible that their
dependencies were resolved much earlier, in some cases even
before the next initcall level. Invoke one probe retry cycle
at every _sync initcall level, allowing these drivers to be
probed earlier.
A real world example of how this change helps follows. On the
MSM8974, there are 3 devices that need to be probed in order
for the display driver to be able to probe and bring up a
display panel. These are the gdsc_mdss (a regulator device), the
mmsscc-dsi device (a display clock controller), and the
dsipllcc device (a PLL controller). Here is a kernel log that
shows these devices probing in the wrong order:
[0.503253] mmsscc-dsi fd8c0000.qcom,mmsscc-dsi: Failed to get pixel source. -- [1]
[0.505210] dsipllcc fd8c0000.qcom,dsipllcc: Failed to get MDSS GDSC -- [2]
[0.523264] gdsc_mdss: no parameters -- [3]
Only gdsc_mdss successfully probed at 0.52 seconds. Now without
_this_ change, the current probe deferral mechanism results in
the devices probing at or after late_init:
[9.196006] dsipllcc fd8c0000.qcom,dsipllcc: Registered DSI PLL clocks. -- [2]
[9.357440] mmsscc-dsi fd8c0000.qcom,mmsscc-dsi: Registered MMSSCC DSI clocks. -- [1]
Thus the display can only be brought up after 9.35 seconds. However,
by allowing a probe retry after each initcall level, this number
reduces drastically:
[0.608252] dsipllcc fd8c0000.qcom,dsipllcc: Registered DSI PLL clocks. -- [2]
[0.613758] mmsscc-dsi fd8c0000.qcom,mmsscc-dsi: Registered MMSSCC DSI clocks.-- [1]
Thus the display can be brought up just after 0.61 seconds.
Change-Id: I83d8ac89e591e89e27934c0402449437b61b2124
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
In addition to reserving memory from the system, there may
be uses where memory should be completely removed from control
of the linux page allocator. Add the appropriate information to
be able to remove the memory.
Change-Id: Ia2e959e0858fb240ab5c0deee49b0d6e4aecfc00
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
The current set of dma_ops only supports memory that's actually
backed by struct pages. In the general case this is what we want,
but there are some use cases where it's useful to allocate
removed memory through dma APIs (through CMA region for example).
Add a limited set of dma operations to accomplish this. This is
only the allocate/free version of the APIs which is all that is
necessary for these purposes.
Change-Id: I518e6965d63f3c95697f59c089b95219f8e6d96e
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Currently, devices are associated with CMA regions via phandle to the
CMA region. The phandle name 'linux,contiguous-region' is the same as
that which is used to designate a CMA region. This can lead to issues
of trying to treat a device only using a CMA region as an actual CMA
region. Rather than continuing to rely on this handle and the depth
to differentiate CMA regions, just create separate DT binding
linux,reserve-contiguous-region to indicate that the node is
an actual CMA node for reserving rather than a client referencing
via phandle.
Change-Id: I88b2d86054525b0569efc424da87974509ce9b25
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
The CMA code is generic enough that it can be expanded out to track
regions of memory that aren't officially managed by the regular page
allocator. This memory can't be referenced via struct page. Change the
CMA apis to track using pfn for allocation/free instead. The pfn can
be converted to a struct page as needed elsewhere.
Change-Id: I5ac3fa5e2169b2101a738177f1654faa401f7604
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Since Operating Performance Points (OPP) functions are specific
to device specific power management, be specific and rename opp.h
to pm_opp.h
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: e4db1c7439b31993a4886b273bb9235a8eea82bf
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
Since Operating Performance Points (OPP) functions are specific to
device specific power management, be specific and rename opp_*
accessors in OPP library with dev_pm_opp_* equivalent.
Affected functions are:
opp_get_voltage
opp_get_freq
opp_get_opp_count
opp_find_freq_exact
opp_find_freq_floor
opp_find_freq_ceil
opp_add
opp_enable
opp_disable
opp_get_notifier
opp_init_cpufreq_table
opp_free_cpufreq_table
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Nishanth Menon <nm@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 5d4879cda67b09f086807821cf594ee079d6dfbe
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: resolve trivial merge conflicts]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
The "index" field of struct cpufreq_frequency_table was never an
index and isn't used at all by the cpufreq core. It only is useful
for cpufreq drivers for their internal purposes.
Many people nowadays blindly set it in ascending order with the
assumption that the core will use it, which is a mistake.
Rename it to "driver_data" as that's what its purpose is. All of its
users are updated accordingly.
[rjw: Changelog]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Git-commit: 5070158804b5339c71809f5e673cea1cfacd804d
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[junjiew@codeaurora.org: update non-upstream files]
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
The value returned by of_get_flat_dt_prop is a pointer. Use
be32_to_cpup which properly type checks against pointers.
Change-Id: Ie75ac6776bd26537a28729902bd918e710eda512
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
commit baab52ded242c35a2290e1fa82e0cc147d0d8c1a upstream.
Commit fa180eb448 (PM / Runtime: Idle devices asynchronously after
probe|release) modified __device_release_driver() to call
pm_runtime_put(dev) instead of pm_runtime_put_sync(dev) before
detaching the driver from the device. However, that was a mistake,
because pm_runtime_put(dev) causes rpm_idle() to be queued up and
the driver may be gone already when that function is executed.
That breaks the assumptions the drivers have the right to make
about the core's behavior on the basis of the existing documentation
and actually causes problems to happen, so revert that part of
commit fa180eb448 and restore the previous behavior of
__device_release_driver().
Reported-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Fixes: fa180eb448 (PM / Runtime: Idle devices asynchronously after probe|release)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f123db8e9d6c84c863cb3c44d17e61995dc984fb upstream.
The put_device(dev) at the bottom of the loop of device_shutdown
may result in the dev being cleaned up. In device_create_release,
the dev is kfreed.
However, device_shutdown attempts to use the dev pointer again after
put_device by referring to dev->parent.
Copy the parent pointer instead to avoid this condition.
This bug was found on Chromium OS's chromeos-3.8, which is based on v3.8.11.
See bug report : https://code.google.com/p/chromium/issues/detail?id=297842
This can easily be reproduced when shutting down with
hidraw devices that report battery condition.
Two examples are the HP Bluetooth Mouse X4000b and the Apple Magic Mouse.
For example, with the magic mouse :
The dev in question is "hidraw0"
dev->parent is "magicmouse"
In the course of the shutdown for this device, the input event cleanup calls
a put on hidraw0, decrementing its reference count.
When we finally get to put_device(dev) in device_shutdown, kobject_cleanup
is called and device_create_release does kfree(dev).
dev->parent is no longer valid, and we may crash in
put_device(dev->parent).
This change should be applied on any kernel with this change :
d1c6c030fc
Signed-off-by: Benson Leung <bleung@chromium.org>
Reviewed-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Physical addresses can be greater than an unsigned long on
systems with LPAE. Print these addresses properly with %pa
Change-Id: If9f1572bb9dadcb14f2834647a4421509e90096a
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Currently, when dynamically placing regions CMA will allow the memory
to be placed anywhere, including highmem. Due to system restrictions,
regions may need to be placed in a smaller range. Add support to
devicetree to allow these regions to have an upper bound on where they
will be placed.
Change-Id: Ib4ae194cbb6389e1091e7e04cfd331e9ab67ad05
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Currently, the CMA flat device tree code does not take into account
targets that may specify size_cells and address_cells > 1. This will
lead to unsuccessful parsing. Add support for taking into account
nodes that may specify size_cells and address_cells explicitly.
Change-Id: I9ea63b8c34e903b186c29ec6555dd7a5c317c602
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
commit 4e67fb5f5e336250db944921e3c68057d6203034 upstream.
Avoid overlapping register regions by making the initial blklen of a new
node 1. If a register write occurs to a yet uncached register, that is
lower than but near an existing node's base_reg, a new node is created
and it's blklen is set to an arbitrary value (sizeof(*rbnode)). That may
cause this node to overlap with another node. Those nodes should be merged,
but this merge doesn't happen yet, so this patch at least makes the initial
blklen small enough to avoid hitting the wrong node, which may otherwise
lead to severe breakage.
Signed-off-by: David Jander <david@protonic.nl>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Zhouping Liu <zliu@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When setting CMA up at a fixed region, it is possible for the address
to be shared between multiple subsystems. If the address is placed
dynamically, there is no mechanism to be able to get the start address
of the region. Drivers may wish to do keep track of allocations relative
to the start address so add an API to get the start region of a CMA
address.
Change-Id: If0730c64496c876d3143064d767b22b984c6dc84
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
CMA provides good utilization of memory but for some use cases, the
allocation time is too costly. Add a Kconfig option to allow the
default region to be permanently set aside for contiguous use cases.
Change-Id: I1eef508d37cf6ae3b7b7a652fc59391b186fc122
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Despite all the performance optimizations, some clients are
still unable to use CMA because of the allocation latency.
Rather than make those clients use a separate set of APIs,
extend the CMA code to allow clients to keep the memory out of
the buddy allocator. Since the pages never actually go to the
buddy allocator, allocation and freeing is only based on the bitmap
allocator to find the appropriate region.
Change-Id: Ia31bb1212fd7b19280361128453c8d25369ce592
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
This reverts commit 74ab7c1e198200036a84acd88c28ee9a441e5167.
Appropriate calls have now been made to flush outstanding highmem
mappings. Place the CMA region in highmem again.
Change-Id: I66baa055ed31074f2f21c77d46e3a5fc906e9683
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
This reverts commit 6308fd5399ae93e35ac381825a279996264a5f78.
As an optimization for kmap/kunmap, the page table entries may
not be completely removed on kunmap. This causes a problem
when memory is XPU protected and the CPU speculates into the
virtual address map. There are functions to remove zero
count entries for kmap calls but kmap_atomic is handled separately.
Remove CMA from highmem until an effective strategy for removing
these entries is found.
Change-Id: Ic60a34401244182442fc6d11dd0cdf986be7a335
CRs-Fixed: 467508
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
MEMBLOCK_ALLOC_ANYWHERE allocates blocks from anywhere,
including highmem. Use this flag to allow CMA regions to
be placed in highmem as opposed to lowmem.
Change-Id: Id5fa36a96e46d60f0e898d764a1f4c8a0a37f5f8
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
CMA currently restricts region names to 'region@x'. Devicetree
does not support the same value of x to be used multiple times.
This means that the devicetree cannot have multiple regions
be dynamically placed (x = 0). Remove the naming restriction
for CMA regions.
Change-Id: If647f8d7e6323497952431ae5b8cae05ba17af50
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Currently, the devicetree lookup code assumes that all
CMA regions are present at a fixed address and uses the
fixed address for associating CMA regions to devices.
This presents a problem for dynamically assigning regions.
Device names get mangled/changed between flattened and
populated devicetree so relying on that is unworkable.
Add a separate name binding to allow lookup later between
devices.
Change-Id: Iaacd9888ea708d7293f1120e2b8c473c5c601f3d
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
The correct binding for regions is linux,contiguous-regions.
Fix it.
Change-Id: I4bbb4cd3e880c75d917b5a5a081861b3197adfa3
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Several of the CMA data structures are used after initialization
remove the __init annotations from them.
Change-Id: Iff48ed88eef7b8fffdfba4b868cc69ded3c6df42
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Add device tree support for contiguous memory regions defined in device
tree. Initialization is done in 2 steps. First, the contiguous memory is
reserved, what happens very early, when only flattened device tree is
available. Then on device initialization the corresponding cma regions are
assigned to device structures.
Change-Id: Ic242499b64875ee57a346d7cbc8a34ebd64e68d2
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
This patch cleans the initialization of dma contiguous framework. The
all-in-one dma_declare_contiguous() function is now separated into
dma_contiguous_reserve_area() which only steals the the memory from
memblock allocator and dma_contiguous_add_device() function, which
assigns given device to the specified reserved memory area. This improves
the flexibility in defining contiguous memory areas and assigning device
to them, because now it is possible to assign more than one device to
the given contiguous memory area. This split in initialization is also
required for upcoming device tree support.
Change-Id: Ibddd1c9abc6550ee62b09645e7a3355256838bfe
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Add ftrace events for ion allocations to make it easier to profile
their performance.
Change-Id: I9f32e076cd50d7d3a145353dfcef74f0f6cdf8a0
Signed-off-by: Liam Mark <lmark@codeaurora.org>
* qandroid-3.10: (636 commits)
netfilter: xt_qtaguid: Protect iface list access with necessary lock
HID: magicmouse: Fix build warning
USB: gadget: mtp: Fix OUT endpoint request length usage in read
USB: gadget: f_mtp: Fix using tx buffer pointer
msm: Fix race condition in domain lookup
msm: Add null-pointer checks for domains
base: sync: increase size of sync_timeline name
USB: gadget: mtp: Add module parameters for Tx transfer length
msm: iommu: Lock the genpool allocation
gpu: ion: fix page offset in dma_buf_kmap()
gpu: ion: Fix bug in ion_system_heap map_user
gpu: ion: Only map as much of the vma as the user requested
gpu: ion: use vmalloc to allocate page array to map kernel
gpu: ion: Remove dead comments
gpu: ion: Minimize allocation fallback delay
mmc: sd: Set the card removed if card detect fails
gpu: ion: don't fault in individual pages for the CP heap
gpu: ion: do not ask for compound pages in system heap
gpu: ion: Modify the system heap to try to allocate large/huge pages
gpu: ion: Set the dma_address of the sg list at alloc time
...
Conflicts:
arch/arm/Kconfig
arch/arm/include/asm/hardware/cache-l2x0.h
arch/arm/mm/cache-l2x0.c
drivers/mmc/card/block.c
drivers/usb/gadget/udc-core.c
On devices with low memory, using request_firmware on rather
large firmware images results in a memory usage penalty that
might be unaffordable. Introduce a new API that allows the
firmware image to be directly loaded to a destination address
without using any intermediate buffer.
Change-Id: I51b55dd9044ea669e2126a3f908028850bf76325
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
Some low memory systems with complex peripherals cannot
afford to have the relatively large firmware images taking
up valuable memory during suspend and resume. Change the
internal implementation of firmware_class to disallow
caching based on a configurable option. In the near future,
variants of request_firmware will take advantage of this
configurability.
Change-Id: I44be7ce3b308b642fb018086def99fcb800a1109
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
Introduce a firmware descriptor structure that makes it
easier to pass around various configuration options in the
internal implementation of firmware_class.
Change-Id: I5c1da222bccd568fabb26da5baccaa4035331efd
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
commit 2d49b5987561e480bdbd8692b27fc5f49a1e2f0b upstream.
regcache_sync_block_raw_flush() expects the address of the register after last
register that needs to be synced as its parameter. But the last call to
regcache_sync_block_raw_flush() in regcache_sync_block_raw() passes the address
of the last register in the block. This effectively always skips over the last
register in a block, even if it needs to be synced. In order to fix it increase
the address by one register.
The issue was introduced in commit 75a5f89 ("regmap: cache: Write consecutive
registers in a single block write").
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2e055e7c9c6084bfbaa68701e52562acf96419e upstream.
Commit f8bd822cb ("regmap: cache: Factor out block sync") made
regcache_rbtree_sync() call regmap_async_complete(), which in turn does
not check for map->bus before dereferencing it.
This causes a NULL pointer dereference on bus-less maps.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Refactor the dpm suspend watchdog code and add watchdogs
on resume too. The dpm wachdog prints the stack trace and
reboots the system if a device takes more than 12 seconds
to suspend or resume.
Change-Id: If00c047a17b80bdc13a8426393c698bc450a7347
Signed-off-by: Benoit Goby <benoit@android.com>
Rather than hard-lock the kernel, dump the suspend thread stack and
BUG() when a driver takes too long to suspend. The timeout is set
to 12 seconds to be longer than the usbhid 10 second timeout.
Exclude from the watchdog the time spent waiting for children that
are resumed asynchronously and time every device, whether or not they
resumed synchronously.
Change-Id: Ifd211c06b104860c2fee6eecfe0d61774aa4508a
Original-author: San Mehat <san@google.com>
Signed-off-by: Benoit Goby <benoit@android.com>
fw_priv->buf is accessed in both request_firmware_load() and
writing to sysfs file of 'loading' context, but not protected
by 'fw_lock' entirely. The patch makes sure that access on
'fw_priv->buf' is protected by the lock.
So fixes the double abort problem reported by nirinA raseliarison:
http://lkml.org/lkml/2013/6/14/188
Reported-and-tested-by: nirinA raseliarison <nirina.raseliarison@gmail.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable <stable@vger.kernel.org> # 3.9
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A node starting before the minimum register is no reason to reject it,
since its end could be in range. The check for the end already exists
two lines lower, so we can just remove the incorrect check.
Signed-off-by: Maarten ter Huurne <maarten@treewalker.org>
Signed-off-by: Mark Brown <broonie@linaro.org>
Here are 3 tiny driver core fixes for 3.10-rc2.
A needed symbol export, a change to make it easier to track down
offending sysfs files with incorrect attributes, and a klist bugfix.
All have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iEYEABECAAYFAlGePdAACgkQMUfUDdst+ynX3wCfbodTGeimy2GTnc5psVgXV/x4
bz8AnR6G/JNCw54meAJ5UlYJRj7Dwo09
=MNP/
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg Kroah-Hartman:
"Here are 3 tiny driver core fixes for 3.10-rc2.
A needed symbol export, a change to make it easier to track down
offending sysfs files with incorrect attributes, and a klist bugfix.
All have been in linux-next for a while"
* tag 'driver-core-3.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
klist: del waiter from klist_remove_waiters before wakeup waitting process
driver core: print sysfs attribute name when warning about bogus permissions
driver core: export subsys_virtual_register
The parameter passed to the regmap lock/unlock callbacks needs to be
map->lock_arg, regcache passes just map. This works fine in the case that no
custom locking callbacks are used since in this case map->lock_arg equals map,
but will break when custom locking callbacks are used. The issue was introduced
in commit 0d4529c5("regmap: make lock/unlock functions customizable") and is
fixed by this patch.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Make it obvious to see what attribute is using bogus permissions.
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Modules want to call this function, so it needs to be exported.
Reported-by: Daniel Mack <zonque@gmail.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix dev_pm_put_subsys_data() so that it doesn't call kfree() under
a spinlock and make it return 1 whenever it leaves NULL
power.subsys_data (regardless of the reason).
Signed-off-by: Shuah Khan <shuah.kh@samsung.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull input updates from Dmitry Torokhov:
"Assorted fixes and cleanups to the existing drivers plus a new driver
for IMS Passenger Control Unit device they use for ther in-flight
entertainment system."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (44 commits)
Input: trackpoint - Optimize trackpoint init to use power-on reset
Input: apbps2 - convert to devm_ioremap_resource()
Input: ALPS - use %ph to print buffers
ARM - shmobile: Armadillo800EVA: Move st1232 reset pin handling
Input: st1232 - add reset pin handling
Input: st1232 - convert to devm_* infrastructure
Input: MT - handle semi-mt devices in core
Input: adxl34x - use spi_get_drvdata()
Input: ad7877 - use spi_get_drvdata() and spi_set_drvdata()
Input: ads7846 - use spi_get_drvdata() and spi_set_drvdata()
Input: ims-pcu - fix a memory leak on error
Input: sysrq - supplement reset sequence with timeout functionality
Input: tegra-kbc - support for defining row/columns based on SoC
Input: imx_keypad - switch to using managed resources
Input: arc_ps2 - add support for device tree
Input: mma8450 - fix signed 12bits to 32bits conversion
Input: eeti_ts - remove redundant null check
Input: edt-ft5x06 - remove redundant null check before kfree
Input: ad714x - add CONFIG_PM_SLEEP to suspend/resume functions
Input: adxl34x - add CONFIG_PM_SLEEP to suspend/resume functions
...
Add debugfs support to make it easier to print debug information
about the dma-buf buffers.
Cc: Dave Airlie <airlied@redhat.com>
[minor fixes on init and warning fix]
Cc: Dan Carpenter <dan.carpenter@oracle.com>
[remove double unlock in fail case]
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
For debugging purposes, it is useful to have a name-string added
while exporting buffers. Hence, dma_buf_export() is replaced with
dma_buf_export_named(), which additionally takes 'exp_name' as a
parameter.
For backward compatibility, and for lazy exporters who don't wish to
name themselves, a #define dma_buf_export() is also made available,
which adds a __FILE__ instead of 'exp_name'.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[Thanks for the idea!]
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
- ARM big.LITTLE cpufreq driver from Viresh Kumar.
- exynos5440 cpufreq driver from Amit Daniel Kachhap.
- cpufreq core cleanup and code consolidation from Viresh Kumar and
Stratos Karafotis.
- cpufreq scalability improvement from Nathan Zimmer.
- AMD "frequency sensitivity feedback" powersave bias for the ondemand
cpufreq governor from Jacob Shin.
- cpuidle code consolidation and cleanups from Daniel Lezcano.
- ARM OMAP cpuidle fixes from Santosh Shilimkar and Daniel Lezcano.
- ACPICA fixes and other improvements from Bob Moore, Jung-uk Kim,
Lv Zheng, Yinghai Lu, Tang Chen, Colin Ian King, and Linn Crosetto.
- ACPI core updates related to hotplug from Toshi Kani, Paul Bolle,
Yasuaki Ishimatsu, and Rafael J. Wysocki.
- Intel Lynxpoint LPSS (Low-Power Subsystem) support improvements
from Rafael J. Wysocki and Andy Shevchenko.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJRf8M8AAoJEKhOf7ml8uNsud4P/3cabXP5lDipzibRrpOiONse
puuvIdhtNdMRMc3t1oSDjNH/w/JA51Gc+ICGFAORiyVmqxBc85mpT6J5ibqV7hNd
pCqbKJceoB5PajHZSx22e4wG9O7YN1k3r80p38IfFzA+Ct0KNSuE0ixMEfHKYjiq
p5pXswk6TY3gtBReH9agrafHqDtXw4IMTE0asMuJ+BorPW7vQeiNlrkuA+0qmDuu
26O0Pm2TVkx1ryfTjdM9zSZ9X2G4JuM8rm1/VFZWQJTExwlv3bA2Za1nvQNJlJ99
6JZ0JXfAehcEW2Ye0sqsZ8HSEabDVHM29QvvOszJ5RpBXERiOCHOkhvFleCoTpn0
Xq0rtXPrLMH1G28Ej+cxmsAjfzOLV2Byg30CAoI/GCLuQ+xh+VMCpuNYQuld25CG
9rtYd0fWESeYsAebhDcX0E3xyzJtbrHtOb9PyGwNkbAJ8YQfhVSMCOPi2SX2wa+Q
qXLXi2VaHvjBSUKcAv5BmM+Ya57Be+88D0LxbgXbUeOnYefUK1ljldKDDshkMjgG
P4LPdm4JpoB5ncXSOO1Dz9w9QnNcFexSUySd/TtKLNMha1vEHV8ISzNPYY+9IdXf
XN0VZbFnUDzdj+Fwna7zyFb1cGihDYJKAtpXvRd8Y6RGUxKx9uGLAFJZw/xZB/cR
KZKuML5O8MgJuef37F38
=H/se
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael J Wysocki:
- ARM big.LITTLE cpufreq driver from Viresh Kumar.
- exynos5440 cpufreq driver from Amit Daniel Kachhap.
- cpufreq core cleanup and code consolidation from Viresh Kumar and
Stratos Karafotis.
- cpufreq scalability improvement from Nathan Zimmer.
- AMD "frequency sensitivity feedback" powersave bias for the ondemand
cpufreq governor from Jacob Shin.
- cpuidle code consolidation and cleanups from Daniel Lezcano.
- ARM OMAP cpuidle fixes from Santosh Shilimkar and Daniel Lezcano.
- ACPICA fixes and other improvements from Bob Moore, Jung-uk Kim, Lv
Zheng, Yinghai Lu, Tang Chen, Colin Ian King, and Linn Crosetto.
- ACPI core updates related to hotplug from Toshi Kani, Paul Bolle,
Yasuaki Ishimatsu, and Rafael J Wysocki.
- Intel Lynxpoint LPSS (Low-Power Subsystem) support improvements from
Rafael J Wysocki and Andy Shevchenko.
* tag 'pm+acpi-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (192 commits)
cpufreq: Revert incorrect commit 5800043
cpufreq: MAINTAINERS: Add co-maintainer
cpuidle: add maintainer entry
ACPI / thermal: do not always return THERMAL_TREND_RAISING for active trip points
ARM: s3c64xx: cpuidle: use init/exit common routine
cpufreq: pxa2xx: initialize variables
ACPI: video: correct acpi_video_bus_add error processing
SH: cpuidle: use init/exit common routine
ARM: S5pv210: compiling issue, ARM_S5PV210_CPUFREQ needs CONFIG_CPU_FREQ_TABLE=y
ACPI: Fix wrong parameter passed to memblock_reserve
cpuidle: fix comment format
pnp: use %*phC to dump small buffers
isapnp: remove debug leftovers
ARM: imx: cpuidle: use init/exit common routine
ARM: davinci: cpuidle: use init/exit common routine
ARM: kirkwood: cpuidle: use init/exit common routine
ARM: calxeda: cpuidle: use init/exit common routine
ARM: tegra: cpuidle: use init/exit common routine for tegra3
ARM: tegra: cpuidle: use init/exit common routine for tegra2
ARM: OMAP4: cpuidle: use init/exit common routine
...
Pull workqueue updates from Tejun Heo:
"A lot of activities on workqueue side this time. The changes achieve
the followings.
- WQ_UNBOUND workqueues - the workqueues which are per-cpu - are
updated to be able to interface with multiple backend worker pools.
This involved a lot of churning but the end result seems actually
neater as unbound workqueues are now a lot closer to per-cpu ones.
- The ability to interface with multiple backend worker pools are
used to implement unbound workqueues with custom attributes.
Currently the supported attributes are the nice level and CPU
affinity. It may be expanded to include cgroup association in
future. The attributes can be specified either by calling
apply_workqueue_attrs() or through /sys/bus/workqueue/WQ_NAME/* if
the workqueue in question is exported through sysfs.
The backend worker pools are keyed by the actual attributes and
shared by any workqueues which share the same attributes. When
attributes of a workqueue are changed, the workqueue binds to the
worker pool with the specified attributes while leaving the work
items which are already executing in its previous worker pools
alone.
This allows converting custom worker pool implementations which
want worker attribute tuning to use workqueues. The writeback pool
is already converted in block tree and there are a couple others
are likely to follow including btrfs io workers.
- WQ_UNBOUND's ability to bind to multiple worker pools is also used
to make it NUMA-aware. Because there's no association between work
item issuer and the specific worker assigned to execute it, before
this change, using unbound workqueue led to unnecessary cross-node
bouncing and it couldn't be helped by autonuma as it requires tasks
to have implicit node affinity and workers are assigned randomly.
After these changes, an unbound workqueue now binds to multiple
NUMA-affine worker pools so that queued work items are executed in
the same node. This is turned on by default but can be disabled
system-wide or for individual workqueues.
Crypto was requesting NUMA affinity as encrypting data across
different nodes can contribute noticeable overhead and doing it
per-cpu was too limiting for certain cases and IO throughput could
be bottlenecked by one CPU being fully occupied while others have
idle cycles.
While the new features required a lot of changes including
restructuring locking, it didn't complicate the execution paths much.
The unbound workqueue handling is now closer to per-cpu ones and the
new features are implemented by simply associating a workqueue with
different sets of backend worker pools without changing queue,
execution or flush paths.
As such, even though the amount of change is very high, I feel
relatively safe in that it isn't likely to cause subtle issues with
basic correctness of work item execution and handling. If something
is wrong, it's likely to show up as being associated with worker pools
with the wrong attributes or OOPS while workqueue attributes are being
changed or during CPU hotplug.
While this creates more backend worker pools, it doesn't add too many
more workers unless, of course, there are many workqueues with unique
combinations of attributes. Assuming everything else is the same,
NUMA awareness costs an extra worker pool per NUMA node with online
CPUs.
There are also a couple things which are being routed outside the
workqueue tree.
- block tree pulled in workqueue for-3.10 so that writeback worker
pool can be converted to unbound workqueue with sysfs control
exposed. This simplifies the code, makes writeback workers
NUMA-aware and allows tuning nice level and CPU affinity via sysfs.
- The conversion to workqueue means that there's no 1:1 association
between a specific worker, which makes writeback folks unhappy as
they want to be able to tell which filesystem caused a problem from
backtrace on systems with many filesystems mounted. This is
resolved by allowing work items to set debug info string which is
printed when the task is dumped. As this change involves unifying
implementations of dump_stack() and friends in arch codes, it's
being routed through Andrew's -mm tree."
* 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (84 commits)
workqueue: use kmem_cache_free() instead of kfree()
workqueue: avoid false negative WARN_ON() in destroy_workqueue()
workqueue: update sysfs interface to reflect NUMA awareness and a kernel param to disable NUMA affinity
workqueue: implement NUMA affinity for unbound workqueues
workqueue: introduce put_pwq_unlocked()
workqueue: introduce numa_pwq_tbl_install()
workqueue: use NUMA-aware allocation for pool_workqueues
workqueue: break init_and_link_pwq() into two functions and introduce alloc_unbound_pwq()
workqueue: map an unbound workqueues to multiple per-node pool_workqueues
workqueue: move hot fields of workqueue_struct to the end
workqueue: make workqueue->name[] fixed len
workqueue: add workqueue->unbound_attrs
workqueue: determine NUMA node of workers accourding to the allowed cpumask
workqueue: drop 'H' from kworker names of unbound worker pools
workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]
workqueue: move pwq_pool_locking outside of get/put_unbound_pool()
workqueue: fix memory leak in apply_workqueue_attrs()
workqueue: fix unbound workqueue attrs hashing / comparison
workqueue: fix race condition in unbound workqueue free path
workqueue: remove pwq_lock which is no longer used
...
Merge first batch of fixes from Andrew Morton:
- A couple of kthread changes
- A few minor audit patches
- A number of fbdev patches. Florian remains AWOL so I'm picking up
some of these.
- A few kbuild things
- ocfs2 updates
- Almost all of the MM queue
(And in the meantime, I already have the second big batch from Andrew
pending in my mailbox ;^)
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (149 commits)
memcg: take reference before releasing rcu_read_lock
mem hotunplug: fix kfree() of bootmem memory
mmKconfig: add an option to disable bounce
mm, nobootmem: do memset() after memblock_reserve()
mm, nobootmem: clean-up of free_low_memory_core_early()
fs/buffer.c: remove unnecessary init operation after allocating buffer_head.
numa, cpu hotplug: change links of CPU and node when changing node number by onlining CPU
mm: fix memory_hotplug.c printk format warning
mm: swap: mark swap pages writeback before queueing for direct IO
swap: redirty page if page write fails on swap file
mm, memcg: give exiting processes access to memory reserves
thp: fix huge zero page logic for page with pfn == 0
memcg: avoid accessing memcg after releasing reference
fs: fix fsync() error reporting
memblock: fix missing comment of memblock_insert_region()
mm: Remove unused parameter of pages_correctly_reserved()
firmware, memmap: fix firmware_map_entry leak
mm/vmstat: add note on safety of drain_zonestat
mm: thp: add split tail pages to shrink page list in page reclaim
mm: allow for outstanding swap writeback accounting
...
In user visible terms just a couple of enhancements here, though there
was a moderate amount of refactoring required in order to support the
register cache sync performance improvements.
- Support for block and asynchronous I/O during register cache syncing;
this provides a use case dependant performance improvement.
- Additional debugfs information on the memory consuption and register
set.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIbBAABAgAGBQJRfoZRAAoJELSic+t+oim9gXAP+JhAihmIQJlhUxZkXojFhClD
SKNWuFHmFC6VGndv52HPZR7nLN6hIlT4VUqk/rEw58R/RTqGuuWGc0KnKJf7ipid
6CdutuOP6q8mgs02kGKFAWRbSl++IXJ4TwvBbiyDMBmmngFoJY+gnmtnpP+PzcAd
LA3fn54jDWzBKCSlFBEC5acYxOMPmzm2uW13mO8Gy1RJrUkXfOemEFsyP0NVNJys
N0Zslp4nUUWmEu41UujuAUGZ7xXnnNQF5R4/RdS3+p22+sCEe7/mhLU1AxalUT4c
m9h9U2UKoXqRBuFQ9kRGwM2Gufjg33DoB0ExqIDEgaD2kRdAdAo/WhTHLxTiQEfq
6YXGZYwl0QUC1KcUwUWJZIq/nECibaYDAoyooNzLQNPAbbO6gdjsTIVCaZK8U/k6
D8bWAM4eRbv6xwXEd8rKW5+2f41dnsb5O3OgbdEEBZnbQ8UizI9KDGbPB3ARV2RI
Xqn+lYZV/q/99Bb3Pn0oS6Ud/tz5BqN4w3N84H0KcvcRHXvYjkdQ6ulsterRykOa
gYWfsCKTbm2C1zBLGDPXkDablodLZmzoCs4ajeIt6zIELNzuIsI3trprpT85RtrS
cjYl61ECuypPYBIW4uzxxBk/FeiEjQ4ndgQ4MgVnUfx0NpmG2N9LlDc2r6i+UgV/
EBxvYlPsEzQYLKoiJl8=
=RG1W
-----END PGP SIGNATURE-----
Merge tag 'regmap-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"In user visible terms just a couple of enhancements here, though there
was a moderate amount of refactoring required in order to support the
register cache sync performance improvements.
- Support for block and asynchronous I/O during register cache
syncing; this provides a use case dependant performance
improvement.
- Additional debugfs information on the memory consuption and
register set"
* tag 'regmap-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (23 commits)
regmap: don't corrupt work buffer in _regmap_raw_write()
regmap: cache: Fix format specifier in dev_dbg
regmap: cache: Make regcache_sync_block_raw static
regmap: cache: Write consecutive registers in a single block write
regmap: cache: Split raw and non-raw syncs
regmap: cache: Factor out block sync
regmap: cache: Factor out reg_present support from rbtree cache
regmap: cache: Use raw I/O to sync rbtrees if we can
regmap: core: Provide regmap_can_raw_write() operation
regmap: cache: Provide a get address of value operation
regmap: Cut down on the average # of nodes in the rbtree cache
regmap: core: Make raw write available to regcache
regmap: core: Warn on invalid operation combinations
regmap: irq: Clarify error message when we fail to request primary IRQ
regmap: rbtree Expose total memory consumption in the rbtree debugfs entry
regmap: debugfs: Add a registers `range' file
regmap: debugfs: Simplify calculation of `c->max_reg'
regmap: cache: Store caches in native register format where possible
regmap: core: Split out in place value parsing
regmap: cache: Use regcache_get_value() to check if we updated
...
nr_pages is not used in pages_correctly_reserved().
So remove it.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__remove_pages() is only necessary for CONFIG_MEMORY_HOTREMOVE. PowerPC
pseries will return -EOPNOTSUPP if unsupported.
Adding an #ifdef causes several other functions it depends on to also
become unnecessary, which saves in .text when disabled (it's disabled in
most defconfigs besides powerpc, including x86). remove_memory_block()
becomes static since it is not referenced outside of
drivers/base/memory.c.
Build tested on x86 and powerpc with CONFIG_MEMORY_HOTREMOVE both enabled
and disabled.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Squishes a warning which my change to hotplug_memory_notifier() added.
I want to keep that warning, because it is punishment for failnig to check
the hotplug_memory_notifier() return value.
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm-assorted:
PM / OPP: add documentation to RCU head in struct opp
PM / sleep: invalidate TEST_CPUS and TEST_CORE support for freeze state
PM / sleep: add TEST_PLATFORM support for freeze state
_regmap_raw_write() contains code to call regcache_write() to write
values to the cache. That code calls memcpy() to copy the value data to
the start of the work_buf. However, at least when _regmap_raw_write() is
called from _regmap_bus_raw_write(), the value data is in the work_buf,
and this memcpy() operation may over-write part of that value data,
depending on the value of reg_bytes + pad_bytes. At least when using
reg_bytes==1 and pad_bytes==0, corruption of the value data does occur.
To solve this, remove the memcpy() operation, and modify the subsequent
.parse_val() call to parse the original value buffer directly.
At least in the case of 8-bit register address and 16-bit values, and
writes of single registers at a time, this memcpy-then-parse combination
used to cancel each-other out; for a work-buffer containing xx 89 03,
the memcpy changed it to 89 03 03, and the parse_val changed it back to
89 89 03, thus leaving the value uncorrupted. This appears completely
accidental though. Since commit 8a819ff "regmap: core: Split out in
place value parsing", .parse_val only returns the parsed value, and does
not modify the buffer, and hence does not (accidentally) undo the
corruption caused by memcpy(). This caused bogus values to get written
to HW, thus preventing e.g. audio playback on systems with a WM8903
CODEC. This patch fixes that.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
When genpd prepares for a system suspend it will fetch a runtime
reference for the device. When returning it we now use the
asyncronous runtime PM API. Thus we don't have to wait for the
device to become idle|suspended before we move on and handle the
next device in queue.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
For irq safe devices return the runtime reference for the parent
by using the asyncronous runtime PM API. Thus we don't have to
wait for it to become idle|suspended. Instead we can move on and
handle the next device in queue.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use the asyncronous runtime PM API when returning the runtime
reference for the device after the system resume is completed.
By using the asyncronous runtime PM API we don't have to wait
for each an every device to become idle|suspended. Instead we
can move on and handle the next device in queue.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Putting devices into idle|suspend in a synchronous manner means we are
waiting for each device to become idle|suspended before the probe|release
is fully done.
This patch switch to use the asynchronous runtime PM API:s instead and
thus improves the parallelism since we can move on and handle the next
device in queue in an earlier phase.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that devtmpfs is caring about uid/gid, we need to use the correct
internal types so users who have USER_NS enabled will have things work
properly for them.
Thanks to Eric for pointing this out, and the patch review.
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If CONFIG_UIDGID_STRICT_TYPE_CHECKS is enalbed, the below compile
failure will be triggered:
drivers/base/devtmpfs.c: In function 'handle_create':
drivers/base/devtmpfs.c:214:19: error: incompatible types when assigning to type 'kuid_t' from type 'uid_t'
drivers/base/devtmpfs.c:215:19: error: incompatible types when assigning to type 'kgid_t' from type 'gid_t'
make[2]: *** [drivers/base/devtmpfs.o] Error 1
This patch fixes the compile failure.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes a sparse warning, and is a good idea given that the
devtmpfs_init() prototype is in this file.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit bc8ce4 (regmap: don't corrupt work buffer in
_regmap_raw_write()) since it turns out that it can cause issues when
taken in isolation from the other changes in -next that lead to its
discovery. On the basis that nobody noticed the problems for quite some
time without that subsequent work let's drop it from v3.9.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Some drivers want to tell userspace what uid and gid should be used for
their device nodes, so allow that information to percolate through the
driver core to userspace in order to make this happen. This means that
some systems (i.e. Android and friends) will not need to even run a
udev-like daemon for their device node manager and can just rely in
devtmpfs fully, reducing their footprint even more.
Signed-off-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dde8437 (PM / OPP: RCU reclaim) introduced rcu_head for
struct opp. This aids freeing using kfree_rcu. However, we missed
adding documentation for the same. This generates kernel doc warning:
Warning(drivers/base/power/opp.c:70): No description found for
parameter 'head'
Add documentation as appropriate.
[rjw: Changelog]
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix format specifier in dev_dbg and suppress the following warning
drivers/base/regmap/regcache.c: In function
‘regcache_sync_block_raw_flush’:
drivers/base/regmap/regcache.c:593:2: warning: format ‘%d’ expects
argument of type ‘int’, but argument 4 has type ‘size_t’ [-Wformat]
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
regcache_sync_block_raw is used only in this file. Hence make it static.
Silences the following warning:
drivers/base/regmap/regcache.c:608:5: warning:
symbol 'regcache_sync_block_raw' was not declared. Should it be static?
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
- Revert of a recent cpuidle change that caused Nehalem machines
to hang on boot from Alex Shi.
- USB power management fix addressing a crash in the port device
object's release routine from Rafael J. Wysocki.
- Device PM QoS fix for a potential deadlock related to sysfs
interface from Rafael J. Wysocki.
- Fix for a cpufreq crash when the /cpus Device Tree node is missing
from Paolo Pisati.
- Fix for a build issue on ia64 related to the Boot Graphics Resource
Table (BGRT) from Tony Luck.
- Two fixes for ACPI handles being set incorrectly for device
objects that don't correspond to any ACPI namespace nodes in
the I2C and SPI subsystems from Rafael J. Wysocki.
- Fix for compiler warnings related to CONFIG_PM_DEVFREQ being
unset from Rajagopal Venkat.
- Fix for a symbol definition typo in cpufreq_governor.h from
Borislav Petkov.
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJRXeiaAAoJEKhOf7ml8uNsZwUP/RmKD3wQEqHFPWk94wb0JhiI
J5nOfmmSjriLxV2XA2ajK48Zh8WDdvpVX0JHEBqPUYeVK09NVaN4L4Jr2ELAbeit
i2NKeWPtudTFEzJqZfK6d7s4TirbvD1+6TUCckI5N0Ku9Exf9R0jCGrEx4TCunOJ
UHluCxGKL9/GWc8Jc0HXto0w8cMAKgNN41quWD77u6qXOTujPafQ69DR6EFuLta6
pRz8tJ2g1oG43K0m5bZwPtIa6k+/6yP2e5zRaOQQM8ca0fxVJVuo2/kc9Warnx1m
Njayp3q9xmqfC5lt4F386BohnwFDBtqKoIbiNt1bJeiqDAnFCRpFAyFhx+sPNg3O
+D77BUdAB3fpxC8My9YgDG/YEK72HI/OVSceQ2txMssZyU5d24aH5P1ZAMwiGaJf
tAVQU/exCYMeJbD7xgjjK5Ny2yUpeUze2BYhRUCB2cseyJqdGc9j5hD/YGdnk9rC
YAxJq94HQUlcsOcFko//z/22ANda5OeLc1xtcX87sBAjsYXFNEyPirrR0b6YqqPr
gshfyE6HXNTzwbbH2uPRRuXGYVn7pGjcPxulPeP/i8pklhInj0rtWnCIq8f/bik1
uudEp/z2Zl9eNKkezAp32Eczn13ivEkz8By+SHrjluqCCpNzZVRNeGBg6uFOMTkL
Gz8dAAvK/Hb05OfgQWad
=s/w4
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- Revert of a recent cpuidle change that caused Nehalem machines to
hang on boot from Alex Shi.
- USB power management fix addressing a crash in the port device
object's release routine from Rafael J Wysocki.
- Device PM QoS fix for a potential deadlock related to sysfs interface
from Rafael J Wysocki.
- Fix for a cpufreq crash when the /cpus Device Tree node is missing
from Paolo Pisati.
- Fix for a build issue on ia64 related to the Boot Graphics Resource
Table (BGRT) from Tony Luck.
- Two fixes for ACPI handles being set incorrectly for device objects
that don't correspond to any ACPI namespace nodes in the I2C and SPI
subsystems from Rafael J Wysocki.
- Fix for compiler warnings related to CONFIG_PM_DEVFREQ being unset
from Rajagopal Venkat.
- Fix for a symbol definition typo in cpufreq_governor.h from Borislav
Petkov.
* tag 'pm+acpi-3.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / BGRT: Don't let users configure BGRT on non X86 systems
cpuidle / ACPI: recover percpu ACPI processor cstate
ACPI / I2C: Use parent's ACPI_HANDLE() in acpi_i2c_register_devices()
cpufreq: Correct header guards typo
ACPI / SPI: Use parent's ACPI_HANDLE() in acpi_register_spi_devices()
cpufreq: check OF node /cpus presence before dereferencing it
PM / devfreq: Fix compiler warnings for CONFIG_PM_DEVFREQ unset
PM / QoS: Avoid possible deadlock related to sysfs access
USB / PM: Don't try to hide PM QoS flags from usb_port_device_release()
commit eca4549f57 "sysfs: Add crash_notes_size to export percpu
note size" adds a printk that outputs a size_t value as %lu
when it should be %zu, resulting in this warning.
drivers/base/cpu.c: In function 'show_crash_notes_size':
drivers/base/cpu.c:142:2: warning: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'unsigned int' [-Wformat=]
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Simon Horman <horms@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJRWLTrAAoJEHm+PkMAQRiGe8oH/iMy48mecVWvxVZn74Tx3Cef
xmW/PnAIj28EhSPqK49N/Ow6AfQToFKf7AP0ge20KAf5teTq95AY+tH74DAANt8F
BjKXXTZiR5xwBvRkq7CR5wDcCvEcBAAz8fgTEd6SEDB2d2VXFf5eKdKUqt1avTCh
Z6Hup5kuwX+ddtwY2DCBXtp2n6fL0Rm5yLzY1A3OOBye1E7VyLTF7M5BR603Q44P
4kRLxn8+R7jy3hTuZIhAeoS8TKUoBwVk7DmKxEzrhTHZVOmvwE9lEHybRnIyOpd/
k1JnbRbiPsLsCVFOn10SQkGDAIk00lro3tuWP2C1ljERiD/OOh5Ui9nXYAhMkbI=
=q15K
-----END PGP SIGNATURE-----
Merge tag 'v3.9-rc5' into wq/for-3.10
Writeback conversion to workqueue will be based on top of wq/for-3.10
branch to take advantage of custom attrs and NUMA support for unbound
workqueues. Mainline currently contains two commits which result in
non-trivial merge conflicts with wq/for-3.10 and because
block/for-3.10/core is based on v3.9-rc3 which contains one of the
conflicting commits, we need a pre-merge-window merge anyway. Let's
pull v3.9-rc5 into wq/for-3.10 so that the block tree doesn't suffer
from workqueue merge conflicts.
The two conflicts and their resolutions:
* e68035fb65 ("workqueue: convert to idr_alloc()") in mainline changes
worker_pool_assign_id() to use idr_alloc() instead of the old idr
interface. worker_pool_assign_id() goes through multiple locking
changes in wq/for-3.10 causing the following conflict.
static int worker_pool_assign_id(struct worker_pool *pool)
{
int ret;
<<<<<<< HEAD
lockdep_assert_held(&wq_pool_mutex);
do {
if (!idr_pre_get(&worker_pool_idr, GFP_KERNEL))
return -ENOMEM;
ret = idr_get_new(&worker_pool_idr, pool, &pool->id);
} while (ret == -EAGAIN);
=======
mutex_lock(&worker_pool_idr_mutex);
ret = idr_alloc(&worker_pool_idr, pool, 0, 0, GFP_KERNEL);
if (ret >= 0)
pool->id = ret;
mutex_unlock(&worker_pool_idr_mutex);
>>>>>>> c67bf5361e7e66a0ff1f4caf95f89347d55dfb89
return ret < 0 ? ret : 0;
}
We want locking from the former and idr_alloc() usage from the
latter, which can be combined to the following.
static int worker_pool_assign_id(struct worker_pool *pool)
{
int ret;
lockdep_assert_held(&wq_pool_mutex);
ret = idr_alloc(&worker_pool_idr, pool, 0, 0, GFP_KERNEL);
if (ret >= 0) {
pool->id = ret;
return 0;
}
return ret;
}
* eb2834285c ("workqueue: fix possible pool stall bug in
wq_unbind_fn()") updated wq_unbind_fn() such that it has single
larger for_each_std_worker_pool() loop instead of two separate loops
with a schedule() call inbetween. wq/for-3.10 renamed
pool->assoc_mutex to pool->manager_mutex causing the following
conflict (earlier function body and comments omitted for brevity).
static void wq_unbind_fn(struct work_struct *work)
{
...
spin_unlock_irq(&pool->lock);
<<<<<<< HEAD
mutex_unlock(&pool->manager_mutex);
}
=======
mutex_unlock(&pool->assoc_mutex);
>>>>>>> c67bf5361e7e66a0ff1f4caf95f89347d55dfb89
schedule();
<<<<<<< HEAD
for_each_cpu_worker_pool(pool, cpu)
=======
>>>>>>> c67bf5361e7e66a0ff1f4caf95f89347d55dfb89
atomic_set(&pool->nr_running, 0);
spin_lock_irq(&pool->lock);
wake_up_worker(pool);
spin_unlock_irq(&pool->lock);
}
}
The resolution is mostly trivial. We want the control flow of the
latter with the rename of the former.
static void wq_unbind_fn(struct work_struct *work)
{
...
spin_unlock_irq(&pool->lock);
mutex_unlock(&pool->manager_mutex);
schedule();
atomic_set(&pool->nr_running, 0);
spin_lock_irq(&pool->lock);
wake_up_worker(pool);
spin_unlock_irq(&pool->lock);
}
}
Signed-off-by: Tejun Heo <tj@kernel.org>
Commit b81ea1b (PM / QoS: Fix concurrency issues and memory leaks in
device PM QoS) put calls to pm_qos_sysfs_add_latency(),
pm_qos_sysfs_add_flags(), pm_qos_sysfs_remove_latency(), and
pm_qos_sysfs_remove_flags() under dev_pm_qos_mtx, which was a
mistake, because it may lead to deadlocks in some situations.
For example, if pm_qos_remote_wakeup_store() is run in parallel
with dev_pm_qos_constraints_destroy(), they may deadlock in the
following way:
======================================================
[ INFO: possible circular locking dependency detected ]
3.9.0-rc4-next-20130328-sasha-00014-g91a3267 #319 Tainted: G W
-------------------------------------------------------
trinity-child6/12371 is trying to acquire lock:
(s_active#54){++++.+}, at: [<ffffffff81301631>] sysfs_addrm_finish+0x31/0x60
but task is already holding lock:
(dev_pm_qos_mtx){+.+.+.}, at: [<ffffffff81f07cc3>] dev_pm_qos_constraints_destroy+0x23/0x250
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (dev_pm_qos_mtx){+.+.+.}:
[<ffffffff811811da>] lock_acquire+0x1aa/0x240
[<ffffffff83dab809>] __mutex_lock_common+0x59/0x5e0
[<ffffffff83dabebf>] mutex_lock_nested+0x3f/0x50
[<ffffffff81f07f2f>] dev_pm_qos_update_flags+0x3f/0xc0
[<ffffffff81f05f4f>] pm_qos_remote_wakeup_store+0x3f/0x70
[<ffffffff81efbb43>] dev_attr_store+0x13/0x20
[<ffffffff812ffdaa>] sysfs_write_file+0xfa/0x150
[<ffffffff8127f2c1>] __kernel_write+0x81/0x150
[<ffffffff812afc2d>] write_pipe_buf+0x4d/0x80
[<ffffffff812af57c>] splice_from_pipe_feed+0x7c/0x120
[<ffffffff812afa25>] __splice_from_pipe+0x45/0x80
[<ffffffff812b14fc>] splice_from_pipe+0x4c/0x70
[<ffffffff812b1538>] default_file_splice_write+0x18/0x30
[<ffffffff812afae3>] do_splice_from+0x83/0xb0
[<ffffffff812afb2e>] direct_splice_actor+0x1e/0x20
[<ffffffff812b0277>] splice_direct_to_actor+0xe7/0x200
[<ffffffff812b15bc>] do_splice_direct+0x4c/0x70
[<ffffffff8127eda9>] do_sendfile+0x169/0x300
[<ffffffff8127ff94>] SyS_sendfile64+0x64/0xb0
[<ffffffff83db7d18>] tracesys+0xe1/0xe6
-> #0 (s_active#54){++++.+}:
[<ffffffff811800cf>] __lock_acquire+0x15bf/0x1e50
[<ffffffff811811da>] lock_acquire+0x1aa/0x240
[<ffffffff81300aa2>] sysfs_deactivate+0x122/0x1a0
[<ffffffff81301631>] sysfs_addrm_finish+0x31/0x60
[<ffffffff812ff77f>] sysfs_hash_and_remove+0x7f/0xb0
[<ffffffff813035a1>] sysfs_unmerge_group+0x51/0x70
[<ffffffff81f068f4>] pm_qos_sysfs_remove_flags+0x14/0x20
[<ffffffff81f07490>] __dev_pm_qos_hide_flags+0x30/0x70
[<ffffffff81f07cd5>] dev_pm_qos_constraints_destroy+0x35/0x250
[<ffffffff81f06931>] dpm_sysfs_remove+0x11/0x50
[<ffffffff81efcf6f>] device_del+0x3f/0x1b0
[<ffffffff81efd128>] device_unregister+0x48/0x60
[<ffffffff82d4083c>] usb_hub_remove_port_device+0x1c/0x20
[<ffffffff82d2a9cd>] hub_disconnect+0xdd/0x160
[<ffffffff82d36ab7>] usb_unbind_interface+0x67/0x170
[<ffffffff81f001a7>] __device_release_driver+0x87/0xe0
[<ffffffff81f00559>] device_release_driver+0x29/0x40
[<ffffffff81effc58>] bus_remove_device+0x148/0x160
[<ffffffff81efd07f>] device_del+0x14f/0x1b0
[<ffffffff82d344f9>] usb_disable_device+0xf9/0x280
[<ffffffff82d34ff8>] usb_set_configuration+0x268/0x840
[<ffffffff82d3a7fc>] usb_remove_store+0x4c/0x80
[<ffffffff81efbb43>] dev_attr_store+0x13/0x20
[<ffffffff812ffdaa>] sysfs_write_file+0xfa/0x150
[<ffffffff8127f71d>] do_loop_readv_writev+0x4d/0x90
[<ffffffff8127f999>] do_readv_writev+0xf9/0x1e0
[<ffffffff8127faba>] vfs_writev+0x3a/0x60
[<ffffffff8127fc60>] SyS_writev+0x50/0xd0
[<ffffffff83db7d18>] tracesys+0xe1/0xe6
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(dev_pm_qos_mtx);
lock(s_active#54);
lock(dev_pm_qos_mtx);
lock(s_active#54);
*** DEADLOCK ***
To avoid that, remove the calls to functions mentioned above from
under dev_pm_qos_mtx and introduce a separate lock to prevent races
between functions that add or remove device PM QoS sysfs attributes
from happening.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When syncing blocks of data using raw writes combine the writes into a
single block write, saving us bus overhead for setup, addressing and
teardown.
Currently the block write is done unconditionally as it is expected that
hardware which has a register format which can support raw writes will
support auto incrementing writes, this decision may need to be revised in
future.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
For code clarity after implementing block writes split out the raw and
non-raw I/O sync implementations.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
The idea of holding blocks of registers in device format is shared between
at least rbtree and lzo cache formats so split out the loop that does the
sync from the rbtree code so optimisations on it can be reused.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
The idea of maintaining a bitmap of present registers is something that
can usefully be used by other cache types that maintain blocks of cached
registers so move the code out of the rbtree cache and into the generic
regcache code.
Refactor the interface slightly as we go to wrap the set bit and enlarge
bitmap operations (since we never do one without the other) and make it
more robust for reads of uncached registers by bounds checking before we
look at the bitmap.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Reviewed-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
For percpu notes, we are exporting only address and not size. So
the userspace tool kexec-tools is putting an upper limit of 1024
and putting the value in p_memsz and p_filesz fields. So the patch
add the new sysfile crash_notes_size to export the exact percpu
note size and let the kexec-tools parse it intead of using 1024.
The idea came from Vivek Goyal. And a later patch will be sent to
kexec-tools to let it parse the size.
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This will bring no meaningful benefit by itself, it is done as a separate
commit to aid bisection if there are problems with the following commits
adding support for coalescing adjacent writes.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Mainly useful internally but exported since this is a public API that's
being checked for.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Provide a helper to do the size based index into a block of registers and
use it when reading a value.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
This patch aims to bring down the average number of nodes
in the rbtree cache and increase the average number of registers
per node. This should improve general lookup and traversal times.
This is achieved by setting the minimum size of a block within the
rbnode to the size of the rbnode itself. This will essentially
cache possibly non-existent registers so to combat this scenario,
we keep a separate bitmap in memory which keeps track of which register
exists. The memory overhead of this change is likely in the order of
~5-10%, possibly less depending on the register file layout. On my test
system with a bitmap of ~4300 bits and a relatively sparse register
layout, the memory requirements for the entire cache did not increase
(the cutting down of nodes which was about 50% of the original number
compensated the situation).
A second patch that can be built on top of this can look at the
ratio `sizeof(*rbnode) / map->cache_word_size' in order to suitably
adjust the block length of each block.
Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
This allows the cache to sync values directly to the device when stored
in native format and also allows asynchronous I/O.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
_regmap_raw_write() contains code to call regcache_write() to write
values to the cache. That code calls memcpy() to copy the value data to
the start of the work_buf. However, at least when _regmap_raw_write() is
called from _regmap_bus_raw_write(), the value data is in the work_buf,
and this memcpy() operation may over-write part of that value data,
depending on the value of reg_bytes + pad_bytes. At least when using
reg_bytes==1 and pad_bytes==0, corruption of the value data does occur.
To solve this, remove the memcpy() operation, and modify the subsequent
.parse_val() call to parse the original value buffer directly.
At least in the case of 8-bit register address and 16-bit values, and
writes of single registers at a time, this memcpy-then-parse combination
used to cancel each-other out; for a work-buffer containing xx 89 03,
the memcpy changed it to 89 03 03, and the parse_val changed it back to
89 89 03, thus leaving the value uncorrupted. This appears completely
accidental though. Since commit 8a819ff "regmap: core: Split out in
place value parsing", .parse_val only returns the parsed value, and does
not modify the buffer, and hence does not (accidentally) undo the
corruption caused by memcpy(). This caused bogus values to get written
to HW, thus preventing e.g. audio playback on systems with a WM8903
CODEC. This patch fixes that.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Display the name for the chip rather than just the primary IRQ so it is
clearer what exactly has failed.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJRRkrbAAoJEHm+PkMAQRiGy3oH/jrbHinYs0auurANgx4TdtWT
/WNajstKBqLOJJ6cnTR7sOqwOVlptt65EbbTs+qGyZ2Z2W/Lg0BMenHvNHo4ER8C
e7UbMdBCSLKBjAMKh1XCoZscGv4Exm8WRH3Vc5yP0Hafj3EzSAVLY1dta9WKKoQi
bh7D1ErUlbU1zczA1w5YbPF0LqFKRvyZOwebMCCAKAxv5wWAxmbcPNxVR4sufkjg
k6TkQ2ysgWivZAfy3tJYOcxiEu7ahpZVEuYdlZEJQXHRQUfoNljQlOp4BqKsYUai
5A0kaf2VpKay/7pkhvTfBBcF/jFJ68pYP6gQ2ThNdr0b5kOiAfMWj030Xyngnhg=
=iO9t
-----END PGP SIGNATURE-----
Merge tag 'v3.9-rc3' into next
Merge with mainline to bring in module_platform_driver_probe() and
devm_ioremap_resource().
Whenever a struct device_attribute is registered
with mismatched permissions - read permission without
a show routine or write permission without store
routine - we will issue a big warning so we catch
those early enough.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The last register block, which falls into the specified range, is not handled
correctly. The formula which calculates the number of register which should be
synced is inverse (and off by one). E.g. if all registers in that block should
be synced only one is synced, and if only one should be synced all (but one) are
synced. To calculate the number of registers that need to be synced we need to
subtract the number of the first register in the block from the max register
number and add one. This patch updates the code accordingly.
The issue was introduced in commit ac8d91c ("regmap: Supply ranges to the sync
operations").
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@vger.kernel.org
Provide a feel of how much overhead the rbtree cache adds to
the game.
[Slightly reworded output in debugfs -- broonie]
Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Kay tells me the most appropriate place to expose workqueues to
userland would be /sys/devices/virtual/workqueues/WQ_NAME which is
symlinked to /sys/bus/workqueue/devices/WQ_NAME and that we're lacking
a way to do that outside of driver core as virtual_device_parent()
isn't exported and there's no inteface to conveniently create a
virtual subsystem.
This patch implements subsys_virtual_register() by factoring out
subsys_register() from subsys_system_register() and using it with
virtual_device_parent() as the origin directory. It's identical to
subsys_system_register() other than the origin directory but we aren't
gonna restrict the device names which should be used under it.
This will be used to expose workqueue attributes to userland.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
In the rbtree code we are exposing statistics relating to the
number of nodes/registers of the rbtree cache for each of the
devices. Ensure that `map->debugfs' has been initialized before
we attempt to initialize the debugfs entry for the rbtree cache.
Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@vger.kernel.org
- Two fixes for the new intel_pstate driver from Dirk Brandewie.
- Fix for incorrect usage of the .find_bridge() callback from struct
acpi_bus_type in the USB core and subsequent removal of that
callback from Rafael J. Wysocki.
- ACPI processor driver cleanups from Chen Gang and Syam Sidhardhan.
- ACPI initialization and error messages fix from Joe Perches.
- Operating Performance Points documentation improvement from
Nishanth Menon.
- Fixes for memory leaks and potential concurrency issues and sysfs
attributes leaks during device removal in the core device PM QoS
code from Rafael J. Wysocki.
- Calxeda Highbank cpufreq driver simplification from Emilio López.
- cpufreq comment cleanup from Namhyung Kim.
- Fix for a section mismatch in Calxeda Highbank interprocessor
communication code from Mark Langsdorf (this is not a PM fix
strictly speaking, but the code in question went in through the
PM tree).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJROQy6AAoJEKhOf7ml8uNsdBMP/RhXZoPEGdPU/rO/O1g3nW7s
KY3H06aEzcXzIiYIimkZI3OvmNIGA+D+lFnd9xcFZX6hkmrCrwZtK63oTqBaU6G1
FnOg/ttDA6j5zBegqbhOcSwy5Xt5ZbiLI4WOtlr1rASeQars6KdUBs35q+KLhl6q
IjfkiLp7plC+aDasyz/KXt5vE4YRbiujk/FuKKSFfNM1/p1IeEXZP7DddygZQNPH
YQk5U8Had6AKrd2XQnSLpw+nlHTMM1KeCfhTQ5ZgDNmZgH3TVVp+TPSESiY+KJGY
4EEe1x76Cfj1/xODu2qVOPzSuCTWwDVNJRwkLML30yZkxqnTECO5YhgBUW7y3/L2
twgxXsm/IGuBMQs1CecmEq9SKwobvc/6/0oMGqPR+vQH1+vwOI3sBxo4IPj8eS6f
I9uHT6B8gieofPqJbBc7oX5PkOR7tJMD1Jg3Iqa3BF6oT4/p52mx4AjcY8v1x+bH
ykTrnpqwtggzoMgLQ86TaFZeV3jR0LHrxYl1FVq2CL/ehQpVvuP0KgtGjZKydZqo
DCN7ZLvyNGm3QSW3FIP61iHuazfL+3z30Psva9viVKcw5xzrSMx/DwIosnH3/bg0
j31RNh1PDz+hSVtczP3oH7tbTAZMSpnqf3Alu/hgcNxjJ8yO7zzvDGZCpprJNNvP
SW31Ju9W+ulnDm91vfko
=9vaj
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael J Wysocki:
- Two fixes for the new intel_pstate driver from Dirk Brandewie.
- Fix for incorrect usage of the .find_bridge() callback from struct
acpi_bus_type in the USB core and subsequent removal of that callback
from Rafael J Wysocki.
- ACPI processor driver cleanups from Chen Gang and Syam Sidhardhan.
- ACPI initialization and error messages fix from Joe Perches.
- Operating Performance Points documentation improvement from Nishanth
Menon.
- Fixes for memory leaks and potential concurrency issues and sysfs
attributes leaks during device removal in the core device PM QoS code
from Rafael J Wysocki.
- Calxeda Highbank cpufreq driver simplification from Emilio López.
- cpufreq comment cleanup from Namhyung Kim.
- Fix for a section mismatch in Calxeda Highbank interprocessor
communication code from Mark Langsdorf (this is not a PM fix strictly
speaking, but the code in question went in through the PM tree).
* tag 'pm+acpi-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq / intel_pstate: Do not load on VM that does not report max P state.
cpufreq / intel_pstate: Fix intel_pstate_init() error path
ACPI / glue: Drop .find_bridge() callback from struct acpi_bus_type
ACPI / glue: Add .match() callback to struct acpi_bus_type
ACPI / porocessor: Beautify code, pr->id is u32 which is never < 0
ACPI / processor: Remove redundant NULL check before kfree
ACPI / Sleep: Avoid interleaved message on errors
PM / QoS: Remove device PM QoS sysfs attributes at the right place
PM / QoS: Fix concurrency issues and memory leaks in device PM QoS
cpufreq: highbank: do not initialize array with a loop
PM / OPP: improve introductory documentation
cpufreq: Fix a typo in comment
mailbox, pl320-ipc: remove __init from probe function
A simple fix to stop us leaking a runtime PM reference in the case where
we fail to enable a device.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRNVQ5AAoJELSic+t+oim93WQQAJIToyJgnuoZfebD3vgT1Tey
YVGM5YY0pL+Ec5Fg91vu/ypFaY888J3UlRtQGxEM13grPunR4y/OflRYAXXTnspW
TcPbcWpkEv464iTQra2GY9Z4gqL9c6fKKBSFwrj74wRb+Jq0BQhrdmbw6U6pMnDS
iAxngfYEdlIULy8gyGnAszJFrQWjYh4U4e7wnUlsOJoZbc7JpW/6ITslwG9PWwK7
h+o7ekjn2anyjAqBStlnSOzQ12kcaam+cDh8Fa8TUmg3HTmFmuCytGA8+XwCVBSQ
ndWIhL1bqeyk7MdS84HjatNRAfPtpSZ9ouxKvLHm/tgALTNt/7CIsXeCm+2OoCQU
7uFJ01WnAstQ58ggEndgjvhr4wGRIp9VZXyVjm8tqH2CLT/UE7H+nnOAcABcd/cn
jZ+t8DQHU2ST1Rvs4Mohax8K6XcOTEQLp/kuhPEUXyqsv73VqIsjloPtqcLbUQdA
RYjMMsSFVFqlPQEOBTDNhGVjrfI4/tlkEh7Kw4VXSZXqf8cvTrAvbWYmMV/MJu2M
pvncD872/jSatRbj5qocnUbOuEyQe3UmdBNtQrdWgseI1z0fyz41X/VvZlzgt+Ll
se8iU4YojEviAUjPzKbKpFwr98r6pmMXtHqxDCYSv47YukiCC5QMenFukMGE5G9R
2qSw38quY1edJiXnq42Y
=D1mN
-----END PGP SIGNATURE-----
Merge tag 'regmap-v3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap PM fix from Mark Brown:
"A simple fix to stop us leaking a runtime PM reference in the case
where we fail to enable a device."
* tag 'regmap-v3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: irq: call pm_runtime_put in pm_runtime_get_sync failed case
Device PM QoS sysfs attributes, if present during device removal,
are removed from within device_pm_remove(), which is too late,
since dpm_sysfs_remove() has already removed the whole attribute
group they belonged to. However, moving the removal of those
attributes to dpm_sysfs_remove() alone is not sufficient, because
in theory they still can be re-added right after being removed by it
(the device's driver is still bound to it at that point).
For this reason, move the entire desctruction of device PM QoS
constraints to dpm_sysfs_remove() and make it prevent any new
constraints from being added after it has run. Also, move the
initialization of the power.qos field in struct device to
device_pm_init_common() and drop the no longer needed
dev_pm_qos_constraints_init().
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The current device PM QoS code assumes that certain functions will
never be called in parallel with each other (for example, it is
assumed that dev_pm_qos_expose_flags() won't be called in parallel
with dev_pm_qos_hide_flags() for the same device and analogously
for the latency limit), which may be overly optimistic. Moreover,
dev_pm_qos_expose_flags() and dev_pm_qos_expose_latency_limit()
leak memory in error code paths (req needs to be freed on errors)
and __dev_pm_qos_drop_user_request() forgets to free the request.
To fix the above issues put more things under the device PM QoS
mutex to make them mutually exclusive and add the missing freeing
of memory.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This file lists the register ranges in the register map. The condition
to split the range is based on whether the block is readable or not.
Ensure that we lock the `debugfs_off_cache' list whenever we access
and modify the list. There is a possible race otherwise between the
read() operations of the `registers' file and the `range' file.
Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
We don't need to use any of the file position information
to calculate the base and max register of each block. Just
use the counter directly.
Set `i = base' at the top to avoid GCC flow analysis bugs. The
value of `i' can never be undefined or 0 in the if (c) { ... }.
Signed-off-by: Dimitris Papastamos <dp@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Currently the value parsing operations both return the parsed value and
modify the passed buffer. This precludes their use in places like the cache
code so split out the in place modification into a new parse_inplace()
operation.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
It's more idiomatic to pass the map structure around and this means we
can use other bits of information from the map.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
If we're updating a value in place it's more work to read the value and
compare the value with what we're about to set than it is to just write
the value into the cache; there are no further operations after writing
in the code even though there's an early return here.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Trace when we start and complete async writes, and when we start and
finish blocking for their completion. This is useful for performance
analysis of the resulting I/O patterns.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Even in failed case of pm_runtime_get_sync, the usage_count
is incremented. In order to keep the usage_count with correct
value and runtime power management to behave correctly, call
pm_runtime_put(_sync) in such case.
Signed-off-by Liu Chuansheng <chuansheng.liu@intel.com>
Signed-off-by: Li Fei <fei.li@intel.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Callers to dma_buf_mmap expect to fput() the vma struct's vm_file
themselves on failure. Not restoring the struct's data on failure
causes a double-decrement of the vm_file's refcount.
Signed-off-by: John Sheu <sheu@google.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
All drivers which implement this need to have some sort of refcount to
allow concurrent vmap usage. Hence implement this in the dma-buf core.
To protect against concurrent calls we need a lock, which potentially
causes new funny locking inversions. But this shouldn't be a problem
for exporters with statically allocated backing storage, and more
dynamic drivers have decent issues already anyway.
Inspired by some refactoring patches from Aaron Plattner, who
implemented the same idea, but only for drm/prime drivers.
v2: Check in dma_buf_release that no dangling vmaps are left.
Suggested by Aaron Plattner. We might want to do similar checks for
attachments, but that's for another patch. Also fix up ERR_PTR return
for vmap.
v3: Check whether the passed-in vmap address matches with the cached
one for vunmap. Eventually we might want to remove that parameter -
compared to the kmap functions there's no need for the vaddr for
unmapping. Suggested by Chris Wilson.
v4: Fix a brown-paper-bag bug spotted by Aaron Plattner.
Cc: Aaron Plattner <aplattner@nvidia.com>
Reviewed-by: Aaron Plattner <aplattner@nvidia.com>
Tested-by: Aaron Plattner <aplattner@nvidia.com>
Reviewed-by: Rob Clark <rob@ti.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
Sometimes drivers need to execute one-off actions in their error handling
or device teardown paths. An example would be toggling a GPIO line to
reset the controlled device into predefined state.
To allow performing such actions when using managed resources let's allow
adding them to stack/group of devres resources.
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
lockdep, but it's a mechanical change.
Cheers,
Rusty.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRJAcuAAoJENkgDmzRrbjxsw0P/3eXb+LddYnx0V0uHYdKpCUf
4vdW7X0fX3Z+aUK69IWRL/6ahoO4TpaHYGHBDjEoivyQ0GDq14X7JNWsYYt3LdMf
3wmDgRc2cn/mZOJbFeVpNV8ox5l/xc0CUvV+iQ8tMjfQItXMXgWUFZKMECsXKSO6
eex3lrw9M2jAX2uL8LQPp9W8xtKu24nSZRC6tH5riE/8fCzi1cZPPAqfxP5c8Lee
ZXtbCRSyAFENZLpKyMe1PC7HvtJyi5NDn9xwOQiXULZV/VOlvP94DGBLIKCM/6dn
4QvZxpG0P0uOlpCgRAVLyh/z7g4XY4VF/fHopLCmEcqLsvgD+V2LQpQ9zWUalLPC
Z+pUpz2vu0gIddPU1nR8R6oGpEdJ8O12aJle62p/RSXWZGx12qUQ+Tamu0tgKcv1
AsiJfbUGNDYfxgU6sHsoQjl2f68LTVckCU1C1LqEbW/S104EIORtGx30CHM4LRiO
32kDC5TtgYDBKQAIqJ4bL48ZMh+9W3uX40p7xzOI5khHQjvswUKa3jcxupU0C1uv
lx8KXo7pn8WT33QGysWC782wJCgJuzSc2vRn+KQoqoynuHGM6agaEtR59gil3QWO
rQEcxH63BBRDgHlg4FM9IkJwwsnC3PWKL8gbX0uAWXAPMbgapJkuuGZAwt0WDGVK
+GszxsFkCjlW0mK0egTb
=tiSY
-----END PGP SIGNATURE-----
Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module update from Rusty Russell:
"The sweeping change is to make add_taint() explicitly indicate whether
to disable lockdep, but it's a mechanical change."
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
MODSIGN: Add option to not sign modules during modules_install
MODSIGN: Add -s <signature> option to sign-file
MODSIGN: Specify the hash algorithm on sign-file command line
MODSIGN: Simplify Makefile with a Kconfig helper
module: clean up load_module a little more.
modpost: Ignore ARC specific non-alloc sections
module: constify within_module_*
taint: add explicit flag to show whether lock dep is still OK.
module: printk message when module signature fail taints kernel.
Apply the introduced memalloc_noio_save() and memalloc_noio_restore() to
force memory allocation with no I/O during runtime_resume/runtime_suspend
callback on device with the flag of 'memalloc_noio' set.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Decotigny <david.decotigny@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Jiri Kosina <jiri.kosina@suse.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce the flag memalloc_noio in 'struct dev_pm_info' to help PM core
to teach mm not allocating memory with GFP_KERNEL flag for avoiding
probable deadlock.
As explained in the comment, any GFP_KERNEL allocation inside
runtime_resume() or runtime_suspend() on any one of device in the path
from one block or network device to the root device in the device tree
may cause deadlock, the introduced pm_runtime_set_memalloc_noio() sets
or clears the flag on device in the path recursively.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Jiri Kosina <jiri.kosina@suse.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Decotigny <david.decotigny@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>