(cherry pick from commit 6444724939)
zram accounted but did not report numbers of failed read and write
queries. make these stats available as failed_reads and failed_writes
attrs.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: I5022b8e485c748dc257b7d637a1c24c918fddbe8
(cherry pick from commit 90a7806ea9)
This is a preparation patch for stats code duplication removal.
1) use atomic64_t for `pages_zero' and `pages_stored' zram stats.
2) `compr_size' and `pages_zero' struct zram_stats members did not
follow the existing device attr naming scheme: zram_stats.ATTR has
ATTR_show() function. rename them:
-- compr_size -> compr_data_size
-- pages_zero -> zero_pages
Minchan Kim's note:
If we really have trouble with atomic stat operation, we could
change it with percpu_counter so that it could solve atomic overhead and
unnecessary memory space by introducing unsigned long instead of 64bit
atomic_t.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: I3a129a8fa8dd7648fc35f3c7dcf67e6b7e2d2ac6
(cherry pick from commit b7cccf8b40)
Remove `good' and `bad' compressed sub-requests stats. RW request may
cause a number of RW sub-requests. zram used to account `good' compressed
sub-queries (with compressed size less than 50% of original size), `bad'
compressed sub-queries (with compressed size greater that 75% of original
size), leaving sub-requests with compression size between 50% and 75% of
original size not accounted and not reported. zram already accounts each
sub-request's compression size so we can calculate real device compression
ratio.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: I959bece2cb1473fdb65b202a6a7d4a306e29f2ee
(cherry pick from commit be257c6130)
Do not pass rw argument down the __zram_make_request() -> zram_bvec_rw()
chain, decode it in zram_bvec_rw() instead. Besides, this is the place
where we distinguish READ and WRITE bio data directions, so account zram
RW stats here, instead of __zram_make_request(). This also allows to
account a real number of zram READ/WRITE operations, not just requests
(single RW request may cause a number of zram RW ops with separate
locking, compression/decompression, etc).
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: Icc018a740548fff44d72eba24a20527738c3b53c
(cherry pick from commit db5d711e2d)
zram_meta_alloc could fail so caller should check it. Otherwise, your
system will hang.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: Iba3a0fb8b26c5d9964b7716bfee1844a3896088d
(cherry pick from commit f614a9f48d)
Commit a0c516cbfc ("zram: don't grab mutex in zram_slot_free_noity")
introduced free request pending code to avoid scheduling by mutex under
spinlock and it was a mess which made code lenghty and increased
overhead.
Now, we don't need zram->lock any more to free slot so this patch
reverts it and then, tb_lock should protect it.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: I9682d6f62d517612030e8466d3d93703793bc24b
(cherry pick from commit 92967471b6)
Currently, the zram table is protected by zram->lock but it's rather
coarse-grained lock and it makes hard for scalibility.
Let's use own rwlock instead of depending on zram->lock. This patch
adds new locking so obviously, it would make slow but this patch is just
prepartion for removing coarse-grained rw_semaphore(ie, zram->lock)
which is hurdle about zram scalability.
Final patch in this patchset series will remove the lock from read-path
and change rw_semaphore with mutex in write path. With bonus, we could
drop pending slot free mess in next patch.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: I597475b88083d07fc45af4fd9d2c551cda4f513c
(cherry pick from commit f40ac2ae1b)
In function zram_bvec_write(), previous data at the index is
already freed by function zram_free_page().
When failed to compress or zs_malloc, there is no way to restore old data.
Therefore, free previous data when it's about to update.
Also, no need to check whether table is not empty outside of
function zram_free_page(), because the function properly checks inside.
Signed-off-by: Sunghan Suh <sunghan.suh@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 24810447
Change-Id: Ib0361b2c60fd42ce8e03ee8b6ed5ab54b56a25c0
(cherry pick from commit da4a04126b)
Dan and Sergey reported that there is a racy between reset and flushing
of pending work so that it could make oops by freeing zram->meta in
reset while zram_slot_free can access zram->meta if new request is
adding during the race window.
This patch moves flush after taking init_lock so it prevents new request
so that it closes the race.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: Ib5fb789b4fb7a5892b934ab10ebeae12fcf7742f
(cherry pick from commit 7bfb3de8a1)
Add my copyright to the zram source code which I maintain.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: Ie6353ef599481e2681bac62f85a2673d6f51b271
(cherry pick from commit 49061236a9)
Remove the old private compcache project address so upcoming patches
should be sent to LKML because we Linux kernel community will take care.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: I00c2876712ef4c65e8b0cd2f77774d617048268f
(cherry pick from commit f0e71fcd0f)
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).
Instead, the correct and race-free way of performing the callback
registration is:
cpu_notifier_register_begin();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_notifier_register_done();
Fix the zsmalloc code by using this latter form of callback registration.
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bug: 24810447
Change-Id: Idda192da0c2d7cb3ca581ba2916fe9b4befe312e
LZ4 compression and decompression functions require different in
signedness input/output parameters: unsigned char for compression and
signed char for decompression.
Change decompression API to require "(const) unsigned char *".
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yann Collet <yann.collet.73@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset is for supporting LZ4 compression and the crypto API using
it.
As shown below, the size of data is a little bit bigger but compressing
speed is faster under the enabled unaligned memory access. We can use
lz4 de/compression through crypto API as well. Also, It will be useful
for another potential user of lz4 compression.
lz4 Compression Benchmark:
Compiler: ARM gcc 4.6.4
ARMv7, 1 GHz based board
Kernel: linux 3.4
Uncompressed data Size: 101 MB
Compressed Size compression Speed
LZO 72.1MB 32.1MB/s, 33.0MB/s(UA)
LZ4 75.1MB 30.4MB/s, 35.9MB/s(UA)
LZ4HC 59.8MB 2.4MB/s, 2.5MB/s(UA)
- UA: Unaligned memory Access support
- Latest patch set for LZO applied
This patch:
Add support for LZ4 compression in the Linux Kernel. LZ4 Compression APIs
for kernel are based on LZ4 implementation by Yann Collet and were changed
for kernel coding style.
LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository : http://code.google.com/p/lz4/
svn revision : r90
Two APIs are added:
lz4_compress() support basic lz4 compression whereas lz4hc_compress()
support high compression or CPU performance get lower but compression
ratio get higher. Also, we require the pre-allocated working memory with
the defined size and destination buffer must be allocated with the size of
lz4_compressbound.
[akpm@linux-foundation.org: make lz4_compresshcctx() static]
Signed-off-by: Chanho Min <chanho.min@lge.com>
Cc: "Darrick J. Wong" <djwong@us.ibm.com>
Cc: Bob Pearson <rpearson@systemfabricworks.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Herbert Xu <herbert@gondor.hengli.com.au>
Cc: Yann Collet <yann.collet.73@gmail.com>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for LZ4 decompression in the Linux Kernel. LZ4 Decompression
APIs for kernel are based on LZ4 implementation by Yann Collet.
Benchmark Results(PATCH v3)
Compiler: Linaro ARM gcc 4.6.2
1. ARMv7, 1.5GHz based board
Kernel: linux 3.4
Uncompressed Kernel Size: 14MB
Compressed Size Decompression Speed
LZO 6.7MB 20.1MB/s, 25.2MB/s(UA)
LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA)
2. ARMv7, 1.7GHz based board
Kernel: linux 3.7
Uncompressed Kernel Size: 14MB
Compressed Size Decompression Speed
LZO 6.0MB 34.1MB/s, 52.2MB/s(UA)
LZ4 6.5MB 86.7MB/s
- UA: Unaligned memory Access support
- Latest patch set for LZO applied
This patch set is for adding support for LZ4-compressed Kernel. LZ4 is a
very fast lossless compression algorithm and it also features an extremely
fast decoder [1].
But we have five of decompressors already and one question which does
arise, however, is that of where do we stop adding new ones? This issue
had been discussed and came to the conclusion [2].
Russell King said that we should have:
- one decompressor which is the fastest
- one decompressor for the highest compression ratio
- one popular decompressor (eg conventional gzip)
If we have a replacement one for one of these, then it should do exactly
that: replace it.
The benchmark shows that an 8% increase in image size vs a 66% increase
in decompression speed compared to LZO(which has been known as the
fastest decompressor in the Kernel). Therefore the "fast but may not be
small" compression title has clearly been taken by LZ4 [3].
[1] http://code.google.com/p/lz4/
[2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157
[3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347
LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository: http://code.google.com/p/lz4/
Signed-off-by: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Yann Collet <yann.collet.73@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry pick from commit 93ae4f978c)
The following method of CPU hotplug callback registration is not safe
due to the possibility of an ABBA deadlock involving the cpu_add_remove_lock
and the cpu_hotplug.lock.
get_online_cpus();
for_each_online_cpu(cpu)
init_cpu(cpu);
register_cpu_notifier(&foobar_cpu_notifier);
put_online_cpus();
The deadlock is shown below:
CPU 0 CPU 1
----- -----
Acquire cpu_hotplug.lock
[via get_online_cpus()]
CPU online/offline operation
takes cpu_add_remove_lock
[via cpu_maps_update_begin()]
Try to acquire
cpu_add_remove_lock
[via register_cpu_notifier()]
CPU online/offline operation
tries to acquire cpu_hotplug.lock
[via cpu_hotplug_begin()]
*** DEADLOCK! ***
The problem here is that callback registration takes the locks in one order
whereas the CPU hotplug operations take the same locks in the opposite order.
To avoid this issue and to provide a race-free method to register CPU hotplug
callbacks (along with initialization of already online CPUs), introduce new
variants of the callback registration APIs that simply register the callbacks
without holding the cpu_add_remove_lock during the registration. That way,
we can avoid the ABBA scenario. However, we will need to hold the
cpu_add_remove_lock throughout the entire critical section, to protect updates
to the callback/notifier chain.
This can be achieved by writing the callback registration code as follows:
cpu_maps_update_begin(); [ or cpu_notifier_register_begin(); see below ]
for_each_online_cpu(cpu)
init_cpu(cpu);
/* This doesn't take the cpu_add_remove_lock */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_maps_update_done(); [ or cpu_notifier_register_done(); see below ]
Note that we can't use get_online_cpus() here instead of cpu_maps_update_begin()
because the cpu_hotplug.lock is dropped during the invocation of CPU_POST_DEAD
notifiers, and hence get_online_cpus() cannot provide the necessary
synchronization to protect the callback/notifier chains against concurrent
reads and writes. On the other hand, since the cpu_add_remove_lock protects
the entire hotplug operation (including CPU_POST_DEAD), we can use
cpu_maps_update_begin/done() to guarantee proper synchronization.
Also, since cpu_maps_update_begin/done() is like a super-set of
get/put_online_cpus(), the former naturally protects the critical sections
from concurrent hotplug operations.
Since the names cpu_maps_update_begin/done() don't make much sense in CPU
hotplug callback registration scenarios, we'll introduce new APIs named
cpu_notifier_register_begin/done() and map them to cpu_maps_update_begin/done().
In summary, introduce the lockless variants of un/register_cpu_notifier() and
also export the cpu_notifier_register_begin/done() APIs for use by modules.
This way, we provide a race-free way to register hotplug callbacks as well as
perform initialization for the CPUs that are already online.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bug: 24810447
Change-Id: I5f85fcb5cfaa5f5f04a29eefc361851e9c345a99
(cherry pick from commit 31fc00bb78)
Add my copyright to the zsmalloc source code which I maintain.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: Ic4137129666be7a6a383ed8b9c929ee97b6cc9fc
(cherry pick from commit cd67e10ac6)
Zram has lived in staging for a LONG LONG time and have been
fixed/improved by many contributors so code is clean and stable now. Of
course, there are lots of product using zram in real practice.
The major TV companys have used zram as swap since two years ago and
recently our production team released android smart phone with zram
which is used as swap, too and recently Android Kitkat start to use zram
for small memory smart phone. And there was a report Google released
their ChromeOS with zram, too and cyanogenmod have been used zram long
time ago. And I heard some disto have used zram block device for tmpfs.
In addition, I saw many report from many other peoples. For example,
Lubuntu start to use it.
The benefit of zram is very clear. With my experience, one of the
benefit was to remove jitter of video application with backgroud memory
pressure. It would be effect of efficient memory usage by compression
but more issue is whether swap is there or not in the system. Recent
mobile platforms have used JAVA so there are many anonymous pages. But
embedded system normally are reluctant to use eMMC or SDCard as swap
because there is wear-leveling and latency issues so if we do not use
swap, it means we can't reclaim anoymous pages and at last, we could
encounter OOM kill. :(
Although we have real storage as swap, it was a problem, too. Because
it sometime ends up making system very unresponsible caused by slow swap
storage performance.
Quote from Luigi on Google
"Since Chrome OS was mentioned: the main reason why we don't use swap
to a disk (rotating or SSD) is because it doesn't degrade gracefully
and leads to a bad interactive experience. Generally we prefer to
manage RAM at a higher level, by transparently killing and restarting
processes. But we noticed that zram is fast enough to be competitive
with the latter, and it lets us make more efficient use of the
available RAM. " and he announced.
http://www.spinics.net/lists/linux-mm/msg57717.html
Other uses case is to use zram for block device. Zram is block device
so anyone can format the block device and mount on it so some guys on
the internet start zram as /var/tmp.
http://forums.gentoo.org/viewtopic-t-838198-start-0.html
Let's promote zram and enhance/maintain it instead of removing.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Nitin Gupta <ngupta@vflare.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Bob Liu <bob.liu@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Luigi Semenzato <semenzato@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 24810447
Change-Id: Ie338290523af86fc4401a1560920de1b71100152
(cherry pick from commit 0f0e3ba346)
Now there's no caller of zram_get_num_devices(), so kill it.
And change zram_devices to static because it's only used in zram_drv.c.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 24810447
Change-Id: I3b72d6512bcbf5ba055b02fce68a833e28134ded
(cherry pick from commit 39a9b8ac93)
On error recovery path of zram_init(), it leaks the zram device object
causing the failure. So change create_device() to free allocated
resources on error path.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 24810447
Change-Id: Ifb794f34d4e999ebd35b89aa2f3eeca8fd696ce8
(cherry pick from commit 1b672224d1)
As suggested by Minchan Kim and Jerome Marchand "The code in reset_store
get the block device (bdget_disk()) but it does not put it (bdput()) when
it's done using it. The usage count is therefore incremented but never
decremented."
This patch also puts bdput() for all error cases.
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 24810447
Change-Id: I034c6cc59426fee5ee6f069ae9cf9ca5395202a2
(cherry pick from commit 2b86ab9cc2)
[1] tried to fix invalid memory access on zram->disk but it didn't
fix properly because get_disk failed during module exit path.
Actually, we don't need to reset zram->disk's capacity to zero
in module exit path so that this patch introduces new argument
"reset_capacity" on zram_reset_divice and it only reset it when
reset_store is called.
[1] 6030ea9b, zram: avoid invalid memory access in zram_exit()
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 24810447
Change-Id: I87e35bc5af6f82a75a9cdc620a142f86e80f2884
(cherry pick from commit 9b3bb7abcd)
Move zram sysfs code to zram drv and remove zram_sysfs.c
file. This gives ability to make static a number of previously
exported zram functions, used from zram sysfs, e.g. internal zram
zram_meta_alloc/free(). We also can drop zram_drv wrapper
functions, used from zram sysfs:
e.g. zram_reset_device()/__zram_reset_device() pair.
v2: as suggested by Greg K-H, move MODULE description to the
bottom of the file.
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 24810447
Change-Id: Ic0038af190db194f7891cb27502a9b6d0d9e4042
(cherry pick from commit 12a7ad3b81)
Function valid_io_request() should verify the entire request are within
the zram device address range. Otherwise it may cause invalid memory
access when accessing/modifying zram->meta->table[index] because the
'index' is out of range. Then it may access non-exist memory, randomly
modify memory belong to other subsystems, which is hard to track down.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 24810447
Change-Id: I648d6f874f49714a044f53ff111f5a22a38d38f8
(cherry pick from commit da5cc7d338)
Use atomic64_xxx() to replace open-coded zram_stat64_xxx().
Some architectures have native support of atomic64 operations,
so we can get rid of the spin_lock() in zram_stat64_xxx().
On the other hand, for platforms use generic version of atomic64
implement, it may cause an extra save/restore of the interrupt
flag. So it's a tradeoff.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 24810447
Change-Id: I75d273bd6acc4cacbe55c2161144b656da2e421c
(cherry pick from commit 397c60668a)
Fixes a bug introduced by commit c8f2f0db1 ("zram: Fix handling
of incompressible pages") which caused invalid memory references
during disk write. Invalid references could occur in two cases:
- Incoming data expands on compression: In this case, reference was
made to kunmap()'ed bio page.
- Partial (non PAGE_SIZE) write with incompressible data: In this
case, reference was made to a kfree()'ed buffer.
Fixes bug 50081:
https://bugzilla.kernel.org/show_bug.cgi?id=50081
Signed-off-by: Nitin Gupta <ngupta@vflare.org>
Cc: stable <stable@vger.kernel.org>
Reported-by: Mihail Kasadjikov <hamer.mk@gmail.com>
Reported-by: Tomas M <tomas@slax.org>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bug: 24810447
Change-Id: Ia6558c1f89896522679f99472d801b1060dc3628
When zram decompress fails, the code unnecessarily dumps failure messages and
does stat accumulation in function zram_decompress_page(), this work is already
done in function zram_decompress_page, the patch skips the redundant work.
Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lockdep complains about recursive deadlock of zram->init_lock.
[1] made it false positive because we can't request IO to zram
before setting disksize. Anyway, we should shut lockdep up to
avoid many reporting from user.
[1] : zram: force disksize setting before using zram
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Acked-by: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kbuild bot whinges due to print format mistmatch caused by
zram: force disksize setting before using zram.
This patch fixes it.
Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1) User of zram normally do mkfs.xxx or mkswap before using
the zram block device(ex, normally, do it at booting time)
It ends up allocating such metadata of zram before real usage so
benefit of lazy initialzation would be mitigated.
2) Some user want to use zram when memory pressure is high.(ie, load zram
dynamically, NOT booting time). It does make sense because people don't
want to waste memory until memory pressure is high(ie, where zram is really
helpful time). In this case, lazy initialzation could be failed easily
because we will use GFP_NOIO instead of GFP_KERNEL for avoiding deadlock.
So the benefit of lazy initialzation would be mitigated, too.
3) Metadata overhead is not critical and Nitin has a plan to diet it.
4K : 12 byte(64bit machine) -> 64G : 192M so 0.3% isn't big overhead
If insane user use such big zram device up to 20, it could consume 6% of ram
but efficieny of zram will cover the waste.
So this patch gives up lazy initialization and instead we initialize metadata
at disksize setting time.
Acked-by: Jerome Marchand <jmarchand@redhat.com>
Acked-by: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now zram document syas "set disksize is optional"
but partly it's wrong. When you try to use zram firstly after
booting, you must set disksize, otherwise zram can't work because
zram gendisk's size is 0. But once you do it, you can use zram freely
after reset because reset doesn't reset to zero paradoxically.
So in this time, disksize setting is optional.:(
It's inconsitent for user behavior and not straightforward.
This patch forces always setting disksize firstly before using zram.
Yes. It changes current behavior so someone could complain when
he upgrades zram. Apparently it could be a problem if zram is mainline
but it still lives in staging so behavior could be changed for right
way to go. Let them excuse.
Acked-by: Jerome Marchand <jmarchand@redhat.com>
Acked-by: Nitin Gupta <ngupta@vflare.org>
Acked-by: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now zram allocates new page with GFP_KERNEL in zram I/O path
if IO is partial. Unfortunately, It may cause deadlock with
reclaim path like below.
write_page from fs
fs_lock
allocation(GFP_KERNEL)
reclaim
pageout
write_page from fs
fs_lock <-- deadlock
This patch fixes it by using GFP_NOIO. In read path, we
reorganize code flow so that kmap_atomic is called after the
GFP_NOIO allocation.
Cc: stable@vger.kernel.org
Acked-by: Jerome Marchand <jmarchand@redhat.com>
Acked-by: Nitin Gupta <ngupta@vflare.org>
[ penberg@kernel.org: don't use GFP_ATOMIC ]
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It seems like an overkill to have adding and subtracting
1 functions from the 32bit counters. Just do it directly.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The ->disksize variable stores values in units of bytes,
print the correct size in Kb
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Simplify dealing with num_devices when initializing zram.
Also cleanup some of the output messages.
Signed-off-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes a bug introduced by commit c8f2f0db1 ("zram: Fix handling
of incompressible pages") which caused invalid memory references
during disk write. Invalid references could occur in two cases:
- Incoming data expands on compression: In this case, reference was
made to kunmap()'ed bio page.
- Partial (non PAGE_SIZE) write with incompressible data: In this
case, reference was made to a kfree()'ed buffer.
Fixes bug 50081:
https://bugzilla.kernel.org/show_bug.cgi?id=50081
Signed-off-by: Nitin Gupta <ngupta@vflare.org>
Cc: stable <stable@vger.kernel.org>
Reported-by: Mihail Kasadjikov <hamer.mk@gmail.com>
Reported-by: Tomas M <tomas@slax.org>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add missing angle bracket before and after the URL.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
zram_bvec_read() shared decompress functionality with zram_read_before_write() function.
Factor-out and make commonly used zram_decompress_page() function, which also simplified
error handling in zram_bvec_read().
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>