This patch splits the existing sync_node_pages into (f)sync_node_pages.
The fsync_node_pages is used for f2fs_sync_file only.
Change-Id: I207b087a54f1a0c2e994a78cd6ed475578d7044e
Acked-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Add a new helper f2fs_flush_merged_bios to clean up redundant codes.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
1. Inode mapping tree can index page in range of [0, ULONG_MAX], however,
in some places, f2fs only search or iterate page in ragne of [0, LONG_MAX],
result in miss hitting in page cache.
2. filemap_fdatawait_range accepts range parameters in unit of bytes, so
the max range it covers should be [0, LLONG_MAX], if we use [0, LONG_MAX]
as range for waiting on writeback, big number of pages will not be covered.
This patch corrects above two issues.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The D state of wait_on_all_pages_writeback should be waken by
function f2fs_write_end_io when all writeback pages have been
succesfully written to device. It's possible that wake_up comes
between get_pages and io_schedule. Maybe in this case it will
lost wake_up and still in D state even if all pages have been
write back to device, and finally, the whole system will be into
the hungtask state.
if (!get_pages(sbi, F2FS_WRITEBACK))
break;
<--------- wake_up
io_schedule();
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Biao He <hebiao6@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
>From the function name of get_valid_checkpoint, it seems to return
the valid cp or NULL for caller to check. If no valid one is found,
f2fs_fill_super will print the err log. But if get_valid_checkpoint
get one valid(the return value indicate that it's valid, however actually
it is invalid after sanity checking), then print another similar err
log. That seems strange. Let's keep sanity checking inside the procedure
of geting valid cp. Another improvement we gained from this move is
that even the large volume is supported, we check the cp in advanced
to skip the following procedure if failing the sanity checking.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Try to use block plug in more place as below to let process cache bios
as much as possbile, in order to reduce lock overhead of queue in IO
scheduler.
1) sync_meta_pages
2) ra_meta_pages
3) f2fs_balance_fs_bg
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
fix missing skip pages info in f2fs_writepages trace event.
Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
f2fs use single bio buffer per type data (META/NODE/DATA) for caching
writes locating in continuous block address as many as possible, after
submitting, these writes may be still cached in bio buffer, so we have
to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.
Unfortunately, in the scenario of high concurrency, bio buffer could be
flushed by someone else before we submit it as below reasons:
a) there is no space in bio buffer.
b) add a request of different type (SYNC, ASYNC).
c) add a discontinuous block address.
For this condition, f2fs_submit_merged_bio will be devastating, because
it could break the following merging of writes in bio buffer, split one
big bio into two smaller one.
This patch introduces f2fs_submit_merged_bio_cond which can do a
conditional submitting with bio buffer, before submitting it will judge
whether:
- page in DATA type bio buffer is matching with specified page;
- page in DATA type bio buffer is belong to specified inode;
- page in NODE type bio buffer is belong to specified inode;
If there is no eligible page in bio buffer, we will skip submitting step,
result in gaining more chance to merge consecutive block IOs in bio cache.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Likewise f2fs_write_cache_pages, let's do for node and meta pages too.
Especially, for node blocks, we should do this before marking its fsync
and dentry flags.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch introduces lifetime IO write statistics exposed to the sysfs interface.
The write IO amount is obtained from block layer, accumulated in the file system and
stored in the hot node summary of checkpoint.
Signed-off-by: Shuoran Liu <liushuoran@huawei.com>
Signed-off-by: Pengyang Hou <houpengyang@huawei.com>
[Jaegeuk Kim: add sysfs documentation]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In write_begin, if storage supports stable_page, we don't need to wait for
writeback to update its contents.
This patch introduces to use wait_for_stable_page instead of
wait_on_page_writeback.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Sometimes we keep dumb when IO error occur in lower layer device, so user
will not receive any error return value for some operation, but actually,
the operation did not succeed.
This sould be avoided, so this patch reports such kind of error to user.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Conflicts:
fs/f2fs/data.c
do_checkpoint and write_checkpoint can fail due to reasons like triggering
in a readonly fs or encountering IO error of storage device.
So it's better to report such error info to user, let user be aware of
failure of doing checkpoint.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Add to stat dirty regular and symlink inode for showing in debugfs.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Maintain regular/symlink inode which has dirty pages in global dirty list
and record their total dirty pages count like the way of handling directory
inode.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Introduce __remove_dirty_inode to clean up codes in remove_dirty_dir_inode.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Add a new dirt list node member in inode info for linking the inode to
global dirty list in superblock, instead of old implementation which
allocate slab cache memory as an entry to inode.
It avoids memory pressure due to slab cache allocation, and also makes
codes more clean.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
remove_dirty_dir_inode will be renamed to remove_dirty_inode as a generic
function in following patch for removing directory/regular/symlink inode
in global dirty list.
Here rename ino management related functions for readability, also in
order to avoid name conflict.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
The last patch is:
commit beaa57dd986d4f398728c060692fc2452895cfd8
Author: Chao Yu <chao2.yu@samsung.com>
Date: Thu Oct 22 18:24:12 2015 +0800
f2fs: fix to skip shrinking extent nodes
In f2fs_shrink_extent_tree we should stop shrink flow if we have already
shrunk enough nodes in extent cache.
Change-Id: I7bc76a98ce99412c59435f4573ace38fca604694
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>