Commit Graph

127 Commits

Author SHA1 Message Date
Yunlei He 7765547fae f2fs: move dir data flush to write checkpoint process
[ Upstream commit b61ac5b720146c619c7cdf17eff2551b934399e5 ]

This patch move dir data flush to write checkpoint process, by
doing this, it may reduce some time for dir fsync.

pre:
	-f2fs_do_sync_file enter
		-file_write_and_wait_range  <- flush & wait
		-write_checkpoint
			-do_checkpoint	    <- wait all
	-f2fs_do_sync_file exit

now:
	-f2fs_do_sync_file enter
		-write_checkpoint
			-block_operations   <- flush dir & no wait
			-do_checkpoint	    <- wait all
	-f2fs_do_sync_file exit

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-07-27 22:06:03 +02:00
Yunlei He faf5b8adb6 f2fs: fix an error return value in truncate_partial_data_page
This patch fix a error return value in truncate_partial_data_page

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-09 19:26:20 -08:00
Yunlei He 79f03cb8af f2fs: remove redundant set_page_dirty()
This patch remove redundant set_page_dirty in truncate_blocks

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-03-08 19:13:48 -08:00
Jaegeuk Kim 9c2fbb5b93 fscrypt: catch fscrypto_get_policy in v4.10-rc6
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-25 11:48:53 -08:00
Hou Pengyang bc2f0a8dd1 f2fs: init local extent_info to avoid stale stack info in tp
To avoid such stale(fops, blk, len) info in f2fs_lookup_extent_tree_end tp

dio-23095 [005] ...1 17878.856859: f2fs_lookup_extent_tree_end:
			dev = (259,30), ino = 856, pgofs = 0,
			ext_info(fofs: 3441207040, blk: 4294967232, len: 3481143808)

Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-23 17:04:57 -08:00
Chao Yu 83af282d50 f2fs: change recovery policy of xattr node block
Currently, if we call fsync after updating the xattr date belongs to the
file, f2fs needs to trigger checkpoint to keep xattr data consistent. But,
this policy cause low performance as checkpoint will block most foreground
operations and cause unneeded and unrelated IOs around checkpoint.

This patch will reuse regular file recovery policy for xattr node block,
so, we change to write xattr node block tagged with fsync flag to warm
area instead of cold area, and during recovery, we search warm node chain
for fsynced xattr block, and do the recovery.

So, for below application IO pattern, performance can be improved
obviously:
- touch file
- create/update/delete xattr entry in file
- fsync file

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-23 16:28:23 -08:00
Jaegeuk Kim f098fe896f f2fs: avoid out-of-order execution of atomic writes
We need to flush data writes before flushing last node block writes by using
FUA with PREFLUSH. We don't need to guarantee precedent node writes since if
those are not written, we can't reach to the last node block when scanning
node block chain during roll-forward recovery.
Afterwards f2fs_wait_on_page_writeback guarantees all the IO submission to
disk, which builds a valid node block chain.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	include/trace/events/f2fs.h
2017-02-23 16:23:14 -08:00
Chao Yu 17d500e93e f2fs: introduce FI_ATOMIC_COMMIT
This patch introduces a new flag to indicate inode status of doing atomic
write committing, so that, we can keep atomic write status for inode
during atomic committing, then we can skip GCing pages of atomic write inode,
that avoids random GCed datas being mixed with current transaction, so
isolation of transaction can be kept.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-23 16:23:01 -08:00
Jaegeuk Kim 4b9ae27137 f2fs: drop exist_data for inline_data when truncated to 0
A test program gets the SEEK_DATA with two values between
a new created file and the exist file on f2fs filesystem.

F2FS filesystem,  (the first "test1" is a new file)
SEEK_DATA size != 0 (offset = 8192)
SEEK_DATA size != 0 (offset = 4096)

PNFS filesystem, (the first "test1" is a new file)
SEEK_DATA size != 0 (offset = 4096)
SEEK_DATA size != 0 (offset = 4096)

int main(int argc, char **argv)
{
        char *filename = argv[1];
        int offset = 1, i = 0, fd = -1;

        if (argc < 2) {
                printf("Usage: %s f2fsfilename\n", argv[0]);
                return -1;
        }

        /*
        if (!access(filename, F_OK) || errno != ENOENT) {
                printf("Needs a new file for test, %m\n");
                return -1;
        }*/

        fd = open(filename, O_RDWR | O_CREAT, 0777);
        if (fd < 0) {
                printf("Create test file %s failed, %m\n", filename);
                return -1;
        }

        for (i = 0; i < 20; i++) {
                offset = 1 << i;
                ftruncate(fd, 0);
                lseek(fd, offset, SEEK_SET);
                write(fd, "test", 5);
                /* Get the alloc size by seek data equal zero*/
                if (lseek(fd, 0, SEEK_DATA)) {
                        printf("SEEK_DATA size != 0 (offset = %d)\n", offset);
                        break;
                }
        }

        close(fd);
        return 0;
}

Reported-and-Tested-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:45:30 -08:00
Jaegeuk Kim a39122a272 f2fs: show the max number of atomic operations
This patch adds to show the max number of atomic operations which are
conducting concurrently.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:45:27 -08:00
Jaegeuk Kim aa4638c4e5 Revert "f2fs: use percpu_counter for # of dirty pages in inode"
This reverts commit 1beba1b3a953107c3ff5448ab4e4297db4619c76.

The perpcu_counter doesn't provide atomicity in single core and consume more
DRAM. That incurs fs_mark test failure due to ENOMEM.

Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-06 13:45:18 -08:00
Yunlei He a230be6484 f2fs: fix a missing size change in f2fs_setattr
This patch fix a missing size change in f2fs_setattr

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-12 15:36:06 -08:00
Jaegeuk Kim 6769922ad1 f2fs: do not activate auto_recovery for fallocated i_size
If a file needs to keep its i_size by fallocate, we need to turn off auto
recovery during roll-forward recovery.

This will resolve the below scenario.

1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 4096" -c "fsync"
2. xfs_io -f /mnt/f2fs/file -c "falloc -k 4096 4096" -c "fsync"
3. md5sum /mnt/f2fs/file;
4. godown /mnt/f2fs/
5. umount /mnt/f2fs/
6. mount -t f2fs /dev/sdx /mnt/f2fs
7. md5sum /mnt/f2fs/file

Reported-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-29 15:57:57 -08:00
Chao Yu a7d40a4d21 f2fs: fix fdatasync
For below two cases, we can't guarantee data consistence:

a)
1. xfs_io "pwrite 0 4195328" "fsync"
2. xfs_io "pwrite 4195328 1024" "fdatasync"
3. godown
4. umount & mount
--> isize we updated before fdatasync won't be recovered

b)
1. xfs_io "pwrite -S 0xcc 0 4202496" "fsync"
2. xfs_io "fpunch 4194304 4096" "fdatasync"
3. godown
4. umount & mount
--> dnode we punched before fdatasync won't be recovered

The reason is that normally fdatasync won't be aware of modification
of metadata in file, e.g. isize changing, dnode updating, so in ->fsync
we will skip flushing node pages for above cases, result in making
fdatasynced file being lost during recovery.

Currently we have introduced DIRTY_META global list in sbi for tracking
dirty inode selectively, so in fdatasync we can choose to flush nodes
depend on dirty state of current inode in the list.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23 13:08:42 -08:00
Chao Yu d53789f502 f2fs: don't wait writeback for datas during checkpoint
Normally, while committing checkpoint, we will wait on all pages to be
writebacked no matter the page is data or metadata, so in scenario where
there are lots of data IO being submitted with metadata, we may suffer
long latency for waiting writeback during checkpoint.

Indeed, we only care about persistence for pages with metadata, but not
pages with data, as file system consistent are only related to metadate,
so in order to avoid encountering long latency in above scenario, let's
recognize and reference metadata in submitted IOs, wait writeback only
for metadatas.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/data.c
2016-11-23 13:07:54 -08:00
Jaegeuk Kim fa710550fd f2fs: avoid BG_GC in f2fs_balance_fs
If many threads hit has_not_enough_free_secs() in f2fs_balance_fs() at the same
time, all the threads would do FG_GC or BG_GC.
In this critical path, we totally don't need to do BG_GC at all.
Let's avoid that.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23 13:06:59 -08:00
Jaegeuk Kim e17d88e103 f2fs: use err for f2fs_preallocate_blocks
This patch has no functional change.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/data.c
	fs/f2fs/f2fs.h
	fs/f2fs/file.c
2016-11-23 13:05:01 -08:00
Jaegeuk Kim 5c640a08fc fs/crypto: catch up 4.9-rc2
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23 13:04:46 -08:00
Jaegeuk Kim a73939e98b f2fs: keep dirty inodes selectively for checkpoint
This is to avoid no free segment bug during checkpoint caused by a number of
dirty inodes.

The case was reported by Chao like this.
1. mount with lazytime option
2. fill 4k file until disk is full
3. sync filesystem
4. read all files in the image
5. umount

In this case, we actually don't need to flush dirty inode to inode page during
checkpoint.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/acl.c
	fs/f2fs/inode.c
	fs/f2fs/namei.c
2016-11-23 13:04:43 -08:00
Jaegeuk Kim 536c558991 f2fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
This is for backport only.

fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23 13:04:43 -08:00
Jaegeuk Kim a270dd0f84 f2fs: call f2fs_balance_fs for setattr
If inode becomes dirty, we need to check the # of dirty inodes whether or not
further checkpoint would be required.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23 13:04:41 -08:00
Chao Yu ce51969154 f2fs: add missing f2fs_balance_fs in f2fs_zero_range
f2fs_balance_fs should be called in between node page updating, otherwise
node page count will exceeded far beyond watermark of triggering
foreground garbage collection, result in facing high risk of hitting LFS
allocation failure.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23 13:04:32 -08:00
Jaegeuk Kim 2777c31eae f2fs: fix overflow due to condition check order
In the last ilen case, i was already increased, resulting in accessing out-
of-boundary entry of do_replace and blkaddr.
Fix to check ilen first to exit the loop.

Fixes: 2aa8fbb9693020 ("f2fs: refactor __exchange_data_block for speed up")
Cc: stable@vger.kernel.org # 4.8+
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-23 13:04:29 -08:00
Fan Li efbee6a821 f2fs: exclude special cases for f2fs_move_file_range
When src and dst is the same file, and the latter part of source region
overlaps with the former part of destination region, current implement
will overwrite data which hasn't been moved yet and truncate data in
overlapped region.
This patch return -EINVAL when such cases occur and return 0 when
source region and destination region is actually the same part of
the same file.

Signed-off-by: Fan li <fanofcode.li@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-24 11:58:53 -07:00
Fan Li 3dc3b70c1a f2fs: fix parameters of __exchange_data_block
__exchange_data_block should take block indexes as parameters
instead of offsets in bytes.

Signed-off-by: Fan li <fanofcode.li@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-24 11:58:49 -07:00
Jaegeuk Kim 5cb75598cf f2fs: check free_sections for defragmentation
Fix wrong condition check for defragmentation of a file.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-24 11:58:42 -07:00
Jaegeuk Kim 4490ba3778 f2fs: avoid page allocation for truncating partial inline_data
When truncating cached inline_data, we don't need to allocate a new page
all the time. Instead, it must check its page cache only.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-24 11:58:37 -07:00
Shuoran Liu b63d01f57b f2fs: add roll-forward recovery process for encrypted dentry
Add roll-forward recovery process for encrypted dentry, so the first fsync
issued to an encrypted file does not need writing checkpoint.

This improves the performance of the following test at thousands of small
files: open -> write -> fsync -> close

Signed-off-by: Shuoran Liu <liushuoran@huawei.com>
Acked-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: modify kernel message to show encrypted names]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/dir.c
	fs/f2fs/f2fs.h
2016-09-24 11:56:20 -07:00
Jaegeuk Kim b0bdd5a610 f2fs: fix lost xattrs of directories
This patch enhances the xattr consistency of dirs from suddern power-cuts.

Possible scenario would be:
1. dir->setxattr used by per-file encryption
2. file->setxattr goes into inline_xattr
3. file->fsync

In that case, we should do checkpoint for #1.
Otherwise we'd lose dir's key information for the file given #2.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-24 11:56:18 -07:00
Sheng Yong 02c4dec3c4 f2fs: remove unnecessary initialization
`flags' is used to save value from userspace, there is no need to
initialize it, and FS_FL_USER_VISIBLE is the mask for getflags.

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Acked-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-09-24 11:40:35 -07:00
Alexander Gordeev 9b7c055d70 f2fs: fix build for v3.10
kvfree() first appears in v3.15. f2fs_kvfree() should be used in the backport.

Signed-off-by: Alexander Gordeev <alex@gordick.net>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-08-16 00:29:34 +09:00
Jaegeuk Kim 92113349fc f2fs: adjust other changes
This patch changes:
- d_inode
- file_dentry
- inode_nohighmem
- ...

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-22 16:24:54 -07:00
Jaegeuk Kim e6ef2dc9ef f2fs: support an ioctl to move a range of data blocks
This patch implements moving a range of data blocks from source file to
destination file.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/file.c
2016-07-22 16:16:05 -07:00
Jaegeuk Kim fb32ebfa91 f2fs: avoid memory allocation failure due to a long length
We need to avoid ENOMEM due to unexpected long length.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-22 16:16:04 -07:00
Jaegeuk Kim 7e3733c759 f2fs: use blk_plug in all the possible paths
This patch reverts 19a5f5e2ef37 (f2fs: drop any block plugging),
and adds blk_plug in write paths additionally.

The main reason is that blk_start_plug can be used to wake up from low-power
mode before submitting further bios.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/file.c
2016-07-22 16:16:02 -07:00
Jaegeuk Kim 59ef789242 f2fs: disable extent_cache for fcollapse/finsert inodes
This reduces the elapsed time to do xfstests/generic/017.

Before: 458 s
After:  390 s

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-22 16:16:00 -07:00
Jaegeuk Kim 2b1ddfb644 f2fs: refactor __exchange_data_block for speed up
This reduces the elapsed time to do xfstests/generic/017.

Before: 715 s
After:  458 s

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-22 16:16:00 -07:00
Jaegeuk Kim 54db9e9228 f2fs: avoid mark_inode_dirty
Let's check inode's dirtiness before calling mark_inode_dirty.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-22 16:15:58 -07:00
Jaegeuk Kim 7c61a19809 f2fs: call SetPageUptodate if needed
SetPageUptodate() issues memory barrier, resulting in performance degrdation.
Let's avoid that.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/data.c
2016-07-22 16:15:54 -07:00
Jaegeuk Kim 8aed67b3a3 f2fs: avoid latency-critical readahead of node pages
The f2fs_map_blocks is very related to the performance, so let's avoid any
latency to read ahead node pages.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-22 16:15:49 -07:00
Jaegeuk Kim f7a90c9d56 f2fs: introduce mode=lfs mount option
This mount option is to enable original log-structured filesystem forcefully.
So, there should be no random writes for main area.

Especially, this supports host-managed SMR device.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-22 16:15:44 -07:00
Jaegeuk Kim 3bcbb3c54b f2fs: remove obsolete parameter in f2fs_truncate
We don't need lock parameter, which is always true.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-22 16:15:38 -07:00
Jaegeuk Kim ae3ea72467 f2fs: avoid unnecessary updating inode during fsync
If roll-forward recovery can recover i_size, we don't need to update inode's
metadata during fsync.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:44:38 -07:00
Jaegeuk Kim e25d86667b f2fs: remove syncing inode page in all the cases
This patch reduces to call them across the whole tree.
- sync_inode_page()
- update_inode_page()
- update_inode()
- f2fs_write_inode()

Instead, checkpoint will flush all the dirty inode metadata before syncing
node pages.
Note that, this is doable, since we call mark_inode_dirty_sync() for all
inode's field change which needs to update on-disk inode as well.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/namei.c
2016-06-02 18:44:38 -07:00
Jaegeuk Kim c2baebb44e f2fs: call mark_inode_dirty_sync for i_field changes
This patch calls mark_inode_dirty_sync() for the following on-disk inode
changes.

 -> largest
 -> ctime/mtime/atime
 -> i_current_depth
 -> i_xattr_nid
 -> i_pino
 -> i_advise
 -> i_flags
 -> i_mode

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/acl.c
	fs/f2fs/inode.c
	fs/f2fs/namei.c
2016-06-02 18:44:37 -07:00
Jaegeuk Kim 72ee8a3bd1 f2fs: introduce f2fs_i_size_write with mark_inode_dirty_sync
This patch introduces f2fs_i_size_write() to call mark_inode_dirty_sync() with
i_size_write().

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:44:35 -07:00
Jaegeuk Kim 9cd207b7e0 f2fs: use inode pointer for {set, clear}_inode_flag
This patch refactors to use inode pointer for set_inode_flag and
clear_inode_flag.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/acl.c
	fs/f2fs/file.c
	fs/f2fs/namei.c
2016-06-02 18:44:31 -07:00
Jaegeuk Kim 3fb1acec3f f2fs: adjust other changes
This patch changes:
- PAGE_CACHE_*
- inode_lock

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 17:02:21 -07:00
Jaegeuk Kim 132a74a388 f2fs: flush pending bios right away when error occurs
Given errors, this patch flushes pending bios as soon as possible.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

 Conflicts:
	fs/f2fs/file.c
2016-05-21 19:36:02 -07:00
Jaegeuk Kim db7b305838 f2fs: use percpu_counter for # of dirty pages in inode
This patch adds percpu_counter for # of dirty pages in inode.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-17 17:25:40 -07:00