Commit graph

101 commits

Author SHA1 Message Date
Christoph Hellwig
f6d6d4fcd1 [XFS] Initial pass at going directly-to-bio on the buffered IO path. This
allows us to submit much larger I/Os instead of sending down lots of small
buffer_heads.  To do this we need to have a rather complicated I/O
submission and completion tracking infrastructure.  Part of the latter has
been merged already a long time ago for direct I/O support. Part of the
problem is that we need to track sub-pagesize regions and for that we
still need buffer_heads for the time beeing.  Long-term I hope we can move
to better data strucutures and/or maybe move this to fs/mpage.c instead of
having it in XFS.  Original patch from Nathan Scott with various updates
from David Chinner and Christoph Hellwig.

SGI-PV: 947118
SGI-Modid: xfs-linux-melb:xfs-kern:203822a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:40:13 +11:00
Nathan Scott
ce8e922c0e [XFS] Complete the pagebuf -> xfs_buf naming convention transition,
finally.

SGI-PV: 947038
SGI-Modid: xfs-linux-melb:xfs-kern:24866a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:39:08 +11:00
Yingping Lu
68bdb6eabc [XFS] Fixed delayed_blks assert failure during umount. The delayed_blks
was caused by ENOSPC but not Rreclaimed by xfs_release or xfs_inactive.
The fix changed the condition in xfs_release and xfs_inactive to invoke
xfs_inactive_free_eofblocks for this special case, changed
xfs_inactive_free_eofblocks to clean the delayed blks after eof. It also
changed xfs_write to set correct eof when ENOSPC occurs.

SGI-PV: 946267
SGI-Modid: xfs-linux-melb:xfs-kern:203788a

Signed-off-by: Yingping Lu <yingping@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:38:31 +11:00
David Chinner
a6867a6815 [XFS] Introduce per-filesystem delwri pagebuf flushing to reduce
contention between filesystems and prevent deadlocks between filesystems
when a flush dependency exists between them.

SGI-PV: 947098
SGI-Modid: xfs-linux-melb:xfs-kern:24844a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:37:58 +11:00
Tim Shimmin
216d3b2acb [XFS] take out the call to vn_mark_bad() used when acl inherit fails and
it needs to back out the inode creation. Tested by xfs_tests/077.

SGI-PV: 930841
SGI-Modid: xfs-linux-melb:xfs-kern:24842a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:37:38 +11:00
Nathan Scott
446ada4a03 [XFS] Add an XFS callout to security_inode_init_security; SE Linux is not
functional with XFS without this change.

SGI-PV: 946762
SGI-Modid: xfs-linux-melb:xfs-kern:24766a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:35:44 +11:00
Christoph Hellwig
42fe2b1f7f [XFS] fix, speedup and simplify atime handling let the VFS handle atime
updates and only sync back to the xfs inode when nessecary

SGI-PV: 946679
SGI-Modid: xfs-linux-melb:xfs-kern:203362a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:35:17 +11:00
Eric Sandeen
24ee80882d [XFS] remove unused vars, args, & unneeded intermediate vars from zeroing
code

SGI-PV: 946641
SGI-Modid: xfs-linux-melb:xfs-kern:203328a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:34:32 +11:00
Nathan Scott
0d14824c07 [XFS] Ensure max diosize reported is aligned with minimum diosize.
SGI-PV: 910890
SGI-Modid: xfs-linux-melb:xfs-kern:24689a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:33:51 +11:00
Nathan Scott
a255a7456d [XFS] Make d_maxiosz report the real maximum (INT_MAX) so we dont
incorrectly limit people using this interface to size IO buffers.

SGI-PV: 910890
SGI-Modid: xfs-linux-melb:xfs-kern:24657a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:32:30 +11:00
Christoph Hellwig
1df84c930a [XFS] Mark some lookup tables const. Thanks to Arjan van de Ven for
spotting these.

SGI-PV: 946028
SGI-Modid: xfs-linux-melb:xfs-kern:202617a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:29:52 +11:00
Christoph Hellwig
4ef19dddba [XFS] enable write barriers by default
SGI-PV: 912426
SGI-Modid: xfs-linux-melb:xfs-kern:201981a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-01-11 15:27:18 +11:00
Christoph Hellwig
7ff92053dd [PATCH] don't include ioctl32.h in drivers
These days ioctl32.h is only used for communication of fs/compat.c and
fs/compat_ioctl.c and doesn't contain anything of interest to drivers.

Remove inclusion in various drivers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:34 -08:00
Christoph Hellwig
fc33a7bb9c [PATCH] per-mountpoint noatime/nodiratime
Turn noatime and nodiratime into per-mount instead of per-sb flags.

After all the preparations this is a rather trivial patch.  The mount code
needs to treat the two options as per-mount instead of per-superblock, and
touch_atime needs to be changed to check the new MNT_ flags in addition to
the MS_ flags that are kept for filesystems that are always
noatime/nodiratime but not user settable anymore.  Besides that core code
only nfs needed an update because it's leaving atime updates to the server
and thus sets the S_NOATIME flag on every inode, but needs to know whether
it's a real noatime mount for an getattr optimization.

While we're at it I've killed the IS_NOATIME/IS_NODIRATIME macros that were
only used by touch_atime.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:34 -08:00
Christoph Hellwig
870f481793 [PATCH] replace inode_update_time with file_update_time
To allow various options to work per-mount instead of per-sb we need a
struct vfsmount when updating ctime and mtime.  This preparation patch
replaces the inode_update_time routine with a file_update_atime routine so
we can easily get at the vfsmount.  (and the file makes more sense in this
context anyway).  Also get rid of the unused second argument - we always
want to update the ctime when calling this routine.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:30 -08:00
Christoph Hellwig
3542c6e18f [PATCH] remove xfs xattr permission checks
remove checks now in the VFS

XFS has an additional xattr interface through obscure ioctl.  it requires
raised capabilities but we need to add some read-only/immutable checks anyway

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:30 -08:00
Jes Sorensen
1b1dcc1b57 [PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem
This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.

Modified-by: Ingo Molnar <mingo@elte.hu>

(finished the conversion)

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2006-01-09 15:59:24 -08:00
Jes Sorensen
794ee1baee [PATCH] mutex subsystem, semaphore to mutex: XFS
This patch switches XFS over to use the new mutex code directly as
opposed to the previous workaround patch I posted earlier that avoided
the namespace clash by forcing it back to semaphores. This falls in the
'works for me<tm>' category.

Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2006-01-09 15:59:21 -08:00
OGAWA Hirofumi
28fd129827 [PATCH] Fix and add EXPORT_SYMBOL(filemap_write_and_wait)
This patch add EXPORT_SYMBOL(filemap_write_and_wait) and use it.

See mm/filemap.c:

And changes the filemap_write_and_wait() and filemap_write_and_wait_range().

Current filemap_write_and_wait() doesn't wait if filemap_fdatawrite()
returns error.  However, even if filemap_fdatawrite() returned an
error, it may have submitted the partially data pages to the device.
(e.g. in the case of -ENOSPC)

<quotation>
Andrew Morton writes,

If filemap_fdatawrite() returns an error, this might be due to some
I/O problem: dead disk, unplugged cable, etc.  Given the generally
crappy quality of the kernel's handling of such exceptions, there's a
good chance that the filemap_fdatawait() will get stuck in D state
forever.
</quotation>

So, this patch doesn't wait if filemap_fdatawrite() returns the -EIO.

Trond, could you please review the nfs part?  Especially I'm not sure,
nfs must use the "filemap_fdatawrite(inode->i_mapping) == 0", or not.

Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:13:47 -08:00
Nathan Scott
a4656391b7 [XFS] Fix a 32 bit value wraparound when providing a mapping for a large
direct write.

SGI-PV: 944820
SGI-Modid: xfs-linux-melb:xfs-kern:24351a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-25 16:41:57 +11:00
Olaf Hering
733482e445 [PATCH] changing CONFIG_LOCALVERSION rebuilds too much, for no good reason
This patch removes almost all inclusions of linux/version.h.  The 3
#defines are unused in most of the touched files.

A few drivers use the simple KERNEL_VERSION(a,b,c) macro, which is
unfortunatly in linux/version.h.

There are also lots of #ifdef for long obsolete kernels, this was not
touched.  In a few places, the linux/version.h include was move to where
the LINUX_VERSION_CODE was used.

quilt vi `find * -type f -name "*.[ch]"|xargs grep -El '(UTS_RELEASE|LINUX_VERSION_CODE|KERNEL_VERSION|linux/version.h)'|grep -Ev '(/(boot|coda|drm)/|~$)'`

search pattern:
/UTS_RELEASE\|LINUX_VERSION_CODE\|KERNEL_VERSION\|linux\/\(utsname\|version\).h

Signed-off-by: Olaf Hering <olh@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:55:57 -08:00
Pekka J Enberg
2109a2d1b1 [PATCH] mm: rename kmem_cache_s to kmem_cache
This patch renames struct kmem_cache_s to kmem_cache so we can start using
it instead of kmem_cache_t typedef.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:24 -08:00
Nathan Scott
15c84a4701 [XFS] Remove no-longer-used qsort source.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-04 10:51:01 +11:00
Nathan Scott
7f248a81c5 [XFS] Cleanup cosmetic differences between source trees.
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-03 16:14:31 +11:00
Nathan Scott
19d5bcf370 [XFS] Ensure fsync does not incorrectly return EIO for pages beyond EOF.
SGI-PV: 944819
SGI-Modid: xfs-linux:xfs-kern:24236a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:14:09 +11:00
Nathan Scott
fdc7ed75c0 [XFS] Fix boundary conditions when issuing direct IOs from large userspace
buffers.

SGI-PV: 944820
SGI-Modid: xfs-linux:xfs-kern:24223a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:13:13 +11:00
Nathan Scott
c11e2c369d [XFS] Rework fid encode/decode wrt 64 bit inums interacting with NFS.
SGI-PV: 937127
SGI-Modid: xfs-linux:xfs-kern:24201a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:11:45 +11:00
Christoph Hellwig
7f14d0a013 [XFS] Simplify pagebuf_rele Remove a conditional that can not be true
anymore and simplify the final put path a little

SGI-PV: 908809
SGI-Modid: xfs-linux:xfs-kern:200790a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:09:35 +11:00
Nathan Scott
6b3f6b5b87 [XFS] Rework the dquot hash sizing heuristics.
SGI-PV: 943123
SGI-Modid: xfs-linux:xfs-kern:24012a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:08:25 +11:00
Eric Sandeen
1f730e3b53 [XFS] Add ATTR_NOSIZETOK definition for xfs_vnodeops.c change
SGI-PV: 942439
SGI-Modid: xfs-linux:xfs-kern:200185a

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 15:08:10 +11:00
Nathan Scott
7b71876980 [XFS] Update license/copyright notices to match the prefered SGI
boilerplate.

SGI-PV: 913862
SGI-Modid: xfs-linux:xfs-kern:23903a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 14:58:39 +11:00
Nathan Scott
a844f4510d [XFS] Remove xfs_macros.c, xfs_macros.h, rework headers a whole lot.
SGI-PV: 943122
SGI-Modid: xfs-linux:xfs-kern:23901a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 14:38:42 +11:00
Nathan Scott
4aeb664c25 [XFS] Improve buffered read throughput by removing unnecessary timer calls
that showed in ´kernel profiles.

SGI-PV: 925163
SGI-Modid: xfs-linux:xfs-kern:23861a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:43:58 +11:00
Nathan Scott
0fdfb3757f [XFS] Remove a null CELL macro and its one caller, not useful to anyone.
SGI-PV: 942986
SGI-Modid: xfs-linux:xfs-kern:23860a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:43:42 +11:00
Nathan Scott
380b5dc0e5 [XFS] Fix up an internal sort function name collision issue.
SGI-PV: 942986
SGI-Modid: xfs-linux:xfs-kern:23859a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:43:18 +11:00
Nathan Scott
80cce77980 [XFS] Make some extended attributes routines take const parameters, for
the FreeBSD porters.

SGI-PV: 942906
SGI-Modid: xfs-linux:xfs-kern:23845a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 11:43:04 +11:00
David Chinner
e8c8b3a79d [XFS] Introduce two new mount options (nolargeio/largeio) to allow
filesystems to expose the filesystem stripe width in stat(2) rather than
the page cache size. This allows applications requiring high bandwidth to
easily determine the optimum I/O size for the underlying filesystem. The
default is to report the page cache size (i.e. "nolargeio").

SGI-PV: 942818
SGI-Modid: xfs-linux:xfs-kern:23830a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:33:05 +11:00
Nathan Scott
ee34807a65 [XFS] Provide a mechiansm for flushing delalloc before quota reporting.
SGI-PV: 942815
SGI-Modid: xfs-linux:xfs-kern:23829a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:32:38 +11:00
Christoph Hellwig
c86e711ceb [XFS] only mark buffers done when all pages are uptodate in addition
replace PBF_NONE with an inverted PBF_DONE, so it's like all the other
flags.

SGI-PV: 942609
SGI-Modid: xfs-linux:xfs-kern:199136a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:29:39 +11:00
Christoph Hellwig
f538d4da8d [XFS] write barrier support Issue all log sync operations as ordered
writes.  In addition flush the disk cache on fsync if the sync cached
operation didn't sync the log to disk (this requires some additional
bookeping in the transaction and log code). If the device doesn't claim to
support barriers, the filesystem has an extern log volume or the trial
superblock write with barriers enabled failed we disable barriers and
print a warning.  We should probably fail the mount completely, but that
could lead to nasty boot failures for the root filesystem.  Not enabled by
default yet, needs more destructive testing first.

SGI-PV: 912426
SGI-Modid: xfs-linux:xfs-kern:198723a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:26:59 +11:00
Christoph Hellwig
739cafd316 [XFS] fix PBF_NONE handling
SGI-PV: 908809
SGI-Modid: xfs-linux:xfs-kern:198669a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:25:51 +11:00
Christoph Hellwig
88741a95af [XFS] remove unused pagebuf flags
SGI-PV: 908809
SGI-Modid: xfs-linux:xfs-kern:198656a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:21:14 +11:00
Christoph Hellwig
04d8b28416 [XFS] Make sure the threads and shaker in xfs_buf are de-initialized in
reverse startup order

SGI-PV: 942063
SGI-Modid: xfs-linux:xfs-kern:198651a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-11-02 10:15:05 +11:00
Hugh Dickins
4c21e2f244 [PATCH] mm: split page table lock
Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
a many-threaded application which concurrently initializes different parts of
a large anonymous area.

This patch corrects that, by using a separate spinlock per page table page, to
guard the page table entries in that page, instead of using the mm's single
page_table_lock.  (But even then, page_table_lock is still used to guard page
table allocation, and anon_vma allocation.)

In this implementation, the spinlock is tucked inside the struct page of the
page table page: with a BUILD_BUG_ON in case it overflows - which it would in
the case of 32-bit PA-RISC with spinlock debugging enabled.

Splitting the lock is not quite for free: another cacheline access.  Ideally,
I suppose we would use split ptlock only for multi-threaded processes on
multi-cpu machines; but deciding that dynamically would have its own costs.
So for now enable it by config, at some number of cpus - since the Kconfig
language doesn't support inequalities, let preprocessor compare that with
NR_CPUS.  But I don't think it's worth being user-configurable: for good
testing of both split and unsplit configs, split now at 4 cpus, and perhaps
change that to 8 later.

There is a benefit even for singly threaded processes: kswapd can be attacking
one part of the mm while another part is busy faulting.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:42 -07:00
Al Viro
27496a8c67 [PATCH] gfp_t: fs/*
- ->releasepage() annotated (s/int/gfp_t), instances updated
 - missing gfp_t in fs/* added
 - fixed misannotation from the original sweep caught by bitwise checks:
   XFS used __nocast both for gfp_t and for flags used by XFS allocator.
   The latter left with unsigned int __nocast; we might want to add a
   different type for those but for now let's leave them alone.  That,
   BTW, is a case when __nocast use had been actively confusing - it had
   been used in the same code for two different and similar types, with
   no way to catch misuses.  Switch of gfp_t to bitwise had caught that
   immediately...

One tricky bit is left alone to be dealt with later - mapping->flags is
a mix of gfp_t and error indications.  Left alone for now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-28 08:16:47 -07:00
Al Viro
dd0fc66fb3 [PATCH] gfp flags annotations - part 1
- added typedef unsigned int __nocast gfp_t;

 - replaced __nocast uses for gfp flags with gfp_t - it gives exactly
   the same warnings as far as sparse is concerned, doesn't change
   generated code (from gcc point of view we replaced unsigned int with
   typedef) and documents what's going on far better.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-08 15:00:57 -07:00
Nishanth Aravamudan
041e0e3b19 [PATCH] fs: fix-up schedule_timeout() usage
Use schedule_timeout_{,un}interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.  Also use helper
functions to convert between human time units and jiffies rather than constant
HZ division to avoid rounding errors.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:36 -07:00
Nathan Scott
cde410a99d [XFS] Sort out some cosmetic differences between XFS trees.
SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:23719a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-05 11:47:01 +10:00
Nathan Scott
c31e887807 [XFS] Fix incorrect use of BMAPI_READ in unwritten extent handling
(luckily just cosmetic).

SGI-PV: 942232
SGI-Modid: xfs-linux-melb:xfs-kern:23718a

Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-05 10:06:55 +10:00
Christoph Hellwig
a3c476d8a1 [XFS] replace "extern inline" with "static inline" Patch from Adrian Bunk
<bunk@stusta.de>, thanks a lot!

SGI-PV: 942227
SGI-Modid: xfs-linux:xfs-kern:198642a

Signed-off-by: Christoph Hellwig <hch@sgi.com>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2005-09-05 08:40:49 +10:00