Commit graph

4972 commits

Author SHA1 Message Date
Oleg Nesterov
17b02695b2 [PATCH] taskstats_tgid_alloc: optimization
Every subthread (except first) does unneeded kmem_cache_alloc/kmem_cache_free.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28 11:30:54 -07:00
Oleg Nesterov
093a8e8aec [PATCH] taskstats_tgid_free: fix usage
taskstats_tgid_free() is called on copy_process's error path. This is wrong.

	IF (clone_flags & CLONE_THREAD)
		We should not clear ->signal->taskstats, current uses it,
		it probably has a valid accumulated info.
	ELSE
		taskstats_tgid_init() set ->signal->taskstats = NULL,
		there is nothing to free.

Move the callsite to __exit_signal(). We don't need any locking, entire
thread group is exiting, nobody should have a reference to soon to be
released ->signal.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28 11:30:54 -07:00
Stephen Rothwell
5fa3839a64 [PATCH] Constify compat_get_bitmap argument
This means we can call it when the bitmap we want to fetch is declared
const.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28 11:30:54 -07:00
Giridhar Pemmasani
52fd24ca1d [PATCH] __vmalloc with GFP_ATOMIC causes 'sleeping from invalid context'
If __vmalloc is called to allocate memory with GFP_ATOMIC in atomic
context, the chain of calls results in __get_vm_area_node allocating memory
for vm_struct with GFP_KERNEL, causing the 'sleeping from invalid context'
warning.  This patch fixes it by passing the gfp flags along so
__get_vm_area_node allocates memory for vm_struct with the same flags.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28 11:30:52 -07:00
Martin Bligh
3bb1a852ab [PATCH] vmscan: Fix temp_priority race
The temp_priority field in zone is racy, as we can walk through a reclaim
path, and just before we copy it into prev_priority, it can be overwritten
(say with DEF_PRIORITY) by another reclaimer.

The same bug is contained in both try_to_free_pages and balance_pgdat, but
it is fixed slightly differently.  In balance_pgdat, we keep a separate
priority record per zone in a local array.  In try_to_free_pages there is
no need to do this, as the priority level is the same for all zones that we
reclaim from.

Impact of this bug is that temp_priority is copied into prev_priority, and
setting this artificially high causes reclaimers to set distress
artificially low.  They then fail to reclaim mapped pages, when they are,
in fact, under severe memory pressure (their priority may be as low as 0).
This causes the OOM killer to fire incorrectly.

From: Andrew Morton <akpm@osdl.org>

__zone_reclaim() isn't modifying zone->prev_priority.  But zone->prev_priority
is used in the decision whether or not to bring mapped pages onto the inactive
list.  Hence there's a risk here that __zone_reclaim() will fail because
zone->prev_priority ir large (ie: low urgency) and lots of mapped pages end up
stuck on the active list.

Fix that up by decreasing (ie making more urgent) zone->prev_priority as
__zone_reclaim() scans the zone's pages.

This bug perhaps explains why ZONE_RECLAIM_PRIORITY was created.  It should be
possible to remove that now, and to just start out at DEF_PRIORITY?

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28 11:30:50 -07:00
Nick Piggin
2ae88149a2 [PATCH] mm: clean up pagecache allocation
- Consolidate page_cache_alloc

- Fix splice: only the pagecache pages and filesystem data need to use
  mapping_gfp_mask.

- Fix grab_cache_page_nowait: same as splice, also honour NUMA placement.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-28 11:30:50 -07:00
Andrew Morton
735a7ffb73 [PATCH] drivers: wait for threaded probes between initcall levels
The multithreaded-probing code has a problem: after one initcall level (eg,
core_initcall) has been processed, we will then start processing the next
level (postcore_initcall) while the kernel threads which are handling
core_initcall are still executing.  This breaks the guarantees which the
layered initcalls previously gave us.

IOW, we want to be multithreaded _within_ an initcall level, but not between
different levels.

Fix that up by causing the probing code to wait for all outstanding probes at
one level to complete before we start processing the next level.

Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-27 15:34:51 -07:00
Christian Krafft
0344c6c538 [POWERPC] sysfs: add support for adding/removing spu sysfs attributes
This patch adds two functions to create and remove sysfs attributes and
attribute_group to all cpus.  That allows to register sysfs attributes in
a subdirectory like: /sys/devices/system/cpu/cpuX/group_name/what_ever
This will be used by cbe_thermal to group all attributes dealing with
thermal support in one directory.

Signed-of-by: Christian Krafft <krafft@de.ibm.com>

Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-10-25 14:20:22 +10:00
Russell King
04fed361da [PATCH] Remove __must_check for device_for_each_child()
Eliminate more __must_check madness.

The return code from device_for_each_child() depends on the values
which the helper function returns.  If the helper function always
returns zero, it's utterly pointless to check the return code from
device_for_each_child().

The only code which knows if the return value should be checked is
the caller itself, so forcing the return code to always be checked
is silly.  Hence, remove the __must_check annotation.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-23 11:01:33 -07:00
Linus Torvalds
cb7fabcf9d Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [PATCH] libata-sff: Allow for wacky systems
  [PATCH] ahci: readability tweak
  [PATCH] libata: typo fix
  [PATCH] ATA must depend on BLOCK
  [PATCH] libata: use correct map_db values for ICH8
2006-10-21 13:41:41 -07:00
Linus Torvalds
5d6aaf3f6d Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6:
  [PATCH] x86-64: Revert timer routing behaviour back to 2.6.16 state
  [PATCH] x86-64: Overlapping program headers in physical addr space fix
  [PATCH] x86-64: Put more than one cpu in TARGET_CPUS
  [PATCH] x86: Revert new unwind kernel stack termination
  [PATCH] x86-64: Use irq_domain in ioapic_retrigger_irq
  [PATCH] i386: Disable nmi watchdog on all ThinkPads
  [PATCH] x86-64: Revert interrupt backlink changes
  [PATCH] x86-64: Fix ENOSYS in system call tracing
  [PATCH] i386: Fix fake return address
  [PATCH] x86-64: x86_64 add NX mask for PTE entry
  [PATCH] x86-64: Speed up dwarf2 unwinder
  [PATCH] x86: Use -maccumulate-outgoing-args
  [PATCH] x86-64: fix page align in e820 allocator
  [PATCH] x86-64: Fix for arch/x86_64/pci/Makefile CFLAGS
  [PATCH] i386: fix .cfi_signal_frame copy-n-paste error
  [PATCH] x86-64: typo in __assign_irq_vector when updating pos for vector and offset
  [PATCH] x86-64: x86_64 hot-add memory srat.c fix
  [PATCH] i386: Update defconfig
  [PATCH] x86-64: Update defconfig
2006-10-21 13:36:46 -07:00
Paul Jackson
faf6bbcf94 [PATCH] cpuset: mempolicy migration typo fix
Mistyped an ifdef CONFIG_CPUSETS - fixed.

I doubt that anyone ever noticed.  The impact of this typo was
that if someone:
 1) was using MPOL_BIND to force off node allocations
 2) while using cpusets to constrain memory placement
 3) when that cpuset was migrating that jobs memory
 4) while the tasks in that job were actively forking
then there was a rare chance that future allocations using
that MPOL_BIND policy would be node local, not off node.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21 13:35:06 -07:00
Andy Whitcroft
7516795739 [PATCH] Reintroduce NODES_SPAN_OTHER_NODES for powerpc
Reintroduce NODES_SPAN_OTHER_NODES for powerpc

Revert "[PATCH] Remove SPAN_OTHER_NODES config definition"
    This reverts commit f62859bb68.
Revert "[PATCH] mm: remove arch independent NODES_SPAN_OTHER_NODES"
    This reverts commit a94b3ab7ea.

Also update the comments to indicate that this is still required
and where its used.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Kravetz <kravetz@us.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21 13:35:06 -07:00
Andrew Morton
d42552c3ac [PATCH] pci: declare pci_get_device_reverse()
We seem to have lost the declaration of pci_get_device_reverse(), if we ever
had one.

Add a CONFIG_PCI=0 stub too.

Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21 13:35:05 -07:00
NeilBrown
4f2e639af4 [PATCH] md: endian annotations for the bitmap superblock
And a couple of bug fixes found by sparse.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21 13:35:05 -07:00
NeilBrown
1c05b4bc22 [PATCH] md: endian annotation for v1 superblock access
Includes a couple of bugfixes found by sparse.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21 13:35:05 -07:00
NeilBrown
da3ed32fe5 [PATCH] md: add another COMPAT_IOCTL for md
..  so that you can use bitmaps with 32bit userspace on a 64 bit kernel.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-21 13:35:05 -07:00
Tejun Heo
3343571d9f [PATCH] libata: typo fix
Typo fix in commment.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-10-21 15:18:59 -04:00
Linus Torvalds
7b7fc708b5 Merge branch 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] Remove SUID when splicing into an inode
  [PATCH] Add lockless helpers for remove_suid()
  [PATCH] Introduce generic_file_splice_write_nolock()
  [PATCH] Take i_mutex in splice_from_pipe()
2006-10-21 10:01:52 -07:00
Andi Kleen
a1bae67243 [PATCH] i386: Disable nmi watchdog on all ThinkPads
Even newer Thinkpads have bugs in SMM code that causes hangs with
NMI watchdog.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-10-21 18:37:02 +02:00
Jan Beulich
690a973f48 [PATCH] x86-64: Speed up dwarf2 unwinder
This changes the dwarf2 unwinder to do a binary search for CIEs
instead of a linear work. The linker is unfortunately not
able to build a proper lookup table at link time, instead it creates
one at runtime as soon as the bootmem allocator is usable (so you'll continue
using the linear lookup for the first [hopefully] few calls).
The code should be ready to utilize a build-time created table once
a fixed linker becomes available.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-10-21 18:37:01 +02:00
Linus Torvalds
c144879164 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (36 commits)
  [Bluetooth] Fix HID disconnect NULL pointer dereference
  [Bluetooth] Add missing entry for Nokia DTL-4 PCMCIA card
  [Bluetooth] Add support for newer ANYCOM USB dongles
  [NET]: Can use __get_cpu_var() instead of per_cpu() in loopback driver.
  [IPV4] inet_peer: Group together avl_left, avl_right, v4daddr to speedup lookups on some CPUS
  [TCP]: One NET_INC_STATS() could be NET_INC_STATS_BH in tcp_v4_err()
  [NETFILTER]: Missing check for CAP_NET_ADMIN in iptables compat layer
  [NETPOLL]: initialize skb for UDP
  [IPV6]: Fix route.c warnings when multiple tables are disabled.
  [TG3]: Bump driver version and release date.
  [TG3]: Add lower bound checks for tx ring size.
  [TG3]: Fix set ring params tx ring size implementation
  [NET]: reduce per cpu ram used for loopback stats
  [IPv6] route: Fix prohibit and blackhole routing decision
  [DECNET]: Fix input routing bug
  [TCP]: Bound TSO defer time
  [IPv4] fib: Remove unused fib_config members
  [IPV6]: Always copy rt->u.dst.error when copying a rt6_info.
  [IPV6]: Make IPV6_SUBTREES depend on IPV6_MULTIPLE_TABLES.
  [IPV6]: Clean up BACKTRACK().
  ...
2006-10-20 10:27:38 -07:00
Al Viro
a90b061c0b [PATCH] nfsd: nfs_replay_me
We are using NFS_REPLAY_ME as a special error value that is never leaked to
clients.  That works fine; the only problem is mixing host- and network-
endian values in the same objects.  Network-endian equivalent would work just
as fine; switch to it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:43 -07:00
Al Viro
c7afef1f96 [PATCH] nfsd: misc endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:43 -07:00
Al Viro
b37ad28bca [PATCH] nfsd: nfs4 code returns error values in net-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:42 -07:00
Al Viro
6264d69d7d [PATCH] nfsd: vfs.c endianness annotations
don't use the same variable to store NFS and host error values

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:41 -07:00
Al Viro
2ebbc012a9 [PATCH] xdr annotations: NFSv4 server
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:41 -07:00
Al Viro
91f07168ce [PATCH] xdr annotations: NFSv3 server
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:41 -07:00
Al Viro
131a21c217 [PATCH] xdr annotations: NFSv2 server
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:41 -07:00
Al Viro
83b11340d6 [PATCH] nfsfh simple endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:41 -07:00
Al Viro
63f103111f [PATCH] nfsd: nfserrno() endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:41 -07:00
Al Viro
bc4785cd47 [PATCH] nfs: verifier is network-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:40 -07:00
Al Viro
0dbb4c6799 [PATCH] xdr annotations: NFS readdir entries
on-the-wire data is big-endian

[in large part pulled from Alexey's patch]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:40 -07:00
Al Viro
52921e02a4 [PATCH] lockd endianness annotations
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:40 -07:00
Al Viro
7111c66e4e [PATCH] fix svc_procfunc declaration
svc_procfunc instances return __be32, not int

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:40 -07:00
Trond Myklebust
cd9ae2b6a7 [PATCH] NFS: Deal with failure of invalidate_inode_pages2()
If invalidate_inode_pages2() fails, then it should in principle just be
because the current process was signalled.  In that case, we just want to
ensure that the inode's page cache remains marked as invalid.

Also add a helper to allow the O_DIRECT code to simply mark the page cache as
invalid once it is finished writing, instead of calling
invalidate_inode_pages2() itself.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:39 -07:00
Alexey Dobriyan
8ac773b4f7 [PATCH] OOM killer meets userspace headers
Despite mm.h is not being exported header, it does contain one thing
which is part of userspace ABI -- value disabling OOM killer for given
process. So,
a) create and export include/linux/oom.h
b) move OOM_DISABLE define there.
c) turn bounding values of /proc/$PID/oom_adj into defines and export
   them too.

Note: mass __KERNEL__ removal will be done later.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:38 -07:00
Ingo Molnar
145fc655a1 [PATCH] genirq: clean up irq-flow-type naming, fix
Re-add the set_irq_chip_and_handler() prototype, it's still widely used.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:37 -07:00
Ralf Baechle
34e856e6a5 [PATCH] Make <linux/personality.h> userspace proof
<linux/personality.h> contains the constants for personality(2) but also
some defintions that are useless or even harmful in userspace such as the
personality() macro.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:35 -07:00
Andrew Morton
3fcfab16c5 [PATCH] separate bdi congestion functions from queue congestion functions
Separate out the concept of "queue congestion" from "backing-dev congestion".
Congestion is a backing-dev concept, not a queue concept.

The blk_* congestion functions are retained, as wrappers around the core
backing-dev congestion functions.

This proper layering is needed so that NFS can cleanly use the congestion
functions, and so that CONFIG_BLOCK=n actually links.

Cc: "Thomas Maier" <balagi@justmail.de>
Cc: "Jens Axboe" <jens.axboe@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: David Howells <dhowells@redhat.com>
Cc: Peter Osterlund <petero2@telia.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:35 -07:00
Thomas Maier
79e2de4bc5 [PATCH] export clear_queue_congested and set_queue_congested
Export the clear_queue_congested() and set_queue_congested() functions
located in ll_rw_blk.c

The functions are renamed to blk_clear_queue_congested() and
blk_set_queue_congested().

(needed in the pktcdvd driver's bio write congestion control)

Signed-off-by: Thomas Maier <balagi@justmail.de>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:35 -07:00
Jens Axboe
01de85e057 [PATCH] Add lockless helpers for remove_suid()
Right now users have to grab i_mutex before calling remove_suid(), in the
unlikely event that a call to ->setattr() may be needed. Split up the
function in two parts:

- One to check if we need to remove suid
- One to actually remove it

The first we can call lockless.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-19 20:53:08 +02:00
Mark Fasheh
6da6180982 [PATCH] Introduce generic_file_splice_write_nolock()
This allows file systems to manage their own i_mutex locking while
still re-using the generic_file_splice_write() logic.

OCFS2 in particular wants this so that it can order cluster locks within
i_mutex.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-19 20:53:08 +02:00
Mark Fasheh
62752ee198 [PATCH] Take i_mutex in splice_from_pipe()
The splice_actor may be calling ->prepare_write() and ->commit_write(). We
want i_mutex on the inode being written to before calling those so that we
don't race i_size changes.

The double locking behavior is done elsewhere in splice.c, and if we
eventually want _nolock variants of generic_file_splice_write(), fs modules
might have to replicate the nasty locking code. We introduce
inode_double_lock() and inode_double_unlock() to consolidate the locking
rules into one set of functions.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-19 20:53:08 +02:00
John Heffner
ae8064ac32 [TCP]: Bound TSO defer time
This patch limits the amount of time you will defer sending a TSO segment
to less than two clock ticks, or the time between two acks, whichever is
longer.

On slow links, deferring causes significant bursts.  See attached plots,
which show RTT through a 1 Mbps link with a 100 ms RTT and ~100 ms queue
for (a) non-TSO, (b) currnet TSO, and (c) patched TSO.  This burstiness
causes significant jitter, tends to overflow queues early (bad for short
queues), and makes delay-based congestion control more difficult.

Deferring by a couple clock ticks I believe will have a relatively small
impact on performance.

Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-18 20:36:48 -07:00
Lijun Chen
eb409460b1 [TIPC]: Added subscription cancellation capability
This patch allows a TIPC application to cancel an existing
topology service subscription by re-requesting the subscription
with the TIPC_SUB_CANCEL filter bit set.  (All other bits of
the cancel request must match the original subscription request.)

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: Per Liden <per.liden@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-18 19:55:22 -07:00
Linus Torvalds
ce9e3d9953 Merge branch 'ubuntu-updates' of master.kernel.org:/pub/scm/linux/kernel/git/bcollins/ubuntu-2.6
* 'ubuntu-updates' of master.kernel.org:/pub/scm/linux/kernel/git/bcollins/ubuntu-2.6:
  [pci_ids] Add Quicknet XJ vendor/device ID's.
  [valkyriefb] Ifdef for when CONFIG_NVRAM isn't enabled.
  [platinumfb] Ifdef for when CONFIG_NVRAM isn't enabled.
  [igafb] Add pci dev table for module auto loading.
  [controlfb] Ifdef for when CONFIG_NVRAM isn't enabled.
  [hid-core] TurboX Keyboard needs NOGET quirk.
  [ixj] Add pci dev table for module auto loading.
  [initio] Add pci dev table for module auto loading.
  [fdomain] Add pci dev table for module auto loading.
  [BusLogic] Add pci dev table for auto module loading.
  [mv643xx] Add pci device table for auto module loading.
  [alim7101] Add pci dev table for auto module loading.
2006-10-18 18:30:00 -07:00
Greg Kroah-Hartman
7a54f25cef PCI Hotplug: move pci_hotplug.h to include/linux/
This makes it possible to build pci hotplug drivers outside of the main
kernel tree, and Sam keeps telling me to move local header files to
their proper places...

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-10-18 11:36:12 -07:00
Matt Domsch
6b4b78fed4 PCI: optionally sort device lists breadth-first
Problem:
New Dell PowerEdge servers have 2 embedded ethernet ports, which are
labeled NIC1 and NIC2 on the chassis, in the BIOS setup screens, and
in the printed documentation.  Assuming no other add-in ethernet ports
in the system, Linux 2.4 kernels name these eth0 and eth1
respectively.  Many people have come to expect this naming.  Linux 2.6
kernels name these eth1 and eth0 respectively (backwards from
expectations).  I also have reports that various Sun and HP servers
have similar behavior.


Root cause:
Linux 2.4 kernels walk the pci_devices list, which happens to be
sorted in breadth-first order (or pcbios_find_device order on i386,
which most often is breadth-first also).  2.6 kernels have both the
pci_devices list and the pci_bus_type.klist_devices list, the latter
is what is walked at driver load time to match the pci_id tables; this
klist happens to be in depth-first order.

On systems where, for physical routing reasons, NIC1 appears on a
lower bus number than NIC2, but NIC2's bridge is discovered first in
the depth-first ordering, NIC2 will be discovered before NIC1.  If the
list were sorted breadth-first, NIC1 would be discovered before NIC2.

A PowerEdge 1955 system has the following topology which easily
exhibits the difference between depth-first and breadth-first device
lists.

-[0000:00]-+-00.0  Intel Corporation 5000P Chipset Memory Controller Hub
           +-02.0-[0000:03-08]--+-00.0-[0000:04-07]--+-00.0-[0000:05-06]----00.0-[0000:06]----00.0  Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (labeled NIC2, 2.4 kernel name eth1, 2.6 kernel name eth0)
           +-1c.0-[0000:01-02]----00.0-[0000:02]----00.0  Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (labeled NIC1, 2.4 kernel name eth0, 2.6 kernel name eth1)


Other factors, such as device driver load order and the presence of
PCI slots at various points in the bus hierarchy further complicate
this problem; I'm not trying to solve those here, just restore the
device order, and thus basic behavior, that 2.4 kernels had.


Solution:

The solution can come in multiple steps.

Suggested fix #1: kernel
Patch below optionally sorts the two device lists into breadth-first
ordering to maintain compatibility with 2.4 kernels.  It adds two new
command line options:
  pci=bfsort
  pci=nobfsort
to force the sort order, or not, as you wish.  It also adds DMI checks
for the specific Dell systems which exhibit "backwards" ordering, to
make them "right".


Suggested fix #2: udev rules from userland
Many people also have the expectation that embedded NICs are always
discovered before add-in NICs (which this patch does not try to do).
Using the PCI IRQ Routing Table provided by system BIOS, it's easy to
determine which PCI devices are embedded, or if add-in, which PCI slot
they're in.  I'm working on a tool that would allow udev to name
ethernet devices in ascending embedded, slot 1 .. slot N order,
subsort by PCI bus/dev/fn breadth-first.  It'll be possible to use it
independent of udev as well for those distributions that don't use
udev in their installers.

Suggested fix #3: system board routing rules
One can constrain the system board layout to put NIC1 ahead of NIC2
regardless of breadth-first or depth-first discovery order.  This adds
a significant level of complexity to board routing, and may not be
possible in all instances (witness the above systems from several
major manufacturers).  I don't want to encourage this particular train
of thought too far, at the expense of not doing #1 or #2 above.


Feedback appreciated.  Patch tested on a Dell PowerEdge 1955 blade
with 2.6.18.

You'll also note I took some liberty and temporarily break the klist
abstraction to simplify and speed up the sort algorithm.  I think
that's both safe and appropriate in this instance.


Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-10-18 11:36:12 -07:00
Alan Cox
29f3eb6463 pci: Additional search functions
In order to finish converting to pci_get_* interfaces we need to add a couple
of bits of missing functionaility

pci_get_bus_and_slot() provides the equivalent to pci_find_slot()
(pci_get_slot is already taken as a name for something similar but not the
same)

pci_get_device_reverse() is the equivalent of pci_find_device_reverse but
refcounting

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-10-18 11:36:12 -07:00
Ben Collins
74d919465a [pci_ids] Add Quicknet XJ vendor/device ID's.
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
2006-10-18 08:55:54 -04:00
Linus Torvalds
43f82216f0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: fm801-gp - handle errors from pci_enable_device()
  Input: gameport core - handle errors returned by device_bind_driver()
  Input: serio core - handle errors returned by device_bind_driver()
  Lockdep: fix compile error in drivers/input/serio/serio.c
  Input: serio - add lockdep annotations
  Lockdep: add lockdep_set_class_and_subclass() and lockdep_set_subclass()
  Input: atkbd - supress "too many keys" error message
  Input: i8042 - supress ACK/NAKs when blinking during panic
  Input: add missing exports to fix modular build
2006-10-17 08:56:43 -07:00
Jan Kara
58ff407bee [PATCH] Fix IO error reporting on fsync()
When IO error happens on metadata buffer, buffer is freed from memory and
later fsync() is called, filesystems like ext2 fail to report EIO.  We

solve the problem by introducing a pointer to associated address space into
the buffer_head.  When a buffer is removed from a list of metadata buffers
associated with an address space, IO error is transferred from the buffer to
the address space, so that fsync can later report it.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-17 08:18:46 -07:00
NeilBrown
d343fce148 [PATCH] knfsd: Allow lockd to drop replies as appropriate
It is possible for the ->fopen callback from lockd into nfsd to find that an
answer cannot be given straight away (an upcall is needed) and so the request
has to be 'dropped', to be retried later.  That error status is not currently
propagated back.

So:
  Change nlm_fopen to return nlm error codes (rather than a private
  protocol) and define a new nlm_drop_reply code.
  Cause nlm_drop_reply to cause the rpc request to get rpc_drop_reply
  when this error comes back.
  Cause svc_process to drop a request which returns a status of
  rpc_drop_reply.

[akpm@osdl.org: fix warning storm]
Cc: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-17 08:18:46 -07:00
Miklos Szeredi
7762f5a0b7 [PATCH] document i_size_write locking rules
Unless someone reads the documentation for write_seqcount_{begin,end} it is
not obvious, that i_size_write() needs locking.  Especially, that lack of such
locking can result in a system hang.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-17 08:18:45 -07:00
Ingo Molnar
a460e745e8 [PATCH] genirq: clean up irq-flow-type naming
Introduce desc->name and eliminate the handle_irq_name() hack.  Add
set_irq_chip_and_handler_name() to set the flow type and name at once.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-17 08:18:45 -07:00
Stephen Hemminger
aaa248f6c9 [PATCH] rename net_random to random32
Make net_random() more widely available by calling it random32

akpm: hopefully this will permit the removal of carta_random32.  That needs
confirmation from Stephane - this code looks somewhat more computationally
expensive, and has a different (ie: callee-stateful) interface.

[akpm@osdl.org: lots of build fixes, cleanups]
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-17 08:18:43 -07:00
Linus Torvalds
0b269d8462 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (28 commits)
  ACPI: check battery status on resume for un/plug events during sleep
  ACPICA: Fix incorrect handling of PCI Express Root Bridge _HID
  ACPI: asus_acpi: don't printk on writing garbage to proc files
  ACPI: asus_acpi: fix proc files parsing
  ACPI: SCI interrupt source override
  ACPI: fix printk format warnings
  ACPI: fix section for CPU init functions
  ACPI: update comments in motherboard.c
  ACPI: acpi_pci_link_set() can allocate with either GFP_ATOMIC or GFP_KERNEL
  ACPI: fix potential OOPS in power driver with CONFIG_ACPI_DEBUG
  ACPI: ibm_acpi: delete obsolete documentation
  ACPI: created a dedicated workqueue for notify() execution
  ACPI: Remove deferred execution from global lock acquire wakeup path
  MSI S270 Laptop support: backlight, wlan, bluetooth states
  ACPI: EC: export ec_transaction() for msi-laptop driver
  ACPI: EC: Simplify acpi_hw_low_level*() with inb()/outb().
  ACPI: EC: Unify poll and interrupt gpe handlers
  ACPI: EC: Unify poll and interrupt mode transaction functions
  ACPI: EC: Remove unused variables and duplicated code
  ACPI: EC: Remove unnecessary delay added by previous transation patch.
  ...
2006-10-15 11:02:52 -07:00
Lennart Poettering
d7a76e4cb3 ACPI: consolidate functions in acpi ec driver
Unify the following functions:

    acpi_ec_poll_read()
    acpi_ec_poll_write()
    acpi_ec_poll_query()
    acpi_ec_intr_read()
    acpi_ec_intr_write()
    acpi_ec_intr_query()

into:

    acpi_ec_poll_transaction()
    acpi_ec_intr_transaction()

These new functions take as arguments an ACPI EC command, a few bytes
to write to the EC data register and a buffer for a few bytes to read
from the EC data register. The old _read(), _write(), _query() are
just special cases of these functions.

Then unified the code in acpi_ec_poll_transaction() and
acpi_ec_intr_transaction() a little more. Both functions are now just
wrappers around the new acpi_ec_transaction_unlocked() function. The
latter contains the EC access logic, the two original
function now just do their special way of locking and call the the
new function for the actual work.

This saves a lot of very similar code. The primary reason for doing
this, however, is that my driver for MSI 270 laptops needs to issue
some non-standard EC commands in a safe way. Due to this I added a new
exported function similar to ec_write()/ec_write() which is called
ec_transaction() and is essentially just a wrapper around
acpi_ec_{poll,intr}_transaction().

Signed-off-by: Lennart Poettering <mzxreary@0pointer.de>
Acked-by: Luming Yu <luming.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2006-10-14 00:49:52 -04:00
Hans Verkuil
5011915cbb V4L/DVB (4746): HM12 is YUV 4:2:0, not YUV 4:1:1
Fix comment in videodev2.h

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-10-14 00:44:23 -03:00
Linus Torvalds
da79cbae39 Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] block layer: ioprio_best function fix
  [PATCH] ide-cd: fix breakage with internally queued commands
  [PATCH] block layer: elv_iosched_show should get elv_list_lock
  [PATCH] splice: fix pipe_to_file() ->prepare_write() error path
  [PATCH] block layer: elevator_find function cleanup
  [PATCH] elevator: elevator_type member not used
2006-10-12 07:49:46 -07:00
Jens Axboe
cea2885a2e [PATCH] ide-cd: fix breakage with internally queued commands
We still need to maintain a private PC style command, since it
isn't completely unified with REQ_TYPE_BLOCK_PC yet.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-12 15:08:51 +02:00
Jens Axboe
2b1191af68 [PATCH] elevator: elevator_type member not used
elevator_type field in elevator_type structure is useless:
it isn't used anywhere in kernel sources.

Signed-off-by: Vasily Tarasov <vtaras@openvz.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-12 15:08:51 +02:00
Venkat Yekkirala
5b368e61c2 IPsec: correct semantics for SELinux policy matching
Currently when an IPSec policy rule doesn't specify a security
context, it is assumed to be "unlabeled" by SELinux, and so
the IPSec policy rule fails to match to a flow that it would
otherwise match to, unless one has explicitly added an SELinux
policy rule allowing the flow to "polmatch" to the "unlabeled"
IPSec policy rules. In the absence of such an explicitly added
SELinux policy rule, the IPSec policy rule fails to match and
so the packet(s) flow in clear text without the otherwise applicable
xfrm(s) applied.

The above SELinux behavior violates the SELinux security notion of
"deny by default" which should actually translate to "encrypt by
default" in the above case.

This was first reported by Evgeniy Polyakov and the way James Morris
was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.

With this patch applied, SELinux "polmatching" of flows Vs. IPSec
policy rules will only come into play when there's a explicit context
specified for the IPSec policy rule (which also means there's corresponding
SELinux policy allowing appropriate domains/flows to polmatch to this context).

Secondly, when a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return errors other than access denied,
such as -EINVAL.  We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.

The solution for this is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.

Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely).  This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).

This patch: Fix the selinux side of things.

This makes sure SELinux polmatching of flow contexts to IPSec policy
rules comes into play only when an explicit context is associated
with the IPSec policy rule.

Also, this no longer defaults the context of a socket policy to
the context of the socket since the "no explicit context" case
is now handled properly.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
2006-10-11 23:59:37 -07:00
Andrew Morton
07646e217f Lockdep: fix compile error in drivers/input/serio/serio.c
lockdep_set_subclass() was missing in !LOCKDEP case

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2006-10-11 23:45:23 -04:00
David Howells
c636ebdb18 [PATCH] VFS: Destroy the dentries contributed by a superblock on unmounting
The attached patch destroys all the dentries attached to a superblock in one go
by:

 (1) Destroying the tree rooted at s_root.

 (2) Destroying every entry in the anon list, one at a time.

 (3) Each entry in the anon list has its subtree consumed from the leaves
     inwards.

This reduces the amount of work generic_shutdown_super() does, and avoids
iterating through the dentry_unused list.

Note that locking is almost entirely absent in the shrink_dcache_for_umount*()
functions added by this patch.  This is because:

 (1) at the point the filesystem calls generic_shutdown_super(), it is not
     permitted to further touch the superblock's set of dentries, and nor may
     it remove aliases from inodes;

 (2) the dcache memory shrinker now skips dentries that are being unmounted;
     and

 (3) the superblock no longer has any external references through which the VFS
     can reach it.

Given these points, the only locking we need to do is when we remove dentries
from the unused list and the name hashes, which we do a directory's worth at a
time.

We also don't need to guard against reference counts going to zero unexpectedly
and removing bits of the tree we're working on as nothing else can call dput().

A cut down version of dentry_iput() has been folded into
shrink_dcache_for_umount_subtree() function.  Apart from not needing to unlock
things, it also doesn't need to check for inotify watches.

In this version of the patch, the complaint about a dentry still being in use
has been expanded from a single BUG_ON() and now gives much more information.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: NeilBrown <neilb@suse.de>
Acked-by: Ian Kent <raven@themaw.net>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:25 -07:00
Mike Frysinger
c751c1dbb1 [PATCH] include linux/types.h in linux/nbd.h
The nbd header uses __be32 and such types but doesn't actually include the
header that defines these things (linux/types.h); so let's include it.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:24 -07:00
Matthew Wilcox
e50190a834 [PATCH] Consolidate check_signature
There's nothing arch-specific about check_signature(), so move it to
<linux/io.h>.  Use a cross between the Alpha and i386 implementations as
the generic one.

Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:23 -07:00
Reinette Chatre
01a3ee2b20 [PATCH] bitmap: parse input from kernel and user buffers
lib/bitmap.c:bitmap_parse() is a library function that received as input a
user buffer.  This seemed to have originated from the way the write_proc
function of the /proc filesystem operates.

This has been reworked to not use kmalloc and eliminates a lot of
get_user() overhead by performing one access_ok before using __get_user().

We need to test if we are in kernel or user space (is_user) and access the
buffer differently.  We cannot use __get_user() to access kernel addresses
in all cases, for example in architectures with separate address space for
kernel and user.

This function will be useful for other uses as well; for example, taking
input for /sysfs instead of /proc, so it was changed to accept kernel
buffers.  We have this use for the Linux UWB project, as part as the
upcoming bandwidth allocator code.

Only a few routines used this function and they were changed too.

Signed-off-by: Reinette Chatre <reinette.chatre@linux.intel.com>
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:22 -07:00
Maciej W. Rozycki
39484e53bb [PATCH] 32-bit compatibility HDIO IOCTLs
A couple of HDIO IOCTLs are not yet handled and a few others are marked
as using a pointer rather than an unsigned long.  The formers include:

HDIO_GET_WCACHE, HDIO_GET_ACOUSTIC, HDIO_GET_ADDRESS and
HDIO_GET_BUSSTATE.  The latters are: HDIO_SET_MULTCOUNT,
HDIO_SET_UNMASKINTR, HDIO_SET_KEEPSETTINGS, HDIO_SET_32BIT,
HDIO_SET_NOWERR, HDIO_SET_DMA, HDIO_SET_PIO_MODE and HDIO_SET_NICE.

Additionally 0x330 used to be HDIO_GETGEO_BIG and may be issued by 32-bit
`hdparm' run on a 64-bit kernel making Linux complain loudly.

This is a fix for these issues.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:22 -07:00
Florin Malita
fa3ba2e81e [PATCH] fix Module taint flags listing in Oops/panic
Module taint flags listing in Oops/panic has a couple of issues:

* taint_flags() doesn't null-terminate the buffer after printing the flags

* per-module taints are only set if the kernel is not already tainted
  (with that particular flag) => only the first offending module gets its
  taint info correctly updated

Some additional changes:

* 'license_gplok' is no longer needed - equivalent to !(taints &
  TAINT_PROPRIETARY_MODULE) - so we can drop it from struct module *
  exporting module taint info via /proc/module:

pwc 88576 0 - Live 0xf8c32000
evilmod 6784 1 pwc, Live 0xf8bbf000 (PF)

Signed-off-by: Florin Malita <fmalita@gmail.com>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:21 -07:00
Stephane Eranian
e0ab2928cc [PATCH] Add carta_random32() library routine
This is a follow-up patch based on the review for perfmon2.  This patch
adds the carta_random32() library routine + carta_random32.h header file.

This is fast, simple, and efficient pseudo number generator algorithm.  We
use it in perfmon2 to randomize the sampling periods.  In this context, we
do not need any fancy randomizer.

Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Cc: David Mosberger <david.mosberger@acm.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:21 -07:00
Davide Libenzi
b611967de4 [PATCH] epoll_pwait()
Implement the epoll_pwait system call, that extend the event wait mechanism
with the same logic ppoll and pselect do.  The definition of epoll_pwait
is:

int epoll_pwait(int epfd, struct epoll_event *events, int maxevents,
                 int timeout, const sigset_t *sigmask, size_t sigsetsize);

The difference between the vanilla epoll_wait and epoll_pwait is that the
latter allows the caller to specify a signal mask to be set while waiting
for events.  Hence epoll_pwait will wait until either one monitored event,
or an unmasked signal happen.  If sigmask is NULL, the epoll_pwait system
call will act exactly like epoll_wait.  For the POSIX definition of
pselect, information is available here:

http://www.opengroup.org/onlinepubs/009695399/functions/select.html

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andi Kleen <ak@muc.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:21 -07:00
Nick Piggin
9858db504c [PATCH] mm: locks_freed fix
Move the lock debug checks below the page reserved checks.  Also, having
debug_check_no_locks_freed in kernel_map_pages is wrong.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:19 -07:00
Andrew Morton
72b64b5940 [PATCH] ext4 uninline ext4_get_group_no_and_offset()
Way too big to inline.

Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:18 -07:00
Alexandre Ratchov
8fadc14323 [PATCH] ext4: move block number hi bits
move '_hi' bits of block numbers in the larger part of the
block group descriptor structure

Signed-off-by: Alexandre Ratchov <alexandre.ratchov@bull.net>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:18 -07:00
Alexandre Ratchov
0d1ee42f27 [PATCH] ext4: allow larger descriptor size
make block group descriptor larger.

Signed-off-by: Alexandre Ratchov <alexandre.ratchov@bull.net>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:18 -07:00
Mingming Cao
18eba7aae0 [PATCH] jbd2: switch blks_type from sector_t to ull
Similar to ext4, change blocks in JBD2 from sector_t to unsigned long long.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:18 -07:00
Mingming Cao
2ae0210760 [PATCH] ext4: blk_type from sector_t to unsigned long long
Change ext4 in-kernel block type (ext4_fsblk_t) from sector_t to unsigned
long long.  Remove ext4 block type string micro E3FSBLK, replaced with "%llu"

[akpm@osdl.org: build fix]
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:18 -07:00
Laurent Vivier
bd81d8eec0 [PATCH] ext4: 64bit metadata
In-kernel super block changes to support >32 bit free blocks numbers.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Alexandre Ratchov <alexandre.ratchov@bull.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:17 -07:00
Badari Pulavarty
a1ddeb7eae [PATCH] ext4: 48bit i_file_acl
As we are planning to support 48-bit block numbers for ext4, we need to
support 48-bit block numbers for extended attributes.  In the short term, we
can do this by reuse (on-disk) 16-bit padding (linux2.i_pad1 currently used
only by "hurd") as high order bits for xattr.  This patch basically does that.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:17 -07:00
Mingming Cao
299717696d [PATCH] jbd2: sector_t conversion
JBD layer in-kernel block varibles type fixes to support >32 bit block number
and convert to sector_t type.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:17 -07:00
Zach Brown
b517bea1c7 [PATCH] 64-bit jbd2 core
Here is the patch to JBD to handle 64 bit block numbers, originally from Zach
Brown.  This patch is useful only after adding support for 64-bit block
numbers in the filesystem.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:16 -07:00
Randy Dunlap
d0d856e8bd [PATCH] ext4: clean up comments in ext4-extents patch
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:16 -07:00
Suparna Bhattacharya
471d4011a9 [PATCH] ext4: uninitialised extent handling
Make it possible to add file preallocation support in future as an RO_COMPAT
feature by recognizing uninitialized extents as holes and limiting extent
length to keep the top bit of ee_len free for marking uninitialized extents.

Signed-off-by: Suparna Bhattacharya <suparna@in.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:16 -07:00
Alex Tomas
f65e6fba16 [PATCH] ext4: 48bit physical block number support in extents
Signed-off-by: Alex Tomas <alex@clusterfs.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:16 -07:00
Mingming Cao
3a5b2ecdd1 [PATCH] ext4: switch fsblk to sector_t
Redefine ext3 in-kernel filesystem block type (ext3_fsblk_t) from unsigned
long to sector_t, to allow kernel to handle  >32 bit ext3 blocks.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:16 -07:00
Alex Tomas
a86c618126 [PATCH] ext3: add extent map support
On disk extents format:
/*
* this is extent on-disk structure
* it's used at the bottom of the tree
*/
struct ext3_extent {
__le32  ee_block;       /* first logical block extent covers */
__le16  ee_len;         /* number of blocks covered by extent */
__le16  ee_start_hi;    /* high 16 bits of physical block */
__le32  ee_start;       /* low 32 bigs of physical block */
};

Signed-off-by: Alex Tomas <alex@clusterfs.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:16 -07:00
Dave Kleikamp
c3fcc8137c [PATCH] jbd2: cleanup ext4_jbd.h
To allow ext4 to build during the transition from jbd to jbd2, we have both
ext4_jbd.h and ext4_jbd2.h in the tree.  We no longer need the former.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:16 -07:00
Mingming Cao
f7f4bccb72 [PATCH] jbd2: rename jbd2 symbols to avoid duplication of jbd symbols
Mingming Cao originally did this work, and Shaggy reproduced it using some
scripts from her.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:15 -07:00
Dave Kleikamp
470decc613 [PATCH] jbd2: initial copy of files from jbd
This is a simple copy of the files in fs/jbd to fs/jbd2 and
/usr/incude/linux/[ext4_]jbd.h to /usr/include/[ext4_]jbd2.h

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:15 -07:00
Mingming Cao
02ea2104c5 [PATCH] ext4: enable building of ext4
Originally part of a patch from Mingming Cao and Randy Dunlap.  Reorganized
by Shaggy.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Mingming Cao<cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:15 -07:00
Mingming Cao
617ba13b31 [PATCH] ext4: rename ext4 symbols to avoid duplication of ext3 symbols
Mingming Cao originally did this work, and Shaggy reproduced it using some
scripts from her.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:15 -07:00
Dave Kleikamp
ac27a0ec11 [PATCH] ext4: initial copy of files from ext3
Start of the ext4 patch series.  See Documentation/filesystems/ext4.txt for
details.

This is a simple copy of the files in fs/ext3 to fs/ext4 and
/usr/incude/linux/ext3* to /usr/include/ex4*

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:15 -07:00
Chen, Kenneth W
502717f4e1 [PATCH] hugetlb: fix linked list corruption in unmap_hugepage_range()
commit fe1668ae5b causes kernel to oops with
libhugetlbfs test suite.  The problem is that hugetlb pages can be shared
by multiple mappings.  Multiple threads can fight over page->lru in the
unmap path and bad things happen.  We now serialize __unmap_hugepage_range
to void concurrent linked list manipulation.  Such serialization is also
needed for shared page table page on hugetlb area.  This patch will fixed
the bug and also serve as a prepatch for shared page table.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-11 11:14:15 -07:00
Jiri Kosina
88aa0103e4 Input: serio - add lockdep annotations
Signed-off-by: Jiri Kosina <jikos@jikos.cz>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2006-10-11 01:45:31 -04:00
Peter Zijlstra
4dfbb9d8c6 Lockdep: add lockdep_set_class_and_subclass() and lockdep_set_subclass()
This annotation makes it possible to assign a subclass on lock init. This
annotation is meant to reduce the _nested() annotations by assigning a
default subclass.

One could do without this annotation and rely on lockdep_set_class()
exclusively, but that would require a manual stack of struct lock_class_key
objects.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2006-10-11 01:45:14 -04:00
Al Viro
44aa5359be [PATCH] ufs endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-10 16:15:34 -07:00
Alexey Dobriyan
d136fe7243 [PATCH] Finish annotations of struct vlan_ethhdr
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-10 16:15:34 -07:00
Al Viro
6ca1584173 [PATCH] smbfs endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-10 16:15:34 -07:00
Alexey Dobriyan
56052d525a [PATCH] cdrom: add endianness annotations
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-10 16:15:33 -07:00
Al Viro
29756fa328 [PATCH] trivial iomem annotations: istallion
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-10 15:37:22 -07:00
Al Viro
fb136e9784 [PATCH] fix misannotation in ioc4.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-10 15:37:22 -07:00
Al Viro
ba46df984b [PATCH] __user annotations: futex
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-10 15:37:22 -07:00
Al Viro
1acc04cd4c [PATCH] dccp __user annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-10 15:37:20 -07:00
Dave Jones
b0ac3f50b8 [HEADERS] Put linux/config.h out of its misery.
Signed-off-by: Dave Jones <davej@redhat.com>
2006-10-09 19:13:51 -04:00
Bill Nottingham
659564c8ad [PATCH] Introduce vfs_listxattr
This patch moves code out of fs/xattr.c:listxattr into a new function -
vfs_listxattr. The code for vfs_listxattr was originally submitted by Bill
Nottingham <notting@redhat.com> to Unionfs.

Sorry about that.  The reason for this submission is to make the
listxattr code in fs/xattr.c a little cleaner (as well as to clean up
some code in Unionfs.)

Currently, Unionfs has vfs_listxattr defined in its code.  I think
that's very ugly, and I'd like to see it (re)moved.  The logical place
to put it, is along side of all the other vfs_*xattr functions.

Overall, I think this patch is benefitial for both kernel.org kernel and
Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-09 14:20:38 -07:00
Al Viro
cb1055fb1b [PATCH] linux/io.h needs types.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-08 12:32:36 -07:00
Al Viro
a8f47c45ae [PATCH] missing include of scatterlist.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-08 12:32:36 -07:00
NeilBrown
c6b0a9f87b [PATCH] knfsd: tidy up up meaning of 'buffer size' in nfsd/sunrpc
There is some confusion about the meaning of 'bufsz' for a sunrpc server.
In some cases it is the largest message that can be sent or received.  In
other cases it is the largest 'payload' that can be included in a NFS
message.

In either case, it is not possible for both the request and the reply to be
this large.  One of the request or reply may only be one page long, which
fits nicely with NFS.

So we remove 'bufsz' and replace it with two numbers: 'max_payload' and
'max_mesg'.  Max_payload is the size that the server requests.  It is used
by the server to check the max size allowed on a particular connection:
depending on the protocol a lower limit might be used.

max_mesg is the largest single message that can be sent or received.  It is
calculated as the max_payload, rounded up to a multiple of PAGE_SIZE, and
with PAGE_SIZE added to overhead.  Only one of the request and reply may be
this size.  The other must be at most one page.

Cc: Greg Banks <gnb@sgi.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-06 08:53:41 -07:00
Pierre Ossman
ec5a19dd93 [PATCH] mmc: multi sector write transfers
SD cards extend the protocol by allowing the host to query a card how many
blocks were successfully stored on the medium.  This allows us to safely write
chunks of blocks at once.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-06 08:53:41 -07:00
Henne
3260259f00 [PATCH] sched: fix a kerneldoc error on is_init()
Fix a kerneldoc warning and reorderd the description for is_init().

Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-06 08:53:41 -07:00
Jan Blunck
a666ecfbf5 [PATCH] Fix typo in "syntax error if percpu macros are incorrectly used" patch
Trivial typo fix in the "syntax error if percpu macros are incorrectly
used" patch.  I misspelled "identifier" in all places.  D'Oh!

Thanks to Dirk Mueller to point this out.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-06 08:53:41 -07:00
Roman Zippel
7236e978a3 [PATCH] provide tickadj define
Provide a tickadj compatibility define for archs still using it.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-06 08:53:40 -07:00
Benjamin Herrenschmidt
7f7bbbe50b [PATCH] page fault retry with NOPAGE_REFAULT
Add a way for a no_page() handler to request a retry of the faulting
instruction.  It goes back to userland on page faults and just tries again
in get_user_pages().  I added a cond_resched() in the loop in that later
case.

The problem I have with signal and spufs is an actual bug affecting apps and I
don't see other ways of fixing it.

In addition, we are having issues with infiniband and 64k pages (related to
the way the hypervisor deals with some HV cards) that will require us to muck
around with the MMU from within the IB driver's no_page() (it's a pSeries
specific driver) and return to the caller the same way using NOPAGE_REFAULT.

And to add to this, the graphics folks have been following a new approach of
memory management that involves transparently swapping objects between video
ram and main meory.  To do that, they need installing PTEs from a no_page()
handler as well and that also requires returning with NOPAGE_REFAULT.

(For the later, they are currently using io_remap_pfn_range to install one PTE
from no_page() which is a bit racy, we need to add a check for the PTE having
already been installed afer taking the lock, but that's ok, they are only at
the proof-of-concept stage.  I'll send a patch adding a "clean" function to do
that, we can use that from spufs too and get rid of the sparsemem hacks we do
to create struct page for SPEs.  Basically, that provides a generic solution
for being able to have no_page() map hardware devices, which is something that
I think sound driver folks have been asking for some time too).

All of these things depend on having the NOPAGE_REFAULT exit path from
no_page() handlers.

Signed-off-by: Benjamin Herrenchmidt <benh@kernel.crashing.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-06 08:53:40 -07:00
David Howells
7d12e780e0 IRQ: Maintain regs pointer globally rather than passing to IRQ handlers
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.

The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around.  On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).

Where appropriate, an arch may override the generic storage facility and do
something different with the variable.  On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.

Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions.  Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller.  A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.

I've build this code with allyesconfig for x86_64 and i386.  I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.

This will affect all archs.  Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:

	struct pt_regs *old_regs = set_irq_regs(regs);

And put the old one back at the end:

	set_irq_regs(old_regs);

Don't pass regs through to generic_handle_irq() or __do_IRQ().

In timer_interrupt(), this sort of change will be necessary:

	-	update_process_times(user_mode(regs));
	-	profile_tick(CPU_PROFILING, regs);
	+	update_process_times(user_mode(get_irq_regs()));
	+	profile_tick(CPU_PROFILING);

I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().

Some notes on the interrupt handling in the drivers:

 (*) input_dev() is now gone entirely.  The regs pointer is no longer stored in
     the input_dev struct.

 (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking.  It does
     something different depending on whether it's been supplied with a regs
     pointer or not.

 (*) Various IRQ handler function pointers have been moved to type
     irq_handler_t.

Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
2006-10-05 15:10:12 +01:00
David Howells
da482792a6 IRQ: Typedef the IRQ handler function type
Typedef the IRQ handler function type.

Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1356d1e5fd256997e3d3dce0777ab787d0515c7a commit)
2006-10-05 13:28:27 +01:00
David Howells
57a58a9435 IRQ: Typedef the IRQ flow handler function type
Typedef the IRQ flow handler function type.

Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 8e973fbdf5716b93a0a8c0365be33a31ca0fa351 commit)
2006-10-05 13:28:06 +01:00
Linus Torvalds
97d41e90fe Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (54 commits)
  [SCSI] Initial Commit of qla4xxx
  [SCSI] raid class: handle component-add errors
  [SCSI] SCSI megaraid_sas: handle thrown errors
  [SCSI] SCSI aic94xx: handle sysfs errors
  [SCSI] SCSI st: fix error handling in module init, sysfs
  [SCSI] SCSI sd: fix module init/exit error handling
  [SCSI] SCSI osst: add error handling to module init, sysfs
  [SCSI] scsi: remove hosts.h
  [SCSI] scsi: Scsi_Cmnd convertion in aic7xxx_old.c
  [SCSI] megaraid_sas: sets ioctl timeout and updates version,changelog
  [SCSI] megaraid_sas: adds tasklet for cmd completion
  [SCSI] megaraid_sas: prints pending cmds before setting hw_crit_error
  [SCSI] megaraid_sas: function pointer for disable interrupt
  [SCSI] megaraid_sas: frame count optimization
  [SCSI] megaraid_sas: FW transition and q size changes
  [SCSI] qla2xxx: Update version number to 8.01.07-k2.
  [SCSI] qla2xxx: Stall mid-layer error handlers while rport is blocked.
  [SCSI] qla2xxx: Add MODULE_FIRMWARE tags.
  [SCSI] qla2xxx: Add support for host port state FC transport attribute.
  [SCSI] qla2xxx: Add support for fabric name FC transport attribute.
  ...
2006-10-04 18:57:35 -07:00
Jeff Garzik
ed542bed12 [SCSI] raid class: handle component-add errors
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-10-04 13:27:26 -05:00
Linus Torvalds
cc94dcf5f2 Merge branch 'for-2.6.19' of git://brick.kernel.dk/data/git/linux-2.6-block
* 'for-2.6.19' of git://brick.kernel.dk/data/git/linux-2.6-block:
  [PATCH] Document bi_sector and sector_t
  [PATCH] helper function for retrieving scsi_cmd given host based block layer tag
2006-10-04 10:44:01 -07:00
Linus Torvalds
5170065d8a Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  [MIPS] Remove remaining reference to ite_gpio.h from Kbuild
  [MIPS] PNX8550 fixups
2006-10-04 10:43:31 -07:00
Roger Gammans
2c2345c2b4 [PATCH] Document bi_sector and sector_t
Signed-Off-By: Roger Gammans <rgammans@computer-surgery.co.uk>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-04 19:32:09 +02:00
David C Somayajulu
f583f4924d [PATCH] helper function for retrieving scsi_cmd given host based block layer tag
This was necessitated by the need for a function to get back
to a scsi_cmnd, when an hba the posts its (corresponding) completion
interrupt with a block layer tag as its reference.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: David Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2006-10-04 19:32:09 +02:00
Haavard Skinnemoen
9ab4f88b7f [PATCH] serial: Rename PORT_AT91 -> PORT_ATMEL
The at91_serial driver can be used with both AT32 and AT91 devices
from Atmel and has therefore been renamed atmel_serial. The only
thing left is to rename PORT_AT91 PORT_ATMEL.

Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
Acked-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 10:25:05 -07:00
David Woodhouse
c4710e65c0 [MIPS] Remove remaining reference to ite_gpio.h from Kbuild
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-10-04 18:06:15 +01:00
Linus Torvalds
fefd26b3b8 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/configh
* master.kernel.org:/pub/scm/linux/kernel/git/davej/configh:
  Remove all inclusions of <linux/config.h>

Manually resolved trivial path conflicts due to removed files in
the sound/oss/ subdirectory.
2006-10-04 09:59:57 -07:00
Linus Torvalds
4a61f17378 Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6: (292 commits)
  [GFS2] Fix endian bug for de_type
  [GFS2] Initialize SELinux extended attributes at inode creation time.
  [GFS2] Move logging code into log.c (mostly)
  [GFS2] Mark nlink cleared so VFS sees it happen
  [GFS2] Two redundant casts removed
  [GFS2] Remove uneeded endian conversion
  [GFS2] Remove duplicate sb reading code
  [GFS2] Mark metadata reads for blktrace
  [GFS2] Remove iflags.h, use FS_
  [GFS2] Fix code style/indent in ops_file.c
  [GFS2] streamline-generic_file_-interfaces-and-filemap gfs fix
  [GFS2] Remove readv/writev methods and use aio_read/aio_write instead (gfs bits)
  [GFS2] inode-diet: Eliminate i_blksize from the inode structure
  [GFS2] inode_diet: Replace inode.u.generic_ip with inode.i_private (gfs)
  [GFS2] Fix typo in last patch
  [GFS2] Fix direct i/o logic in filemap.c
  [GFS2] Fix bug in Makefiles for lock modules
  [GFS2] Remove (extra) fs_subsys declaration
  [GFS2/DLM] Fix trailing whitespace
  [GFS2] Tidy up meta_io code
  ...
2006-10-04 09:06:16 -07:00
Linus Torvalds
d002ec481c Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [XFRM]: BEET mode
  [TCP]: Kill warning in tcp_clean_rtx_queue().
  [NET_SCHED]: Remove old estimator implementation
  [ATM]: [zatm] always *pcr in alloc_shaper()
  [ATM]: [ambassador] Change the return type to reflect reality
  [ATM]: kmalloc to kzalloc patches for drivers/atm
  [TIPC]: fix printk warning
  [XFRM]: Clearing xfrm_policy_count[] to zero during flush is incorrect.
  [XFRM] STATE: Use destination address for src hash.
  [NEIGH]: always use hash_mask under tbl lock
  [UDP]: Fix MSG_PROBE crash
  [UDP6]: Fix flowi clobbering
  [NET_SCHED]: Revert "HTB: fix incorrect use of RB_EMPTY_NODE"
  [NETFILTER]: ebt_mark: add or/and/xor action support to mark target
  [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
  [NETFILTER]: Honour source routing for LVS-NAT
  [NETFILTER]: add type parameter to ip_route_me_harder
  [NETFILTER]: Kconfig: fix xt_physdev dependencies
2006-10-04 08:26:19 -07:00
Linus Torvalds
5a96c5d0c5 Merge master.kernel.org:/pub/scm/linux/kernel/git/willy/parisc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/willy/parisc-2.6: (41 commits)
  [PARISC] Kill wall_jiffies use
  [PARISC] Honour "panic_on_oops" sysctl
  [PARISC] Fix fs/binfmt_som.c
  [PARISC] Export clear_user_page to modules
  [PARISC] Make DMA routines more stubby
  [PARISC] Define pci_get_legacy_ide_irq
  [PARISC] Fix CONFIG_DEBUG_SPINLOCK
  [PARISC] Fix HPUX compat compile with current GCC
  [PARISC] Fix iounmap compile warning
  [PARISC] Add support for Quicksilver AGPGART
  [PARISC] Move LBA and SBA register defines to the common ropes.h
  [PARISC] Create shared <asm/ropes.h> header
  [PARISC] Stash the lba_device in its struct device drvdata
  [PARISC] Generalize IS_ASTRO et al to take a parisc_device like
  [PARISC] Pretty print the name of the lba type on kernel boot
  [PARISC] Remove some obsolete comments and I checked that Reo is similar to Ike
  [PARISC] Add hardware found in the rp8400
  [PARISC] Allow nested interrupts
  [PARISC] Further updates to timer_interrupt()
  [PARISC] remove halftick and copy clocktick to local var (gcc can optimize usage)
  ...
2006-10-04 08:18:34 -07:00
Linus Torvalds
13bbd8d906 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (25 commits)
  [POWERPC] Add support for the mpc832x mds board
  [POWERPC] Add initial support for the e300c2 core
  [POWERPC] Add MPC8360EMDS default dts file
  [POWERPC] Add MPC8360EMDS board support
  [POWERPC] Add QUICC Engine (QE) infrastructure
  [POWERPC] Add QE device tree node definition
  [POWERPC] Don't try to just continue if xmon has no input device
  [POWERPC] Fix a printk in pseries_mpic_init_IRQ
  [POWERPC] Get default baud rate in udbg_scc
  [POWERPC] Fix zImage.coff on oldworld PowerMac
  [POWERPC] Fix xmon=off and cleanup xmon initialisation
  [POWERPC] Cleanup include/asm-powerpc/xmon.h
  [POWERPC] Update swim3 printk after blkdev.h change
  [POWERPC] Cell interrupt rework
  POWERPC: mpc82xx merge: board-specific/platform stuff(resend)
  POWERPC: 8272ads merge to powerpc: common stuff
  POWERPC: Added devicetree for mpc8272ads board
  [POWERPC] iSeries has no legacy I/O
  [POWERPC] implement BEGIN/END_FW_FTR_SECTION
  [POWERPC] iSeries does not need pcibios_fixup_resources
  ...
2006-10-04 08:16:37 -07:00
Linus Torvalds
18e6756a6b Merge branch 'audit.b32' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b32' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] message types updated
  [PATCH] name_count array overrun
  [PATCH] PPID filtering fix
  [PATCH] arch filter lists with < or > should not be accepted
2006-10-04 08:15:55 -07:00
Linus Torvalds
e30fdb1e02 Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] pata_artop: kill gcc warning
  [PATCH] libata: turn off NCQ if queue depth is adjusted to 1
  [PATCH] libata: cosmetic changes to constants
  [libata] DocBook minor updates, fixes
  [libata] PCI ID table cleanup in various drivers
  [libata] Print out Status register, if a BSY-sleep takes too long
  [libata] init probe_ent->private_data in a common location
  [libata] minor PCI IDE probe fixes and cleanups
  [libata] Use new PCI_VDEVICE() macro to dramatically shorten ID lists
  [PATCH] Fix reference of uninitialised memory in ata_device_add()
2006-10-04 08:06:16 -07:00
Adrian Bunk
d56b9b9c46 [PATCH] The scheduled removal of some OSS drivers
This patch contains the scheduled removal of OSS drivers that:
- have ALSA drivers for the same hardware without known regressions and
- whose Kconfig options have been removed in 2.6.17.

[michal.k.k.piotrowski@gmail.com: build fix]
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:32 -07:00
Josh Triplett
595182bcdf [PATCH] RCU: CREDITS and MAINTAINERS
Add MAINTAINERS entry for Read-Copy Update (RCU), listing Dipankar Sarma as
maintainer, and giving the URL for Paul McKenney's RCU site.  Add
MAINTAINERS entry for rcutorture, listing myself as maintainer.  Add
CREDITS entries for developers of RCU, RCU variants, and rcutorture.  Use
Paul McKenney's preferred email address in include/linux/rcupdate.h .

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Cc: Paul McKenney <paulmck@us.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:31 -07:00
Oleg Nesterov
20e9751bd9 [PATCH] rcu: simplify/improve batch tuning
Kill a hard-to-calculate 'rsinterval' boot parameter and per-cpu
rcu_data.last_rs_qlen.  Instead, it adds adds a flag rcu_ctrlblk.signaled,
which records the fact that one of CPUs has sent a resched IPI since the
last rcu_start_batch().

Roughly speaking, we need two rcu_start_batch()s in order to move callbacks
from ->nxtlist to ->donelist.  This means that when ->qlen exceeds qhimark
and continues to grow, we should send a resched IPI, and then do it again
after we gone through a quiescent state.

On the other hand, if it was already sent, we don't need to do it again
when another CPU detects overflow of the queue.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:31 -07:00
Alan Stern
e6a92013ba [PATCH] SRCU: report out-of-memory errors
Currently the init_srcu_struct() routine has no way to report out-of-memory
errors.  This patch (as761) makes it return -ENOMEM when the per-cpu data
allocation fails.

The patch also makes srcu_init_notifier_head() report a BUG if a notifier
head can't be initialized.  Perhaps it should return -ENOMEM instead, but
in the most likely cases where this might occur I don't think any recovery
is possible.  Notifier chains generally are not created dynamically.

[akpm@osdl.org: avoid statement-with-side-effect in macro]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:30 -07:00
Alan Stern
eabc069401 [PATCH] Add SRCU-based notifier chains
This patch (as751) adds a new type of notifier chain, based on the SRCU
(Sleepable Read-Copy Update) primitives recently added to the kernel.  An
SRCU notifier chain is much like a blocking notifier chain, in that it must
be called in process context and its callout routines are allowed to sleep.
 The difference is that the chain's links are protected by the SRCU
mechanism rather than by an rw-semaphore, so calling the chain has
extremely low overhead: no memory barriers and no cache-line bouncing.  On
the other hand, unregistering from the chain is expensive and the chain
head requires special runtime initialization (plus cleanup if it is to be
deallocated).

SRCU notifiers are appropriate for notifiers that will be called very
frequently and for which unregistration occurs very seldom.  The proposed
"task notifier" scheme qualifies, as may some of the network notifiers.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Paul E. McKenney <paulmck@us.ibm.com>
Acked-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:30 -07:00
Paul E. McKenney
621934ee7e [PATCH] srcu-3: RCU variant permitting read-side blocking
Updated patch adding a variant of RCU that permits sleeping in read-side
critical sections.  SRCU is as follows:

o	Each use of SRCU creates its own srcu_struct, and each
	srcu_struct has its own set of grace periods.  This is
	critical, as it prevents one subsystem with a blocking
	reader from holding up SRCU grace periods for other
	subsystems.

o	The SRCU primitives (srcu_read_lock(), srcu_read_unlock(),
	and synchronize_srcu()) all take a pointer to a srcu_struct.

o	The SRCU primitives must be called from process context.

o	srcu_read_lock() returns an int that must be passed to
	the matching srcu_read_unlock().  Realtime RCU avoids the
	need for this by storing the state in the task struct,
	but SRCU needs to allow a given code path to pass through
	multiple SRCU domains -- storing state in the task struct
	would therefore require either arbitrary space in the
	task struct or arbitrary limits on SRCU nesting.  So I
	kicked the state-storage problem up to the caller.

	Of course, it is not permitted to call synchronize_srcu()
	while in an SRCU read-side critical section.

o	There is no call_srcu().  It would not be hard to implement
	one, but it seems like too easy a way to OOM the system.
	(Hey, we have enough trouble with call_rcu(), which does
	-not- permit readers to sleep!!!)  So, if you want it,
	please tell me why...

[josht@us.ibm.com: sparse notation]
Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:30 -07:00
Eric W. Biederman
95d77884c7 [PATCH] htirq: tidy up the htirq code
This moves the declarations for the architecture helpers into
include/linux/htirq.h from the generic include/linux/pci.h.  Hopefully this
will make this distinction clearer.

htirq.h is included where it is needed.

The dependency on the msi code is fixed and removed.

The Makefile is tidied up.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <greg@kroah.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:30 -07:00
Eric W. Biederman
3b7d1921f4 [PATCH] msi: refactor and move the msi irq_chip into the arch code
It turns out msi_ops was simply not enough to abstract the architecture
specific details of msi.  So I have moved the resposibility of constructing
the struct irq_chip to the architectures, and have two architecture specific
functions arch_setup_msi_irq, and arch_teardown_msi_irq.

For simple architectures those functions can do all of the work.  For
architectures with platform dependencies they can call into the appropriate
platform code.

With this msi.c is finally free of assuming you have an apic, and this
actually takes less code.

The helpers for the architecture specific code are declared in the linux/msi.h
to keep them separate from the msi functions used by drivers in linux/pci.h

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <greg@kroah.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:29 -07:00
Eric W. Biederman
1f80025e62 [PATCH] msi: simplify msi sanity checks by adding with generic irq code
Currently msi.c is doing sanity checks that make certain before an irq is
destroyed it has no more users.

By adding irq_has_action I can perform the test is a generic way, instead of
relying on a msi specific data structure.

By performing the core check in dynamic_irq_cleanup I ensure every user of
dynamic irqs has a test present and we don't free resources that are in use.

In msi.c this allows me to kill the attrib.state member of msi_desc and all of
the assciated code to maintain it.

To keep from freeing data structures when irq cleanup code is called to soon
changing dyanamic_irq_cleanup is insufficient because there are msi specific
data structures that are also not safe to free.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <greg@kroah.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:29 -07:00
Eric W. Biederman
8b955b0ddd [PATCH] Initial generic hypertransport interrupt support
This patch implements two functions ht_create_irq and ht_destroy_irq for
use by drivers.  Several other functions are implemented as helpers for
arch specific irq_chip handlers.

The driver for the card I tested this on isn't yet ready to be merged.
However this code is and hypertransport irqs are in use in a few other
places in the kernel.  Not that any of this will get merged before 2.6.19

Because the ipath-ht400 is slightly out of spec this code will need to be
generalized to work there.

I think all of the powerpc uses are for a plain interrupt controller in a
chipset so support for native hypertransport devices is a little less
interesting.

However I think this is a half way decent model on how to separate arch
specific and generic helper code, and I think this is a functional model of
how to get the architecture dependencies out of the msi code.

[akpm@osdl.org: Kconfig fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg KH <greg@kroah.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:29 -07:00
Eric W. Biederman
e78d01693b [PATCH] Add Hypertransport capability defines
This adds defines for the hypertransport capability subtypes and starts
using them a little.

[akpm@osdl.org: fix typo]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:29 -07:00
Eric W. Biederman
23d0b8b053 [PATCH] genirq: irq: generalize the check for HARDIRQ_BITS
This patch adds support for systems that cannot receive every interrupt on a
single cpu simultaneously, in the check to see if we have enough HARDIRQ_BITS.

MAX_HARDIRQS_PER_CPU becomes the count of the maximum number of hardare
generated interrupts per cpu.

On architectures that support per cpu interrupt delivery this can be a
significant space savings and scalability bonus.

This patch adds support for systems that cannot receive every interrupt on

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:28 -07:00
Eric W. Biederman
323a01c508 [PATCH] genirq: irq: remove msi hacks
Because of the nasty way that CONFIG_PCI_MSI was implemented we wound up with
set_irq_info and set_native_irq_info, with move_irq and move_native_irq.  Both
functions did the same thing but they were built and called under different
circumstances.  Now that the msi hacks are gone we can kill move_irq and
set_irq_info.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:28 -07:00
Eric W. Biederman
3a16d71362 [PATCH] genirq: irq: add a dynamic irq creation API
With the msi support comes a new concept in irq handling, irqs that are
created dynamically at run time.

Currently the msi code allocates irqs backwards.  First it allocates a
platform dependent routing value for an interrupt the ``vector'' and then it
figures out from the vector which irq you are on.

This msi backwards allocator suffers from two basic problems.  The allocator
suffers because it is trying to do something that is architecture specific in
a generic way making it brittle, inflexible, and tied to tightly to the
architecture implementation.  The alloctor also suffers from it's very
backwards nature as it has tied things together that should have no
dependencies.

To solve the basic dynamic irq allocation problem two new architecture
specific functions are added: create_irq and destroy_irq.

create_irq takes no input and returns an unused irq number, that won't be
reused until it is returned to the free poll with destroy_irq.  The irq then
can be used for any purpose although the only initial consumer is the msi
code.

destroy_irq takes an irq number allocated with create_irq and returns it to
the free pool.

Making this functionality per architecture increases the simplicity of the irq
allocation code and increases it's flexibility.

dynamic_irq_init() and dynamic_irq_cleanup() are added to automate the
irq_desc initializtion that should happen for dynamic irqs.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:27 -07:00
Eric W. Biederman
38bc036130 [PATCH] genirq: msi: refactor the msi_ops
The current msi_ops are short sighted in a number of ways, this patch attempts
to fix the glaring deficiences.

- Report in msi_ops if a 64bit address is needed in the msi message, so we
  can fail 32bit only msi structures.

- Send and receive a full struct msi_msg in both setup and target.  This is
  a little cleaner and allows for architectures that need to modify the data
  to retarget the msi interrupt to a different cpu.

- In target pass in the full cpu mask instead of just the first cpu in case
  we can make use of the full cpu mask.

- Operate in terms of irqs and not vectors, currently there is still a 1-1
  relationship but on architectures other than ia64 I expect this will change.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:27 -07:00
Eric W. Biederman
0366f8f713 [PATCH] genirq: msi: implement helper functions read_msi_msg and write_msi_msg
In support of this I also add a struct msi_msg that captures the the two
address and one data field ina typical msi message, and I remember the pos and
if the address is 64bit in struct msi_desc.

This makes the code a little more readable and easier to maintain, and paves
the way to further simplfications.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:27 -07:00
Eric W. Biederman
e7b946e98a [PATCH] genirq: irq: add moved_masked_irq
Currently move_native_irq disables and renables the irq we are migrating to
ensure we don't take that irq when we are actually doing the migration
operation.  Disabling the irq needs to happen but sometimes doing the work is
move_native_irq is too late.

On x86 with ioapics the irq move sequences needs to be:
edge_triggered:
  mask irq.
  move irq.
  unmask irq.
  ack irq.
level_triggered:
  mask irq.
  ack irq.
  move irq.
  unmask irq.

We can easily perform the edge triggered sequence, with the current defintion
of move_native_irq.  However the level triggered case does not map well.  For
that I have added move_masked_irq, to allow me to disable the irqs around both
the ack and the move.

Q: Why have we not seen this problem earlier?

A: The only symptom I have been able to reproduce is that if we change
   the vector before acknowleding an irq the wrong irq is acknowledged.
   Since we currently are not reprogramming the irq vector during
   migration no problems show up.

   We have to mask the irq before we acknowledge the irq or else we could
   hit a window where an irq is asserted just before we acknowledge it.

   Edge triggered irqs do not have this problem because acknowledgements
   do not propogate in the same way.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:26 -07:00
Eric W. Biederman
a24ceab4f4 [PATCH] genirq: irq: convert the move_irq flag from a 32bit word to a single bit
The primary aim of this patchset is to remove maintenances problems caused by
the irq infrastructure.  The two big issues I address are an artificially
small cap on the number of irqs, and that MSI assumes vector == irq.  My
primary focus is on x86_64 but I have touched other architectures where
necessary to keep them from breaking.

- To increase the number of irqs I modify the code to look at the (cpu,
  vector) pair instead of just looking at the vector.

  With a large number of irqs available systems with a large irq count no
  longer need to compress their irq numbers to fit.  Removing a lot of brittle
  special cases.

  For acpi guys the result is that irq == gsi.

- Addressing the fact that MSI assumes irq == vector takes a few more
  patches.  But suffice it to say when I am done none of the generic irq code
  even knows what a vector is.

In quick testing on a large Unisys x86_64 machine we stumbled over at least
one driver that assumed that NR_IRQS could always fit into an 8 bit number.
This driver is clearly buggy today.  But this has become a class of bugs that
it is now much easier to hit.

This patch:

This is a minor space optimization.  In practice I don't think this has any
affect because of our alignment constraints and the other fields but there is
not point in chewing up an uncessary word and since we already read the flag
field this should improve the cache hit ratio of the irq handler.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rajesh Shah <rajesh.shah@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:26 -07:00
Cedric Le Goater
f7aa2638f2 [PATCH] Fix linux/nfsd/const.h for make headers_check
make headers_check fails on linux/nfsd/const.h.

Since linux/sunrpc/msg_prot.h does not seem to export anything interesting
for userspace, this patch moves it in the __KERNEL__ protected section.

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:24 -07:00
J.Bruce Fields
42ca099381 [PATCH] knfsd: nfsd4: actually use all the pieces to implement referrals
Use all the pieces set up so far to implement referral support, allowing
return of NFS4ERR_MOVED and fs_locations attribute.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:23 -07:00
J.Bruce Fields
81c3f41302 [PATCH] knfsd: nfsd4: xdr encoding for fs_locations
Encode fs_locations attribute.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:23 -07:00
Manoj Naik
933469190e [PATCH] knfsd: nfsd4: fslocations data structures
Define FS locations structures, some functions to manipulate them, and add
code to parse FS locations in downcall and add to the exports structure.

[bfields@fieldses.org: bunch of fixes and cleanups]
Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:22 -07:00
J.Bruce Fields
b009a873de [PATCH] knfsd: nfsd: store export path in export
Store the export path in the svc_export structure instead of storing only the
dentry.  This will prevent the need for additional d_path calls to provide
NFSv4 fs_locations support.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:22 -07:00
Neil Brown
89e63ef609 [PATCH] Convert lockd to use the newer mutex instead of the older semaphore
Both the (recently introduces) nsm_sema and the older f_sema are converted
over.

Cc: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:19 -07:00
Olaf Kirch
bc5fea4299 [PATCH] knfsd: register all RPC programs with portmapper by default
The NFSACL patches introduced support for multiple RPC services listening on
the same transport.  However, only the first of these services was registered
with portmapper.  This was perfectly fine for nfsacl, as you traditionally do
not want these to show up in a portmapper listing.

The patch below changes the default behavior to always register all services
listening on a given transport, but retains the old behavior for nfsacl
services.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:19 -07:00
Olaf Kirch
460f5cac1e [PATCH] knfsd: export nsm_local_state to user space via sysctl
Every NLM call includes the client's NSM state.  Currently, the Linux client
always reports 0 - which seems not to cause any problems, but is not what the
protocol says.

This patch exposes the kernel's internal variable to user space via a sysctl,
which can be set at system boot time by statd.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:18 -07:00
Olaf Kirch
39be4502cb [PATCH] knfsd: match GRANTED_RES replies using cookies
When we send a GRANTED_MSG call, we current copy the NLM cookie provided in
the original LOCK call - because in 1996, some broken clients seemed to rely
on this bug.  However, this means the cookies are not unique, so that when the
client's GRANTED_RES message comes back, we cannot simply match it based on
the cookie, but have to use the client's IP address in addition.  Which breaks
when you have a multi-homed NFS client.

The X/Open spec explicitly mentions that clients should not expect the same
cookie; so one may hope that any clients that were broken in 1996 have either
been fixed or rendered obsolete.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:18 -07:00
Olaf Kirch
031d869d0e [PATCH] knfsd: make nlmclnt_next_cookie SMP safe
The way we incremented the NLM cookie in nlmclnt_next_cookie was not thread
safe.  This patch changes the counter to an atomic_t

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:17 -07:00
Olaf Kirch
abd1f50094 [PATCH] knfsd: lockd: optionally use hostnames for identifying peers
This patch adds the nsm_use_hostnames sysctl and module param.  If set, lockd
will use the client's name (as given in the NLM arguments) to find the NSM
handle.  This makes recovery work when the NFS peer is multi-homed, and the
reboot notification arrives from a different IP than the original lock calls.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:17 -07:00
NeilBrown
350fce8dbf [PATCH] knfsd: simplify nlmsvc_invalidate_all
As a result of previous patches, the loop in nlmsvc_invalidate_all just sets
h_expires for all client/hosts to 0 (though does it in a very complicated
way).

This was possibly meant to trigger early garbage collection but half the time
'0' is in the future and so it infact delays garbage collection.

Pre-aging the 'hosts' is not really needed at this point anyway so we throw
out the loop and nlm_find_client which is no longer needed.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:17 -07:00
Olaf Kirch
f2af793db0 [PATCH] knfsd: lockd: make nlm_traverse_* more flexible
This patch makes nlm_traverse{locks,blocks,shares} and friends use a function
pointer rather than a "action" enum.

This function pointer is given two nlm_hosts (one given by the caller, the
other taken from the lock/block/share currently visited), and is free to do
with them as it wants.  If it returns a non-zero value, the lockd/block/share
is released.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:17 -07:00
Olaf Kirch
07ba806351 [PATCH] knfsd: change nlm_file to use a hlist
This changes struct nlm_file and the nlm_files hash table to use a hlist
instead of the home-grown lists.

This allows us to remove f_hash which was only used to find the right hash
chain to delete an entry from.

It also increases the size of the nlm_files hash table from 32 to 128.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:17 -07:00
Olaf Kirch
68a2d76cea [PATCH] knfsd: lockd: Change list of blocked list to list_node
This patch changes the nlm_blocked list to use a list_node instead of
homegrown linked list handling.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:17 -07:00
Olaf Kirch
0cea32761a [PATCH] knfsd: lockd: make the hash chains use a hlist_node
Get rid of the home-grown singly linked lists for the nlm_host hash table.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:17 -07:00
Olaf Kirch
9502c52259 [PATCH] knfsd: lockd: make the nsm upcalls use the nsm_handle
This converts the statd upcalls to use the nsm_handle

This means that we only register each host once with statd, rather than
registering each host/vers/protocol triple.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:17 -07:00
Olaf Kirch
5c8dd29ca7 [PATCH] knfsd: lockd: Make nlm_host_rebooted use the nsm_handle
This patch makes the SM_NOTIFY handling understand and use the nsm_handle.

To make it a bit clear what is happening:

    nlmclent_prepare_reclaim and nlmclnt_finish_reclaim
    get open-coded into 'reclaimer'

The result is tidied up.

Then some of that functionality is moved out into nlm_host_rebooted (which
calls nlmclnt_recovery which starts a thread which runs reclaimer).

Also host_rebooted now finds an nsm_handle rather than a host, then then
iterates over all hosts and deals with each host that shares that nsm_handle.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:17 -07:00
Olaf Kirch
8dead0dbd4 [PATCH] knfsd: lockd: introduce nsm_handle
This patch introduces the nsm_handle, which is shared by all nlm_host objects
referring to the same client.

With this patch applied, all nlm_hosts from the same address will share the
same nsm_handle.  A future patch will add sharing by name.

Note: this patch changes h_name so that it is no longer guaranteed to be an IP
address of the host.  When the host represents an NFS server, h_name will be
the name passed in the mount call.  When the host represents a client, h_name
will be the name presented in the lock request received from the client.  A
h_name is only used for printing informational messages, this change should
not be significant.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:16 -07:00
Olaf Kirch
db4e4c9a9e [PATCH] knfsd: when looking up a lockd host, pass hostname & length
This patch adds the peer's hostname (and name length) to all calls to
nlm*_lookup_host functions.  A subsequent patch will make use of these (is
requested by a sysctl).

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:16 -07:00
Olaf Kirch
cf712c24d7 [PATCH] knfsd: consolidate common code for statd->lockd notification
Common code from nlm4svc_proc_sm_notify and nlmsvc_proc_sm_notify is moved
into a new nlm_host_rebooted.

This is in preparation of a patch that will change the reboot notification
handling entirely.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:16 -07:00
Greg Banks
7b2b1fee30 [PATCH] knfsd: knfsd: cache ipmap per TCP socket
Speed up high call-rate workloads by caching the struct ip_map for the peer on
the connected struct svc_sock instead of looking it up in the ip_map cache
hashtable on every call.  This helps workloads using AUTH_SYS authentication
over TCP.

Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16
synthetic client threads simulating an rsync (i.e.  recursive directory
listing) workload reading from an i386 RH9 install image (161480 regular files
in 10841 directories) on the server.  That tree is small enough to fill in the
server's RAM so no disk traffic was involved.  This setup gives a sustained
call rate in excess of 60000 calls/sec before being CPU-bound on the server.

Profiling showed strcmp(), called from ip_map_match(), was taking 4.8% of each
CPU, and ip_map_lookup() was taking 2.9%.  This patch drops both contribution
into the profile noise.

Note that the above result overstates this value of this patch for most
workloads.  The synthetic clients are all using separate IP addresses, so
there are 64 entries in the ip_map cache hash.  Because the kernel measured
contained the bug fixed in commit

commit 1f1e030bf7

and was running on 64bit little-endian machine, probably all of those 64
entries were on a single chain, thus increasing the cost of ip_map_lookup().

With a modern kernel you would need more clients to see the same amount of
performance improvement.  This patch has helped to scale knfsd to handle a
deployment with 2000 NFS clients.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:16 -07:00
NeilBrown
596bbe53eb [PATCH] knfsd: Allow max size of NFSd payload to be configured
The max possible is the maximum RPC payload.  The default depends on amount of
total memory.

The value can be set within reason as long as no nfsd threads are currently
running.  The value can also be ready, allowing the default to be determined
after nfsd has started.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:16 -07:00
Greg Banks
7adae489fe [PATCH] knfsd: Prepare knfsd for support of rsize/wsize of up to 1MB, over TCP
The limit over UDP remains at 32K.  Also, make some of the apparently
arbitrary sizing constants clearer.

The biggest change here involves replacing NFSSVC_MAXBLKSIZE by a function of
the rqstp.  This allows it to be different for different protocols (udp/tcp)
and also allows it to depend on the servers declared sv_bufsiz.

Note that we don't actually increase sv_bufsz for nfs yet.  That comes next.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:16 -07:00
NeilBrown
3cc03b164c [PATCH] knfsd: Avoid excess stack usage in svc_tcp_recvfrom
..  by allocating the array of 'kvec' in 'struct svc_rqst'.

As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer
allocate an array of this size on the stack.  So we allocate it in 'struct
svc_rqst'.

However svc_rqst contains (indirectly) an array of the same type and size
(actually several, but they are in a union).  So rather than waste space, we
move those arrays out of the separately allocated union and into svc_rqst to
share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at
different times, so there is no conflict).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:15 -07:00
NeilBrown
4452435948 [PATCH] knfsd: Replace two page lists in struct svc_rqst with one
We are planning to increase RPCSVC_MAXPAGES from about 8 to about 256.  This
means we need to be a bit careful about arrays of size RPCSVC_MAXPAGES.

struct svc_rqst contains two such arrays.  However the there are never more
that RPCSVC_MAXPAGES pages in the two arrays together, so only one array is
needed.

The two arrays are for the pages holding the request, and the pages holding
the reply.  Instead of two arrays, we can simply keep an index into where the
first reply page is.

This patch also removes a number of small inline functions that probably
server to obscure what is going on rather than clarify it, and opencode the
needed functionality.

Also remove the 'rq_restailpage' variable as it is *always* 0.  i.e.  if the
response 'xdr' structure has a non-empty tail it is always in the same pages
as the head.

 check counters are initilised and incr properly
 check for consistant usage of ++ etc
 maybe extra some inlines for common approach
 general review

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Magnus Maatta <novell@kiruna.se>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:15 -07:00
Alex Dubov
4020f2d7f0 [PATCH] mmc: driver for TI FlashMedia card reader - source
Driver for TI Flash Media card reader.  At present, only MMC/SD cards are
supported.

[akpm@osdl.org: cleanups, build fixes]
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Cc: Daniel Qarras <dqarras@yahoo.com>
Acked-by: Pierre Ossman <drzeus@drzeus.cx>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:14 -07:00
Jim Cromie
856fe98f16 [PATCH] scx200_hrt: fix precedence bug manifesting as 27x clock in 1 MHz mode
Fix paren-placement / precedence bug breaking initialization for 1 MHz
clock mode.

Also fix comment spelling error, and fence-post (off-by-one) error on
symbol used in request_region.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=7242

Thanks alexander.krause@erazor-zone.de, dzpost@dedekind.net, for the
reports and patch test, and phelps@mantara.com for the independent patch
and verification.

Signed-off-by:  Jim Cromie <jim.cromie@gmail.com>
Cc: <alexander.krause@erazor-zone.de>
Cc: <dzpost@dedekind.net>
Cc: <phelps@mantara.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:14 -07:00
Christoph Hellwig
1d2c8eea69 [PATCH] slab: clean up leak tracking ifdefs a little bit
- rename ____kmalloc to kmalloc_track_caller so that people have a chance
  to guess what it does just from it's name.  Add a comment describing it
  for those who don't.  Also move it after kmalloc in slab.h so people get
  less confused when they are just looking for kmalloc - move things around
  in slab.c a little to reduce the ifdef mess.

[penberg@cs.helsinki.fi: Fix up reversed #ifdef]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:13 -07:00
Cedric Le Goater
b119f13f56 [PATCH] ipc: headers_check fix
Fix headers_check #ifdef __KERNEL__ stuff.

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
All-the-fault-of: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:12 -07:00
Kyle McMartin
f86e45131f [PATCH] Need forward decl of task_struct in linux/debug_locks.h
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2006-10-04 06:45:23 -06:00
Steve Grubb
c8e649ba90 [PATCH] message types updated
Hi,

This patch adds a new type for 3rd party module use and cleans up a deprecated
message type.

Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-10-04 08:31:24 -04:00
Dave Jones
038b0a6d8d Remove all inclusions of <linux/config.h>
kbuild explicitly includes this at build time.

Signed-off-by: Dave Jones <davej@redhat.com>
2006-10-04 03:38:54 -04:00
Diego Beltrami
0a69452cb4 [XFRM]: BEET mode
This patch introduces the BEET mode (Bound End-to-End Tunnel) with as
specified by the ietf draft at the following link:

http://www.ietf.org/internet-drafts/draft-nikander-esp-beet-mode-06.txt

The patch provides only single family support (i.e. inner family =
outer family).

Signed-off-by: Diego Beltrami <diego.beltrami@gmail.com>
Signed-off-by: Miika Komu     <miika@iki.fi>
Signed-off-by: Herbert Xu     <herbert@gondor.apana.org.au>
Signed-off-by: Abhinav Pathak <abhinav.pathak@hiit.fi>
Signed-off-by: Jeff Ahrenholz <ahrenholz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-04 00:31:09 -07:00
Bart De Schuymer
b18dfa90c0 [NETFILTER]: ebt_mark: add or/and/xor action support to mark target
The following patch adds or/and/xor functionality for the mark target,
while staying backwards compatible.

Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-04 00:30:57 -07:00
Simon Horman
b4c4ed175f [NETFILTER]: add type parameter to ip_route_me_harder
By adding a type parameter to ip_route_me_harder() the
expensive call to inet_addr_type() can be avoided in some cases.
A followup patch where ip_route_me_harder() is called from within
ip_vs_out() is one such example.

Signed-off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-10-04 00:30:54 -07:00
Li Yang
9865853851 [POWERPC] Add QUICC Engine (QE) infrastructure
Add QUICC Engine (QE) configuration, header files, and
QE management and library code that are used by QE devices
drivers.

Includes Leo's modifications up to, and including, the
platform_device to of_device adaptation:

"The series of patches add generic QE infrastructure called
qe_lib, and MPC8360EMDS board support.  Qe_lib is used by
QE device drivers such as ucc_geth driver.

This version updates QE interrupt controller to use new irq
mapping mechanism, addresses all the comments received with
last submission and includes some style fixes.

v2: Change to use device tree for BCSR and MURAM;
Remove I/O port interrupt handling code as it is not generic
enough.

v3: Address comments from Kumar;  Update definition of several
device tree nodes;  Copyright style change."

In addition, the following changes have been made:

o removed typedefs
o uint -> u32 conversions
o removed following defines:
  QE_SIZEOF_BD, BD_BUFFER_ARG, BD_BUFFER_CLEAR, BD_BUFFER,
  BD_STATUS_AND_LENGTH_SET, BD_STATUS_AND_LENGTH, and BD_BUFFER_SET
  because they hid sizeof/in_be32/out_be32 operations from the reader.
o fixed qe_snums_init() serial num assignment to use a const array
o made CONFIG_UCC_FAST select UCC_SLOW
o reduced NR_QE_IC_INTS from 128 to 64
o remove _IO_BASE, etc. defines (not used)
o removed irrelevant comments, added others to resemble removed BD_ defines
o realigned struct definitions in headers
o various other style fixes including things like pinMask -> pin_mask
o fixed a ton of whitespace issues
o marked ioregs as __be32/__be16
o removed platform_device code and redundant get_qe_base()
o removed redundant comments
o added cpu_relax() to qe_reset
o uncasted all get_property() assignments
o eliminated unneeded casts
o eliminated immrbar_phys_to_virt (not used)

Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Shlomi Gridish <gridish@freescale.com>
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-10-04 15:24:27 +10:00
Linus Torvalds
708e16892e Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (39 commits)
  Add missing maintainer countries in CREDITS
  Fix bytes <-> kilobytes  typo in Kconfig for ramdisk
  fix a typo in Documentation/pi-futex.txt
  BUG_ON conversion for fs/xfs/
  BUG_ON() conversion in fs/nfsd/
  BUG_ON conversion for fs/reiserfs
  BUG_ON cleanups in arch/i386
  BUG_ON cleanup in drivers/net/tokenring/
  BUG_ON cleanup for drivers/md/
  kerneldoc-typo in led-class.c
  debugfs: spelling fix
  rcutorture: Fix incorrect description of default for nreaders parameter
  parport: Remove space in function calls
  Michal Wronski: update contact info
  Spelling fix: "control" instead of "cotrol"
  reboot parameter in Documentation/kernel-parameters.txt
  Fix copy&waste bug in comment in scripts/kernel-doc
  remove duplicate "until" from kernel/workqueue.c
  ite_gpio fix tabbage
  fix file specification in comments
  ...

Fixed trivial path conflicts due to removed files:
   arch/mips/dec/boot/decstation.c, drivers/char/ite_gpio.c
2006-10-03 16:35:11 -07:00
Uwe Zeisberger
f30c226954 fix file specification in comments
Many files include the filename at the beginning, serveral used a wrong one.

Signed-off-by: Uwe Zeisberger <Uwe_Zeisberger@digi.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-10-03 23:01:26 +02:00
Linus Torvalds
e6bf0bf374 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  [MIPS] Fix wreckage after removal of tickadj; convert to GENERIC_TIME.
  [MIPS] DECstation defconfig update
  [MIPS] Fix size of zones_size and zholes_size array
  [MIPS] BCM1480: Mask pending interrupts against c0_status.im.
  [MIPS] SB1250: Interrupt handler fixes
  [MIPS] Remove IT8172-based platforms, ITE 8172G and Globespan IVR support.
  [MIPS] Remove Atlas and SEAD from feature-removal-schedule.
  [MIPS] Remove Jaguar and Ocelot family from feature list.
  [MIPS] BCM1250: TRDY timeout tweaks for Broadcom SiByte systems
  [MIPS] Remove dead DECstation boot code
  [MIPS] Let gcc align 'struct pt_regs' on 8 bytes boundary
2006-10-03 13:03:40 -07:00
Mauro Carvalho Chehab
dcc29cbcec V4L/DVB (4673): Mark the two newer ioctls as experimental
VIDIOC_ENUM_FRAMESIZES and VIDIOC_ENUM_FRAMEINTERVALS ioctls are meant
to be used to provide better support for webcams. Currently, it is not yet
used on kernel drivers.
Better to keep it marked as experimental, until we have several kernel drivers
supporting those features.

Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-10-03 15:14:16 -03:00
Laurent Pinchart
92b2db08b1 V4L/DVB (4672): Frame format enumeration (1/2)
Add VIDIOC_ENUM_FRAMESIZES and VIDIOC_ENUM_FRAMEINTERVALS ioctls to enumerate 
supported frame sizes and frame intervals.

Signed-off-by: Martin Rubli <martin.rubli@epfl.ch>
Signed-off-by: Laurent Pinchart <laurent.pinchart@skynet.be>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-10-03 15:14:15 -03:00
Yoichi Yuasa
af8b128719 [MIPS] Remove IT8172-based platforms, ITE 8172G and Globespan IVR support.
As per feature-removal-schedule.txt.

Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2006-10-03 17:59:17 +01:00
Maciej W. Rozycki
15a1c51404 [MIPS] BCM1250: TRDY timeout tweaks for Broadcom SiByte systems
It was obesrved that at least one older PCI card predating the
requirement for the TRDY signal to respond within 16 clock ticks actually
does not meet this rule nor even the power-on defaults of the PCI bridges
found in development systems built around the Broadcom SiByte SOCs.  Here
is a patch that bumps up the timeout to the highest finite value supported
by these chips, which is 255 clock ticks.  The bridges affected are the
SiByte SOC itself and the SP1011.
    
 This change does not effectively affect systems only having PCI option
cards installed that meet the TRDY requirement of the current PCI spec.
The rule was introduced with PCI 2.1, so any older card may make the
system affected.  If this is the case, performance of the system will
suffer in return for the card working at all.  If this is a concern, then
the solution is not to use such cards.
    
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

---
2006-10-03 17:59:17 +01:00
Linus Torvalds
6f3a28f7d1 Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-serial
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-serial: (21 commits)
  [SERIAL] add PNP IDs for FPI based touchscreens
  [SERIAL] Magic SysRq SAK does nothing on serial consoles
  [SERIAL] tickle NMI watchdog on serial output.
  [SERIAL] Fix oops when removing suspended serial port
  [SERIAL] Fix resume handling bug
  [SERIAL] Remove wrong asm/serial.h inclusions
  [SERIAL] CONFIG_PM=n slim: drivers/serial/8250_pci.c
  [SERIAL] OMAP1510 serial fix for 115200 baud
  [SERIAL] returning proper error from serial core driver
  [SERIAL] Make uart_line_info() correctly tell MMIO from I/O port
  [SERIAL] suspend/resume handlers don't have level arg anymore
  [SERIAL] 8250 resourse management fixes
  [SERIAL] serial_cs: Add quirk for brainboxes 2-port RS232 card
  [SERIAL] serial_cs: handle Nokia multi->single port bodge via config quirk
  [SERIAL] serial_cs: add configuration quirk
  [SERIAL] serial_cs: Convert Oxford 950 / Possio GCC wakeup quirk
  [SERIAL] serial_cs: convert IBM post-init handling to a quirk
  [SERIAL] serial_cs: allow wildcarded quirks
  [SERIAL] serial_cs: convert multi-port table to quirk table
  [SERIAL] serial_cs: Use clean up multiport card detection
  ...
2006-10-03 09:13:29 -07:00
Linus Torvalds
ccaa36f735 Merge git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (29 commits)
  [POWERPC] Fix rheap alignment problem
  [POWERPC] Use check_legacy_ioport() for ISAPnP
  [POWERPC] Avoid NULL pointer in gpio1_interrupt
  [POWERPC] Enable generic rtc hook for the MPC8349 mITX
  [POWERPC] Add powerpc get/set_rtc_time interface to new generic rtc class
  [POWERPC] Create a "wrapper" script and use it in arch/powerpc/boot
  [POWERPC] fix spin lock nesting in hvc_iseries
  [POWERPC] EEH failure to mark pci slot as frozen.
  [POWERPC] update powerpc defconfig files after libata kconfig breakage
  [POWERPC] enable sysrq in pmac32_defconfig
  [POWERPC] UPIO_TSI cleanup
  [POWERPC] rewrite mkprep and mkbugboot in sane C
  [POWERPC] maple/pci iomem annotations
  [POWERPC] powerpc oprofile __user annotations
  [POWERPC] cell spufs iomem annotations
  [POWERPC] NULL noise removal: spufs
  [POWERPC] ppc math-emu needs -fno-builtin-fabs for math.c and fabs.c
  [POWERPC] update mpc8349_itx_defconfig and remove some debug settings
  [POWERPC] Always call cede in pseries dedicated idle loop
  [POWERPC] Fix loop logic in irq_alloc_virt()
  ...
2006-10-03 08:52:26 -07:00
Zach Brown
8b2a1fd1b3 [PATCH] pr_debug: check pr_debug() arguments
check pr_debug() arguments

When DEBUG isn't defined pr_debug() is defined away as an empty macro.  By
throwing away the arguments we allow completely incorrect code to build.

Instead let's make it an empty inline which checks arguments and mark it so gcc
can check the format specification.

This results in a seemingly insignificant code size increase.  A x86-64
allyesconfig:

   text    data     bss     dec     hex filename
25354768        7191098 4854720 37400586        23ab00a vmlinux.before
25354945        7191138 4854720 37400803        23ab0e3 vmlinux

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:20 -07:00
Paul Clements
d19c2ee0b8 [PATCH] md: allow SET_BITMAP_FILE to work on 64bit kernel with 32bit userspace
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:19 -07:00
NeilBrown
e8703fe1f5 [PATCH] md: remove MAX_MD_DEVS which is an arbitrary limit
Once upon a time we needed to fixed limit to the number of md devices,
probably because we preallocated some array.  This need no longer exists, but
we still have an arbitrary limit.

So remove MAX_MD_DEVS and allow as many devices as we can fit into the 'minor'
part of a device number.

Also remove some useless noise at init time (which reports MAX_MD_DEVS) and
remove MD_THREAD_NAME_MAX which hasn't been used for a while.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:18 -07:00
NeilBrown
11ce99e625 [PATCH] md: Remove working_disks from raid1 state data
It is equivalent to conf->raid_disks - conf->mddev->degraded.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
Paul Clements
9b1d1dac18 [PATCH] md: new sysfs interface for setting bits in the write-intent-bitmap
Add a new sysfs interface that allows the bitmap of an array to be dirtied.
The interface is write-only, and is used as follows:

echo "1000" > /sys/block/md2/md/bitmap

(dirty the bit for chunk 1000 [offset 0] in the in-memory and on-disk
bitmaps of array md2)

echo "1000-2000" > /sys/block/md1/md/bitmap

(dirty the bits for chunks 1000-2000 in md1's bitmap)

This is useful, for example, in cluster environments where you may need to
combine two disjoint bitmaps into one (following a server failure, after a
secondary server has taken over the array).  By combining the bitmaps on
the two servers, a full resync can be avoided (This was discussed on the
list back on March 18, 2005, "[PATCH 1/2] md bitmap bug fixes" thread).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
NeilBrown
76186dd8b7 [PATCH] md: remove 'working_disks' from raid10 state
It isn't needed as mddev->degraded contains equivalent info.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
NeilBrown
02c2de8cc8 [PATCH] md: remove the working_disks and failed_disks from raid5 state data.
They are not needed.  conf->failed_disks is the same as mddev->degraded and
conf->working_disks is conf->raid_disks - mddev->degraded.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
NeilBrown
850b2b420c [PATCH] md: replace magic numbers in sb_dirty with well defined bit flags
Instead of magic numbers (0,1,2,3) in sb_dirty, we have
some flags instead:
MD_CHANGE_DEVS
   Some device state has changed requiring superblock update
   on all devices.
MD_CHANGE_CLEAN
   The array has transitions from 'clean' to 'dirty' or back,
   requiring a superblock update on active devices, but possibly
   not on spares
MD_CHANGE_PENDING
   A superblock update is underway.

We wait for an update to complete by waiting for all flags to be clear.  A
flag can be set at any time, even during an update, without risk that the
change will be lost.

Stop exporting md_update_sb - isn't needed.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:17 -07:00
NeilBrown
b5c124af69 [PATCH] md: fix a comment that is wrong in raid5.h
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:16 -07:00
Adrian Bunk
fbedac04fa [PATCH] md: the scheduled removal of the START_ARRAY ioctl for md
This patch contains the scheduled removal of the START_ARRAY ioctl for md.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:16 -07:00
Bryn Reeves
999d816851 [PATCH] dm table: add target flush
This patch adds support for a per-target dm_flush_fn method.  This is needed
to allow dm-loop to invalidate page cache mappings in response to BLKFLSBUF
ioctl commands.

Signed-off-by: Bryn Reeves <breeves@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:16 -07:00
Bryn Reeves
3cb4021453 [PATCH] dm: extract device limit setting
Separate the setting of device I/O limits from dm_get_device().  dm-loop will
use this.

Signed-off-by: Bryn Reeves <breeves@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:16 -07:00
Milan Broz
8757b7764f [PATCH] dm table: add target preresume
This patch adds a target preresume hook.

It is called before the targets are resumed and if it returns an error the
resume gets cancelled.

The crypt target will use this to indicate that it is unable to process I/O
because no encryption key has been supplied.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:15 -07:00
Alasdair G Kergon
7006f6eca8 [PATCH] dm: export blkdev_driver_ioctl
Export blkdev_driver_ioctl for device-mapper.

If we get as far as the device-mapper ioctl handler, we know the ioctl is not
a standard block layer BLK* one, so we don't need to check for them a second
time and can call blkdev_driver_ioctl() directly.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:13 -07:00
Milan Broz
aa129a2247 [PATCH] dm: support ioctls on mapped devices
Extend the core device-mapper infrastructure to accept arbitrary ioctls on a
mapped device provided that it has exactly one target and it is capable of
supporting ioctls.

[We can't use unlocked_ioctl because we need 'inode': 'file' might be NULL.
Is it worth changing this?]

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

Arnd Bergmann <arnd@arndb.de> wrote:

> Am Wednesday 21 June 2006 21:31 schrieb Alasdair G Kergon:
> > static struct block_device_operations dm_blk_dops = {
> > .open = dm_blk_open,
> > .release = dm_blk_close,
> > +.ioctl = dm_blk_ioctl,
> > .getgeo = dm_blk_getgeo,
> > .owner = THIS_MODULE
>
> I guess this also needs a ->compat_ioctl method, otherwise it won't
> work for ioctl numbers that have a compat_ioctl implementation in the
> low-level device driver.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:13 -07:00
Adrian Bunk
3cb340ecbb [PATCH] vt: proper prototypes for some console functions
This patch adds proper prototypes to header files for three console init
functions used on drivers/char/vt.c

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:12 -07:00
Antonino A. Daplas
1a6600be3e [PATCH] fbdev: Honor the return value of device_create_file
Check the return value of device_create_file().  If return is 'fail', remove
attributes by calling device_remove_file().

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:10 -07:00
Dennis Munsie
fc5891c8a3 [PATCH] fbdev: Add generic ddc read functionality
Adds functionality to read the EDID information over the DDC bus in a generic
way.  This code is based on the DDC implementation in the radeon driver.

[adaplas]
- separate from fbmon.c and place in new file fb_ddc.c
- remove dependency to CONFIG_I2C and CONFIG_I2C_ALGOBIT, otherwise, feature
  will not compile if i2c support is compiled as a module
- feature is selectable only by drivers needing it. It must have a
  'select FB_DDC if xxx' in Kconfig
- change printk's to dev_*, the i2c people prefers it

Signed-off-by: Dennis Munsie <dmunsie@cecropia.com>
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:09 -07:00
Alan Cox
913759ac90 [PATCH] ide: Fix crash on repeated reset
Michal Miroslaw reported a problem (bugzilla #7023) where a user initiated
reset while the IDE layer was already resetting the channel caused a crash,
and provided a rough fix.

This is a slightly cleaner version of the fix which tracks the reset state
and blocks further reset requests while a reset is in progress.

Note this is not a security issue - random end users can't access the
ioctl in question anyway.

Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:08 -07:00
Sergei Shtylyov
3f63c5e88a [PATCH] ide: remove dma_base2 field from ide_hwif_t
Remove dma_base2 field from ide_hwif_t as it's used only in 2 drivers and
without great need.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: John Keller <jpk@sgi.com>
Signed-off-by: Jeremy Higdon <jeremy@sgi.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:08 -07:00
Adrian Bunk
27ac6036f3 [PATCH] drivers/ide/: cleanups
- setup-pci.c: remove the unused ide_pci_unregister_driver()
- ide-dma.c: remove the unused EXPORT_SYMBOL_GPL(ide_in_drive_list)

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:08 -07:00
Matt Mackall
83d7dbc409 [PATCH] Make number of IDE interfaces configurable
Make IDE_HWIFS configurable if EMBEDDED

This lets us lop as much as 16k off an x86 build.  It's a little ugly, but
it's dead simple.  Note the fix for HWIFS < 2.

Sizing interfaces dynamically unfortunately turns out to be pretty
major surgery.

add/remove: 0/1 grow/shrink: 0/11 up/down: 0/-16182 (-16182)
function                                     old     new   delta
ide_hwifs                                  16920    1692  -15228
init_irq                                    1113     750    -363
ideprobe_init                                283     138    -145
ide_pci_setup_ports                         1329    1193    -136
save_match                                    85       -     -85
ide_register_hw_with_fixup                   367     287     -80
ide_setup                                   1364    1308     -56
is_chipset_set                                40       4     -36
create_proc_ide_interfaces                   225     205     -20
init_ide_data                                 84      67     -17
ide_probe_for_cmd640x                       1198    1183     -15
ide_unregister                              1452    1451      -1

Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:07 -07:00
Sergei Shtylylov
020e322de3 [PATCH] IDE: claim extra DMA ports regardless of channel
- Claim extra DMA I/O ports regardless of what IDE channels are
  present/enabled.

- Remove extra ports handling from ide_mapped_mmio_dma() since it's not
  applicable to the custom-mapping IDE drivers.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:07 -07:00
Siddha, Suresh B
89c4710ee9 [PATCH] sched: cleanup sched_group cpu_power setup
Up to now sched group's cpu_power for each sched domain is initialized
independently.  This made the setup code ugly as the new sched domains are
getting added.

Make the sched group cpu_power setup code generic, by using domain child
field and new domain flag in sched_domain.  For most of the sched
domains(except NUMA), sched group's cpu_power is now computed generically
using the domain properties of itself and of the child domain.

sched groups in NUMA domains are setup little differently and hence they
don't use this generic mechanism.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:06 -07:00
Siddha, Suresh B
1a84887080 [PATCH] sched: introduce child field in sched_domain
Introduce the child field in sched_domain struct and use it in
sched_balance_self().

We will also use this field in cleaning up the sched group cpu_power
setup(done in a different patch) code.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:04:06 -07:00
Franck Bui-Huu
ffc5089196 [PATCH] Create kallsyms_lookup_size_offset()
Some uses of kallsyms_lookup() do not need to find out the name of a symbol
and its module's name it belongs.  This is specially true in arch specific
code, which needs to unwind the stack to show the back trace during oops
(mips is an example).  In this specific case, we just need to retreive the
function's size and the offset of the active intruction inside it.

Adds a new entry "kallsyms_lookup_size_offset()" This new entry does
exactly the same as kallsyms_lookup() but does not require any buffers to
store any names.

It returns 0 if it fails otherwise 1.

Signed-off-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:03:41 -07:00
David Howells
afefdbb28a [PATCH] VFS: Make filldir_t and struct kstat deal in 64-bit inode numbers
These patches make the kernel pass 64-bit inode numbers internally when
communicating to userspace, even on a 32-bit system.  They are required
because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS
for example.  The 64-bit inode numbers are then propagated to userspace
automatically where the arch supports it.

Problems have been seen with userspace (eg: ld.so) using the 64-bit inode
number returned by stat64() or getdents64() to differentiate files, and
failing because the 64-bit inode number space was compressed to 32-bits, and
so overlaps occur.

This patch:

Make filldir_t take a 64-bit inode number and struct kstat carry a 64-bit
inode number so that 64-bit inode numbers can be passed back to userspace.

The stat functions then returns the full 64-bit inode number where
available and where possible.  If it is not possible to represent the inode
number supplied by the filesystem in the field provided by userspace, then
error EOVERFLOW will be issued.

Similarly, the getdents/readdir functions now pass the full 64-bit inode
number to userspace where possible, returning EOVERFLOW instead when a
directory entry is encountered that can't be properly represented.

Note that this means that some inodes will not be stat'able on a 32-bit
system with old libraries where they were before - but it does mean that
there will be no ambiguity over what a 32-bit inode number refers to.

Note similarly that directory scans may be cut short with an error on a
32-bit system with old libraries where the scan would work before for the
same reasons.

It is judged unlikely that this situation will occur because modern glibc
uses 64-bit capable versions of stat and getdents class functions
exclusively, and that older systems are unlikely to encounter
unrepresentable inode numbers anyway.

[akpm: alpha build fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:03:40 -07:00
Andrew Morton
1d32849b14 [PATCH] pid.h cleanup
Make the pid.h macros look less revolting in an 80-col window.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 08:03:40 -07:00
Linus Torvalds
0235497f7a Add prototype for sigset_from_compat()
Duh.  I screwed up editing David Howells patch in commit
3f2e05e90e, and the actual declaration for
the sigset_from_compat() function went missing. My bad.

Olaf Hering saved the day and noticed that I'm a moron.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 14:05:20 -07:00
Steven Whitehouse
128e5ebaf8 [GFS2] Remove iflags.h, use FS_
Update GFS2 in the light of David Howells' patch:

[PATCH] BLOCK: Move common FS-specific ioctls to linux/fs.h [try #6]
36695673b0

which calls the filesystem independant flags FS_..._FL. As a result
we no longer need the flags.h file and the conversion routine is
moved into the GFS2 source code.

Userland programs which used to include iflags.h should now include
fs.h and use the new flag names.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2006-10-02 11:24:43 -04:00
Linus Torvalds
3e04767a46 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6:
  [MTD] Cleanup of 'ioremap balanced with iounmap for drivers/mtd subsystem'
  [MTD] fix nftl_write warning
  [MTD] fix printk warning
  [MTD ONENAND] Check OneNAND lock scheme & all block unlock command support
  [MTD ONENAND] Remove unused MTD_ONENAND_SYNC_READ configuration
  [MTD ONENAND] Fix OneNAND probe
  [MTD NAND] Provide prototype for newly-exported nand_wait_ready()
  [MTD] Remove #ifndef __KERNEL__ hack in <mtd/mtd-abi.h>
  [MTD NAND] Allow override of page read and write functions.
  [MTD NAND] Allocate chip->buffers separately to allow it to be overridden
  [MTD NAND] Split nand_scan() into two parts; allow board driver to intervene
  [MTD NAND] Export nand_wait_ready() for use by board drivers
2006-10-02 08:22:17 -07:00
Linus Torvalds
a12f66fccf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (35 commits)
  Input: wistron - add support for Acer TravelMate 2424NWXCi
  Input: wistron - fix setting up special buttons
  Input: add KEY_BLUETOOTH and KEY_WLAN definitions
  Input: add new BUS_VIRTUAL bus type
  Input: add driver for stowaway serial keyboards
  Input: make input_register_handler() return error codes
  Input: remove cruft that was needed for transition to sysfs
  Input: fix input module refcounting
  Input: constify input core
  Input: libps2 - rearrange exports
  Input: atkbd - support Microsoft Natural Elite Pro keyboards
  Input: i8042 - disable MUX mode on Toshiba Equium A110
  Input: i8042 - get rid of polling timer
  Input: send key up events at disconnect
  Input: constify psmouse driver
  Input: i8042 - add Amoi to the MUX blacklist
  Input: logips2pp - add sugnature 56 (Cordless MouseMan Wheel), cleanup
  Input: add driver for Touchwin serial touchscreens
  Input: add driver for Touchright serial touchscreens
  Input: add driver for Penmount serial touchscreens
  ...
2006-10-02 08:20:33 -07:00
David Howells
3f2e05e90e [PATCH] BLOCK: Revert patch to hack around undeclared sigset_t in linux/compat.h
Revert Andrew Morton's patch to temporarily hack around the lack of a
declaration of sigset_t in linux/compat.h to make the block-disablement
patches build on IA64.  This got accidentally pushed to Linus and should
be fixed in a different manner.

Also make linux/compat.h #include asm/signal.h to gain a definition of
sigset_t so that it can externally declare sigset_from_compat().

This has been compile-tested for i386, x86_64, ia64, mips, mips64, frv, ppc and
ppc64 and run-tested on frv.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 08:03:31 -07:00
Cedric Le Goater
9ec52099e4 [PATCH] replace cad_pid by a struct pid
There are a few places in the kernel where the init task is signaled.  The
ctrl+alt+del sequence is one them.  It kills a task, usually init, using a
cached pid (cad_pid).

This patch replaces the pid_t by a struct pid to avoid pid wrap around
problem.  The struct pid is initialized at boot time in init() and can be
modified through systctl with

	/proc/sys/kernel/cad_pid

[ I haven't found any distro using it ? ]

It also introduces a small helper routine kill_cad_pid() which is used
where it seemed ok to use cad_pid instead of pid 1.

[akpm@osdl.org: cleanups, build fix]
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:25 -07:00
Oleg Nesterov
1a657f78dc [PATCH] introduce get_task_pid() to fix unsafe get_pid()
proc_pid_make_inode:

	ei->pid = get_pid(task_pid(task));

I think this is not safe.  get_pid() can be preempted after checking "pid
!= NULL".  Then the task exits, does detach_pid(), and RCU frees the pid.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:25 -07:00
Arnd Bergmann
135ab6ec8f [PATCH] remove remaining errno and __KERNEL_SYSCALLS__ references
The last in-kernel user of errno is gone, so we should remove the definition
and everything referring to it.  This also removes the now-unused lib/execve.c
file that was introduced earlier.

Also remove every trace of __KERNEL_SYSCALLS__ that still remained in the
kernel.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:23 -07:00
Arnd Bergmann
3db03b4afb [PATCH] rename the provided execve functions to kernel_execve
Some architectures provide an execve function that does not set errno, but
instead returns the result code directly.  Rename these to kernel_execve to
get the right semantics there.  Moreover, there is no reasone for any of these
architectures to still provide __KERNEL_SYSCALLS__ or _syscallN macros, so
remove these right away.

[akpm@osdl.org: build fix]
[bunk@stusta.de: build fix]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata.hirokazu@renesas.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp>
Cc: Chris Zankel <chris@zankel.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:23 -07:00
Kirill Korotaev
73ea41302b [PATCH] IPC namespace - utils
This patch adds basic IPC namespace functionality to
IPC utils:
- init_ipc_ns
- copy/clone/unshare/free IPC ns
- /proc preparations

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:22 -07:00
Kirill Korotaev
25b21cb2f6 [PATCH] IPC namespace core
This patch set allows to unshare IPCs and have a private set of IPC objects
(sem, shm, msg) inside namespace.  Basically, it is another building block of
containers functionality.

This patch implements core IPC namespace changes:
- ipc_namespace structure
- new config option CONFIG_IPC_NS
- adds CLONE_NEWIPC flag
- unshare support

[clg@fr.ibm.com: small fix for unshare of ipc namespace]
[akpm@osdl.org: build fix]
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:22 -07:00
Serge E. Hallyn
071df104f8 [PATCH] namespaces: utsname: implement CLONE_NEWUTS flag
Implement a CLONE_NEWUTS flag, and use it at clone and sys_unshare.

[clg@fr.ibm.com: IPC unshare fix]
[bunk@stusta.de: cleanup]
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:22 -07:00
Serge E. Hallyn
bf47fdcda6 [PATCH] namespaces: utsname: remove system_utsname
The system_utsname isn't needed now that kernel/sysctl.c is fixed.
Nuke it.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:21 -07:00
Serge E. Hallyn
4865ecf131 [PATCH] namespaces: utsname: implement utsname namespaces
This patch defines the uts namespace and some manipulators.
Adds the uts namespace to task_struct, and initializes a
system-wide init namespace.

It leaves a #define for system_utsname so sysctl will compile.
This define will be removed in a separate patch.

[akpm@osdl.org: build fix, cleanup]
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:21 -07:00
Serge E. Hallyn
e9ff3990f0 [PATCH] namespaces: utsname: switch to using uts namespaces
Replace references to system_utsname to the per-process uts namespace
where appropriate.  This includes things like uname.

Changes: Per Eric Biederman's comments, use the per-process uts namespace
	for ELF_PLATFORM, sunrpc, and parts of net/ipv4/ipconfig.c

[jdike@addtoit.com: UML fix]
[clg@fr.ibm.com: cleanup]
[akpm@osdl.org: build fix]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:21 -07:00
Serge E. Hallyn
0bdd7aab7f [PATCH] namespaces: utsname: introduce temporary helpers
Define utsname() and init_utsname() which return &system_utsname.  Users of
system_utsname will be changed to use these helpers, after which
system_utsname will disappear.

Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:21 -07:00
Serge E. Hallyn
1651e14e28 [PATCH] namespaces: incorporate fs namespace into nsproxy
This moves the mount namespace into the nsproxy.  The mount namespace count
now refers to the number of nsproxies point to it, rather than the number of
tasks.  As a result, the unshare_namespace() function in kernel/fork.c no
longer checks whether it is being shared.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:20 -07:00
Serge E. Hallyn
ab516013ad [PATCH] namespaces: add nsproxy
This patch adds a nsproxy structure to the task struct.  Later patches will
move the fs namespace pointer into this structure, and introduce a new utsname
namespace into the nsproxy.

The vserver and openvz functionality, then, would be implemented in large part
by virtualizing/isolating more and more resources into namespaces, each
contained in the nsproxy.

[akpm@osdl.org: build fix]
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Andrey Savochkin <saw@sw.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:20 -07:00
Peter Zijlstra
12fd352038 [PATCH] nfsd: lockdep annotation
while doing a kernel make modules_install install over an NFS mount.

  =============================================
  [ INFO: possible recursive locking detected ]
  ---------------------------------------------
  nfsd/9550 is trying to acquire lock:
   (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f

  but task is already holding lock:
   (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f

  other info that might help us debug this:
  2 locks held by nfsd/9550:
   #0:  (hash_sem){..--}, at: [<cc895223>] exp_readlock+0xd/0xf [nfsd]
   #1:  (&inode->i_mutex){--..}, at: [<c034c845>] mutex_lock+0x1c/0x1f

  stack backtrace:
   [<c0103508>] show_trace_log_lvl+0x58/0x152
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa57>] __lock_acquire+0x77a/0x9a3
   [<c012af4a>] lock_acquire+0x60/0x80
   [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034c845>] mutex_lock+0x1c/0x1f
   [<c0162edc>] vfs_unlink+0x34/0x8a
   [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
   [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033e84d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb
  DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
  Leftover inexact backtrace:
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa57>] __lock_acquire+0x77a/0x9a3
   [<c012af4a>] lock_acquire+0x60/0x80
   [<c034c6c2>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034c845>] mutex_lock+0x1c/0x1f
   [<c0162edc>] vfs_unlink+0x34/0x8a
   [<cc891d98>] nfsd_unlink+0x18f/0x1e2 [nfsd]
   [<cc89884f>] nfsd3_proc_remove+0x95/0xa2 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033e84d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb

  =============================================
  [ INFO: possible recursive locking detected ]
  ---------------------------------------------
  nfsd/9580 is trying to acquire lock:
   (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f

  but task is already holding lock:
   (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f

  other info that might help us debug this:
  2 locks held by nfsd/9580:
   #0:  (hash_sem){..--}, at: [<cc89522b>] exp_readlock+0xd/0xf [nfsd]
   #1:  (&inode->i_mutex){--..}, at: [<c034cc1d>] mutex_lock+0x1c/0x1f

  stack backtrace:
   [<c0103508>] show_trace_log_lvl+0x58/0x152
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa63>] __lock_acquire+0x77a/0x9a3
   [<c012af56>] lock_acquire+0x60/0x80
   [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034cc1d>] mutex_lock+0x1c/0x1f
   [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
   [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
   [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033ec1d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb
  DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
  Leftover inexact backtrace:
   [<c0103b8b>] show_trace+0xd/0x10
   [<c0103c2f>] dump_stack+0x19/0x1b
   [<c012aa63>] __lock_acquire+0x77a/0x9a3
   [<c012af56>] lock_acquire+0x60/0x80
   [<c034ca9a>] __mutex_lock_slowpath+0xa7/0x20e
   [<c034cc1d>] mutex_lock+0x1c/0x1f
   [<cc892ad1>] nfsd_setattr+0x2c8/0x499 [nfsd]
   [<cc893ede>] nfsd_create_v3+0x31b/0x4ac [nfsd]
   [<cc8984a1>] nfsd3_proc_create+0x128/0x138 [nfsd]
   [<cc88f0d4>] nfsd_dispatch+0xc0/0x178 [nfsd]
   [<c033ec1d>] svc_process+0x3a5/0x5ed
   [<cc88f5ba>] nfsd+0x1a7/0x305 [nfsd]
   [<c0101005>] kernel_thread_helper+0x5/0xb

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Neil Brown <neilb@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:20 -07:00
Greg Banks
bfd241600a [PATCH] knfsd: make rpc threads pools numa aware
Actually implement multiple pools.  On NUMA machines, allocate a svc_pool per
NUMA node; on SMP a svc_pool per CPU; otherwise a single global pool.  Enqueue
sockets on the svc_pool corresponding to the CPU on which the socket bh is run
(i.e.  the NIC interrupt CPU).  Threads have their cpu mask set to limit them
to the CPUs in the svc_pool that owns them.

This is the patch that allows an Altix to scale NFS traffic linearly
beyond 4 CPUs and 4 NICs.

Incorporates changes and feedback from Neil Brown, Trond Myklebust, and
Christoph Hellwig.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:20 -07:00
Greg Banks
a74554429e [PATCH] knfsd: add svc_set_num_threads
Currently knfsd keeps its own list of all nfsd threads in nfssvc.c; add a new
way of managing the list of all threads in a svc_serv.  Add
svc_create_pooled() to allow creation of a svc_serv whose threads are managed
by the sunrpc code.  Add svc_set_num_threads() to manage the number of threads
in a service, either per-pool or globally across the service.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks
9a24ab5749 [PATCH] knfsd: add svc_get
add svc_get() for those occasions when we need to temporarily bump up
svc_serv->sv_nrthreads as a pseudo refcount.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks
3262c816a3 [PATCH] knfsd: split svc_serv into pools
Split out the list of idle threads and pending sockets from svc_serv into a
new svc_pool structure, and allocate a fixed number (in this patch, 1) of
pools per svc_serv.  The new structure contains a lock which takes over
several of the duties of svc_serv->sv_lock, which is now relegated to
protecting only sv_tempsocks, sv_permsocks, and sv_tmpcnt in svc_serv.

The point is to move the hottest fields out of svc_serv and into svc_pool,
allowing a following patch to arrange for a svc_pool per NUMA node or per CPU.
 This is a major step towards making the NFS server NUMA-friendly.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks
5685f0fa1c [PATCH] knfsd: convert sk_reserved to atomic_t
Convert the svc_sock->sk_reserved variable from an int protected by
svc_serv->sv_lock, to an atomic.  This reduces (by 1) the number of places we
need to take the (effectively global) svc_serv->sv_lock.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00
Greg Banks
1a68d952af [PATCH] knfsd: use new lock for svc_sock deferred list
Protect the svc_sock->sk_deferred list with a new lock svc_sock->sk_defer_lock
instead of svc_serv->sv_lock.  Using the more fine-grained lock reduces the
number of places we need to take the svc_serv lock.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:19 -07:00