Commit graph

842 commits

Author SHA1 Message Date
Andi Kleen
43c85c9c5d [PATCH] Remove need for early lockdep init
I think it was only needed for the printks and we can do them later.

I put in a single early_printk so that we know the kernel is alive
(early_printk doesn't need any locks)

This makes some things easier for initialization of unwind for
lockdep, which is needed by later patches.

cc: mingo@elte.hu

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Andi Kleen
2c8c0e6b8d [PATCH] Convert x86-64 to early param
Instead of hackish manual parsing

Requires earlier i386 patchkit, but also fixes i386 early_printk again.

I removed some obsolete really early parameters which didn't do anything useful.
Also made a few parameters that needed it early (mostly oops printing setup)

Also removed one panic check that wasn't visible without
early console anyways (the early console is now initialized after that
panic)

This cleans up a lot of code.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Andi Kleen
9ca33eb698 [PATCH] Use early CPU identify before early command line parsing
This makes it possible to modify CPU flags in command line
options without hacks.

And remove another copy in head64.c

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Chuck Ebbert
d4d35854a1 [PATCH] remove lock prefix from is_at_popf() tests
The lock prefix will cause an exception when used with the
popf instruction, so no need to continue searching after it's
found.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Muli Ben-Yehuda
145106e810 [PATCH] remove superflous BUG_ON's in nommu and gart
There's no need to check for invalid DMA data direction in nommu and
gart since we do it in dma-mapping.h anyway before calling the
individual dma-ops.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Eric W. Biederman
29a6c25bd6 [PATCH] Fix gdt table size in trampoline.S
Allows easier extension of the GDT by using the proper C symbol
for the size in the descriptor.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Chuck Ebbert
a752d7194c [PATCH] fix is_at_popf() for compat tasks
When testing for the REX instruction prefix, first check
for 32-bit mode because in compat mode the REX prefix is an
increment instruction.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Muli Ben-Yehuda
0577f148b5 [PATCH] Calgary IOMMU: save a bit of space in bus_info
Make translation_disabled a uchar rather than an int

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda
a4fc520a0f [PATCH] Calgary IOMMU: calgary_init_one_nontraslated() can return void
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda
871b17008e [PATCH] Calgary IOMMU: fix reference counting of Calgary PCI devices
The pci_get_device() API decrements the reference count on the 'from'
parameter when it continues searching. Therefore, take a ref count on
Calgary bus when we initialize them in either translated or
non-translated mode.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda
b8f4fe66a5 [PATCH] Calgary IOMMU: fix error path memleak in calgary_free_tar
We were freeing the iommu_table and leaking the bitmap pages. Also
rename it to calgary_free_bus, which is more accurate.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda
9f2dc46d5e [PATCH] Calgary IOMMU: break out of pci_find_device_reverse if dev not found
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Muli Ben-Yehuda
f38db651d5 [PATCH] Calgary IOMMU: consolidate per bus data structures
Move the tce_table_kva array, disabled bitmap and bus_to_phb array
into a new per bus 'struct calgary_bus_info'. Also slightly reorganize
build_tce_table and tce_table_setparms to avoid exporting bus_info to
tce.c.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Jan Beulich
3b94355c47 [PATCH] remove int_delivery_dest
The genapic field and the accessor macro weren't used anywhere.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Jan Beulich
caff0710eb [PATCH] initialize end of memory variables as early as possible
While an earlier patch already did a small step into that direction,
this patch moves initialization of all memory end variables to as
early as possible, so that dependent code doesn't need to check
whether these variables have already been set.

Also, remove a misleading (perhaps just outdated) comment, and make
static a variable only used in a single file.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Andi Kleen
44cc45267b [PATCH] Remove obsolete CVS $Id$ from assembler files in arch/x86_64/kernel/*
CVS hasn't been used for a long time for them.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:31 +02:00
Andi Kleen
fe7414a288 [PATCH] Use BUILD_BUG_ON in apic.c build sanity checking
Makes code a little shorter.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
efec3b9a32 [PATCH] Fix up some non linuxy style in ACPI functions in mpparse.c
No functional changes.

Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
276ec1a76a [PATCH] Remove some unneeded ACPI externs in mpparse.c
They are not used in this file so remove them. i386 didn't have them either.

Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
a01fd3baff [PATCH] Remove useless wrapper in mpparse.c code
It used to contain support code for NUMAQ, but that is long gone already
on 64bit.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
55f05ffaa7 [PATCH] Replace mp bus array with bitmap for bus not pci
Since we only support PCI and ISA legacy busses now there is no need to
have an full array with checking.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
dfa4698c50 [PATCH] Move early chipset quirks out to new file
They did not really belong into io_apic.c. Move them into a new file
and clean it up a bit.

Also remove outdated ATI quirk that was obsolete,

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
edd9652296 [PATCH] Remove MPS table APIC renumbering
The MPS table specification says that the operating system should
renumber the IO-APICs following the table as needed.  However in
ACPI this is not allowed or neeeded and all x86-64 systems are ACPI
compliant.

The code was already disabled on some systems because it caused
problems there. Remove it completely now.

CC: mdomsch@dell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
eea0e11c1f [PATCH] Factor out common io apic routing entry access
The IO APIC code had lots of duplicated code to read/write 64bit
routing entries into the IO-APIC. Factor this out int common read/write
functions

In a few cases the IO APIC lock is taken more often now, but this
isn't a problem because it's all initialization/shutdown only
slow path code.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
c1a58b42b4 [PATCH] i386/x86-64: Remove obsolete sanity check in mptable parsing
It apparently has never triggered in many years.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
a8fcf1a24a [PATCH] Remove obsolete PIC mode
PIC mode is an outdated way to drive the APICs that was used on
some early MP boards. It is not supported in the ACPI model.

It is unlikely to be ever configured by any x86-64 system

Remove it thus.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:30 +02:00
Andi Kleen
e509913434 [PATCH] Remove leftover MCE/EISA support
No 64bit EISA or Microchannel systems ever. Remove the left over code
in the IO-APIC driver and the mptable parser

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
5cb6b99928 [PATCH] Remove pirq overwrite support
This was an old workaround for broken MP-BIOS. The user could
specify overwrites on the command line.

I've never seen it being used for anything on 64bit. So get
rid of it for now.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
2e91a17b35 [PATCH] Add some comments to entry.S
And remove some old obsolete ones.
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
3f14c746a6 [PATCH] Remove old "focus disabled" chipset errata workaround
The new systems already use focus disabled and the comment was
completely outdated.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
6c96a29f20 [PATCH] Remove apic mismatch counter
Nobody has been setting the mismatch counter and the ifdef was never
set so remove it.
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
7f11d8a5ef [PATCH] Remove all ifdefs for local/io apic
IO-APIC or local APIC can only be disabled at runtime anyways and
Kconfig has forced these options on for a long time now.

The Kconfigs are kept only now for the benefit of the shared acpi
boot.c code.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
5ba5891d44 [PATCH] Add some comments what tce.c actually does
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
b06babac45 [PATCH] Add proper alignment to ENTRY
Previously it didn't align. Use the same one as the C compiler
in blended mode, which is good for K8 and Core2 and doesn't hurt
on P4.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:29 +02:00
Andi Kleen
31679f38d8 [PATCH] Simplify profile_pc on x86-64
Use knowledge about EFLAGS layout (bits 22:63 are always 0) to distingush
EFLAGS word and kernel address in the spin lock stack frame.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andi Kleen
c16b63e09d [PATCH] i386/x86-64: Don't randomize stack top when no randomization personality is set
Based on patch from Frank van Maarseveen <frankvm@frankvm.com>, but
extended.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Adam Henley
d5d9ca6d88 [PATCH] A few trivial spelling and grammar fixes
A few trivial spelling and grammar mistakes picked up in
"arch/x86_64/aperture.c", "arch/x86_64/crash.c" and
"arch/x86_64/apic.c". I think all are correct fixes but am ever aware
of my fallibility :o) This is my first patch submission so all
feedback is appreciated, esp. WRT CCing to Linus, Andi and
trivial@kernel.org, is this correct? And which is the most appropriate
kernel version to diff against? If any.

Should apply cleanly to 2.6.18-rc1

Signed-off-by: Adam Henley <adamazing@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>

-  adam
2006-09-26 10:52:28 +02:00
Stephane Eranian
d3a4f48d48 [PATCH] x86-64 TIF flags for debug regs and io bitmap in ctxsw
Hello,

Following my discussion with Andi. Here is a patch that introduces
two new TIF flags to simplify the context switch code in __switch_to().
The idea is to minimize the number of cache lines accessed in the common
case, i.e., when neither the debug registers nor the I/O bitmap are used.

This patch covers the x86-64 modifications. A patch for i386 follows.

Changelog:
	- add TIF_DEBUG to track when debug registers are active
	- add TIF_IO_BITMAP to track when I/O bitmap is used
	- modify __switch_to() to use the new TIF flags

<signed-off-by>: eranian@hpl.hp.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andi Kleen
2f766d1606 [PATCH] Clean up asm/smp.h includes
No need to include it from entry.S
Drop all the #ifdef __ASSEMBLY__

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Vojtech Pavlik
c08c820508 [PATCH] Add the vgetcpu vsyscall
This patch adds a vgetcpu vsyscall, which depending on the CPU RDTSCP
capability uses either the RDTSCP or CPUID to obtain a CPU and node
numbers and pass them to the program.

AK: Lots of changes over Vojtech's original code:
Better prototype for vgetcpu()
It's better to pass the cpu / node numbers as separate arguments
to avoid mistakes when going from SMP to NUMA.
Also add a fast time stamp based cache using a user supplied
argument to speed things more up.
Use fast method from Chuck Ebbert to retrieve node/cpu from
GDT limit instead of CPUID
Made sure RDTSCP init is always executed after node is known.
Drop printk

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Vojtech Pavlik
a670fad0ad [PATCH] Add initalization of the RDTSCP auxilliary values
This patch adds initalization of the RDTSCP auxilliary values to CPU numbers
to time.c. If RDTSCP is available, the MSRs are written with the respective
values. It can be later used to initalize per-cpu timekeeping variables.

AK: Some cleanups. Move externs into headers and fix CPU hotplug.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Venkatesh Pallipadi
248dcb2fff [PATCH] x86: i386/x86-64 Add nmi watchdog support for new Intel CPUs
AK: This redoes the changes I temporarily reverted.

Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.

What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.

Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Vivek Goyal
3f22c5789e [PATCH] kdump x86_64 nmi event notification fix
After a crash we should wait for NMI IPI event and not for external NMI or
NMI watchdog tick.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Andi Kleen
fac58550e8 [PATCH] Fix up panic messages for different NMI panics
When a unknown NMI happened the panic would claim a NMI watchdog timeout.
Also it would check the variable set by nmi_watchdog=panic and panic then.

Fix up the panic message to be generic
Unconditionally panic on unknown NMI when panic on unknown nmi is enabled.

Noticed by Jan Beulich

Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Shaohua Li
4038f901cf [PATCH] i386/x86-64: Fix NMI watchdog suspend/resume
Making NMI suspend/resume work with SMP. We use CPU hotplug to offline
APs in SMP suspend/resume. Only BSP executes sysdev's .suspend/.resume
method. APs should follow CPU hotplug code path.

And:

+From: Don Zickus <dzickus@redhat.com>

Makes the start/stop paths of nmi watchdog more robust to handle the
suspend/resume cases more gracefully.

AK: I merged the two patches together

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Don Zickus
c41c5cd3b2 [PATCH] x86: x86 clean up nmi panic messages
Clean up some of the output messages on the nmi error paths to make more
sense when they are displayed.  This is mainly a cosmetic fix and
shouldn't impact any normal code path.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus
8da5adda91 [PATCH] x86: Allow users to force a panic on NMI
To quote Alan Cox:

The default Linux behaviour on an NMI of either memory or unknown is to
continue operation. For many environments such as scientific computing
it is preferable that the box is taken out and the error dealt with than
an uncorrected parity/ECC error get propogated.

A small number of systems do generate NMI's for bizarre random reasons
such as power management so the default is unchanged. In other respects
the new proc/sys entry works like the existing panic controls already in
that directory.

This is separate to the edac support - EDAC allows supported chipsets to
handle ECC errors well, this change allows unsupported cases to at least
panic rather than cause problems further down the line.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus
e33e89ab1a [PATCH] x86: Add abilty to enable/disable nmi watchdog from procfs (update)
Adds a new /proc/sys/kernel/nmi_watchdog call that will enable/disable the
nmi watchdog.

By entering a non-zero value here, a user can enable the nmi watchdog to
monitor the online cpus in the system.  By entering a zero value here, a
user can disable the nmi watchdog and free up a performance counter which
could then be utilized by the oprofile subsystem, otherwise oprofile may be
short a counter when in use.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
Don Zickus
407984f1af [PATCH] x86: Add abilty to enable/disable nmi watchdog with sysctl
Adds a new /proc/sys/kernel/nmi call that will enable/disable the nmi
watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Don Zickus
2fbe7b25c8 [PATCH] i386/x86-64: Remove un/set_nmi_callback and reserve/release_lapic_nmi functions
Removes the un/set_nmi_callback and reserve/release_lapic_nmi functions as
they are no longer needed.  The various subsystems are modified to register
with the die_notifier instead.

Also includes compile fixes by Andrew Morton.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:27 +02:00
Andi Kleen
1d001df19d [PATCH] Add TIF_RESTORE_SIGMASK
We need TIF_RESTORE_SIGMASK in order to support ppoll() and pselect()
system calls. This patch originally came from Andi, and was based
heavily on David Howells' implementation of same on i386. I fixed a typo
which was causing do_signal() to use the wrong signal mask.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus
3adbbcce9a [PATCH] x86: Cleanup NMI interrupt path
This patch cleans up the NMI interrupt path.  Instead of being gated by if
the 'nmi callback' is set, the interrupt handler now calls everyone who is
registered on the die_chain and additionally checks the nmi watchdog,
reseting it if enabled.  This allows more subsystems to hook into the NMI if
they need to (without being block by set_nmi_callback).

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus
f2802e7f57 [PATCH] Add SMP support on x86_64 to reservation framework
This patch includes the changes to make the nmi watchdog on x86_64 SMP
aware.  A bunch of code was moved around to make it simpler to read.  In
addition, it is now possible to determine if a particular NMI was the result
of the watchdog or not.  This feature allows the kernel to filter out
unknown NMIs easier.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Don Zickus
828f0afda1 [PATCH] x86: Add performance counter reservation framework for UP kernels
Adds basic infrastructure to allow subsystems to reserve performance
counters on the x86 chips.  Only UP kernels are supported in this patch to
make reviewing easier.  The SMP portion makes a lot more changes.

Think of this as a locking mechanism where each bit represents a different
counter.  In addition, each subsystem should also reserve an appropriate
event selection register that will correspond to the performance counter it
will be using (this is mainly neccessary for the Pentium 4 chips as they
break the 1:1 relationship to performance counters).

This will help prevent subsystems like oprofile from interfering with the
nmi watchdog.

Signed-off-by:  Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Andi Kleen
b07f8915cd [PATCH] x86: Temporarily revert parts of the Core 2 nmi nmi watchdog support
This makes merging easier.  They are readded a few patches later.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:26 +02:00
Linus Torvalds
79e453d49b Revert mmiocfg heuristics and blacklist changes
This reverts commits 11012d419c and
40dd2d20f2, which allowed us to use the
MMIO accesses for PCI config cycles even without the area being marked
reserved in the e820 memory tables.

Those changes were needed for EFI-environment Intel macs, but broke some
newer Intel 965 boards, so for now it's better to revert to our old
2.6.17 behaviour and at least avoid introducing any new breakage.

Andi Kleen has a set of patches that work with both EFI and the broken
Intel 965 boards, which will be applied once they get wider testing.

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-19 08:15:22 -07:00
Al Viro
55669bfa14 [PATCH] audit: AUDIT_PERM support
add support for AUDIT_PERM predicate

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-11 13:32:30 -04:00
Al Viro
dc104fb323 [PATCH] audit: more syscall classes added
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-09-11 13:32:27 -04:00
Keith Owens
01ebb77b31 [PATCH] x86_64: Save original IST values for checking stack addresses
The values in init_tss.ist[] can change when an IST event occurs.  Save
the original IST values for checking stack addresses when debugging or
doing stack traces.

Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:16 -07:00
Andi Kleen
ceee882230 [PATCH] x86_64: Recover 1MB of kernel memory
Noticed by Jan Beulich.

When the kernel was moved from 1MB to 2MB in 2.6.17 the kernel reservation
code wasn't adjusted and it still reserved starting with 1MB. This means 1MB always
were lost.

This patch fixes this by reserving only starting with _text.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Jan Beulich
ea424055b7 [PATCH] x86: Make backtracer fallback logic more bullet-proof
The unwinder fallback logic still had potential for falling through to
the legacy stack trace code without printing an indication (at once
serving as a separator) of this.

Further, the stack pointer retrieval for the fallback should be as
restrictive as possible (in order to avoid having the legacy stack
tracer try to access invalid memory). The patch tightens that, but
this could certainly be further improved.

Also making the call_trace command line option now conditional upon
CONFIG_STACK_UNWIND (as it's meaningless otherwise).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen
c05991ed12 [PATCH] x86_64: Add kernel thread stack frame termination for properly stopping stack unwinds.
One open question: Should these added pushes perhaps be made
conditional upon CONFIG_STACK_UNWIND or CONFIG_UNWIND_INFO?
[AK: Not needed -- these are all very slow paths]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Andi Kleen
11012d419c [PATCH] x86: Revert e820 MCFG heuristics
The check for the MCFG table being reserved in the e820 map was originally
added to detect a broken BIOS in a preproduction Intel SDV. However it also
breaks the Apple x86 Macs, which can't supply this properly, but need
a working MCFG. With this patch they wouldn't use the MCFG and not work.

After some discussion I think it's best to remove the heuristic again.
It also failed on some other boxes (although it didn't cause much
problems there because old style port access for PCI config space
still works as fallback), but the preproduction SDVs can just use
pci=nommcfg. Supporting production machines properly is more
important.

Edgar Hucek did all the debugging work.

Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-30 16:05:15 -07:00
Horms
012c437d03 [PATCH] Change panic_on_oops message to "Fatal exception"
Previously the message was "Fatal exception: panic_on_oops", as introduced
in a recent patch whith removed a somewhat dangerous call to ssleep() in
the panic_on_oops path.  However, Paul Mackerras suggested that this was
somewhat confusing, leadind people to believe that it was panic_on_oops
that was the root cause of the fatal exception.  On his suggestion, this
patch changes the message to simply "Fatal exception".  A suitable oops
message should already have been displayed.

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-08-14 12:54:29 -07:00
Alexey Dobriyan
825e037fb8 [PATCH] Fix more per-cpu typos
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-06 08:57:47 -07:00
Muli Ben-Yehuda
a166222cde [PATCH] x86_64: Fix CONFIG_IOMMU_DEBUG
If CONFIG_IOMMU_DEBUG is set force_iommu defaults to 1. In the case
where no HW IOMMU is present in the machine and we end up using nommu,
leaving force_iommu set to 1 causes dma_alloc_coherent to do the wrong
thing. Therefore, if we end up using nommu, make sure force_iommu is
0.

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-02 20:19:54 -07:00
Andi Kleen
2699500b31 [PATCH] x86_64: Fix backtracing for interrupt stacks
Re-add backlink for old style unwinder to stack switching.  Add proper
stack frame and CFI annotations to call_softirq

This prevents a oops when backtracing with fallback through the
interrupt stack top.

Suggested by Jan Beulich and Herbert Xu wanted it in 2.6.18.

Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-02 20:19:54 -07:00
Chandra Seetharaman
be6b5a3505 [PATCH] cpu hotplug: use hotplug version of registration in late inits
Use hotplug version of register_cpu_notifier in late init functions.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:39 -07:00
Horms
cea6a4ba8a [PATCH] panic_on_oops: remove ssleep()
This patch is part of an effort to unify the panic_on_oops behaviour across
all architectures that implement it.

It was pointed out to me by Andi Kleen that if an oops has occured in
interrupt context, then calling sleep() in the oops path will only cause a
panic, and that it would be really better for it not to be in the path at
all.

This patch removes the ssleep() call and reworks the console message
accordinly.  I have a slght concern that the resulting console message is
too long, feedback welcome.

For powerpc it also unifies the 32bit and 64bit behaviour.

Fror x86_64, this patch only updates the console message, as ssleep() is
already not present.

Signed-off-by: Horms <horms@verge.net.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:39 -07:00
Eric W. Biederman
2a8a3d5b65 [PATCH] machine_kexec.c: Fix the description of segment handling
One of my original comments in machine_kexec was unclear
and this should fix it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: Horms <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:38 -07:00
Andi Kleen
65f87d8a8a [PATCH] x86_64: Fix swiotlb=force
It was broken before. But having it is important as possible hardware
bug workaround.

And previously there was no way to force swiotlb if there is another IOMMU.
Side effect is that iommu=force won't force swiotlb anymore even if there
isn't another IOMMU.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Jon Mason
d2105b10fe [PATCH] x86_64: Calgary IOMMU - Multi-Node NULL pointer dereference fix
Calgary hits a NULL pointer dereference when booting in a multi-chassis
NUMA system.  See Redhat bugzilla number 198498, found by Konrad
Rzeszutek (konradr@redhat.com).

There are many issues that had to be resolved to fix this problem.
Firstly when I originally wrote the code to handle NUMA systems, I
had a large misunderstanding that was not corrected until now.  That was
that I thought the "number of nodes online" referred to number of
physical systems connected.  So that if NUMA was disabled, there
would only be 1 node and it would only show that node's PCI bus.
In reality if NUMA is disabled, the system displays all of the
connected chassis as one node but is only ignorant of the delays
in accessing main memory.  Therefore, references to num_online_nodes()
and MAX_NUMNODES are incorrect and need to be set to the maximum
number of nodes that can be accessed (which are 8).  I created a
variable, MAX_NUM_CHASSIS, and set it to 8 to fix this.

Secondly, when walking the PCI in detect_calgary, the code only
checked the first "slot" when looking to see if a device is present.
This will work for most cases, but unfortunately it isn't always the
case.  In the NUMA MXE drawers, there are USB devices present on the
3rd slot (with slot 1 being empty).  So, to work around this, all
slots (up to 8) are scanned to see if there are any devices present.

Lastly, the bus is being enumerated on large systems in a different
way the we originally thought.  This throws the ugly logic we had
out the window.  To more elegantly handle this, I reorganized the
kva array to be sparse (which removed the need to have any bus number
to kva slot logic in tce.c) and created a secondary space array to
contain the bus number to phb mapping.

With these changes Calgary boots on an x460 with 4 nodes with and
without NUMA enabled.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Muli Ben-Yehuda
089bbbcb36 [PATCH] x86_64: Calgary IOMMU - fix off by one error
Fixed off-by-one error in detect_calgary and calgary_init which will
cause arrays to overflow.  Also, removed impossible to hit BUG_ON.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen
0e5f61b00c [PATCH] x86_64: On Intel systems when CPU has C3 don't use TSC
On Intel systems generally the TSC stops in C3 or deeper,
so don't use it there. Follows similar logic on i386.

This should fix problems on Meroms.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-29 20:59:55 -07:00
Andi Kleen
b13761ecd1 [PATCH] x86_64: Dump leftover backtrace entries when dwarf2 unwinder got stuck
The dwarf2 unwinder currently often gets stuck because a lot
of assembly code doesn't have proper dwarf2 annotiation yet.

This currently often happens with __down. Should fix this by
adding proper dwarf2 annotation to all inline assembly. However
until that's done we need a quick fix for 2.6.18 to avoid
incomplete backtraces.

So when this happens dump the rest of the stack with the old unwinder
instead of silently not dumping it. There was already a optional
"both" mode that dumped both, but that was too ugly.

I also clarified the headers for the different backtraces a bit.

Also add a clear error message for missing dwarf2
annotation that people can work on.

And I removed a dead variable left over from Ingo's changes.

Cc: mingo@elte.hu
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28 19:28:00 -07:00
Andi Kleen
d5a2601734 [PATCH] i386/x86-64: Add user_mode checks to profile_pc for oprofile
Fixes a obscure user space triggerable crash during oprofiling.

Oprofile calls profile_pc from NMIs even when user_mode(regs) is not true and
the program counter is inside the kernel lock section. This opens
a race - when a user program jumps to a kernel lock address and
a NMI happens before the illegal page fault exception is raised
and the program has a unmapped esp or ebp then the kernel could
oops. NMIs have a higher priority than exceptions so that could
happen.

Add user_mode checks to i386/x86-64 profile_pc to prevent that.

Cc: John Levon <levon@movementarian.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-28 19:28:00 -07:00
Muli Ben-Yehuda
aa0a9f373e [PATCH] x86_64: Fix Calgary copyright statements per IBM guidelines
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:33 -07:00
Jacob Shin
0d2caebd56 [PATCH] x86_64: Fix hotplug problem in mce amd
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 15:12:32 -07:00
Jon Smirl
894673ee61 [PATCH] tty: Remove include of screen_info.h from tty.h
screen_info.h doesn't have anything to do with the tty layer and shouldn't be
included by tty.h.  This patches removes the include and modifies all users to
directly include screen_info.h.  struct screen_info is mainly used to
communicate with the console drivers in drivers/video/console.  Note that this
patch touches every arch and I have no way of testing it.  If there is a
mistake the worst thing that will happen is a compile error.

[akpm@osdl.org: fix arm build]
[akpm@osdl.org: fix alpha build]
Signed-off-by: Jon Smirl <jonsmir@gmail.com>
Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:16 -07:00
Arjan van de Ven
1454aed92b [PATCH] put a comment at register_die_notifier that the export is used
{un}register_die_notifier() is used by kdb... document this so that future
"remove dead export" rounds can skip this export.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:15 -07:00
Ingo Molnar
f86bf9b7bc [PATCH] lockdep: clean up completion initializer in smpboot.c
Clean up lockdep on-stack-completion initializer.  (This also removes the
dependency on waitqueue_lock_key.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:14 -07:00
Andrew Morton
1a91023a9f [PATCH] x86_64: e820.c needs pgtable.h
arch/x86_64/kernel/e820.c:42: error: 'MAXMEM' undeclared here (not in a function)

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:13 -07:00
Ingo Molnar
60be6b9a41 [PATCH] lockdep: annotate on-stack completions
lockdep needs to have the waitqueue lock initialized for on-stack waitqueues
implicitly initialized by DECLARE_COMPLETION().  Annotate on-stack completions
accordingly.

Has no effect on non-lockdep kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:09 -07:00
Ingo Molnar
366c7f554e [PATCH] lockdep: annotate enable_in_hardirq()
Make use of local_irq_enable_in_hardirq() API to annotate places that enable
hardirqs in hardirq context.

Has no effect on non-lockdep kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:09 -07:00
Ingo Molnar
2148270cd2 [PATCH] lockdep: x86_64 early init
x86_64 uses spinlocks very early - earlier than start_kernel().  So call
lockdep_init() from the arch setup code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:04 -07:00
Ingo Molnar
2601e64d26 [PATCH] lockdep: irqtrace subsystem, x86_64 support
Add irqflags-tracing support to x86_64.

[akpm@osdl.org: build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:03 -07:00
Ingo Molnar
21b32bbff9 [PATCH] lockdep: stacktrace subsystem, x86_64 support
Framework to generate and save stacktraces quickly, without printing anything
to the console.  x86_64 support.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:02 -07:00
Ingo Molnar
c9ca1ba5bd [PATCH] lockdep: x86_64 document stack frame internals
Document stack frame nesting internals some more.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:02 -07:00
Ingo Molnar
3ac94932a2 [PATCH] lockdep: beautify x86_64 stacktraces
Beautify x86_64 stacktraces to be more readable.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03 15:27:02 -07:00
Thomas Gleixner
b1e05aa230 [PATCH] irq-flags: x86_64: Use the new IRQF_ constants
Use the new IRQF_ constants and remove the SA_INTERRUPT define

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-02 13:58:49 -07:00
Al Viro
b915543b46 [PATCH] audit syscall classes
Allow to tie upper bits of syscall bitmap in audit rules to kernel-defined
sets of syscalls.  Infrastructure, a couple of classes (with 32bit counterparts
for biarch targets) and actual tie-in on i386, amd64 and ia64.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-07-01 07:44:10 -04:00
Jörn Engel
6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Adrian Bunk
5bba17127e [NET]: make skb_release_data() static
skb_release_data() no longer has any users in other files.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-29 16:58:30 -07:00
Ingo Molnar
c0ad90a32f [PATCH] genirq: add ->retrigger() irq op to consolidate hw_irq_resend()
Add ->retrigger() irq op to consolidate hw_irq_resend() implementations.
(Most architectures had it defined to NOP anyway.)

NOTE: ia64 needs testing. i386 and x86_64 tested.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:23 -07:00
Ingo Molnar
a53da52fd7 [PATCH] genirq: cleanup: merge irq_affinity[] into irq_desc[]
Consolidation: remove the irq_affinity[NR_IRQS] array and move it into the
irq_desc[NR_IRQS].affinity field.

[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:22 -07:00
Ingo Molnar
d1bef4ed5f [PATCH] genirq: rename desc->handler to desc->chip
This patch-queue improves the generic IRQ layer to be truly generic, by adding
various abstractions and features to it, without impacting existing
functionality.

While the queue can be best described as "fix and improve everything in the
generic IRQ layer that we could think of", and thus it consists of many
smaller features and lots of cleanups, the one feature that stands out most is
the new 'irq chip' abstraction.

The irq-chip abstraction is about describing and coding and IRQ controller
driver by mapping its raw hardware capabilities [and quirks, if needed] in a
straightforward way, without having to think about "IRQ flow"
(level/edge/etc.) type of details.

This stands in contrast with the current 'irq-type' model of genirq
architectures, which 'mixes' raw hardware capabilities with 'flow' details.
The patchset supports both types of irq controller designs at once, and
converts i386 and x86_64 to the new irq-chip design.

As a bonus side-effect of the irq-chip approach, chained interrupt controllers
(master/slave PIC constructs, etc.) are now supported by design as well.

The end result of this patchset intends to be simpler architecture-level code
and more consolidation between architectures.

We reused many bits of code and many concepts from Russell King's ARM IRQ
layer, the merging of which was one of the motivations for this patchset.

This patch:

rename desc->handler to desc->chip.

Originally i did not want to do this, because it's a big patch.  But having
both "desc->handler", "desc->handle_irq" and "action->handler" caused a
large degree of confusion and made the code appear alot less clean than it
truly is.

I have also attempted a dual approach as well by introducing a
desc->chip alias - but that just wasnt robust enough and broke
frequently.

So lets get over with this quickly.  The conversion was done automatically
via scripts and converts all the code in the kernel.

This renaming patch is the first one amongst the patches, so that the
remaining patches can stay flexible and can be merged and split up
without having some big monolithic patch act as a merge barrier.

[akpm@osdl.org: build fix]
[akpm@osdl.org: another build fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-29 10:26:21 -07:00
Andrew Morton
d9a5685436 [PATCH] x86_64: oprofile build fix
WARNING: "unset_nmi_callback" [arch/x86_64/oprofile/oprofile.ko] undefined!
WARNING: "set_nmi_callback" [arch/x86_64/oprofile/oprofile.ko] undefined!

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28 14:59:07 -07:00
Andrew Morton
a052b68b1e [PATCH] x86: do_IRQ(): check irq number
We recently changed x86 to handle more than 256 IRQs.  Add a check in do_IRQ()
just to make sure that nothing went wrong with that implementation.

[chrisw@sous-sol.org: do x86_64 too]
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@muc.de>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-28 14:59:02 -07:00
Siddha, Suresh B
5c45bf279d [PATCH] sched: mc/smt power savings sched policy
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in
/sys/devices/system/cpu/ control the MC/SMT power savings policy for the
scheduler.

Based on the values (1-enable, 0-disable) for these controls, sched groups
cpu power will be determined for different domains.  When power savings
policy is enabled and under light load conditions, scheduler will minimize
the physical packages/cpu cores carrying the load and thus conserving
power(with a perf impact based on the workload characteristics...  see OLS
2005 CMP kernel scheduler paper for more details..)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Con Kolivas <kernel@kolivas.org>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Chandra Seetharaman
74b85f3790 [PATCH] cpu hotplug: make cpu_notifier related notifier blocks __cpuinit only
Make notifier_blocks associated with cpu_notifier as __cpuinitdata.

__cpuinitdata makes sure that the data is init time only unless
CONFIG_HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:41 -07:00
Chandra Seetharaman
9c7b216d23 [PATCH] cpu hotplug: revert init patch submitted for 2.6.17
In 2.6.17, there was a problem with cpu_notifiers and XFS.  I provided a
band-aid solution to solve that problem.  In the process, i undid all the
changes you both were making to ensure that these notifiers were available
only at init time (unless CONFIG_HOTPLUG_CPU is defined).

We deferred the real fix to 2.6.18.  Here is a set of patches that fixes the
XFS problem cleanly and makes the cpu notifiers available only at init time
(unless CONFIG_HOTPLUG_CPU is defined).

If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
time.

This patch reverts the notifier_call changes made in 2.6.17

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:40 -07:00
Rusty Russell
19eadf98c8 [PATCH] x86: increase interrupt vector range
Remove the limit of 256 interrupt vectors by changing the value stored in
orig_{e,r}ax to be the complemented interrupt vector.  The orig_{e,r}ax
needs to be < 0 to allow the signal code to distinguish between return from
interrupt and return from syscall.  With this change applied, NR_IRQS can
be > 256.

Xen extends the IRQ numbering space to include room for dynamically
allocated virtual interrupts (in the range 256-511), which requires a more
permissive interface to do_IRQ.

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Linus Torvalds
da206c9e68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
  typo fixes
  Clean up 'inline is not at beginning' warnings for usb storage
  Storage class should be first
  i386: Trivial typo fixes
  ixj: make ixj_set_tone_off() static
  spelling fixes
  fix paniced->panicked typos
  Spelling fixes for Documentation/atomic_ops.txt
  move acknowledgment for Mark Adler to CREDITS
  remove the bouncing email address of David Campbell
2006-06-26 13:33:14 -07:00
Linus Torvalds
972d19e837 Merge master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  [CRYPTO] tcrypt: Forbid tcrypt from being built-in
  [CRYPTO] aes: Add wrappers for assembly routines
  [CRYPTO] tcrypt: Speed benchmark support for digest algorithms
  [CRYPTO] tcrypt: Return -EAGAIN from module_init()
  [CRYPTO] api: Allow replacement when registering new algorithms
  [CRYPTO] api: Removed const from cra_name/cra_driver_name
  [CRYPTO] api: Added cra_init/cra_exit
  [CRYPTO] api: Fixed incorrect passing of context instead of tfm
  [CRYPTO] padlock: Rearrange context structure to reduce code size
  [CRYPTO] all: Pass tfm instead of ctx to algorithms
  [CRYPTO] digest: Remove unnecessary zeroing during init
  [CRYPTO] aes-i586: Get rid of useless function wrappers
  [CRYPTO] digest: Add alignment handling
  [CRYPTO] khazad: Use 32-bit reads on key
2006-06-26 11:03:29 -07:00
Linus Torvalds
81a07d7588 Merge branch 'x86-64'
* x86-64: (83 commits)
  [PATCH] x86_64: x86_64 stack usage debugging
  [PATCH] x86_64: (resend) x86_64 stack overflow debugging
  [PATCH] x86_64: msi_apic.c build fix
  [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs
  [PATCH] x86_64: Avoid broadcasting NMI IPIs
  [PATCH] x86_64: fix apic error on bootup
  [PATCH] x86_64: enlarge window for stack growth
  [PATCH] x86_64: Minor string functions optimizations
  [PATCH] x86_64: Move export symbols to their C functions
  [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR
  [PATCH] x86_64: Fix modular pc speaker
  [PATCH] x86_64: remove sys32_ni_syscall()
  [PATCH] x86_64: Do not use -ffunction-sections for modules
  [PATCH] x86_64: Add cpu_relax to apic_wait_icr_idle
  [PATCH] x86_64: adjust kstack_depth_to_print default
  [PATCH] i386/x86-64: adjust /proc/interrupts column headings
  [PATCH] x86_64: Fix race in cpu_local_* on preemptible kernels
  [PATCH] x86_64: Fix fast check in safe_smp_processor_id
  [PATCH] x86_64: x86_64 setup.c - printing cmp related boottime information
  [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
  ...

Manual resolve of trivial conflict in arch/i386/kernel/Makefile
2006-06-26 10:51:09 -07:00
Eric Sandeen
4961f10e22 [PATCH] x86_64: (resend) x86_64 stack overflow debugging
Take two, now without spurious whitespace :(  Applies to git & 2.6.17-rc6

CONFIG_DEBUG_STACKOVERFLOW existed for x86_64 in 2.4, but seems to have gone AWOL in 2.6.

I've pretty much just copied this over from the 2.4 code, with
appropriate tweaks for the 2.6 kernel, plus a bugfix.  I'd personally
rather see it printed out the way other arches do it, i.e.
bytes-remaining-until-overflow, rather than having to do the subtraction
yourself.  Also, only 128 bytes remaining seems awfully late to issue a
warning.  But I'll start here :)

Signed-off-by: Eric Sandeen <sandeen@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Venkatesh Pallipadi
0080e66755 [PATCH] x86_64: i386/x86-64 Add nmi watchdog support for new Intel CPUs
Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.

What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.

Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Keith Owens
e77deacb7b [PATCH] x86_64: Avoid broadcasting NMI IPIs
On some i386/x86_64 systems, sending an NMI IPI as a broadcast will
reset the system.  This seems to be a BIOS bug which affects machines
where one or more cpus are not under OS control.  It occurs on HT
systems with a version of the OS that is not compiled without HT
support.  It also occurs when a system is booted with max_cpus=n where
2 <= n < cpus known to the BIOS.  The fix is to always send NMI IPI as
a mask instead of as a broadcast.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Siddha, Suresh B
704fc59e1d [PATCH] x86_64: fix apic error on bootup
Appended patch fixes the "APIC error on CPUX: 00(40)" observed during bootup.

From SDM Vol-3A "Valid Interrupt Vectors" section:
	"When an illegal vector value (0-15) is written to an LVT entry
	and the delivery mode is Fixed, the APIC may signal an illegal
	vector error, with out regard to whether the mask bit is set
	or whether an interrupt is actually seen on input."

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Andi Kleen
2ee60e1789 [PATCH] x86_64: Move export symbols to their C functions
Only exports for assembler files are left in x8664_ksyms.c

Originally inspired by a patch from Al Viro
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Keith Owens
45486f81c9 [PATCH] x86_64: Standardize i386/x86_64 handling of NMI_VECTOR
x86_64 and i386 behave inconsistently when sending an IPI on vector 2
(NMI_VECTOR).  Make both behave the same, so IPI 2 is sent as NMI.

The crash code was abusing send_IPI_allbutself() by passing a code
instead of a vector, it only worked because crash knew about the
internal code of send_IPI_allbutself().  Change crash to use NMI_VECTOR
instead, and remove the comment about how crash was abusing the function.

This patch is a pre-requisite for fixing the problem where sending an
IPI as NMI would reboot some Dell Xeon systems.  I cannot fix that
problem while crash continus to abuse send_IPI_allbutself().

It also removes the inconsistency between i386 and x86_64 for
NMI_VECTOR.  That will simplify all the RAS code that needs to bring
all the cpus to a clean stop, even when one or more cpus are spinning
disabled.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:22 -07:00
Piotr Kaczuba
9c63f87387 [PATCH] x86_64: Fix modular pc speaker
It turned out that the following change is needed when the speaker is
compiled as a module.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Jan Beulich
cab093b9d4 [PATCH] x86_64: adjust kstack_depth_to_print default
Defaulting to a value not evenly divisible by four makes little sense,
as four values are displayed per line (and hence the rest of the line
would otherwise be wasted).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Jan Beulich
bdbdaa791f [PATCH] i386/x86-64: adjust /proc/interrupts column headings
With (significantly) more than 10 CPUs online, the column headings
drifted off the positions of the column contents with growing CPU
numbers.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Andi Kleen
75bd665cc9 [PATCH] x86_64: Fix fast check in safe_smp_processor_id
The APIC ID returned by hard_smp_processor_id can be beyond
NR_CPUS and then overflow the x86_cpu_to_apic[] array.

Add a check for overflow. If it happens then the slow loop below
will catch.

Bug pointed out by Doug Thompson

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Rohit Seth
e42f943737 [PATCH] x86_64: x86_64 setup.c - printing cmp related boottime information
Getting phys_proc_id and cpu_core_id information to be printed at boot
time for AMD processors.  Also matching the Node related boot time
information that gets printed for Intel and AMD processors for NUMA
configurations.

Signed-off-by: Rohit Seth <rohitseth@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Andi Kleen
495ab9c045 [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
During some profiling I noticed that default_idle causes a lot of
memory traffic. I think that is caused by the atomic operations
to clear/set the polling flag in thread_info. There is actually
no reason to make this atomic - only the idle thread does it
to itself, other CPUs only read it. So I moved it into ti->status.

Converted i386/x86-64/ia64 for now because that was the easiest
way to fix ACPI which also manipulates these flags in its idle
function.

Cc: Nick Piggin <npiggin@novell.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:21 -07:00
Andi Kleen
d9005b52de [PATCH] x86_64: Remove bogus RED-PEN comment in signal.c
No red zone possible/needed on the alternative stack.

It caused confusion.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:20 -07:00
Andrew Morton
8fa3d6fc5e [PATCH] x86_64: check_addr() cleanups
- Use DMA_32BIT_MASK

 - Use %z for size_t

 - 80-cols

Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:20 -07:00
Andi Kleen
b633237e9c [PATCH] x86_64: Mark mce_amd cpu notifier __cpuinit/__cpuinitdata
Cc: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:20 -07:00
Jacob Shin
2903ee85ce [PATCH] x86_64: mce_amd cleanup
Clean up mce_amd.c for readability and remove code no
longer needed.

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:20 -07:00
Jacob Shin
9526866439 [PATCH] x86_64: mce_amd support for family 0x10 processors
Add support for mce threshold registers found in future
AMD family 0x10 processors.  Backwards compatible with
family 0xF hardware.

AK: fixed build on !SMP

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:20 -07:00
Jacob Shin
fff2e89f11 [PATCH] x86_64: mce_amd relocate sysfs files
Get rid of /sys/devices/system/threshold directory and move
mce_amd thresholding files into the machine sysfs directory --
/sys/devices/system/machinecheck.

AK: Fixed warning

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:20 -07:00
Jacob Shin
17fc14ff1b [PATCH] x86_64: apic support for extended apic interrupt
Add support for extended APIC LVT found in future AMD processors.

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:20 -07:00
Vojtech Pavlik
2f82bde472 [PATCH] x86_64: Update copyright in time.c
Update my copyright dates in arch/x86-64/kernel/time.c

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:19 -07:00
Vojtech Pavlik
b2df3ddb68 [PATCH] x86_64: Explain why HPET T0_CMP register is written twice
After writing the CFG register, the first value written to the T0_CMP
register is the value at which next interrupt should be triggered, every
value after that sets the period of the interrupt. For that reason, the code
needs to write the value twice - to set both the phase and period.

[AK: I had already figured it out by myself, but it's still useful
to have a comment for this.]

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:19 -07:00
Vojtech Pavlik
4221133845 [PATCH] x86_64: Make use of the *PER* constants in time.c
This patch makes use of the newly added conversion constants
in time.h to x86-64 time.c. The code gets significantly easier
to understand.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:19 -07:00
Vojtech Pavlik
e30db3e699 [PATCH] x86_64: Remove hack to manually enable HPET on AMD8111 southbridges
Remove #ifdefed code to manually enable HPET on AMD8111, where the
BIOS doesn't have ACPI HPET tables and doesn't enable it for us.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:19 -07:00
Vojtech Pavlik
7b0e850125 [PATCH] x86_64: Add X86_FEATURE_RDTSCP, fix rdtscp in /proc/cpuinfo
This patch adds the X86_FEATURE_RDTSCP #define, so that kernel code can
check for the feature easily and also fixes the location of the "rdtscp"
string in the cpuinfo tables.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:19 -07:00
Vojtech Pavlik
f8bf3c65a9 [PATCH] x86_64: Rename oem_force_hpet_timer to apic_is_clustered_box
Rename oem_force_hpet_timer to apic_is_clustered_box, to give the
function a better fitting name - it really isn't at all about HPET.

Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:19 -07:00
Rohit Seth
f3fa8ebc25 [PATCH] x86_64: moving phys_proc_id and cpu_core_id to cpuinfo_x86
Most of the fields of cpuinfo are defined in cpuinfo_x86 structure.
This patch moves the phys_proc_id and cpu_core_id for each processor to
cpuinfo_x86 structure as well.

Signed-off-by: Rohit Seth <rohitseth@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:19 -07:00
Jon Mason
e465058d55 [PATCH] x86_64: Calgary IOMMU - Calgary specific bits
This patch hooks Calgary into the build, the x86-64 IOMMU
initialization paths, and introduces the Calgary specific bits.  The
implementation draws inspiration from both PPC (which has support for
the same chip but requires firmware support which we don't have on
x86-64) and gart. Calgary is different from gart in that it support a
translation table per PHB, as opposed to the single gart aperture.

Changes from previous version:
 * Addition of boot-time disablement for bus-level translation/isolation
   (e.g, enable userspace DMA for things like X)
 * Usage of newer IOMMU abstraction functions

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:19 -07:00
Jon Mason
0dc243ae10 [PATCH] x86_64: Calgary IOMMU - IOMMU abstractions
This patch creates a new interface for IOMMUs by adding a centralized
location for IOMMU allocation (for translation tables/apertures) and
IOMMU initialization.  In creating these, code was moved around for
abstraction, uniformity, and consiceness.

Take note of the move of the iommu_setup bootarg parsing code to
__setup.  This is enabled by moving back the location of the aperture
allocation/detection to mem init (which while ugly, was already the
location of the swiotlb_init).

While a slight departure from the previous patch, I belive this provides
the true intention of the previous versions of the patch which changed
this code.  It also makes the addition of the upcoming calgary code much
cleaner than previous patches.

[AK: Removed one broken change. iommu_setup still has to be called
early]

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:18 -07:00
Jon Mason
8d4f6b93a4 [PATCH] x86_64: Calgary IOMMU - introduce iommu_detected
swiotlb relies on the gart specific iommu_aperture variable to know if
we discovered a hardware IOMMU before swiotlb initialization.  Introduce
iommu_detected to do the same thing, but in a HW IOMMU neutral manner,
in preparation for adding the Calgary HW IOMMU.

Signed-Off-By: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-Off-By: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:18 -07:00
Rohit Seth
2bbc419f9d [PATCH] x86_64: Change assembly to use regular cpuid_count macro
Minor cleanup patch:

Replacing the asm statement with cpuid_count macro(which already
provides the same functionality).

Signed-off-by: Rohit Seth <rohitseth@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:18 -07:00
Jan Beulich
46d13a384b [PATCH] x86_64: use halt() instead of raw inline assembly
Use abstractions whenever possible.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:18 -07:00
Jan Beulich
c33bd9aac0 [PATCH] i386/x86-64: fall back to old-style call trace if no unwinding
If no unwinding is possible at all for a certain exception instance,
fall back to the old style call trace instead of not showing any trace
at all.

Also, allow setting the stack trace mode at the command line.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:18 -07:00
Jan Beulich
dffead4e42 [PATCH] x86_64: reliable stack trace support (x86-64 syscall
Adjust the CFA offset for 64- and 32-bit syscall entries so that the five
slots pre-subtracted from the stack pointer do not appear to reside outside
of the current frame.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Jan Beulich
1de9c3f67e [PATCH] x86_64: reliable stack trace support (x86-64 IRQ stack
Change the switching to/from the IRQ stack so that unwind annotations can
be added for it without requiring CFA expressions.

AK: I cleaned it up a bit, making it unconditional and removing the
obsolete DEBUG_INFO full frame code.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Jan Beulich
b538ed278b [PATCH] x86_64: reliable stack trace support (x86-64)
These are the x86_64-specific pieces to enable reliable stack traces. The
only restriction with this is that it currently cannot unwind across the
interrupt->normal stack boundary, as that transition is lacking proper
annotation.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
bibo,mao
2b28592b07 [PATCH] x86_64: x86_86 msi miss one entry handler
In x86_64 architecture, if device driver with msi function
gets 0xee vector by assign_irq_vector() function, system will
crash if this interrupt happens. It is because 0xee interrupt
entry is empty. This patch modifies this. This patch is based
on 2.6.17-rc6.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Andi Kleen
a813ce432f [PATCH] x86_64: Rename IOMMU option, fix help and mark option embedded.
- Rename the GART_IOMMU option to IOMMU to make clear it's not
   just for AMD
 - Rewrite the help text to better emphatise this fact
 - Make it an embedded option because too many people get it wrong.

To my astonishment I discovered the aacraid driver tests this
symbol directly. This looks quite broken to me - it's an internal
implementation detail of the PCI DMA API. Can the maintainer
please clarify what this test was intended to do?

Cc: linux-scsi@vger.kernel.org
Cc: alan@redhat.com
Cc: markh@osdl.org
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Andi Kleen
4d9bc79cd2 [PATCH] x86_64: Make sure is_compat_task works early
Previously it would only work in the first 32bit system call, not during
early process setup.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Ingo Molnar
26a3c49cec [PATCH] x86_64: fix vector_lock deadlock in io_apic.c
Fix a potential deadlock scenario introduced by io_apic.c's new vector_lock
on i386 and x86_64.

Found by the locking correctness validator. The patch was boot-tested on
x86. For details of the deadlock scenario, see the validator output:

  ======================================================
  [ BUG: hard-safe -> hard-unsafe lock order detected! ]
  ------------------------------------------------------
  idle/1 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
   (msi_lock){....}, at: [<c04ff8d2>] startup_msi_irq_wo_maskbit+0x10/0x35

  and this task is already holding:
   (&irq_desc[i].lock){++..}, at: [<c015b924>] probe_irq_on+0x36/0x107
  which would create a new lock dependency:
   (&irq_desc[i].lock){++..} -> (msi_lock){....}

  but this new dependency connects a hard-irq-safe lock:
   (&irq_desc[i].lock){++..}
  ... which became hard-irq-safe at:
    [<c01468c4>] lockdep_acquire+0x68/0x84
    [<c10485e9>] _spin_lock+0x21/0x2f
    [<c015aff5>] __do_IRQ+0x3d/0x113
    [<c01062d3>] do_IRQ+0x8c/0xad

  to a hard-irq-unsafe lock:
   (vector_lock){--..}
  ... which became hard-irq-unsafe at:
  ...  [<c01468c4>] lockdep_acquire+0x68/0x84
    [<c10485e9>] _spin_lock+0x21/0x2f
    [<c011b5e8>] assign_irq_vector+0x34/0xc8
    [<c1aa82fa>] setup_IO_APIC+0x45a/0xcff
    [<c1aa56e3>] smp_prepare_cpus+0x5ea/0x8aa
    [<c010033f>] init+0x32/0x2cb
    [<c0102005>] kernel_thread_helper+0x5/0xb

  which could potentially lead to deadlocks!

  other info that might help us debug this:

  3 locks held by idle/1:
   #0:  (port_mutex){--..}, at: [<c067070d>] uart_add_one_port+0x61/0x289
   #1:  (&state->mutex){--..}, at: [<c067071f>] uart_add_one_port+0x73/0x289
   #2:  (&irq_desc[i].lock){++..}, at: [<c015b924>] probe_irq_on+0x36/0x107

  the hard-irq-safe lock's dependencies:
  -> (&irq_desc[i].lock){++..} ops: 9861 {
     initial-use  at:
                          [<c01468c4>] lockdep_acquire+0x68/0x84
                          [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
                          [<c015b415>] setup_irq+0x9b/0x14d
                          [<c1aaa4c4>] time_init_hook+0xf/0x11
                          [<c1a9f320>] time_init+0x44/0x46
                          [<c1a9955f>] start_kernel+0x191/0x38f
                          [<c0100210>] 0xc0100210
     in-hardirq-W at:
                          [<c01468c4>] lockdep_acquire+0x68/0x84
                          [<c10485e9>] _spin_lock+0x21/0x2f
                          [<c015aff5>] __do_IRQ+0x3d/0x113
                          [<c01062d3>] do_IRQ+0x8c/0xad
     in-softirq-W at:
                          [<c01468c4>] lockdep_acquire+0x68/0x84
                          [<c10485e9>] _spin_lock+0x21/0x2f
                          [<c015aff5>] __do_IRQ+0x3d/0x113
                          [<c01062d3>] do_IRQ+0x8c/0xad
   }
   ... key      at: [<c1ea31e0>] irq_desc_lock_type+0x0/0x20
    -> (i8259A_lock){++..} ops: 5149 {
       initial-use  at:
                        [<c01468c4>] lockdep_acquire+0x68/0x84
                        [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
                        [<c0108090>] init_8259A+0x11/0x8f
                        [<c1aa0d22>] init_ISA_irqs+0x12/0x4d
                        [<c1aaa4f0>] pre_intr_init_hook+0x8/0xa
                        [<c1aa0cb9>] init_IRQ+0xe/0x65
                        [<c1a99546>] start_kernel+0x178/0x38f
                        [<c0100210>] 0xc0100210
       in-hardirq-W at:
                        [<c01468c4>] lockdep_acquire+0x68/0x84
                        [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
                        [<c0107fb0>] mask_and_ack_8259A+0x1b/0xcc
                        [<c015b007>] __do_IRQ+0x4f/0x113
                        [<c01062d3>] do_IRQ+0x8c/0xad
       in-softirq-W at:
                        [<c01468c4>] lockdep_acquire+0x68/0x84
                        [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
                        [<c0107fb0>] mask_and_ack_8259A+0x1b/0xcc
                        [<c015b007>] __do_IRQ+0x4f/0x113
                        [<c01062d3>] do_IRQ+0x8c/0xad
     }
     ... key      at: [<c142f174>] i8259A_lock+0x14/0x40
   ... acquired at:
     [<c01468c4>] lockdep_acquire+0x68/0x84
     [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
     [<c0107eb2>] enable_8259A_irq+0x10/0x47
     [<c0107f12>] startup_8259A_irq+0x8/0xc
     [<c015b45e>] setup_irq+0xe4/0x14d
     [<c1aaa4c4>] time_init_hook+0xf/0x11
     [<c1a9f320>] time_init+0x44/0x46
     [<c1a9955f>] start_kernel+0x191/0x38f
     [<c0100210>] 0xc0100210

    -> (ioapic_lock){+...} ops: 122 {
       initial-use  at:
                        [<c01468c4>] lockdep_acquire+0x68/0x84
                        [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
                        [<c1aa71db>] io_apic_get_version+0x16/0x55
                        [<c1aa5c73>] mp_register_ioapic+0xc6/0x127
                        [<c1aa382e>] acpi_parse_ioapic+0x2d/0x39
                        [<c1abe031>] acpi_table_parse_madt_family+0xb4/0x100
                        [<c1abe093>] acpi_table_parse_madt+0x16/0x18
                        [<c1aa3c8a>] acpi_boot_init+0x132/0x251
                        [<c1aa08ea>] setup_arch+0xd36/0xe37
                        [<c1a99434>] start_kernel+0x66/0x38f
                        [<c0100210>] 0xc0100210
       in-hardirq-W at:
                        [<c01468c4>] lockdep_acquire+0x68/0x84
                        [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
                        [<c011bce1>] mask_IO_APIC_irq+0x11/0x31
                        [<c011c5cc>] ack_edge_ioapic_vector+0x31/0x41
                        [<c015b007>] __do_IRQ+0x4f/0x113
                        [<c01062d3>] do_IRQ+0x8c/0xad
     }
     ... key      at: [<c1432514>] ioapic_lock+0x14/0x3c
      -> (i8259A_lock){++..} ops: 5149 {
         initial-use  at:
                         [<c01468c4>] lockdep_acquire+0x68/0x84
                         [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
                         [<c0108090>] init_8259A+0x11/0x8f
                         [<c1aa0d22>] init_ISA_irqs+0x12/0x4d
                         [<c1aaa4f0>] pre_intr_init_hook+0x8/0xa
                         [<c1aa0cb9>] init_IRQ+0xe/0x65
                         [<c1a99546>] start_kernel+0x178/0x38f
                         [<c0100210>] 0xc0100210
         in-hardirq-W at:
                         [<c01468c4>] lockdep_acquire+0x68/0x84
                         [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
                         [<c0107fb0>] mask_and_ack_8259A+0x1b/0xcc
                         [<c015b007>] __do_IRQ+0x4f/0x113
                         [<c01062d3>] do_IRQ+0x8c/0xad
         in-softirq-W at:
                         [<c01468c4>] lockdep_acquire+0x68/0x84
                         [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
                         [<c0107fb0>] mask_and_ack_8259A+0x1b/0xcc
                         [<c015b007>] __do_IRQ+0x4f/0x113
                         [<c01062d3>] do_IRQ+0x8c/0xad
       }
       ... key      at: [<c142f174>] i8259A_lock+0x14/0x40
     ... acquired at:
     [<c01468c4>] lockdep_acquire+0x68/0x84
     [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
     [<c0107e6b>] disable_8259A_irq+0x10/0x47
     [<c011bdbd>] startup_edge_ioapic_vector+0x31/0x58
     [<c015b45e>] setup_irq+0xe4/0x14d
     [<c015b5a1>] request_irq+0xda/0xf9
     [<c1ac983a>] rtc_init+0x6a/0x1a7
     [<c0100457>] init+0x14a/0x2cb
     [<c0102005>] kernel_thread_helper+0x5/0xb

   ... acquired at:
     [<c01468c4>] lockdep_acquire+0x68/0x84
     [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
     [<c011bce1>] mask_IO_APIC_irq+0x11/0x31
     [<c011c5cc>] ack_edge_ioapic_vector+0x31/0x41
     [<c015b007>] __do_IRQ+0x4f/0x113
     [<c01062d3>] do_IRQ+0x8c/0xad

  the hard-irq-unsafe lock's dependencies:
  -> (vector_lock){--..} ops: 31 {
     initial-use  at:
                          [<c01468c4>] lockdep_acquire+0x68/0x84
                          [<c10485e9>] _spin_lock+0x21/0x2f
                          [<c011b5e8>] assign_irq_vector+0x34/0xc8
                          [<c1aa82fa>] setup_IO_APIC+0x45a/0xcff
                          [<c1aa56e3>] smp_prepare_cpus+0x5ea/0x8aa
                          [<c010033f>] init+0x32/0x2cb
                          [<c0102005>] kernel_thread_helper+0x5/0xb
     softirq-on-W at:
                          [<c01468c4>] lockdep_acquire+0x68/0x84
                          [<c10485e9>] _spin_lock+0x21/0x2f
                          [<c011b5e8>] assign_irq_vector+0x34/0xc8
                          [<c1aa82fa>] setup_IO_APIC+0x45a/0xcff
                          [<c1aa56e3>] smp_prepare_cpus+0x5ea/0x8aa
                          [<c010033f>] init+0x32/0x2cb
                          [<c0102005>] kernel_thread_helper+0x5/0xb
     hardirq-on-W at:
                          [<c01468c4>] lockdep_acquire+0x68/0x84
                          [<c10485e9>] _spin_lock+0x21/0x2f
                          [<c011b5e8>] assign_irq_vector+0x34/0xc8
                          [<c1aa82fa>] setup_IO_APIC+0x45a/0xcff
                          [<c1aa56e3>] smp_prepare_cpus+0x5ea/0x8aa
                          [<c010033f>] init+0x32/0x2cb
                          [<c0102005>] kernel_thread_helper+0x5/0xb
   }
   ... key      at: [<c1432574>] vector_lock+0x14/0x3c

  stack backtrace:
   [<c0104f36>] show_trace+0xd/0xf
   [<c010543e>] dump_stack+0x17/0x19
   [<c0144e34>] check_usage+0x1f6/0x203
   [<c0146395>] __lockdep_acquire+0x8c2/0xaa5
   [<c01468c4>] lockdep_acquire+0x68/0x84
   [<c10487f4>] _spin_lock_irqsave+0x2a/0x3a
   [<c04ff8d2>] startup_msi_irq_wo_maskbit+0x10/0x35
   [<c015b932>] probe_irq_on+0x44/0x107
   [<c0673d58>] serial8250_config_port+0x84b/0x986
   [<c06707b1>] uart_add_one_port+0x105/0x289
   [<c1ace54b>] serial8250_init+0xc3/0x10a
   [<c0100457>] init+0x14a/0x2cb
   [<c0102005>] kernel_thread_helper+0x5/0xb

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:17 -07:00
Jon Mason
357c2b9056 [PATCH] x86_64: remove unused gart header file
include/asm-x86_64/gart-mapping.h is only ever used in
arch/x86_64/kernel/setup.c and none of its contents are referenced.
Looks to be leftover cruft not removed in the dma_ops patch.

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Andi Kleen
5c0f80fab3 [PATCH] x86_64: Remove long obsolete CVS
Early development of x86-64 Linux was in CVS, but that hasn't been
the case for a long time now. Remove the obsolete $Id$s.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Don Zickus
3e4ff11574 [PATCH] x86_64: nmi watchdog header cleanup
Misc header cleanup for nmi watchdog.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Ingo Molnar
14118c3cdd [PATCH] x86_64: fix unlikely profiling & vsyscalls on x86_64
fix unlikely profiling in vsyscalls ...

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Jan Beulich
4b787e0b83 [PATCH] x86_64: add END()/ENDPROC() annotations to entry.S
Since END()/ENDPROC() are now available, add respective annotations to
x86_64's entry.S. This should help debugging activities.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Andi Kleen
f201611fce [PATCH] x86_64: Use -ENODEV in IOMMU initialization
Fix

initcall at 0xffffffff806c5b89: pci_iommu_init+0x0/0x53c(): returned with error code -1

Return -ENODEV instead when the IOMMU is not used.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:15 -07:00
Jan Beulich
6ebcc00e95 [PATCH] i386/x86-64: simplify ioapic_register_intr()
Simplify (remove duplication of) code in ioapic_register_intr().

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:15 -07:00
Jan Beulich
0a1ad60d7a [PATCH] x86_64: serialize assign_irq_vector() use of static variables
Since assign_irq_vector() can be called at runtime, its access of static
variables should be protected by a lock.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:15 -07:00
Andi Kleen
a32073bffc [PATCH] x86_64: Clean and enhance up K8 northbridge access code
- Factor out the duplicated access/cache code into a single file
   * Shared between i386/x86-64.
 - Share flush code between AGP and IOMMU
   * Fix a bug: AGP didn't wait for end of flush before
 - Drop 8 northbridges limit and allocate dynamically
 - Add lock to serialize AGP and IOMMU GART flushes
 - Add PCI ID for next AMD northbridge
 - Random related cleanups

The old K8 NUMA discovery code is unchanged. New systems
should all use SRAT for this.

Cc: "Navin Boppuri" <navin.boppuri@newisys.com>
Cc: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:15 -07:00
Jon Mason
7c2d9cd218 [PATCH] x86_64: trivial gart clean-up
A trivial change to have gart_unmap_sg call gart_unmap_single directly,
instead of bouncing through the dma_unmap_single wrapper in
dma-mapping.h.

This change required moving the gart_unmap_single above gart_unmap_sg,
and under gart_map_single (which seems a more logical place that its
current location IMHO).

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:15 -07:00
Mike Waychison
f5adc9c79d [PATCH] x86_64: iommu_gart_bitmap search to cross next_bit
Allow search for a contiguous block of iommu space to cross the next_bit
marker if we have already committed ourselves to flushing the gart.

There shouldn't be any reason why we'd restrict the search.

Signed-off-by: Mike Waychison <mikew@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:15 -07:00
Jon Mason
9f2036f3e2 [PATCH] x86_64: pci-dma.c clean-up - trivial
Replace hard coded DMA masks with #defines from
include/linux/dma-mapping.h

Signed-off-by: Jon Mason <jdmason@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:14 -07:00
Gerd Hoffmann
d167a51877 [PATCH] x86_64: x86_64 version of the smp alternative patch.
Changes are largely identical to the i386 version:

 * alternative #define are moved to the new alternative.h file.
 * one new elf section with pointers to the lock prefixes which can be
   nop'ed out for non-smp.
 * two new elf sections simliar to the "classic" alternatives to
   replace SMP code with simpler UP code.
 * fixup headers to use alternative.h instead of defining their own
   LOCK / LOCK_PREFIX macros.

The patch reuses the i386 version of the alternatives code to avoid code
duplication.  The code in alternatives.c was shuffled around a bit to
reduce the number of #ifdefs needed.  It also got some tweaks needed for
x86_64 (vsyscall page handling) and new features (noreplacement option
which was x86_64 only up to now).  Debug printk's are changed from
compile-time to runtime.

Loosely based on a early version from Bastian Blank <waldi@debian.org>

Signed-off-by: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:14 -07:00
Andi Kleen
240cd6a806 [PATCH] i386/x86-64: Emulate CPUID4 on AMD
Intel systems report the cache level data from CPUID 4 in sysfs.
Add a CPUID 4 emulation for AMD CPUs to report the same
information for them. This allows programs to read this
information in a uniform way.

The AMD way to report this is less flexible so some assumptions
are hardcoded (e.g. no L3)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:14 -07:00
Andi Kleen
faee9a5dc9 [PATCH] i386/x86-64: Use new official CPUID to get APICID/core split on AMD platforms
Previously the apicid<->coreid split was computed based on the max
number of cores. Now use a new CPUID AMD defined for that. On most
systems right now it should be 0 and the old method will be used.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:14 -07:00
ravikiran thirumalai
0f4fdb7fba [PATCH] x86_64: Use local APIC ID from local APIC instead of CPUID
vSMPowered systems use apic_cluster too.  Forcing apic_physflat works
on these systems too, but only if we change phys_pkg_id to use
hard_smp_prcoessor_id() instead of cpuid_ebx.  I am guessing other
multichassi cluster systems would need this too.

Signed-off-by: ravikiran thirumalai <kiran@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:14 -07:00
Antonino A. Daplas
ba70710e59 [PATCH] fbdev: Firmware EDID fixes
- make firmware edid independent from framebuffer (No need to choose
  framebuffer just to disable this option

- enable this option in X86_64

- check if VBE/DDC function is implemented before calling actual function

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:30 -07:00
Andreas Mohr
7d622d4794 [PATCH] make pmtmr_ioport __read_mostly
- written on init only, accessed for every timer read --> __read_mostly
- fix broken sentence

Signed-off-by: Andreas Mohr <andi@lisas.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 09:58:21 -07:00
Tobias Klauser
2efe55a9ce Storage class should be first
Storage class should be before const

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-26 18:57:34 +02:00
Andreas Mohr
d6e05edc59 spelling fixes
acquired (aquired)
contiguous (contigious)
successful (succesful, succesfull)
surprise (suprise)
whether (weather)
some other misspellings

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-26 18:35:02 +02:00
Lee Revell
f18190bd34 fix paniced->panicked typos
In a testament to the utter simplicity and logic of the English
language ;-), I found a single correct use - in kernel/panic.c - and
10-15 incorrect ones.

Signed-Off-By: Lee Revell <rlrevell@joe-job.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-26 18:30:00 +02:00
Herbert Xu
6c2bb98bc3 [CRYPTO] all: Pass tfm instead of ctx to algorithms
Up until now algorithms have been happy to get a context pointer since
they know everything that's in the tfm already (e.g., alignment, block
size).

However, once we have parameterised algorithms, such information will
be specific to each tfm.  So the algorithm API needs to be changed to
pass the tfm structure instead of the context pointer.

This patch is basically a text substitution.  The only tricky bit is
the assembly routines that need to get the context pointer offset
through asm-offsets.h.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2006-06-26 17:34:39 +10:00
Linus Torvalds
3448097fcc Revert "swsusp special saveable pages support" commits
This reverts commits

  3e3318dee0 [PATCH] swsusp: x86_64 mark special saveable/unsaveable pages
  b6370d96e0 [PATCH] swsusp: i386 mark special saveable/unsaveable pages
  ce4ab0012b [PATCH] swsusp: add architecture special saveable pages support

because not only do they apparently cause page faults on x86, the
infrastructure doesn't compile on powerpc.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 18:41:00 -07:00
Amul Shah
00212fef81 [PATCH] Fix kdump Crash Kernel boot memory reservation for NUMA machines
This patch will fix a boot memory reservation bug that trashes memory on
the ES7000 when loading the kdump crash kernel.

The code in arch/x86_64/kernel/setup.c to reserve boot memory for the crash
kernel uses the non-numa aware "reserve_bootmem" function instead of the
NUMA aware "reserve_bootmem_generic".  I checked to make sure that no other
function was using "reserve_bootmem" and found none, except the ones that
had NUMA ifdef'ed out.

I have tested this patch only on an ES7000 with NUMA on and off (numa=off)
in a single (non-NUMA) and multi-cell (NUMA) configurations.

Signed-off-by: Amul Shah <amul.shah@unisys.com>
Looks-good-to: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:01:26 -07:00
Linus Torvalds
37224470c8 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (65 commits)
  ACPI: suppress power button event on S3 resume
  ACPI: resolve merge conflict between sem2mutex and processor_perflib.c
  ACPI: use for_each_possible_cpu() instead of for_each_cpu()
  ACPI: delete newly added debugging macros in processor_perflib.c
  ACPI: UP build fix for bugzilla-5737
  Enable P-state software coordination via _PDC
  P-state software coordination for speedstep-centrino
  P-state software coordination for acpi-cpufreq
  P-state software coordination for ACPI core
  ACPI: create acpi_thermal_resume()
  ACPI: create acpi_fan_suspend()/acpi_fan_resume()
  ACPI: pass pm_message_t from acpi_device_suspend() to root_suspend()
  ACPI: create acpi_device_suspend()/acpi_device_resume()
  ACPI: replace spin_lock_irq with mutex for ec poll mode
  ACPI: Allow a WAN module enable/disable on a Thinkpad X60.
  sem2mutex: acpi, acpi_link_lock
  ACPI: delete unused acpi_bus_drivers_lock
  sem2mutex: drivers/acpi/processor_perflib.c
  ACPI add ia64 exports to build acpi_memhotplug as a module
  ACPI: asus_acpi_init(): propagate correct return value
  ...

Manual resolve of conflicts in:

	arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c
	arch/i386/kernel/cpu/cpufreq/speedstep-centrino.c
	include/acpi/processor.h
2006-06-23 07:52:36 -07:00
Shaohua Li
55b2355eef [PATCH] don't use flush_tlb_all in suspend time
flush_tlb_all uses on_each_cpu, which will disable/enable interrupt.
In suspend/resume time, this will make interrupt wrongly enabled.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:43:00 -07:00
Shaohua Li
3e3318dee0 [PATCH] swsusp: x86_64 mark special saveable/unsaveable pages
Pages (Reserved/ACPI NVS/ACPI Data) below end_pfn will be saved/restored by S4
currently.  We should mark 'Reserved' pages not saveable.

Pages (Reserved/ACPI NVS/ACPI Data) above end_pfn will not be saved/restored
by S4 currently.  We should save the 'ACPI NVS/ACPI Data' pages.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@suspend2.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:59 -07:00
Andreas Mohr
7b0c2d9218 [PATCH] x86: make i387 mxcsr_feature_mask __read_mostly
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:57 -07:00
Andreas Mohr
acae9d3243 [PATCH] x86: make using_apic_timer __read_mostly
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-23 07:42:57 -07:00
Len Brown
3e8e7c93d7 Pull bugzilla-5653 into release branch 2006-06-15 15:41:53 -04:00
Andy Currid
a2ef3a50f1 [PATCH] Fix HPET operation on 64-bit NVIDIA platforms
From: "Andy Currid" <ACurrid@nvidia.com>

This patch fixes a kernel panic during boot that occurs on NVIDIA platforms
that have HPET enabled.

When HPET is enabled, the standard timer IRQ is routed to IOAPIC pin 2 and is
advertised as such in the ACPI APIC table - but an earlier workaround in the
kernel was ignoring this override.  The fix is to honor timer IRQ overrides
from ACPI when HPET is detected on an NVIDIA platform.

Signed-off-by: Andy Currid <acurrid@nvidia.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: "Yu, Luming" <luming.yu@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-08 15:12:21 -07:00
Andi Kleen
822ff019f7 [PATCH] x86_64: Don't do syscall exit tracing twice
int_ret_from_syscall already does syscall exit tracing, so
no need to do it again in the caller.

This caused problems for UML and some other special programs doing
syscall interception.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-30 20:31:06 -07:00
Robert Hentosh
7ca97c6131 [PATCH] x86_64: Fix off by one in bad_addr checking in find_e820_area
From: Robert Hentosh <robert_hentosh@dell.com>

Actually, we just stumbled on a different bug found in find_e820_area() in
e820.c.  The following code does not handle the edge condition correctly:

   while (bad_addr(&addr, size) && addr+size < ei->addr + ei->size)
       ;
   last = addr + size;
   if ( last > ei->addr + ei->size )
       continue;

The second statement in the while loop needs to be a <= b so that it is the
logical negavite of the if (a > b) outside it. It needs to read:

   while (bad_addr(&addr, size) && addr+size <= ei->addr + ei->size)
       ;

In the case that failed bad_addr was returning an address that is exactly size
bellow the end of the e820 range.

AK: Again together with the earlier avoid edma fix this fixes
boot on a Dell PE6850/16GB

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-30 20:31:06 -07:00
Daniel Yeisley
0d01532451 [PATCH] x86_64: Handle empty node zero
From: Daniel Yeisley <dan.yeisley@unisys.com>

It is possible to boot a Unisys ES7000 with CPUs from multiple cells, and not
also include the memory from those cells.  This can create a scenario where
node 0 has cpus, but no associated memory.  The system will boot fine in a
configuration where node 0 has memory, but nodes 2 and 3 do not.

[AK: I rechecked the code and generic code seems to indeed handle that already.
Dan's original patch had a change for mm/slab.c that seems to be already in now.]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-30 20:31:06 -07:00
Jan Beulich
b2468e525f [PATCH] x86_64: fix last_tsc calculation of PM timer
From: "Jan Beulich" <jbeulich@novell.com>

The PM timer code updates vxtime.last_tsc, but this update was done
incorrectly in two ways:
- offset_delay being in microseconds requires multiplying with cpu_mhz
  rather than cpu_khz
- the multiplication of offset_delay and cpu_khz (both being 32-bit
  values) on most current CPUs would overflow (observed value of the
  delay was approximately 4000us, yielding an overflow for frequencies
  starting a little above 1GHz)

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-30 20:31:05 -07:00
Andi Kleen
dc9a719528 [PATCH] x86_64: Fix no IOMMU warning in PCI-GART driver
Complaining about the IOMMU not compiled in doesn't make sense
here because it is clearly compiled in.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-30 20:31:05 -07:00
Satoshi Oshima
dc49e3445a [PATCH] kprobes: bad manipulation of 2 byte opcode on x86_64
Problem:

If we put a probe onto a callq instruction and the probe is executed,
kernel panic of Bad RIP value occurs.

Root cause:

If resume_execution() found 0xff at first byte of p->ainsn.insn, it must
check the _second_ byte.  But current resume_execution check _first_ byte
again.

I changed it checks second byte of p->ainsn.insn.

Kprobes on i386 don't have this problem, because the implementation is a
little bit different from x86_64.

Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Satoshi Oshima <soshima@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-21 12:59:21 -07:00
Andi Kleen
40e59a6166 [PATCH] x86_64: Don't schedule on exception stack on preemptive kernels
Extends an earlier patch from John Blackwood to more exception handlers
that also run on the exception stacks.

Expand the use of preempt_conditional_{sti,cli} to all cases where
interrupts are to be re-enabled during exception handling while running
on an IST stack.

Based on original patch from Jan Beulich.

Cc: John Blackwood <john.blackwood@ccur.com>
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-16 07:59:32 -07:00
Andi Kleen
f0fdabf8bf [PATCH] x86_64: Don't warn for overflow in nommu case when dma_mask is < 32bit
This triggers for b44's 1GB DMA workaround which tries to map
first and then bounces.

The 32bit heuristic is reasonable because the IOMMU doesn't attempt
to handle < 32bit masks anyways.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-16 07:59:31 -07:00
Andi Kleen
ac71d12c99 [PATCH] x86_64: Avoid EBDA area in early boot allocator
Based on analysis&patch from Robert Hentosch

Observed on a Dell PE6850 with 16GB

The problem occurs very early on, when the kernel allocates space for the
temporary memory map called bootmap. The bootmap overlaps the EBDA region.
EBDA region is not historically reserved in the e820 mapping. When the
bootmap is freed it marks the EBDA region as usable.

If you notice in setup.c there is already code to work around the EBDA
in reserve_ebda_region(), this check however occurs after the bootmap
is allocated and doesn't prevent the bootmap from using this range.

AK: I redid the original patch. Thanks also to Jan Beulich for
spotting some mistakes.

Cc: Robert_Hentosch@dell.com
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Corey Minyard
8b1ffe9550 [PATCH] x86_64: add nmi_exit to die_nmi
Playing with NMI watchdog on x86_64, I discovered that it didn't
do what I expected.  It always panic-ed, even when it didn't
happen from interrupt context.  This patch solves that
problem for me.  Also, in this case, do_exit() will be called
with interrupts disabled, I believe.  Would it be wise to also
call local_irq_enable() after nmi_exit()?
[Yes I added it -AK]

Currently, on x86_64, any NMI watchdog timeout will cause a panic
because the irq count will always be set to be in an interrupt
when do_exit() is called from die_nmi().  If we add nmi_exit() to
the die_nmi() call (since the nmi will never exit "normally")
it seems to solve this problem.  The following small program
can be used to trigger the NMI watchdog to reproduce this:
  main ()
  {
        iopl(3);
        for (;;) asm("cli");
  }

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Corey Minyard
cdc60a4c8e [PATCH] x86_64: fix die_lock nesting
I noticed this when poking around in this area.

The oops_begin() function in x86_64 would only conditionally claim
the die_lock if the call is nested, but oops_end() would always
release the spinlock. This patch adds a nest count for the die lock
so that the release of the lock is only done on the final oops_end().

Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Andi Kleen
5192d84e4c [PATCH] x86_64: Check for too many northbridges in IOMMU code
The IOMMU code can only deal with 8 northbridges. Error out when
more are found.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Kimball Murray
e0c1e9bf81 [PATCH] x86_64: avoid IRQ0 ioapic pin collision
The patch addresses a problem with ACPI SCI interrupt entry, which gets
re-used, and the IRQ is assigned to another unrelated device.  The patch
corrects the code such that SCI IRQ is skipped and duplicate entry is
avoided.  Second issue came up with VIA chipset, the problem was caused by
original patch assigning IRQs starting 16 and up.  The VIA chipset uses
4-bit IRQ register for internal interrupt routing, and therefore cannot
handle IRQ numbers assigned to its devices.  The patch corrects this
problem by allowing PCI IRQs below 16.

Cc: len.brown@intel.com

Signed-off by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-08 09:34:56 -07:00
Linus Torvalds
532f57da40 Merge branch 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current
* 'audit.b10' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
  [PATCH] Audit Filter Performance
  [PATCH] Rework of IPC auditing
  [PATCH] More user space subject labels
  [PATCH] Reworked patch for labels on user space messages
  [PATCH] change lspp ipc auditing
  [PATCH] audit inode patch
  [PATCH] support for context based audit filtering, part 2
  [PATCH] support for context based audit filtering
  [PATCH] no need to wank with task_lock() and pinning task down in audit_syscall_exit()
  [PATCH] drop task argument of audit_syscall_{entry,exit}
  [PATCH] drop gfp_mask in audit_log_exit()
  [PATCH] move call of audit_free() into do_exit()
  [PATCH] sockaddr patch
  [PATCH] deal with deadlocks in audit_free()
2006-05-01 21:43:05 -07:00
Mikael Pettersson
160bd18e5e [PATCH] x86_64: make PC Speaker driver work
The PC Speaker driver's ->probe() routine doesn't even get called in the
64-bit kernels.  The reason for that is that the arch code apparently has
to explictly add a "pcspkr" platform device in order for the driver core to
call the ->probe() routine.  arch/i386/kernel/setup.c unconditionally adds
a "pcspkr" device, but the x86_64 kernel has no code at all related to the
PC Speaker.

The patch below copies the relevant code from i386 to x86_64, which makes
the PC Speaker work for me on x86_64.

Cc: Dmitry Torokhov <dtor_core@ameritech.net>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-05-01 18:17:47 -07:00
Al Viro
5411be59db [PATCH] drop task argument of audit_syscall_{entry,exit}
... it's always current, and that's a good thing - allows simpler locking.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-01 06:06:18 -04:00
Chandra Seetharaman
83d722f7e1 [PATCH] Remove __devinit and __cpuinit from notifier_call definitions
Few of the notifier_chain_register() callers use __init in the definition
of notifier_call.  It is incorrect as the function definition should be
available after the initializations (they do not unregister them during
initializations).

This patch fixes all such usages to _not_ have the notifier_call __init
section.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 08:30:03 -07:00
Mike Waychison
5b20192727 [PATCH] x86_64: Fix a race in the free_iommu path
We do this by removing a micro-optimization that tries to avoid grabbing
the iommu_bitmap_lock spinlock and using a bus-locked operation.

This still races with other simultaneous alloc_iommu or free_iommu(size >
1) which both use bus-unlocked operations.

The end result of this race is eventually ending up with an
iommu_gart_bitmap that has bits errornously set all over, making large
contiguous iommu space allocations fail with 'PCI-DMA: Out of IOMMU space'.

Signed-off-by: Mike Waychison <mikew@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-22 09:19:52 -07:00
Andi Kleen
18bd057b14 [PATCH] i386/x86-64: Fix x87 information leak between processes
AMD K7/K8 CPUs only save/restore the FOP/FIP/FDP x87 registers in FXSAVE
when an exception is pending.  This means the value leak through
context switches and allow processes to observe some x87 instruction
state of other processes.

This was actually documented by AMD, but nobody recognized it as
being different from Intel before.

The fix first adds an optimization: instead of unconditionally
calling FNCLEX after each FXSAVE test if ES is pending and skip
it when not needed. Then do a x87 load from a kernel variable to
clear FOP/FIP/FDP.

This means other processes always will only see a constant value
defined by the kernel in their FP state.

I took some pain to make sure to chose a variable that's already
in L1 during context switch to make the overhead of this low.

Also alternative() is used to patch away the new code on CPUs
who don't need it.

Patch for both i386/x86-64.

The problem was discovered originally by Jan Beulich. Richard
Brunner provided the basic code for the workarounds, with contribution
from Jan.

This is CVE-2006-1056

Cc: richard.brunner@amd.com
Cc: jbeulich@novell.com

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-20 07:58:11 -07:00
Prasanna S Panchamukhi
3b60211c16 [PATCH] Switch Kprobes inline functions to __kprobes for x86_64
Andrew Morton pointed out that compiler might not inline the functions
marked for inline in kprobes.  There-by allowing the insertion of probes
on these kprobes routines, which might cause recursion.

This patch removes all such inline and adds them to kprobes section
there by disallowing probes on all such routines.  Some of the routines
can even still be inlined, since these routines gets executed after the
kprobes had done necessay setup for reentrancy.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:53 -07:00
Vivek Goyal
8bcc5280e6 [PATCH] x86_64: x86_64 add crashdump trigger points
o Start booting into the capture kernel after an Oops if system is in a
  unrecoverable state. System will boot into the capture kernel, if one is
  pre-loaded by the user, and capture the kernel core dump.

o One of the following conditions should be true to trigger the booting of
  capture kernel.
        - panic_on_oops is set.
        - pid of current thread is 0
        - pid of current thread is 1
        - Oops happened inside interrupt context.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-18 10:39:19 -07:00
Bjorn Helgaas
4f705ae3e9 [PATCH] DMI: move dmi_scan.c from arch/i386 to drivers/firmware/
dmi_scan.c is arch-independent and is used by i386, x86_64, and ia64.
Currently all three arches compile it from arch/i386, which means that ia64
and x86_64 depend on things in arch/i386 that they wouldn't otherwise care
about.

This is simply "mv arch/i386/kernel/dmi_scan.c drivers/firmware/" (removing
trailing whitespace) and the associated Makefile changes.  All three
architectures already set CONFIG_DMI in their top-level Kconfig files.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Andi Kleen <ak@muc.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andrey Panin <pazke@orbita1.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-04-14 11:41:25 -07:00
Andi Kleen
97a4d00388 [PATCH] x86_64: Remove check for canonical RIP
As pointed out by Linus it is useless now because entry.S should
handle it correctly in all cases.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:38:57 -07:00
Kyle McMartin
894b5779ce [PATCH] No arch-specific strpbrk implementations
While cleaning up parisc_ksyms.c earlier, I noticed that strpbrk wasn't
being exported from lib/string.c.  Investigating further, I noticed a
changeset that removed its export and added it to _ksyms.c on a few more
architectures.  The justification was that "other arches do it."

I think this is wrong, since no architecture currently defines
__HAVE_ARCH_STRPBRK, there's no reason for any of them to be exporting it
themselves.  Therefore, consolidate the export to lib/string.c.

Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:40 -07:00
John Blackwood
97c2803c9c [PATCH] x86_64: Plug GS leak in arch_prctl()
In linux-2.6.16, we have noticed a problem where the gs base value
returned from an arch_prtcl(ARCH_GET_GS, ...) call will be incorrect if:

   - the current/calling task has NOT set its own gs base yet to a
     non-zero value,

   - some other task that ran on the same processor previously set their
     own gs base to a non-zero value.

In this situation, the ARCH_GET_GS code will read and return the
MSR_KERNEL_GS_BASE msr register.

However, since the __switch_to() code does NOT load/zero the
MSR_KERNEL_GS_BASE register when the task that is switched IN has a zero
next->gs value, the caller of arch_prctl(ARCH_GET_GS, ...) will get back
the value of some previous tasks's gs base value instead of 0.

    Change the arch_prctl() ARCH_GET_GS code to only read and return
    the MSR_KERNEL_GS_BASE msr register if the 'gs' register of the calling
    task is non-zero.

    Side note: Since in addition to using arch_prctl(ARCH_SET_GS, ...),
    a task can also setup a gs base value by using modify_ldt() and write
    an index value into 'gs' from user space, the patch below reads
    'gs' instead of using thread.gs, since in the modify_ldt() case,
    the thread.gs value will be 0, and incorrect value would be returned
    (the task->thread.gs value).

    When the user has not set its own gs base value and the 'gs'
    register is zero, then the MSR_KERNEL_GS_BASE register will not be
    read and a value of zero will be returned by reading and returning
    'task->thread.gs'.

    The first patch shown below is an attempt at implementing this
    approach.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-09 11:53:53 -07:00