Commit graph

83 commits

Author SHA1 Message Date
Bernhard Kaindl
2b1f6278d7 [PATCH] x86: Save the MTRRs of the BSP before booting an AP
Applied fix by Andew Morton:
http://lkml.org/lkml/2007/4/8/88 - Fix `make headers_check'.

AMD and Intel x86 CPU manuals state that it is the responsibility of
system software to initialize and maintain MTRR consistency across
all processors in Multi-Processing Environments.

Quote from page 188 of the AMD64 System Programming manual (Volume 2):

7.6.5 MTRRs in Multi-Processing Environments

"In multi-processing environments, the MTRRs located in all processors must
characterize memory in the same way. Generally, this means that identical
values are written to the MTRRs used by the processors." (short omission here)
"Failure to do so may result in coherency violations or loss of atomicity.
Processor implementations do not check the MTRR settings in other processors
to ensure consistency. It is the responsibility of system software to
initialize and maintain MTRR consistency across all processors."

Current Linux MTRR code already implements the above in the case that the
BIOS does not properly initialize MTRRs on the secondary processors,
but the case where the fixed-range MTRRs of the boot processor are changed
after Linux started to boot, before the initialsation of a secondary
processor, is not handled yet.

In this case, secondary processors are currently initialized by Linux
with MTRRs which the boot processor had very early, when mtrr_bp_init()
did run, but not with the MTRRs which the boot processor uses at the
time when that secondary processors is actually booted,
causing differing MTRR contents on the secondary processors.

Such situation happens on Acer Ferrari 1000 and 5000 notebooks where the
BIOS enables and sets AMD-specific IORR bits in the fixed-range MTRRs
of the boot processor when it transitions the system into ACPI mode.
The SMI handler of the BIOS does this in SMM, entered while Linux ACPI
code runs acpi_enable().

Other occasions where the SMI handler of the BIOS may change bits in
the MTRRs could occur as well. To initialize newly booted secodary
processors with the fixed-range MTRRs which the boot processor uses
at that time, this patch saves the fixed-range MTRRs of the boot
processor before new secondary processors are started. When the
secondary processors run their Linux initialisation code, their
fixed-range MTRRs will be updated with the saved fixed-range MTRRs.

If CONFIG_MTRR is not set, we define mtrr_save_state
as an empty statement because there is nothing to do.

Possible TODOs:

*) CPU-hotplugging outside of SMP suspend/resume is not yet tested
   with this patch.

*) If, even in this case, an AP never runs i386/do_boot_cpu or x86_64/cpu_up,
   then the calls to mtrr_save_state() could be replaced by calls to
   mtrr_save_fixed_ranges(NULL) and  mtrr_save_state() would not be
   needed.

   That would need either verification of the CPU-hotplug code or
   at least a test on a >2 CPU machine.

*) The MTRRs of other running processors are not yet checked at this
   time but it might be interesting to syncronize the MTTRs of all
   processors before booting. That would be an incremental patch,
   but of rather low priority since there is no machine known so
   far which would require this.

AK: moved prototypes on x86-64 around to fix warnings

Signed-off-by: Bernhard Kaindl <bk@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
2007-05-02 19:27:17 +02:00
Jeremy Fitzhardinge
c5413fbe89 [PATCH] i386: Fix UP gdt bugs
Fixes two problems with the GDT when compiling for uniprocessor:
 - There's no percpu segment, so trying to load its selector into %fs fails.
   Use a null selector instead.
 - The real gdt needs to be loaded at some point.  Do it in cpu_init().

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
2007-05-02 19:27:16 +02:00
Jeremy Fitzhardinge
7c3576d261 [PATCH] i386: Convert PDA into the percpu section
Currently x86 (similar to x84-64) has a special per-cpu structure
called "i386_pda" which can be easily and efficiently referenced via
the %fs register.  An ELF section is more flexible than a structure,
allowing any piece of code to use this area.  Indeed, such a section
already exists: the per-cpu area.

So this patch:
(1) Removes the PDA and uses per-cpu variables for each current member.
(2) Replaces the __KERNEL_PDA segment with __KERNEL_PERCPU.
(3) Creates a per-cpu mirror of __per_cpu_offset called this_cpu_off, which
    can be used to calculate addresses for this CPU's variables.
(4) Simplifies startup, because %fs doesn't need to be loaded with a
    special segment at early boot; it can be deferred until the first
    percpu area is allocated (or never for UP).

The result is less code and one less x86-specific concept.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
2007-05-02 19:27:16 +02:00
Jeremy Fitzhardinge
a6c4e076ee [PATCH] i386: clean up identify_cpu
identify_cpu() is used to identify both the boot CPU and secondary
CPUs, but it performs some actions which only apply to the boot CPU.
Those functions are therefore really __init functions, but because
they're called by identify_cpu(), they must be marked __cpuinit.

This patch splits identify_cpu() into identify_boot_cpu() and
identify_secondary_cpu(), and calls the appropriate init functions
from each.  Also, identify_boot_cpu() and all the functions it
dominates are marked __init.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:12 +02:00
Jeremy Fitzhardinge
01a2f43556 [PATCH] i386: Add smp_ops interface
Add a smp_ops interface.  This abstracts the API defined by
<linux/smp.h> for use within arch/i386.  The primary intent is that it
be used by a paravirtualizing hypervisor to implement SMP, but it
could also be used by non-APIC-using sub-architectures.

This is related to CONFIG_PARAVIRT, but is implemented unconditionally
since it is simpler that way and not a highly performance-sensitive
interface.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-05-02 19:27:11 +02:00
Rusty Russell
4fbb596881 [PATCH] i386: cleanup GDT Access
Now we have an explicit per-cpu GDT variable, we don't need to keep the
descriptors around to use them to find the GDT: expose cpu_gdt directly.

We could go further and make load_gdt() pack the descriptor for us, or even
assume it means "load the current cpu's GDT" which is what it always does.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:11 +02:00
Rusty Russell
d2cbcc49e2 [PATCH] i386: clean up cpu_init()
We now have cpu_init() and secondary_cpu_init() doing nothing but calling
_cpu_init() with the same arguments.  Rename _cpu_init() to cpu_init() and use
it as a replcement for secondary_cpu_init().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:10 +02:00
Rusty Russell
bf50467204 [PATCH] i386: Use per-cpu GDT immediately upon boot
Now we are no longer dynamically allocating the GDT, we don't need the
"cpu_gdt_table" at all: we can switch straight from "boot_gdt_table" to the
per-cpu GDT.  This means initializing the cpu_gdt array in C.

The boot CPU uses the per-cpu var directly, then in smp_prepare_cpus() it
switches to the per-cpu copy just allocated.  For secondary CPUs, the
early_gdt_descr is set to point directly to their per-cpu copy.

For UP the code is very simple: it keeps using the "per-cpu" GDT as per SMP,
but we never have to move.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:10 +02:00
Rusty Russell
ae1ee11be7 [PATCH] i386: Use per-cpu variables for GDT, PDA
Allocating PDA and GDT at boot is a pain.  Using simple per-cpu variables adds
happiness (although we need the GDT page-aligned for Xen, which we do in a
followup patch).

[akpm@linux-foundation.org: build fix]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:10 +02:00
Andrew Morton
a86f34b49f [PATCH] x86: revert x86_64-mm-fix-the-irqbalance-quirk-for-e7320-e7520-e7525
Obsoleted by Ingo's genapic stuff.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-05-02 19:27:04 +02:00
Ingo Molnar
d04f41e353 [PATCH] CPU hotplug: call check_tsc_sync_source() with irqs off
check_tsc_sync_source() depends on being called with irqs disabled (it
checks whether the TSC is coherent across two specific CPUs). This is
incidentally true during bootup, but not during cpu hotplug __cpu_up().
This got found via smp_processor_id() debugging.

disable irqs explicitly and remove the unconditional enabling of
interrupts. Add touch_nmi_watchdog() to the cpu_online_map busy loop.

this bug is present both on i386 and on x86_64.

Reported-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-07 10:07:24 -08:00
Zachary Amsden
eda08b1bef [PATCH] vmi: paravirt drop udelay op
Not respecting udelay causes problems with any virtual hardware that is passed
through to real hardware.  This can be noticed by any device that interacts
with the real world in real time - like AP startup, which takes real time.  Or
keyboard LEDs, which should blink in real-time.  Or floppy drives, but only
when passed through to a real floppy controller on OSes which can't
sufficiently buffer the floppy commands to emulate a zero latency floppy.  Or
IDE drives, when connecting to a physical CDROM.

This was mostly a hack to get the kernel to boot faster, but it introduced a
number of misvirtualization bugs, and Alan and Pavel argued pretty strongly
against it.  We were the only client, and now want to clean up this cruft.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:52 -08:00
Thomas Gleixner
e9e2cdb412 [PATCH] clockevents: i386 drivers
Add clockevent drivers for i386: lapic (local) and PIT/HPET (global).  Update
the timer IRQ to call into the PIT/HPET driver's event handler and the
lapic-timer IRQ to call into the lapic clockevent driver.  The assignement of
timer functionality is delegated to the core framework code and replaces the
compile and runtime evalution in do_timer_interrupt_hook()

Use the clockevents broadcast support and implement the lapic_broadcast
function for ACPI.

No changes to existing functionality.

[ kdump fix from Vivek Goyal <vgoyal@in.ibm.com> ]
[ fixes based on review feedback from Arjan van de Ven <arjan@infradead.org> ]
Cleanups-from: Adrian Bunk <bunk@stusta.de>
Build-fixes-from: Andrew Morton <akpm@osdl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:59 -08:00
Thomas Gleixner
e05d723f98 [PATCH] i386, apic: clean up the APIC code
The apic code is quite unstructured and missing a lot of comments.

- Restructure the code into helper functions, timer, setup/shutdown,
  interrupt and power management blocks.
- Fixup comments.
- Namespace fixups
- Inline helpers for version and is_integrated
- Combine the ack_bad_irq functions

No functional changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Zachary Amsden <zach@vmware.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Rohit Seth <rohitseth@google.com>
Cc: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:58 -08:00
Ingo Molnar
95492e4646 [PATCH] x86: rewrite SMP TSC sync code
make the TSC synchronization code more robust, and unify it between x86_64 and
i386.

The biggest change is the removal of the 'fix up TSCs' code on x86_64 and
i386, in some rare cases it was /causing/ time-warps on SMP systems.

The new code only checks for TSC asynchronity - and if it can prove a
time-warp (if it can observe the TSC going backwards when going from one CPU
to another within a critical section), then the TSC clock-source is turned
off.

The TSC synchronization-checking code also got moved into a separate file.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Rusty Russell
2a57ff1a70 [PATCH] i386: Rename cpu_gdt_descr and remove extern declaration from smpboot.c
When I implemented the DECLARE_PER_CPU(var) macros, I was careful that
people couldn't use "var" in a non-percpu context, by prepending
percpu__.  I never considered that this would allow them to overload
the same name for a per-cpu and a non-percpu variable.

It is only one of many horrors in the i386 boot code, but let's rename
the non-perpcu cpu_gdt_descr to early_gdt_descr (not boot_gdt_descr,
that's something else...)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>

===================================================================
2007-02-13 13:26:26 +01:00
Zachary Amsden
bbab4f3bb7 [PATCH] i386: vMI timer patches
VMI timer code.  It works by taking over the local APIC clock when APIC is
configured, which requires a couple hooks into the APIC code.  The backend
timer code could be commonized into the timer infrastructure, but there are
some pieces missing (stolen time, in particular), and the exact semantics of
when to do accounting for NO_IDLE need to be shared between different
hypervisors as well.  So for now, VMI timer is a separate module.

[Adrian Bunk: cleanups]

Subject: VMI timer patches
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden
7ce0bcfd16 [PATCH] i386: vMI backend for paravirt-ops
Fairly straightforward implementation of VMI backend for paravirt-ops.

[Adrian Bunk: some cleanups]

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
Zachary Amsden
ae5da273fe [PATCH] i386: SMP boot hook for paravirt
Add VMI SMP boot hook.  We emulate a regular boot sequence and use the same
APIC IPI initiation, we just poke magic values to load into the CPU state when
the startup IPI is received, rather than having to jump through a real mode
trampoline.

This is all that was needed to get SMP to work.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2007-02-13 13:26:21 +01:00
James Bottomley
9ee79a3d37 [PATCH] x86: fix PDA variables to work during boot
The current PDA code, which went in in post 2.6.19 has a flaw in that it
doesn't correctly cycle the GDT and %GS segment through the boot PDA,
the CPU PDA and finally the per-cpu PDA.

The bug generally doesn't show up if the boot CPU id is zero, but
everything falls apart for a non zero boot CPU id.  The basically kills
voyager which is perfectly capable of doing non zero CPU id boots, so
voyager currently won't boot without this.

The fix is to be careful and actually do the GDT setups correctly.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-01-22 19:39:36 -08:00
Vivek Goyal
4a5d107a9a [PATCH] i386: cpu hotplug/smpboot misc MODPOST warning fixes
o Misc smpboot/cpu hotplug path cleanups. I did those to supress the
  warnings generated by MODPOST. These warnings are visible only
  if CONFIG_RELOCATABLE=y.

o CONFIG_RELOCATABLE compiles the kernel with --emit-relocs option. This
  option retains relocation information in vmlinux file and MODPOST
  is quick to spit out "Section mismatch" warnings.

o This patch fixes some of those warnings. Many of the functions in
  smpboot case are __devinit type and they in turn accesses text/data which
  if of type __cpuinit. Now if CONFIG_HOTPLUG=y and CONFIG_HOTPLUG_CPU=n
  then we end up in cases where a function in .text segment is calling
  another function in .init.text segment and MODPOST emits warning.

WARNING: vmlinux - Section mismatch: reference to .init.text:identify_cpu from .text between 'smp_store_cpu_info' (at offset 0xc011020d) and 'do_boot_cpu'
WARNING: vmlinux - Section mismatch: reference to .init.text:init_gdt from .text between 'do_boot_cpu' (at offset 0xc01102ca) and '__cpu_up'
WARNING: vmlinux - Section mismatch: reference to .init.text:print_cpu_info from .text between 'do_boot_cpu' (at offset 0xc01105d0) and '__cpu_up'

o It also fixes the issues where CONFIG_HOTPLUG_CPU=y and start_secondary()
  is calling smp_callin() which in-turn calls synchronize_tsc_ap() which is
  of type __init. This should have meant broken CPU hotplug.

WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc011603f) and 'initialize_secondary'
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'MP_processor_info' (at offset 0xc0116a4f) and 'mp_register_lapic'
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'MP_processor_info' (at offset 0xc0116a4f) and 'mp_register_lapic'

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
2007-01-11 01:52:44 +01:00
Vivek Goyal
3771a450cf [PATCH] i386: modpost smpboot code warning fix
o Currently synchronize_tsc_ap() is of type __init. It is called by
  smp_callin() which is of type __cpuinit. So synchronize_tsc_ap()
  should be of type __cpuinit.

o Modpost generates warnings for i386 if CONFIG_RELOCATABLE=y and
  CONFIG_HOTPLUG_CPU=y

WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164dc) and 'initialize_secondary'
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'start_secondary' (at offset 0xc01164e8) and 'initialize_secondary'

o tsc is of type __initdata. It should be of type __cpuinitdata.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2007-01-05 23:55:23 -08:00
Andrew Morton
24d34dc564 [PATCH] arch/i386/kernel/smpboot.c: remove unneeded ifdef
#ifdef CONFIG_SMP in a file which isn't compiled in non-SMP kernels.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:46 -08:00
Randy Dunlap
7e74437cf6 [PATCH] i386: export smp_num_siblings for oprofile
oprofile uses smp_num_siblings without testing for CONFIG_X86_HT.
I looked at modifying oprofile, but this way is cleaner & simpler
and I didn't see a good reason not to just export it when CONFIG_SMP.

WARNING: "smp_num_siblings" [arch/i386/oprofile/oprofile.ko] undefined!

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-12-09 21:33:36 +01:00
Shaohua Li
3b1bdf4e08 [PATCH] CPU hotplug broken with 2GB VMSPLIT
In VMSPLIT mode, kernel PGD might have more entries than user space.

Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-08 08:29:09 -08:00
Linus Torvalds
4522d58275 Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (156 commits)
  [PATCH] x86-64: Export smp_call_function_single
  [PATCH] i386: Clean up smp_tune_scheduling()
  [PATCH] unwinder: move .eh_frame to RODATA
  [PATCH] unwinder: fully support linker generated .eh_frame_hdr section
  [PATCH] x86-64: don't use set_irq_regs()
  [PATCH] x86-64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq
  [PATCH] x86-64: Make ix86 default to HIGHMEM4G instead of NOHIGHMEM
  [PATCH] i386: replace kmalloc+memset with kzalloc
  [PATCH] x86-64: remove remaining pc98 code
  [PATCH] x86-64: remove unused variable
  [PATCH] x86-64: Fix constraints in atomic_add_return()
  [PATCH] x86-64: fix asm constraints in i386 atomic_add_return
  [PATCH] x86-64: Correct documentation for bzImage protocol v2.05
  [PATCH] x86-64: replace kmalloc+memset with kzalloc in MTRR code
  [PATCH] x86-64: Fix numaq build error
  [PATCH] x86-64: include/asm-x86_64/cpufeature.h isn't a userspace header
  [PATCH] unwinder: Add debugging output to the Dwarf2 unwinder
  [PATCH] x86-64: Clarify error message in GART code
  [PATCH] x86-64: Fix interrupt race in idle callback (3rd try)
  [PATCH] x86-64: Remove unwind stack pointer alignment forcing again
  ...

Fixed conflict in include/linux/uaccess.h manually

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-07 08:59:11 -08:00
Adrian Bunk
d9408cefe6 [PATCH] i386: Clean up smp_tune_scheduling()
- remove the write-only local variable "bandwidth"
- don't set "max_cache_size" in the (cachesize < 0) case:
  that's already handled in kernel/sched.c:measure_migration_cost()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:19 +01:00
Siddha, Suresh B
b0d0a4ba45 [PATCH] x86: fix the irqbalance quirk for E7320/E7520/E7525
Move the irqbalance quirks for E7320/E7520/E7525(Errata 23 in
http://download.intel.com/design/chipsets/specupdt/30304203.pdf) to early
quirks.

And add a PCI quirk for these platforms to check(which happens very late
during the boot) if the APIC routing is indeed set to default flat mode.

This fixes the breakage(in x86_64) of this quirk due to cpu hotplug which
selects physical mode instead of the logical flat(as needed for this errata
workaround).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:10 +01:00
Rusty Russell
d3561b7fa0 [PATCH] paravirt: header and stubs for paravirtualisation
Create a paravirt.h header for all the critical operations which need to be
replaced with hypervisor calls, and include that instead of defining native
operations, when CONFIG_PARAVIRT.

This patch does the dumbest possible replacement of paravirtualized
instructions: calls through a "paravirt_ops" structure.  Currently these are
function implementations of native hardware: hypervisors will override the ops
structure with their own variants.

All the pv-ops functions are declared "fastcall" so that a specific
register-based ABI is used, to make inlining assember easier.

And:

+From: Andy Whitcroft <apw@shadowen.org>

The paravirt ops introduce a 'weak' attribute onto memory_setup().
Code ordering leads to the following warnings on x86:

    arch/i386/kernel/setup.c:651: warning: weak declaration of
                `memory_setup' after first use results in unspecified behavior

Move memory_setup() to avoid this.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
2006-12-07 02:14:07 +01:00
Jeremy Fitzhardinge
6211119580 [PATCH] i386: Initialize the per-CPU data area
When a CPU is brought up, a PDA and GDT are allocated for it.  The GDT's
__KERNEL_PDA entry is pointed to the allocated PDA memory, so that all
references using this segment descriptor will refer to the PDA.

This patch rearranges CPU initialization a bit, so that the GDT/PDA are set up
as early as possible in cpu_init().  Also for secondary CPUs, GDT+PDA are
preallocated and initialized so all the secondary CPU needs to do is set up
the ldt and load %gs.  This will be important once smp_processor_id() and
current use the PDA.

In all cases, the PDA is set up in head.S, before a CPU starts running C code,
so the PDA is always available.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-12-07 02:14:02 +01:00
David Howells
c4028958b6 WorkStruct: make allyesconfig
Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-11-22 14:57:56 +00:00
Keith Mannthey
78b656b8bf [PATCH] i383 numa: fix numaq/summit apicid conflict
This allows numaq to properly align cpus to their given node during
boot.  Pass logical apicid to apicid_to_node and allow the summit
sub-arch to use physical apicid (hard_smp_processor_id()).

Tested against numaq and summit based systems with no issues.

Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-03 18:46:10 -07:00
Greg Banks
a406c3664e [PATCH] cpumask: export node_to_cpu_mask consistently
cpumask: ensure that node_to_cpumask() is available to modules for all
supported combinations of architecture and CONFIG_NUMA.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:17 -07:00
Peter Zijlstra
6e9a4738c9 [PATCH] completions: lockdep annotate on stack completions
All on stack DECLARE_COMPLETIONs should be replaced by:
DECLARE_COMPLETION_ONSTACK

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-01 00:39:24 -07:00
keith mannthey
3b08606dc2 [PATCH] convert i386 Summit subarch to use SRAT info for apicid_to_node calls
Convert the i386 summit subarch apicid_to_node to use node information
provided by the SRAT.  It was discussed a little on LKML a few weeks ago
and was seen as an acceptable fix.  The current way of obtaining the nodeid

 static inline int apicid_to_node(int logical_apicid)
 {
   return logical_apicid >> 5;
 }

is just not correct for all summit systems/bios.  Assuming the apicid
matches the Linux node number require a leap of faith that the bios mapped
out the apicids a set way.  Modern summit HW (IBM x460) does not layout its
bios in the manner for various reasons and is unable to boot i386 numa.

The best way to get the correct apicid to node information is from the SRAT
table during boot.  It lays out what apicid belongs to what node.  I use
this information to create a table for use at run time.

Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-29 09:18:03 -07:00
Dave Jones
3ca113ea74 [PATCH] i386: don't taint UP K7's running SMP kernels.
We have a test that looks for invalid pairings of certain athlon/durons
that weren't designed for SMP, and taint accordingly (with 'S') if we find
such a configuration.  However, this test shouldn't fire if there's only
a single CPU present. It's perfectly valid for an SMP kernel to boot on UP
hardware for example.

AK: changed to num_possible_cpus()

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:34 +02:00
Rusty Russell
1a3f239ddf [PATCH] i386: Replace i386 open-coded cmdline parsing with
This patch replaces the open-coded early commandline parsing
throughout the i386 boot code with the generic mechanism (already used
by ppc, powerpc, ia64 and s390).  The code was inconsistent with
whether it deletes the option from the cmdline or not, meaning some of
these will get passed through the environment into init.

This transformation is mainly mechanical, but there are some notable
parts:

1) Grammar: s/linux never set's it up/linux never sets it up/

2) Remove hacked-in earlyprintk= option scanning.  When someone
   actually implements CONFIG_EARLY_PRINTK, then they can use
   early_param().
[AK: actually it is implemented, but I'm adding the early_param it in the next
x86-64 patch]

3) Move declaration of generic_apic_probe() from setup.c into asm/apic.h

4) Various parameters now moved into their appropriate files (thanks Andi).

5) All parse functions which examine arg need to check for NULL,
   except one where it has subtle humor value.

AK: readded acpi_sci handling which was completely dropped
AK: moved some more variables into acpi/boot.c

Cc: len.brown@intel.com

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:32 +02:00
Shaohua Li
4038f901cf [PATCH] i386/x86-64: Fix NMI watchdog suspend/resume
Making NMI suspend/resume work with SMP. We use CPU hotplug to offline
APs in SMP suspend/resume. Only BSP executes sysdev's .suspend/.resume
method. APs should follow CPU hotplug code path.

And:

+From: Don Zickus <dzickus@redhat.com>

Makes the start/stop paths of nmi watchdog more robust to handle the
suspend/resume cases more gracefully.

AK: I merged the two patches together

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-26 10:52:27 +02:00
keith mannthey
bfa0e9a07c [PATCH] i386: fix flat mode numa on a real numa system
If there is only 1 node in the system cpus should think they are apart of
some other node.

If cases where a real numa system boots the Flat numa option make sure the
cpus don't claim to be apart on a non-existent node.

Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-25 17:38:36 -07:00
Andrew Morton
c35a7261ea [PATCH] synchronize_tsc() fixes
- Move the tsc synchronisation variables into a struct, mark it __initdata

- local `realdelta' wants to be 64-bit

- Print the skew for negative skews, as well as for positive ones

- remove dead code

Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-31 13:28:38 -07:00
Jörn Engel
6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Siddha, Suresh B
5c45bf279d [PATCH] sched: mc/smt power savings sched policy
sysfs entries 'sched_mc_power_savings' and 'sched_smt_power_savings' in
/sys/devices/system/cpu/ control the MC/SMT power savings policy for the
scheduler.

Based on the values (1-enable, 0-disable) for these controls, sched groups
cpu power will be determined for different domains.  When power savings
policy is enabled and under light load conditions, scheduler will minimize
the physical packages/cpu cores carrying the load and thus conserving
power(with a perf impact based on the workload characteristics...  see OLS
2005 CMP kernel scheduler paper for more details..)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Con Kolivas <kernel@kolivas.org>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:45 -07:00
Rohit Seth
4b89aff930 [PATCH] i386: move phys_proc_id and cpu_core_id to cpuinfo_x86
Move the phys_core_id and cpu_core_id to cpuinfo_x86 structure.  Similar
patch for x86_64 is already accepted by Andi earlier this week.

[akpm@osdl.org: fix warning]
Signed-off-by: Rohit Seth <rohitseth@google.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Shaohua Li
bd9e0b74f5 [PATCH] x86: cpu_init(): avoid GFP_KERNEL allocation while atomic
The patch fixes two issues:

1.  cpu_init is called with interrupt disabled.  Allocating gdt table
   there isn't good at runtime.

2. gdt table page cause memory leak in CPU hotplug case.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:32:37 -07:00
Don Zickus
3e4ff11574 [PATCH] x86_64: nmi watchdog header cleanup
Misc header cleanup for nmi watchdog.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26 10:48:16 -07:00
Andreas Mohr
186989177e [PATCH] cpu_relax(): smpboot.c
Add cpu_relax() to various smpboot.c init loops.  cpu_relax() always implies a
barrier (according to Arjan), so remove those as well.

Signed-off-by: Andreas Mohr <andi@lisas.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-25 10:00:55 -07:00
Dave Jones
7f5910ecab [PATCH] Avoid printing pointless tsc skew msgs
These messages are kinda silly..

CPU#0 had 0 usecs TSC skew, fixed it up.
CPU#1 had 0 usecs TSC skew, fixed it up.

inspired from: http://bugzilla.kernel.org/attachment.cgi?id=7713&action=view

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-28 08:33:47 -07:00
Siddha, Suresh B
1e9f28fa1e [PATCH] sched: new sched domain for representing multi-core
Add a new sched domain for representing multi-core with shared caches
between cores.  Consider a dual package system, each package containing two
cores and with last level cache shared between cores with in a package.  If
there are two runnable processes, with this appended patch those two
processes will be scheduled on different packages.

On such systems, with this patch we have observed 8% perf improvement with
specJBB(2 warehouse) benchmark and 35% improvement with CFP2000 rate(with 2
users).

This new domain will come into play only on multi-core systems with shared
caches.  On other systems, this sched domain will be removed by domain
degeneration code.  This new domain can be also used for implementing power
savings policy (see OLS 2005 CMP kernel scheduler paper for more details..
I will post another patch for power savings policy soon)

Most of the arch/* file changes are for cpu_coregroup_map() implementation.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:43 -08:00
Ashok Raj
34f361ade2 [PATCH] Check if cpu can be onlined before calling smp_prepare_cpu()
- Moved check for online cpu out of smp_prepare_cpu()

- Moved default declaration of smp_prepare_cpu() to kernel/cpu.c

- Removed lock_cpu_hotplug() from smp_prepare_cpu() to around it, since
  its called from cpu_up() as well now.

- Removed clearing from cpu_present_map during cpu_offline as it breaks
  using cpu_up() directly during a subsequent online operation.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-25 08:23:01 -08:00
Gerd Hoffmann
9a0b5817ad [PATCH] x86: SMP alternatives
Implement SMP alternatives, i.e.  switching at runtime between different
code versions for UP and SMP.  The code can patch both SMP->UP and UP->SMP.
The UP->SMP case is useful for CPU hotplug.

With CONFIG_CPU_HOTPLUG enabled the code switches to UP at boot time and
when the number of CPUs goes down to 1, and switches to SMP when the number
of CPUs goes up to 2.

Without CONFIG_CPU_HOTPLUG or on non-SMP-capable systems the code is
patched once at boot time (if needed) and the tables are released
afterwards.

The changes in detail:

  * The current alternatives bits are moved to a separate file,
    the SMP alternatives code is added there.

  * The patch adds some new elf sections to the kernel:
    .smp_altinstructions
	like .altinstructions, also contains a list
	of alt_instr structs.
    .smp_altinstr_replacement
	like .altinstr_replacement, but also has some space to
	save original instruction before replaving it.
    .smp_locks
	list of pointers to lock prefixes which can be nop'ed
	out on UP.
    The first two are used to replace more complex instruction
    sequences such as spinlocks and semaphores.  It would be possible
    to deal with the lock prefixes with that as well, but by handling
    them as special case the table sizes become much smaller.

 * The sections are page-aligned and padded up to page size, so they
   can be free if they are not needed.

 * Splitted the code to release init pages to a separate function and
   use it to release the elf sections if they are unused.

Signed-off-by: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:04 -08:00