Commit graph

64 commits

Author SHA1 Message Date
Harvey Harrison
675a081360 x86: unify mmap_{32|64}.c
mmap_is_ia32 always true for X86_32, or while emulating IA32 on X86_64

Randomization not supported on X86_32 in legacy layout.  Both layouts allow
randomization on X86_64.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:10 +01:00
Jeremy Fitzhardinge
f3f20de87c x86: clean up mm/init_32.c
Some code reformatting in init_32.c.  No functional change.

Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:09 +01:00
Jeremy Fitzhardinge
27ec161f93 x86: kill mk_pte_huge
It only has a single use, which can be trivially replaced.

Signed-off-by: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:09 +01:00
Joerg Roedel
929fd58913 x86: some whitespace cleanups in paging code
This patch does some whitespace cleanups in the paging code to fix some
checkpatch.pl warnings of my formerly merged cleanup patches.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:08 +01:00
Andrew Morton
954683a2c1 x86: PIE executable randomization, uninlining
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:07 +01:00
Andrew Morton
bb1ad8205b x86: PIE executable randomization, checkpatch fixes
#39: FILE: arch/ia64/ia32/binfmt_elf32.c:229:
+elf32_map (struct file *filep, unsigned long addr, struct elf_phdr *eppnt, int prot, int type, unsigned long unused)

WARNING: no space between function name and open parenthesis '('
#39: FILE: arch/ia64/ia32/binfmt_elf32.c:229:
+elf32_map (struct file *filep, unsigned long addr, struct elf_phdr *eppnt, int prot, int type, unsigned long unused)

WARNING: line over 80 characters
#67: FILE: arch/x86/kernel/sys_x86_64.c:80:
+			new_begin = randomize_range(*begin, *begin + 0x02000000, 0);

ERROR: use tabs not spaces
#110: FILE: arch/x86/kernel/sys_x86_64.c:185:
+ ^I        mm->cached_hole_size = 0;$

ERROR: use tabs not spaces
#111: FILE: arch/x86/kernel/sys_x86_64.c:186:
+ ^I^Imm->free_area_cache = mm->mmap_base;$

ERROR: use tabs not spaces
#112: FILE: arch/x86/kernel/sys_x86_64.c:187:
+ ^I}$

ERROR: use tabs not spaces
#141: FILE: arch/x86/kernel/sys_x86_64.c:216:
+ ^I^I/* remember the largest hole we saw so far */$

ERROR: use tabs not spaces
#142: FILE: arch/x86/kernel/sys_x86_64.c:217:
+ ^I^Iif (addr + mm->cached_hole_size < vma->vm_start)$

ERROR: use tabs not spaces
#143: FILE: arch/x86/kernel/sys_x86_64.c:218:
+ ^I^I        mm->cached_hole_size = vma->vm_start - addr;$

ERROR: use tabs not spaces
#157: FILE: arch/x86/kernel/sys_x86_64.c:232:
+  ^Imm->free_area_cache = TASK_UNMAPPED_BASE;$

ERROR: need a space before the open parenthesis '('
#291: FILE: arch/x86/mm/mmap_64.c:101:
+	} else if(mmap_is_legacy()) {

WARNING: braces {} are not necessary for single statement blocks
#302: FILE: arch/x86/mm/mmap_64.c:112:
+	if (current->flags & PF_RANDOMIZE) {
+		mm->mmap_base += ((long)rnd) << PAGE_SHIFT;
+	}

WARNING: line over 80 characters
#314: FILE: fs/binfmt_elf.c:48:
+static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int, unsigned long);

WARNING: no space between function name and open parenthesis '('
#314: FILE: fs/binfmt_elf.c:48:
+static unsigned long elf_map (struct file *, unsigned long, struct elf_phdr *, int, int, unsigned long);

WARNING: line over 80 characters
#429: FILE: fs/binfmt_elf.c:438:
+					   eppnt, elf_prot, elf_type, total_size);

ERROR: need space after that ',' (ctx:VxV)
#480: FILE: fs/binfmt_elf.c:939:
+				elf_prot, elf_flags,0);
 				                   ^

total: 9 errors, 7 warnings, 461 lines checked
Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:07 +01:00
Jiri Kosina
cc503c1b43 x86: PIE executable randomization
main executable of (specially compiled/linked -pie/-fpie) ET_DYN binaries
onto a random address (in cases in which mmap() is allowed to perform a
randomization).

The code has been extraced from Ingo's exec-shield patch
http://people.redhat.com/mingo/exec-shield/

[akpm@linux-foundation.org: fix used-uninitialsied warning]
[kamezawa.hiroyu@jp.fujitsu.com: fixed ia32 ELF on x86_64 handling]

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:07 +01:00
Joerg Roedel
40842bf50e x86: use __PAGE_KERNEL* instead of _KERNPG_TABLE
This minor cleanup replaces _KERNPG_TABLE with the __PAGE_KERNEL* for 2MB PTEs
in the 64-bit memory initialization code. The __PAGE_KERNEL* defines are more
appropriate for PTEs.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:31:02 +01:00
H. Peter Anvin
65ea5b0349 x86: rename the struct pt_regs members for 32/64-bit consistency
We have a lot of code which differs only by the naming of specific
members of structures that contain registers.  In order to enable
additional unifications, this patch drops the e- or r- size prefix
from the register names in struct pt_regs, and drops the x- prefixes
for segment registers on the 32-bit side.

This patch also performs the equivalent renames in some additional
places that might be candidates for unification in the future.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:56 +01:00
Jeremy Fitzhardinge
5548fecdff x86: clean up bitops-related warnings
Add casts to appropriate places to silence spurious bitops warnings.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:55 +01:00
Andrew Morton
022eb43419 x86: kmap_atomic() debugging
[ mingo@elte.hu: cleanups and made dependent on CONFIG_DEBUG_HIGHMEM.

  this caught a handful of bugs already, so lets apply it. If it gets
  things wrong we'll disable it. ]

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:47 +01:00
Christoph Lameter
b263295dbf x86: 64-bit, make sparsemem vmemmap the only memory model
Use sparsemem as the only memory model for UP, SMP and NUMA.  Measurements
indicate that DISCONTIGMEM has a higher overhead than sparsemem.  And
FLATMEMs benefits are minimal.  So I think its best to simply standardize
on sparsemem.

Results of page allocator tests (test can be had via git from slab git
tree branch tests)

Measurements in cycle counts. 1000 allocations were performed and then the
average cycle count was calculated.

Order	FlatMem	Discontig	SparseMem
0	  639	  665		  641
1	  567	  647		  593
2	  679	  774		  692
3	  763	  967		  781
4	  961	 1501		  962
5	 1356	 2344		 1392
6	 2224	 3982		 2336
7	 4869	 7225		 5074
8	12500	14048		12732
9	27926	28223		28165
10	58578	58714		58682

(Note that FlatMem is an SMP config and the rest NUMA configurations)

Memory use:

SMP Sparsemem
-------------

Kernel size:

   text    data     bss     dec     hex filename
3849268  397739 1264856 5511863  541ab7 vmlinux

             total       used       free     shared    buffers     cached
Mem:       8242252      41164    8201088          0        352      11512
-/+ buffers/cache:      29300    8212952
Swap:      9775512          0    9775512

SMP Flatmem
-----------

Kernel size:

   text    data     bss     dec     hex filename
3844612  397739 1264536 5506887  540747 vmlinux

So 4.5k growth in text size vs. FLATMEM.

             total       used       free     shared    buffers     cached
Mem:       8244052      40544    8203508          0        352      11484
-/+ buffers/cache:      28708    8215344

2k growth in overall memory use after boot.

NUMA discontig:

   text    data     bss     dec     hex filename
3888124  470659 1276504 5635287  55fcd7 vmlinux

             total       used       free     shared    buffers     cached
Mem:       8256256      56908    8199348          0        352      11496
-/+ buffers/cache:      45060    8211196
Swap:      9775512          0    9775512

NUMA sparse:

   text    data     bss     dec     hex filename
3896428  470659 1276824 5643911  561e87 vmlinux

8k text growth. Given that we fully inline virt_to_page and friends now
that is rather good.

             total       used       free     shared    buffers     cached
Mem:       8264720      57240    8207480          0        352      11516
-/+ buffers/cache:      45372    8219348
Swap:      9775512          0    9775512

The total available memory is increased by 8k.

This patch makes sparsemem the default and removes discontig and
flatmem support from x86.

[ akpm@linux-foundation.org: allnoconfig build fix ]

Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:47 +01:00
Yinghai Lu
3f6e5a12b8 x86: use core id bits for apicid_to_node initialization
We shoud use core id bits instead of max cores, in case later with AMD
downcores Quad core Opteron.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:39 +01:00
Thomas Gleixner
7462894a7c x86: fixup numa 64 namespace
Using a variable name, which is the same as a macro name is not
really smart. Change the variable names and fixup all users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:38 +01:00
Thomas Gleixner
e3cfe529dd x86: cleanup numa_64.c
Clean it up before applying more patches.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:37 +01:00
Thomas Gleixner
0b9c99b6f2 x86: cleanup tlbflush.h variants
Bring the tlbflush.h variants into sync to prepare merging and
paravirt support.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:35 +01:00
Thomas Gleixner
3abf024d2a x86: nuke a ton of unused exports
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:28 +01:00
Thomas Gleixner
4ec08da02f x86: remove the duplicated arch/x86/ia32/mmap32.c
Use mmap_32.c in arch/x86/mm instead

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:26 +01:00
Thomas Gleixner
f8eeae6821 x86: clean up arch/x86/mm/mmap_32/64.c
White space and coding style clenaup.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:25 +01:00
Thomas Gleixner
aaa64e04f9 x86: move numa related declarations
More stuff shuffeled to the correct place

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:17 +01:00
Thomas Gleixner
718fc13b46 x86: move debug related declarations to kdebug.h
Move them and fixup some users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:17 +01:00
Thomas Gleixner
c9ff03428f x86: move k8 related declarations
Move k8 related declarations to k8.h and fix numa_64.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-01-30 13:30:16 +01:00
Ingo Molnar
23be8c7ddf x86: fix boot crash on HIGHMEM4G && SPARSEMEM
Denys Fedoryshchenko reported a bootup crash when he upgraded
his system from 3GB to 4GB RAM:

   http://lkml.org/lkml/2008/1/7/9

the bug is due to HIGHMEM4G && SPARSEMEM kernels making pfn_to_page()
to return an invalid pointer when the pfn is in a memory hole. The
256 MB PCI aperture at the end of RAM was not mapped by sparsemem,
and hence the pfn was not valid. But set_highmem_pages_init() iterated
this range without checking the pfn's validity first.

this bug was probably present in the sparsemem code ever since sparsemem
has been introduced in v2.6.13. It was masked due to HIGHMEM64G using
larger memory regions in sparsemem_32.h:

 #ifdef CONFIG_X86_PAE
 #define SECTION_SIZE_BITS       30
 #define MAX_PHYSADDR_BITS       36
 #define MAX_PHYSMEM_BITS        36
 #else
 #define SECTION_SIZE_BITS       26
 #define MAX_PHYSADDR_BITS       32
 #define MAX_PHYSMEM_BITS        32
 #endif

which creates 1GB sparsemem regions instead of 64MB sparsemem regions.
So in practice we only ever created true sparsemem holes on x86 with
HIGHMEM4G - but that was rarely used by distros.

( btw., we could probably save 2MB of mem_map[]s on X86_PAE if we reduced
  the sparsemem region size to 256 MB. )

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-15 16:44:37 +01:00
KAMEZAWA Hiroyuki
b6fd6ecb83 memory hotplug x86_64: fix section mismatch in init_memory_mapping()
Changes __meminit to __init_refok.

WARNING: vmlinux.o(.text+0x1d07c): Section mismatch: reference to
.init.text:find_e820_area (between 'init_memory_mapping' and 'arch_add_memory')

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-29 09:24:54 -08:00
Linus Torvalds
a7e1e001f4 Merge branch 'v2.6.24-rc1-lockdep' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-lockdep
* 'v2.6.24-rc1-lockdep' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-lockdep:
  lockdep: fix a typo in the __lock_acquire comment
  sched: fix unconditional irq lock
  lockdep: fixup irq tracing
2007-11-03 12:42:52 -07:00
Adrian Bunk
fb8c177fe0 x86: mm/discontig_32.c: make code static
node0_bdata and paddr_to_nid() can become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-30 00:22:22 +01:00
Linus Torvalds
6a22c57b8d Revert "x86_64: allocate sparsemem memmap above 4G"
This reverts commit 2e1c49db4c.

First off, testing in Fedora has shown it to cause boot failures,
bisected down by Martin Ebourne, and reported by Dave Jobes.  So the
commit will likely be reverted in the 2.6.23 stable kernels.

Secondly, in the 2.6.24 model, x86-64 has now grown support for
SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the
bug is not visible any more, it's become invisible due to the code just
being irrelevant and no longer enabled on the only architecture that
this ever affected.

Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Martin Ebourne <fedora@ebourne.me.uk>
Cc: Zou Nan hai <nanhai.zou@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-29 14:05:37 -07:00
Peter Zijlstra
143a5d325d lockdep: fixup irq tracing
Ensure we fixup the IRQ state before we hit any locking code.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-25 14:01:10 +02:00
Alexey Dobriyan
eec407c9ac x86: fix bogus KERN_ALERT on oops
fix this:

printing eip: f881b9f3 *pdpt = 0000000000003001 <1>*pde = 000000000480a067 *pte = 0000000000000000
						^^^

[ mingo: added KERN_CONT as suggested by Pekka Enberg ]

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-24 12:58:02 +02:00
Keshavamurthy, Anil S
a9c55b3ba8 Intel IOMMU: clflush_cache_range now takes size param
Introduce the size param for clflush_cache_range().

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Greg KH <greg@kroah.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-22 08:13:18 -07:00
Linus Torvalds
c00046c279 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (74 commits)
  fix do_sys_open() prototype
  sysfs: trivial: fix sysfs_create_file kerneldoc spelling mistake
  Documentation: Fix typo in SubmitChecklist.
  Typo: depricated -> deprecated
  Add missing profile=kvm option to Documentation/kernel-parameters.txt
  fix typo about TBI in e1000 comment
  proc.txt: Add /proc/stat field
  small documentation fixes
  Fix compiler warning in smount example program from sharedsubtree.txt
  docs/sysfs: add missing word to sysfs attribute explanation
  documentation/ext3: grammar fixes
  Documentation/java.txt: typo and grammar fixes
  Documentation/filesystems/vfs.txt: typo fix
  include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros
  trivial copy_data_pages() tidy up
  Fix typo in arch/x86/kernel/tsc_32.c
  file link fix for Pegasus USB net driver help
  remove unused return within void return function
  Typo fixes retrun -> return
  x86 hpet.h: remove broken links
  ...
2007-10-19 20:36:17 -07:00
Simon Arlott
676b1855de spelling fixes: arch/x86_64/
Spelling fixes in arch/x86_64/.

Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-20 01:25:36 +02:00
Simon Arlott
27b46d7661 spelling fixes: arch/i386/
Spelling fixes in arch/i386/.

Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-20 01:13:56 +02:00
Linus Torvalds
60812a4a99 Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86
* ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86: (33 commits)
  x86: convert cpuinfo_x86 array to a per_cpu array
  x86: introduce frame_pointer() and stack_pointer()
  x86 & generic: change to __builtin_prefetch()
  i386: do not BUG_ON() when MSR is unknown
  x86: acpi use cpu_physical_id
  x86: convert cpu_llc_id to be a per cpu variable
  x86: convert cpu_to_apicid to be a per cpu variable
  i386: introduce "used_vectors" bitmap which can be used to reserve vectors.
  x86: use raw locks during oopses
  x86: honor _PAGE_PSE bit on page walks
  i386: do cpuid_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE.
  x86: implement missing x86_64 function smp_call_function_mask()
  x86: use descriptor's functions instead of inline assembly
  i386: consolidate show_regs and show_registers for i386
  i386: make callgraph use dump_trace() on i386/x86_64
  x86: enable iommu_merge by default
  i386: i386 add AMD64 Barcelona PMU MSR definitions to msr.h
  x86: Unify i386 and x86-64 early quirks
  x86: enable HPET on ICH3 and ICH4
  x86: force enable HPET on VT8235/8237 chipsets
  ...

Manually fix trivial conflict with task pid container helper changes in
arch/x86/kernel/process_32.c
2007-10-19 15:06:00 -07:00
Linus Torvalds
26790656d7 Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86
* ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86:
  x86: fix global_flush_tlb() bug
2007-10-19 11:57:05 -07:00
Alexey Dobriyan
19c5870c0e Use helpers to obtain task pid in printks (arch code)
One of the easiest things to isolate is the pid printed in kernel log.
There was a patch, that made this for arch-independent code, this one makes
so for arch/xxx files.

It took some time to cross-compile it, but hopefully these are all the
printks in arch code.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:43 -07:00
Serge E. Hallyn
b460cbc581 pid namespaces: define is_global_init() and is_container_init()
is_init() is an ambiguous name for the pid==1 check.  Split it into
is_global_init() and is_container_init().

A cgroup init has it's tsk->pid == 1.

A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns.  But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.

Changelog:

	2.6.22-rc4-mm2-pidns1:
	- Use 'init_pid_ns.child_reaper' to determine if a given task is the
	  global init (/sbin/init) process. This would improve performance
	  and remove dependence on the task_pid().

	2.6.21-mm2-pidns2:

	- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
	  ppc,avr32}/traps.c for the _exception() call to is_global_init().
	  This way, we kill only the cgroup if the cgroup's init has a
	  bug rather than force a kernel panic.

[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: Pavel Emelianov <xemul@openvz.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Herbert Poetzel <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:37 -07:00
Mike Travis
71fff5e6ca x86: convert cpu_to_apicid to be a per cpu variable
This patch converts the x86_cpu_to_apicid array to be a per cpu
variable. This saves sizeof(apicid) * NR unused cpus.  Access is mostly
from startup and CPU HOTPLUG functions.

MP_processor_info() is one of the functions that require access to the
x86_cpu_to_apicid array before the per_cpu data area is setup.  For this
case, a pointer to the __initdata array is initialized in setup_arch()
and removed in smp_prepare_cpus() after the per_cpu data area is
initialized.

A second change is included to change the initial array value of ARCH
i386 from 0xff to BAD_APICID to be consistent with ARCH x86_64.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Jan Beulich
b1992df3f0 x86: honor _PAGE_PSE bit on page walks
[ tglx: arch/x86 adaptation ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Andi Kleen
b3f5dde15e x86: don't zero pad addresses in segfault message
don't zero pad addresses in segfault message. Matches the other trap
messages. This leaves some more space for the new file name.

[ mingo: arch/x86 adaptation ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Andi Kleen
af93ebc0b3 x86: remove page_fault_trace
Old debugging code that is not really needed anymore. If someone
wants it it would be better replaced with a systemtap script or
kprobe.

This avoids a potential cache miss during page fault processing.

[ mingo: arch/x86 adaptation ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Ingo Molnar
9a24d04a3c x86: fix global_flush_tlb() bug
While we were reviewing pageattr_32/64.c for unification,
Thomas Gleixner noticed the following serious SMP bug in
global_flush_tlb():

	down_read(&init_mm.mmap_sem);
	list_replace_init(&deferred_pages, &l);
	up_read(&init_mm.mmap_sem);

this is SMP-unsafe because list_replace_init() done on two CPUs in
parallel can corrupt the list.

This bug has been introduced about a year ago in the 64-bit tree:

       commit ea7322decb
       Author: Andi Kleen <ak@suse.de>
       Date:   Thu Dec 7 02:14:05 2006 +0100

       [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr

                down_read(&init_mm.mmap_sem);
        -       dpage = xchg(&deferred_pages, NULL);
        +       list_replace_init(&deferred_pages, &l);
                up_read(&init_mm.mmap_sem);

the xchg() based version was SMP-safe, but list_replace_init() is not.
So this "cleanup" introduced a nasty bug.

why this bug never become prominent is a mystery - it can probably be
explained with the (still) relative obscurity of the x86_64 architecture.

the safe fix for now is to write-lock init_mm.mmap_sem.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 12:19:26 +02:00
Linus Torvalds
d20ead9e86 Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86
* ssh://master.kernel.org/pub/scm/linux/kernel/git/tglx/linux-2.6-x86: (114 commits)
  x86: delete vsyscall files during make clean
  kbuild: fix typo SRCARCH in find_sources
  x86: fix kernel rebuild due to vsyscall fallout
  .gitignore update for x86 arch
  x86: unify include/asm/debugreg_32/64.h
  x86: unify include/asm/unwind_32/64.h
  x86: unify include/asm/types_32/64.h
  x86: unify include/asm/tlb_32/64.h
  x86: unify include/asm/siginfo_32/64.h
  x86: unify include/asm/bug_32/64.h
  x86: unify include/asm/mman_32/64.h
  x86: unify include/asm/agp_32/64.h
  x86: unify include/asm/kdebug_32/64.h
  x86: unify include/asm/ioctls_32/64.h
  x86: unify include/asm/floppy_32/64.h
  x86: apply missing DMA/OOM prevention to floppy_32.h
  x86: unify include/asm/cache_32/64.h
  x86: unify include/asm/cache_32/64.h
  x86: unify include/asm/dmi_32/64.h
  x86: unify include/asm/delay_32/64.h
  ...
2007-10-17 13:13:16 -07:00
Luiz Fernando N. Capitulino
de8aacbe6a x86: convert mm_context_t semaphore to a mutex
convert mm_context_t semaphore to a mutex.

Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:17:05 +02:00
Pavel Emelyanov
9aa8d7195a i386: clean up oops/bug reports
Typically the oops first lines look like this:

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c049dfbd
*pde = 00000000
Oops: 0002 [#1]
PREEMPT SMP
...

Such output is gained with some ugly if (!nl) printk("\n"); code and
besides being a waste of lines, this is also annoying to read. The
following output looks better (and it is how it looks on x86_64):

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip: c049dfbd *pde = 00000000
Oops: 0002 [#1] PREEMPT SMP
...

[ tglx: arch/x86 adaptation ]

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:16:52 +02:00
Mike Travis
98c9e27a56 x86: fix cpu_to_node references
In x86_64 and i386 architectures most arrays that are sized using
NR_CPUS lay in local memory on node 0.  Not only will most (99%?) of the
systems not use all the slots in these arrays, particularly when NR_CPUS
is increased to accommodate future very high cpu count systems, but a
number of cache lines are passed unnecessarily on the system bus when
these arrays are referenced by cpus on other nodes.

Typically, the values in these arrays are referenced by the cpu
accessing it's own values, though when passing IPI interrupts, the cpu
does access the data relevant to the targeted cpu/node.  Of course, if
the referencing cpu is not on node 0, then the reference will still
require cross node exchanges of cache lines.  A common use of this is
for an interrupt service routine to pass the interrupt to other cpus
local to that node.

Ideally, all the elements in these arrays should be moved to the per_cpu
data area.  In some cases (such as x86_cpu_to_apicid) the array is
referenced before the per_cpu data areas are setup.  In this case, a
static array is declared in the __initdata area and initialized by the
booting cpu (BSP).  The values are then moved to the per_cpu area after
it is initialized and the original static array is freed with the rest
of the __initdata.

This patch:

Fix four instances where cpu_to_node is referenced by array instead of
via the cpu_to_node macro.  This is preparation to moving it to the
per_cpu data area.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:16:37 +02:00
H. Peter Anvin
6619a8fb59 x86: Create clflush() inline, remove hardcoded wbinvd
Create an inline function for clflush(), with the proper arguments,
and use it instead of hard-coding the instruction.

This also removes one instance of hard-coded wbinvd, based on a patch
by Bauder de Oliveira Costa.

[ tglx: arch/x86 adaptation ]

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:16:12 +02:00
Adrian Bunk
59659f14b6 i386: make some variables static
This patch makes some needlessly global variables static.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:56 +02:00
Yoann Padioleau
83e83d546c x86: 0 -> NULL, for arch/x86_64
When comparing a pointer, it's clearer to compare it to NULL than to 0.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Yoann Padioleau <padator@wanadoo.fr>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: ak@suse.de
Cc: discuss@x86-64.org
Cc: akpm@linux-foundation.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:52 +02:00
Huang, Ying
84e0fdb175 x86: NX bit handling in change_page_attr()
This patch fixes a bug of change_page_attr/change_page_attr_addr on
Intel x86_64 CPUs.  After changing page attribute to be executable with
these functions, the page remains un-executable on Intel x86_64 CPU.
Because on Intel x86_64 CPU, only if the "NX" bits of all four level
page tables are cleared, the corresponding page is executable (refer to
section 4.13.2 of Intel 64 and IA-32 Architectures Software Developer's
Manual).  So, the bug is fixed through clearing the "NX" bit of PMD when
splitting the huge PMD.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:43 +02:00