Original code doesn't write back to CCR4 register. This patch reflects a
value of a register.
Cc: Jordan Crouse <jordan.crouse@amd.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
I hope to support "classic" MediaGXm in kernel.
The DIR1 register of MediaGXm( or Geode) shows the following values for
identify CPU. For example, My MediaGXm shows 0x42.
We can read National Semiconductor's datasheet without any NDAs.
http://www.national.com/pf/GX/GXLV.html
from datasheets:
DIR1
0x30 - 0x33 GXm rev. 1.0 - 2.3
0x34 - 0x4f GXm rev. 2.4 - 3.x
0x5x GXm rev. 5.0 - 5.4
0x6x GXLV
0x7x (unknow)
0x8x Gx1
In nsc driver of X, accept 0x30 through 0x82. What will 0x7x mean?
Cc: Jordan Crouse <jordan.crouse@amd.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
All Transmeta CPUs ever produced have constant-rate TSCs.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mtrr: fix size_or_mask and size_and_mask
This fixes two bugs in /proc/mtrr interface:
o If physical address size crosses the 44 bit boundary
size_or_mask is evaluated wrong.
o size_and_mask limits width of physical base
address for an MTRR to be less than 44 bits.
TBD: later patch had one more change, but I think that was bogus.
TBD: need to double check
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Add a notifier mechanism to the low level idle loop. You can register a
callback function which gets invoked on entry and exit from the low level idle
loop. The low level idle loop is defined as the polling loop, low-power call,
or the mwait instruction. Interrupts processed by the idle thread are not
considered part of the low level loop.
The notifier can be used to measure precisely how much is spent in useless
execution (or low power mode). The perfmon subsystem uses it to turn on/off
monitoring.
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Every file should include the headers containing the prototypes for
it's global functions.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Convert the PDA code to use %fs rather than %gs as the segment for
per-processor data. This is because some processors show a small but
measurable performance gain for reloading a NULL segment selector (as %fs
generally is in user-space) versus a non-NULL one (as %gs generally is).
On modern processors the difference is very small, perhaps undetectable.
Some old AMD "K6 3D+" processors are noticably slower when %fs is used
rather than %gs; I have no idea why this might be, but I think they're
sufficiently rare that it doesn't matter much.
This patch also fixes the math emulator, which had not been adjusted to
match the changed struct pt_regs.
[frederik.deweerdt@gmail.com: fixit with gdb]
[mingo@elte.hu: Fix KVM too]
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Ian Campbell <Ian.Campbell@XenSource.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Zachary Amsden <zach@vmware.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
[akpm@osdl.org: sparc64 fix]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This change should make Longhaul more compatible with
both ver. 2 and Powersaver processors. Voltage transitions
will be done before or after frequency transition. That depends
on direction of change. I don't know how to force conservative
governor when voltage scaling is enabled, so there is only
a warning for user. Minimal voltage is calculated in different
way now because in this way more power is saved at lower
multipliers.
Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
This is driver for Enhanced Powersaver which is present in VIA C7
processors. Beta tested by Jorgen (jorgen (at) greven dot dk).
Thanks! Based on documentation provided by Dave Jones (Thanks!)
and C7 Eden datasheet available from www.via.com.tw. Looks like all
these C7 Eden CPU's don't have P-states in BIOS. I know that 2
p-states is low, but Jorgen finds it usefull anyway because board
is passive cooled.
There are 3 different types of C7 processors (called brands):
0. C7-M - these processors can set any maultiplier between min and
max, any voltage between min and max.
1. C7 - only min and max states are supported. Voltage is different
for min and max states.
2. Eden - only min and max states are supported. Looks like this
brand can only change multiplier. Voltage seems to be the same for
min and max frequency.
Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
I don't know why it is working and how, but it is working. On my
Epia transition time is by default set to 100us. I'm changing it to
200us. After that I can change frequency from min (x4.0) to max (x7.5)
without lockup. Many times.
There is a paranoid check at a beginning of a patch. Probably dead
code, but I don't have better ideas for CL10000 case at the moment.
Only way to to detect broken chip seems to be looking in log for
spurious interrupts.
Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
This is bug reported by John-Marc Chandonia:
> Detected 1002.292 MHz processor.
> longhaul: VIA C3 'Nehemiah B' [C5N] CPU detected. Powersaver supported.
> longhaul: Using throttling support.
> longhaul: Invalid (reserved) FSB!
FSB is correcly guessed for 999.554 MHz CPU.
To fix this error:
- ROUNDING should be range, not mask - at it's current value it is +7 -8,
- more precise calculations inside guess_fsb - 7.5x133MHz is 1000MHz now.
Signed-off-by: Rafal Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
Now there is no need to depend on -1 in Nehemiah tables. After
previous change code is eliminating multipliers lower then 5.0
by minmult for Nehemiah A and B.
Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
Looks like some time ago I introduced a bug to Longhaul.
I had report that 9x133Mhz CPU is seen as 5x133MHz. So I
changed multipliers table. That was a mistake. According to
documentation table was correct. So only way to avoid 5 or 9
dilema is not use MaxMHzBR for PowerSaver 1.0. One code that
works on all processors. To do it I need also separate flag for
Nehemiah C (min = x4.0) and Nehemiah (min = x5.0).
Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
This fixes the cpuinfo_cur_freq value by using the correct
find_khz_freq_from_fiddid() when the CPU uses hardware p-states.
Signed-off-by: Joachim Deguara <joachim.deguara@amd.com>
Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
There is no need to have this option in Longhaul anymore.
It was for laptop with CLE266 chipset in times, when only
ACPI C3 was used to switch frequency. Now we have native
support not only for CLE266, but CN400 too. Would be good
to have support for PN266, but I can't find datasheet for it.
Looks like BIOS for CPU's faster then 1GHz don't support
ACPI C2 nor C3.
Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
This reverts commit e4f0ae0ea6.
It's not wrong, but it's not right either, and everybody seems to agree
that the right fix is probably to do the ccr3 write after the ccr4 one
(and that we also should clean it up a bit). And after that we need to
really validate that all the bits that we write to ccr4 actually do
work.
The old 2.6.19 code was insane, and basically didn't change ccr4 at all
(even though it certainly looks like it was the *intent* to do so). So
let's revert the change that may fix things, just because it's not what
was actually ever tested when the code was written, even if it _was_ the
intent.
There's a discussion on http://lkml.org/lkml/2007/1/9/63 that was
started by the patch that now gets reverted, and that discussion may
well contain the proper long-term fix.
Suggested-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This workaround unnecessarily cripples functionality to work
around an errata that doesn't seem possible to hit due to
us using the automatic clock throttling in the p4 mcheck code.
See http://lkml.org/lkml/2006/10/28/148 for complete reasoning
and lack of disconsent.
Signed-off-by: Dave Jones <davej@redhat.com>
The current PDA code, which went in in post 2.6.19 has a flaw in that it
doesn't correctly cycle the GDT and %GS segment through the boot PDA,
the CPU PDA and finally the per-cpu PDA.
The bug generally doesn't show up if the boot CPU id is zero, but
everything falls apart for a non zero boot CPU id. The basically kills
voyager which is perfectly capable of doing non zero CPU id boots, so
voyager currently won't boot without this.
The fix is to be careful and actually do the GDT setups correctly.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We write back the wrong register when configuring the Geode processor.
Instead of storing to CCR4, it stores to CCR3.
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
o MODPOST generates warning for i386 if kernel is compiled with
CONFIG_RELOCATABLE=y
WARNING: vmlinux - Section mismatch: reference to .init.data: from .data between 'this_cpu' (at offset 0xc05194d0) and 'cpuinfo_op'
o this_cpu pointer should be of type __cpuinitdata.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
cmd.val was used uninitialized on the line below.
Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
This is patch that solves Ebox mini PC issue and make
FSB code more specification compilant. At start guess_fsb
function is guessing 200MHz FSB too. It is better to
make it in this way because, thanks to this function, driver
will fail for bogus FSB values caused by bogus multiplier
value. For PowerSaver processors we can't depend on Max /
MinMHzFSB because these values are only used for
PowerSaver 2.0 and 3.0. Most processors on which Longhaul
is used are PowerSaver 1.0 only. I'm changing code for older
CPU's too, but not so much as previously, and this code was
already used for Ezra. Using MinMHzBR for Ezra-T is outside
spec. It is for voltage scaling purpose and don't have to
be equal to minmult (but it is). Same for Nehemiah (it
isn't for sure). Added mult - current multiplier value.
Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
ACPI PM2 register was fallback for "Longhaul ver. 1" CPU's.
My assumption that this register isn't present at
"PowerSaver" motherboards is so far true, but current code
will not work correctly in other case. There are three possible
supports: ACPI C3, PM2 and northbridge. That was my assumption
that ACPI C3 and northbridge is for PS and northbridge and PM2
is for V1. In current code we can only check if it is ACPI
support or not by port22_en. So remove port22_en and add
longhaul_flags. If USE_ACPI_C3 and USE_NORTHBRIDGE are both
clear then it means ACPI PM2 support. Also change order of
support probe from ACPI C3, PM2, northbridge to ACPI C3,
northbridge, ACPI PM2. Paranoid protection against port 0x22
cast as ACPI PM2 register. Bit 1 clear in such case - lockup
on AGP DMA. And obvious (now) fixup for do_powersaver. Use
cx->address only for ACPI C3 ("PowerSaver" processor using
PM2 support).
Signed-off-by: Rafa¿ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
A space and a bracket are missing (and indentation is wrong).
Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Fixes the oops in cpufreq_stats with acpi_cpufreq driver. The issue was
that the frequency was reported as 0 in acpi-cpufreq.c. The bug is due to
different indicies for freq_table and ACPI perf table.
Also adds a check in cpufreq_stats to check for error return from
freq_table_get_index() and avoid using the error return value.
Patch fixes the issue reported at
http://www.ussg.iu.edu/hypermail/linux/kernel/0611.2/0629.html
and also other similar issue here
http://bugme.osdl.org/show_bug.cgi?id=7383 comment 53
Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Check the correct variable and set policy->cur upon acpi-cpufreq
initialization to allow the userspace governor to be used as default.
Signed-off-by: Mattia Dongili <malattia@linux.it>
Acked-by: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Support for CN400 northbridge when ACPI C3 isn't available.
Tested on Epia SP13000. Thanks to Robert for testing it.
Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
On board of Epia SP13000 is 10x133Mhz VIA Nehemiah. It is reported
as 10x200MHz. This patch is fixing this issue.
Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
Support for Core CPUs was broken in two ways in speedstep-lib: for x86_64,
we missed a MSR definition; for both x86_64 and i386, the FSB calculation
was wrong by four (it's a quad-pumped bus). Also increase the accuracy
of the calculation.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
Fix the bug in duplicate states elimination in acpi-cpufreq.
Bug: Due to duplicate state elimiation in the loop earlier, the number
of valid_states can be less than perf->state_count, in which case
freq_table was ending up with some garbage/uninitialized entries
in the table.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
From: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
On some systems there could be bits set in the upper half of
the control value provided by the _PSS object. These bits are
only relevant for cpufreq drivers that use IO ports which are not
currently supported by the speedstep-centrino driver. The current
MSR oriented code assumes that upper bits are not set and thus
fails to work correctly when they are. e.g. the control and status
value equality check failed on the IBM x3650 even though the ACPI
spec allows inequality.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Signed-off-by: Dave Jones <davej@redhat.com>
We don't need a temporary variable to get the PCI revision ID.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Dave Jones <davej@redhat.com>
There was lots of #ifdef noise in the kernel due to hotcpu_notifier(fn,
prio) not correctly marking 'fn' as used in the !HOTPLUG_CPU case, and thus
generating compiler warnings of unused symbols, hence forcing people to add
#ifdefs.
the compiler can skip truly unused functions just fine:
text data bss dec hex filename
1624412 728710 3674856 6027978 5bfaca vmlinux.before
1624412 728710 3674856 6027978 5bfaca vmlinux.after
[akpm@osdl.org: topology.c fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We do the exact same printk about a dozen lines above
with no intermediate printk's.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Here is a small patch for i386 which adds a cpufeature flag and
detection code for Intel's Branch Trace Store (BTS) feature. This
feature can be found on Intel P4 and Core 2 processors among others.
It can also be used by perfmon.
changelog:
- add CPU_FEATURE_BTS
- add Branch Trace Store detection
signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Make the needlessly global alloc_gdt() static.
(against) pda-percpu-init
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Until not so long ago, there were system log messages pointing to
inconsistent MTRR setup of the video frame buffer caused by the way vesafb
and X worked. While vesafb was fixed meanwhile, I believe fixing it there
only hides a shortcoming in the MTRR code itself, in that that code is not
symmetric with respect to the ordering of attempts to set up two (or more)
regions where one contains the other. In the current shape, it permits
only setting up sub-regions of pre-exisiting ones. The patch below makes
this symmetric.
While working on that I noticed a few more inconsistencies in that code,
namely
- use of 'unsigned int' for sizes in many, but not all places (the patch
is converting this to use 'unsigned long' everywhere, which specifically
might be necessary for x86-64 once a processor supporting more than 44
physical address bits would become available)
- the code to correct inconsistent settings during secondary processor
startup tried (if necessary) to correct, among other things, the value
in IA32_MTRR_DEF_TYPE, however the newly computed value would never get
used (i.e. stored in the respective MSR)
- the generic range validation code checked that the end of the
to-be-added range would be above 1MB; the value checked should have been
the start of the range
- when contained regions are detected, previously this was allowed only
when the old region was uncacheable; this can be symmetric (i.e. the new
region can also be uncacheable) and even further as per Intel's
documentation write-trough and write-back for either region is also
compatible with the respective opposite in the other
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Allow selected bug checks to be skipped by paravirt kernels. The two most
important are the F00F workaround (which is either done by the hypervisor,
or not required), and the 'hlt' instruction check, which can break under
some hypervisors.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Both lhype and Xen want to call the core of the x86 cpu detect code before
calling start_kernel.
(extracted from larger patch)
AK: folded in start_kernel header patch
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Use the pcurrent field in the PDA to implement the "current" macro. This ends
up compiling down to a single instruction to get the current task.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Use the cpu_number in the PDA to implement raw_smp_processor_id. This is a
little simpler than using thread_info, though the cpu field in thread_info
cannot be removed since it is used for things other than getting the current
CPU in common code.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
This patch is the meat of the PDA change. This patch makes several related
changes:
1: Most significantly, %gs is now used in the kernel. This means that on
entry, the old value of %gs is saved away, and it is reloaded with
__KERNEL_PDA.
2: entry.S constructs the stack in the shape of struct pt_regs, and this
is passed around the kernel so that the process's saved register
state can be accessed.
Unfortunately struct pt_regs doesn't currently have space for %gs
(or %fs). This patch extends pt_regs to add space for gs (no space
is allocated for %fs, since it won't be used, and it would just
complicate the code in entry.S to work around the space).
3: Because %gs is now saved on the stack like %ds, %es and the integer
registers, there are a number of places where it no longer needs to
be handled specially; namely context switch, and saving/restoring the
register state in a signal context.
4: And since kernel threads run in kernel space and call normal kernel
code, they need to be created with their %gs == __KERNEL_PDA.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
When a CPU is brought up, a PDA and GDT are allocated for it. The GDT's
__KERNEL_PDA entry is pointed to the allocated PDA memory, so that all
references using this segment descriptor will refer to the PDA.
This patch rearranges CPU initialization a bit, so that the GDT/PDA are set up
as early as possible in cpu_init(). Also for secondary CPUs, GDT+PDA are
preallocated and initialized so all the secondary CPU needs to do is set up
the ldt and load %gs. This will be important once smp_processor_id() and
current use the PDA.
In all cases, the PDA is set up in head.S, before a CPU starts running C code,
so the PDA is always available.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Chuck Ebbert <76306.1226@compuserve.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Cc: James Bottomley <James.Bottomley@SteelEye.com>
Cc: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Clean up the espfix code:
- Introduced PER_CPU() macro to be used from asm
- Introduced GET_DESC_BASE() macro to be used from asm
- Rewrote the fixup code in asm, as calling a C code with the altered %ss
appeared to be unsafe
- No longer altering the stack from a .fixup section
- 16bit per-cpu stack is no longer used, instead the stack segment base
is patched the way so that the high word of the kernel and user %esp
are the same.
- Added the limit-patching for the espfix segment. (Chuck Ebbert)
[jeremy@goop.org: use the x86 scaling addressing mode rather than shifting]
Signed-off-by: Stas Sergeev <stsp@aknet.ru>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Zachary Amsden <zach@vmware.com>
Acked-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Here is a patch (used by perfmon2) to detect the presence of the Precise Event
Based Sampling (PEBS) feature for i386. The patch also adds the cpu_has_pebs
macro.
- adds X86_FEATURE_PEBS
- adds cpu_has_pebs to test for X86_FEATURE_PEBS
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Fix checks that failed to realize that values are 4-kB-unit-sized (note the
format strings in this same diff context which *do* realize the unit size,
via appended "000"!). Also fix an incorrect below-1MB area check (as
gathered from Jan Beulich's unapplied patch at
http://www.ussg.iu.edu/hypermail/linux/kernel/0411.1/1378.html ) Update
mtrr_add_page() docu to make 4-kB-sized calculation more obvious.
Given several further items mentioned in Jan's patch mail, all in all MTRR
code seems surprisingly buggy, for a surprisingly long period of time (many
years). Further work/investigation would be useful.
TBD Note that my patch is pretty much UNTESTED, since I can only verify that it
TBD successfully boots my machine, but I cannot test against actual buggy
TBD hardware which would require these (formerly broken) checks. Long -mm
TBD simmering would make sense, especially since these now-working checks might
TBD turn out to have adverse effects on unaffected hardware.
Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Conflicts:
drivers/ata/libata-scsi.c
include/linux/libata.h
Futher merge of Linus's head and compilation fixups.
Signed-Off-By: David Howells <dhowells@redhat.com>
arch/x86_64/kernel/cpufreq/../../../i386/kernel/cpu/cpufreq/speedstep-lib.c:131: error: 'MSR_FSB_FREQ' undeclared (first use in this function)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
On some systems such as the IBM x3650 there are bits set in the
upper half of the control values provided by the _PSS object.
These bits are only relevant for cpufreq drivers that use IO ports
which are not currently supported by the speedstep-centrino driver.
The current MSR oriented code assumes that upper bits are not set
and thus fails to work correctly when they are. e.g. the control
and status value equality check fails even though the ACPI spec
allows the inequality.
Signed-off-by: Gary Hade <garyh@us.ibm.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Several more Intel CPUs are now capable using the p4-clockmod cpufreq
driver. As it is of limited use most of the time, print a big bold warning
if a better cpufreq driver might be available.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Dave Jones <davej@redhat.com>
acpi-cpufreq needs the same patch as the previous speedstep-centrino change.
Additionally, the centrino driver can have its ifdef moved out a little
further to eliminate some more code/variables.
Signed-off-by: Dave Jones <davej@redhat.com>
arch/i386/kernel/cpu/cpufreq/speedstep-centrino.c:396: warning: 'sw_any_bug_dmi_table' defined but not used
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
If someone inserts speedstep-smi on a mobile P4, it prevents other cpufreq
modules from loading until it is unloaded.
Signed-off-by: Hiroshi Miura <miura@da-cha.org>
Signed-off-by: Dave Jones <davej@redhat.com>
ioremap must be balanced by an iounmap and failing to do so can result
in a memory leak.
Tested (compilation only):
- using allmodconfig
- making sure the files are compiling without any warning/error due to
new changes
Signed-off-by: Amol Lad <amol@verismonetworks.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Enable ondemand governor and acpi-cpufreq to use IA32_APERF and IA32_MPERF MSR
to get active frequency feedback for the last sampling interval. This will
make ondemand take right frequency decisions when hardware coordination of
frequency is going on.
Without APERF/MPERF, ondemand can take wrong decision at times due
to underlying hardware coordination or TM2.
Example:
* CPU 0 and CPU 1 are hardware cooridnated.
* CPU 1 running at highest frequency.
* CPU 0 was running at highest freq. Now ondemand reduces it to
some intermediate frequency based on utilization.
* Due to underlying hardware coordination with other CPU 1, CPU 0 continues to
run at highest frequency (as long as other CPU is at highest).
* When ondemand samples CPU 0 again next time, without actual frequency
feedback from APERF/MPERF, it will think that previous frequency change
was successful and can go to wrong target frequency. This is because it
thinks that utilization it has got this sampling interval is when running at
intermediate frequency, rather than actual highest frequency.
More information about IA32_APERF IA32_MPERF MSR:
Refer to IA-32 Intel® Architecture Software Developer's Manual at
http://developer.intel.com
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Recent speedstep-centrino unification onto acpi-cpufreq patchset broke
cpuinfo_cur_freq interface in /sys/../cpuinfo/, when MSR was used for
transitions. Attached patch fixes that breakage.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Only change the frequency if the state previously set is different
from what we are trying to set. We don't really have to get the current
frequency at this point.
Signed-off-by: Denis Sadykov <denis.m.sadykov@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Mark ACPI hooks in speedstep-centrino as deprecated. Change the order in which
speedstep-centrino and acpi-cpufreq (when both are in kernel) will be
added. First driver to be tried is now acpi-cpufreq, followed by
speedstep-centrino.
Add a note in feature-removal-schedule to mark this deprecation.
Signed-off-by: Denis Sadykov <denis.m.sadykov@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Add in the support for Intel Enhanced Speedstep - MSR based transitions.
With this change, the ACPI based support in speedstep-centrino can be
deprecated and duplicate code in that driver can be marked for removal.
Much easier to maintain and support this way. This also reduces the
user misconfigurations and questions on which driver is to be used
under which CPUs to support Enhanced Speedstep.
Signed-off-by: Denis Sadykov <denis.m.sadykov@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Some clean up and redsign of the driver. Mainly making it easier to add
support for multiple sub-mechanisms of changing frequency. Currently this
driver supports only ACPI SYSTEM_IO address space. With the changes
below it is easier to add support for other address spaces like Intel
Enhanced Speedstep which uses MSR (ACPI FIXED_FEATURE_HARDWARE) to do the
transitions.
Signed-off-by: Denis Sadykov <denis.m.sadykov@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
This patchset has refresh/rebase of a bunch of patches/bugfixes related to
acpi-cpufreq that were sent earlier on this list.
patch 1/8
Patch that fixes a bug in swcoordination code in acpi-cpufreq
patch 2/8 through patch 7/8
Grand unification of ACPI based speedstep-centrino and acpi-cpufreq drivers.
ACPI allows P-state transitions in multiple ways. Like using IO ports or using
processor native method (MSR). Without this patch, IO port based P-state
transitions are handled in acpi-cpufreq driver and MSR based transitions on
Intel CPUs are handled in speedstep-centrino driver. Even though most of the
code in these two drivers should be similar, except for final changing/checking
of frequency (one driver does it using IO port and other does it through
MSR), we have duplicated code in these two drivers. There are also issues
around BIOSes supporting both MSR and IO port and which driver should be
loaded first in standard installations.
The patchset combines functionality of these two driver into acpi-cpufreq
driver. ACPI based functionality in speedstep-centrino is marked deprecated
and will be removed in future. speedstep-centrino will continue to work
on systems that depend on older non-ACPI table based P-state chanes.
* 2/8 - Patch that reorganizes the code in acpi-cpufreq, cleaning it up
a little and making it easier to add MSR support later.
* 3/8 - Pull in the MSR based transition support into acpi-cpufreq.
* 4/8 - Mark speedstep-centrino deprecated. Change the order in Makefile to
load acpi-cpufreq first and speedstep-centrino later, in cases where both
are configured in.
* 5/8 - lindent acpi-cpufreq.c
* 6/8 - Minor change to eliminate the check of current frequency on
notifications. We can use last set frequency instead.
* 7/8 - Make cpufreq->get of acpi_cpufreq work correctly again.
There will be a patch in future that removes ACPI based support in
speedstep-centrino in coming months.
patch 8/8
Add support for IA32_APERF and IA32_MPERF MSR and get the actual frequency
from these MSRs and use it to determine the next frequency target in ondemand
governor
This patch:
There is a bug in software coordination patch in acpi-cpufreq, due to which
frequency will only be set on first CPU of any coordinated group.
Bug identified by Denis, was not recognised earlier as there are no platforms
yet that use software coordination with acpi-cpufreq driver.
Signed-off-by: Denis Sadykov <denis.m.sadykov@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Get rid of warning in the thermal throttling code about not checking
sysfs return values.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This changes a couple of if() BUG(); constructs to
BUG_ON(); so it can be safely optimized away.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jesper Juhl reported that testing the software math-emulation by forcing
"no387" doesn't work on modern CPU's.
The reason was two-fold:
- you also need to pass in "nofxsr" to make sure that we not only don't
touch the old i387 legacy hardware, it also needs to disable the
modern XMM/FXSR sequences
- "nofxsr" didn't actually clear the capability bits immediately,
leaving the early boot sequence still using FXSR until we got to
the identify_cpu() stage.
This fixes the "nofxsr" flag to take effect immediately on the boot CPU.
Debugging by Randy Dunlap
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesper Juhl <jesper.juhl@gmail.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Please ignore previous message.
This patch is adding support for CPU connected to CLE266
chipset. For older CPU this is only way. For "Powersaver"
processor this way will be used if ACPI C3 isn't supported.
I have tested it. It seems to work exacly like ACPI.
But it is less safe. On CLE266 chipset port 0x22 is
blocking processor access to PCI bus too.
Signed-off-by: Rafa³ Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
Make the sections proper and get rid of section mismatch warnings.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits)
[PATCH] Don't set calgary iommu as default y
[PATCH] i386/x86-64: New Intel feature flags
[PATCH] x86: Add a cumulative thermal throttle event counter.
[PATCH] i386: Make the jiffies compares use the 64bit safe macros.
[PATCH] x86: Refactor thermal throttle processing
[PATCH] Add 64bit jiffies compares (for use with get_jiffies_64)
[PATCH] Fix unwinder warning in traps.c
[PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1
[PATCH] x86: Move direct PCI scanning functions out of line
[PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI
[PATCH] Don't leak NT bit into next task
[PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
[PATCH] Fix some broken white space in ia32_signal.c
[PATCH] Initialize argument registers for 32bit signal handlers.
[PATCH] Remove all traces of signal number conversion
[PATCH] Don't synchronize time reading on single core AMD systems
[PATCH] Remove outdated comment in x86-64 mmconfig code
[PATCH] Use string instructions for Core2 copy/clear
[PATCH] x86: - restore i8259A eoi status on resume
[PATCH] i386: Split multi-line printk in oops output.
...
The functions prepare_set and post_set in kernel/cpu/mtrr/generic.c wrap
the spinlock set_atomicity_lock: prepare_set returns with the lock held,
and post_set releases the lock without acquiring it. Add lock annotations
to these two functions so that sparse can check callers for lock pairing,
and so that sparse will not complain about these functions since they
intentionally use locks in this manner.
Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add supplemental SSE3 instructions flag, and Direct Cache Access flag.
As described in "Intel Processor idenfication and the CPUID instruction
AP485 Sept 2006"
AK: also added for x86-64
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
The counter is exported to /sys that keeps track of the
number of thermal events, such that the user knows how bad the
thermal problem might be (since the logging to syslog and mcelog
is rate limited).
AK: Fixed cpu hotplug locking
Signed-off-by: Dmitriy Zavin <dmitriyz@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Refactor the event processing (syslog messaging and rate limiting)
into separate file therm_throt.c. This allows consistent reporting
of CPU thermal throttle events.
After ACK'ing the interrupt, if the event is current, the user
(p4.c/mce_intel.c) calls therm_throt_process to log (and rate limit)
the event. If that function returns 1, the user has the option to log
things further (such as to mce_log in x86_64).
AK: minor cleanup
Signed-off-by: Dmitriy Zavin <dmitriyz@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Mark i386-specific cpu cache functions as __cpuinit. They are all
only called from arch/i386/common.c:display_cache_info() that already is
marked as __cpuinit.
Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
Mark i386-specific cpu identification functions as __cpuinit. They are all
only called from arch/i386/common.c:identify_cpu() that already is marked as
__cpuinit.
Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
Mark i386-specific cpu init functions as __cpuinit. They are all
only called from arch/i386/common.c:identify_cpu() that already is marked as
__cpuinit. This patch also removes the empty function init_umc().
Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>