commit f3ca416452 upstream.
acpi_processor_set_throttling() uses set_cpus_allowed_ptr() to make
sure that the (struct acpi_processor)->acpi_processor_set_throttling()
callback will run on the right CPU. However, the function may be
called from a worker thread already bound to a different CPU in which
case that won't work.
Make acpi_processor_set_throttling() use work_on_cpu() as appropriate
instead of abusing set_cpus_allowed_ptr().
Reported-and-tested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Using a u64 here creates an endian bug. We store a u32 number in the
top byte which is a larger number than intended on big endian systems.
There is no reason to use a 64 bit data type here, I guess it was just
an oversight.
I removed the initialization to zero as well. It's needed with a u64
but with a u32, the variable gets initialized properly inside the call
to acpi_os_read_port().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Merge reason: Pick up the following two fix commits.
2be19102b7: x86, NUMA: Fix empty memblk detection in numa_cleanup_meminfo()
765af22da8: x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
Scheduled NUMA init 32/64bit unification changes depend on these.
Signed-off-by: Tejun Heo <tj@kernel.org>
With the this_cpu_xx we no longer need to pass an acpi
structure to the msr management code. Simplifies code and improves
performance.
NOTE: This code is x86 specific (see #ifdef CONFIG_X86) but not under
arch/x86.
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
After one CPU is offlined, it is unnecessary to switch T-state for it.
So it will be better that the throttling is disabled after the cpu
is offline.
At the same time after one cpu is online, we should check whether
the T-state is supported and then set the corresponding T-state
flag.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Now before it executes the T-state operation on one CPU, it will try to
migrate to the target CPU. Especially this is required on the system that
uses the MSR_IA32_THERMAL_CONTROL register to switch T-state.
But unfortunately it doesn't check whether the migration is successful or not.
In such case we will get/set the incorrect T-state on the offline CPU as
it fails in the migration to the offline CPU.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Remove deprecated ACPI process procfs I/F for throttling control.
This is because the t-state control should only be done in kernel,
when system is in a overheating state.
Now users can only change the processor t-state indirectly,
by poking the cooling device sysfs I/F of the processor.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
As a feature that would only be used when system is overheating,
the processor t-state control should not be exported to user space.
Make /proc/acpi/processor/*/throttle depends on CONFIG_ACPI_PROCFS,
which is cleared by default.
And we will remove this I/F in 2.6.38.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Remove deprecated ACPI processor procfs I/F, including:
/proc/acpi/processor/CPUX/power
/proc/acpi/processor/CPUX/limit
/proc/acpi/processor/CPUX/info
/proc/acpi/processor/CPUX/throttling still exists,
as we don't have sysfs I/F available for now.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Dan's list contains:
drivers/acpi/processor_throttling.c +1139 acpi_processor_get_throttling_info(11) warning: variable derefenced before check 'pr'
acpi_processor_get_throttling_info() is never called with pr == NULL.
[ bart: the potential NULL pointer dereference was finally fixed in
(much later than mine) commit 5cfa245 but my patch is still valid ]
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
acpi_integer is now obsolete and removed from the ACPICA code base,
replaced by u64.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
If the NULL test on pr is needed, then the dereference should be after the
NULL test.
A simplified version of the semantic match that detects this problem is as
follows (http://coccinelle.lip6.fr/):
// <smpl>
@match exists@
expression x, E;
identifier fld;
@@
* x->fld
... when != \(x = E\|&x\)
* x == NULL
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Remove open-coded zalloc_cpumask_var() and zalloc_cpumask_var_node().
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Linux/ACPI core files using internal.h all PREFIX "ACPI: ",
however, not all ACPI drivers use/want it -- and they
should not have to #undef PREFIX to define their own.
Add GPL commment to internal.h while we are there.
This does not change any actual console output,
asside from a whitespace fix.
Signed-off-by: Len Brown <len.brown@intel.com>
This failure is very common on many platforms. Handling it in the ACPI
processor driver is enough, and we don't need a warning message unless
CONFIG_ACPI_DEBUG is set.
Based on a patch from Zhang Rui.
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389
Signed-off-by: Frans Pop <elendil@planet.nl>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the BIOS reports an invalid throttling state (which seems to be
fairly common after system boot), a reset is done to state T0.
Because of a check in acpi_processor_get_throttling_ptc(), the reset
never actually gets executed, which results in the error reoccurring
on every access of for example /proc/acpi/processor/CPU0/throttling.
Add a 'force' option to acpi_processor_set_throttling() to ensure
the reset really takes effect.
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389
This patch, together with the next one, fixes a regression introduced in
2.6.30, listed on the regression list. They have been available for 2.5
months now in bugzilla, but have not been picked up, despite various
reminders and without any reason given.
Google shows that numerous people are hitting this issue. The issue is in
itself relatively minor, but the bug in the code is clear.
The patches have been in all my kernels and today testing has shown that
throttling works correctly with the patches applied when the system
overheats (http://bugzilla.kernel.org/show_bug.cgi?id=13918#c14).
Signed-off-by: Frans Pop <elendil@planet.nl>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now whether the ACPI processor proc I/F is registered depends on the
CONFIG_PROC. It had better depend on the CONFIG_ACPI_PROCFS.
When the CONFIG_ACPI_PROCFS is unset in kernel configuration, the
ACPI processor proc I/F won't be registered.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Commit 4973b22a ("ACPI processor: reset the throttling state once it's
invalid") introduced a new warning which prints a spurious newline.
The ACPI_WARNING macro that is used already takes care of adding a
newline, after adding ACPI_CA_VERSION to the message. Remove the newline
to avoid the message getting split into two lines.
Signed-off-by: Frans Pop <elendil@planet.nl>
Signed-off-by: Len Brown <len.brown@intel.com>
If the BIOS hands us an invalid throttling state,
write a valid state.
http://bugzilla.kernel.org/show_bug.cgi?id=13259
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: James Ettle <theholyettlz@googlemail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Introduce module parameter processor.ignore_tpc.
Some laptops are shipped with buggy _TPC,
this module parameter is used to to disable the buggy support.
http://bugzilla.kernel.org/show_bug.cgi?id=13259
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: James Ettle <theholyettlz@googlemail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Impact: Reduce memory usage, use new API.
This is part of an effort to reduce structure sizes for machines
configured with large NR_CPUS. cpumask_t gets replaced by
cpumask_var_t, which is either struct cpumask[1] (small NR_CPUS) or
struct cpumask * (large NR_CPUS).
(Changes to powernow-k* by <travis>.)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move all the component definitions for drivers to a single shared place,
include/acpi/acpi_drivers.h.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
ACPI_DB_ERROR and ACPI_DB_WARN were removed from ACPICA core.
So replace ACPI_DEBUG_PRINT((ACPI_DB_ERROR, ...) with printk(KERN_ERR PREFIX ...)
and ACPI_DEBUG_PRINT((ACPI_DB_WARN, ...) with printk(KERN_WARNING PREFIX ...)
We do not use ACPI_ERROR/ACPI_WARNING since they're not exported, see
-------------------------------------------------------------
commit 6468463abd
Author: Len Brown <len.brown@intel.com>
Date: Mon Jun 26 23:41:38 2006 -0400
ACPI: un-export ACPI_ERROR() -- use printk(KERN_ERR...)
Signed-off-by: Len Brown <len.brown@intel.com>
-------------------------------------------------------------
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
As of version 2.0, ACPI can return 64-bit integers. The current
acpi_evaluate_integer only supports 64-bit integers on 64-bit platforms.
Change the argument to take a pointer to an acpi_integer so we support
64-bit integers on all platforms.
lenb: replaced use of "acpi_integer" with "unsigned long long"
lenb: fixed bug in acpi_thermal_trips_update()
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
* Replace previous instances of the cpumask_of_cpu_ptr* macros
with a the new (lvalue capable) generic cpumask_of_cpu().
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* This patch replaces the dangerous lvalue version of cpumask_of_cpu
with new cpumask_of_cpu_ptr macros. These are patterned after the
node_to_cpumask_ptr macros.
In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
the cpumask_of_cpu_map[cpu] entry is used. The cpumask_of_cpu_map
is provided when there is a large NR_CPUS count, reducing
greatly the amount of code generated and stack space used for
cpumask_of_cpu(). The pointer to the cpumask_t value is needed for
calling set_cpus_allowed_ptr() to reduce the amount of stack space
needed to pass the cpumask_t value.
If there isn't a cpumask_of_cpu_map[], then a temporary variable is
declared and filled in with value from cpumask_of_cpu(cpu) as well as
a pointer variable pointing to this temporary variable. Afterwards,
the pointer is used to reference the cpumask value. The compiler
will optimize out the extra dereference through the pointer as well
as the stack space used for the pointer, resulting in identical code.
A good example of the orthogonal usages is in net/sunrpc/svc.c:
case SVC_POOL_PERCPU:
{
unsigned int cpu = m->pool_to[pidx];
cpumask_of_cpu_ptr(cpumask, cpu);
*oldmask = current->cpus_allowed;
set_cpus_allowed_ptr(current, cpumask);
return 1;
}
case SVC_POOL_PERNODE:
{
unsigned int node = m->pool_to[pidx];
node_to_cpumask_ptr(nodecpumask, node);
*oldmask = current->cpus_allowed;
set_cpus_allowed_ptr(current, nodecpumask);
return 1;
}
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix printk format warning:
linux-next-20080617/drivers/acpi/processor_throttling.c:1258: warning: format '%d' expects type 'int', but argument 4 has type 'size_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Change processors from an array sized by NR_CPUS to a per_cpu variable.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Change references from for_each_cpu_mask to for_each_cpu_mask_nr
where appropriate
Reviewed-by: Paul Jackson <pj@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.
Add correct ->owner to proc_fops to fix reading/module unloading race.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Use new set_cpus_allowed_ptr() function added by previous patch,
which instead of passing the "newly allowed cpus" cpumask_t arg
by value, pass it by pointer:
-int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)
* Modify CPU_MASK_ALL
Depends on:
[sched-devel]: sched: add new set_cpus_allowed_ptr function
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
According to ACPI spec, the _TSD object provides T-state control cross
logical processor dependency information to OSPM. So the t-state
coordination should be considered when T-state for one cpu is changed.
According to ACPI spec, three types of coordination are defined.
SW_ALL, SW_ANY and HW_ALL.
SW_ALL: it means that OSPM needs to initiate T-state transition on
all processors in the domain. It is necessary to call throttling set function
for all affected cpus.
SW_ANY: it means that OSPM may initiate T-state transition on any processor in
the domain.
HW_ALL: Spec only says that hardware will perform the coordination and doesn't
recommend how OSPM coordinate T-state among the affected cpus. So it is treated
as the type of SW_ALL. It means that OSPM needs to initiate t-state transition
on all the processors in the domain.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The t-state coordination should be considered when T-state for one cpu
is changed.It means that OSPM should select one proper target T-state for
the all affected cpus before updating T-state.
So the function of acpi_processor_throttling_notifier is added.
Before updating T-state it can be called for all the affected cpus to get
the proper target T-state, which can meet the requirement of thermal, user and
_TPC. After updating T-state, it can be called to update T-state flag.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Accordint to ACPI spec, the _TSD object provides T-state control cross
logical processor dependency information to OSPM.
After the _TSD data for all cpus are obtained, OSPM will set up
the T-state coordination between CPUs.
Of course if the _TSD doesn't exist or _TSD data is incorrect , it is
assumed that there is no T-state coordination and T-state is changed
independently.
Now there is no proper solution to update T-state coordination after
one cpu is hotplugged. So this patch won't support hotplugged cpu very well.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
It is necessary to check the parameter when calling the function of
acpi_processor_get/set_throttling function so as to avoid the NULL
pointer reference in pr or throttling.
http://bugzilla.kernel.org/show_bug.cgi?id=9747
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The IRQ operation(enable/disable) should be avoided when throttling is
controlled via PTC method. It is replaced by the migration of task.
This fixes an oops on T61 -- a regression due to
f79f06ab9f b/c FixedHW support tried to read remote MSR with interrupts disabled.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Add throttling control via MSR when T-states uses
the FixHW Control Status registers.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>