android_kernel_samsung_msm8976/lib
Peter Zijlstra 5881a5ab50 lib/int_sqrt: optimize small argument
commit 3f3295709edea6268ff1609855f498035286af73 upstream.

The current int_sqrt() computation is sub-optimal for the case of small
@x.  Which is the interesting case when we're going to do cumulative
distribution functions on idle times, which we assume to be a random
variable, where the target residency of the deepest idle state gives an
upper bound on the variable (5e6ns on recent Intel chips).

In the case of small @x, the compute loop:

	while (m != 0) {
		b = y + m;
		y >>= 1;

		if (x >= b) {
			x -= b;
			y += m;
		}
		m >>= 2;
	}

can be reduced to:

	while (m > x)
		m >>= 2;

Because y==0, b==m and until x>=m y will remain 0.

And while this is computationally equivalent, it runs much faster
because there's less code, in particular less branches.

      cycles:                 branches:              branch-misses:

OLD:

hot:   45.109444 +- 0.044117  44.333392 +- 0.002254  0.018723 +- 0.000593
cold: 187.737379 +- 0.156678  44.333407 +- 0.002254  6.272844 +- 0.004305

PRE:

hot:   67.937492 +- 0.064124  66.999535 +- 0.000488  0.066720 +- 0.001113
cold: 232.004379 +- 0.332811  66.999527 +- 0.000488  6.914634 +- 0.006568

POST:

hot:   43.633557 +- 0.034373  45.333132 +- 0.002277  0.023529 +- 0.000681
cold: 207.438411 +- 0.125840  45.333132 +- 0.002277  6.976486 +- 0.004219

Averages computed over all values <128k using a LFSR to generate order.
Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
each int_sqrt() invocation.

Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org
Fixes: 30493cc9dd ("lib/int_sqrt.c: optimize square root algorithm")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Anshul Garg <aksgarg1989@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Davidson <md@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
2019-07-27 21:46:05 +02:00
..
lz4 lz4: fix another possible overrun 2016-05-18 14:34:38 +05:30
lzo lzo: check for length overrun in variable length encoding. 2014-10-30 09:35:11 -07:00
mpi Import latest Samsung release 2017-04-18 03:43:52 +02:00
raid6 lib/raid6: build proper files on corresponding arch 2012-12-13 19:51:04 +11:00
reed_solomon
xz decompressors: fix typo "POWERPC" 2013-03-13 15:21:48 -07:00
zlib_deflate
zlib_inflate
.gitignore X.509: Implement simple static OID registry 2012-10-08 13:50:18 +10:30
argv_split.c argv_split(): teach it to handle mutable strings 2013-04-29 18:28:19 -07:00
asn1_decoder.c KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2] 2019-07-27 21:45:51 +02:00
atomic64.c lib: atomic64: Initialize locks statically to fix early users 2012-12-20 13:50:16 -08:00
atomic64_test.c atomic64_test: simplify the #ifdef for atomic64_dec_if_positive() test 2012-07-30 17:25:16 -07:00
audit.c
average.c
bcd.c usb/core: use bin2bcd() for bcdDevice in RH 2012-09-10 11:13:16 -07:00
bch.c
bitmap.c Merge remote-tracking branch 'f2fs/linux-3.10.y' into HEAD 2017-04-18 17:02:28 +02:00
bitrev.c
bsearch.c
btree.c lib/btree.c: fix leak of whole btree nodes 2014-08-07 14:30:27 -07:00
bug.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
build_OID_registry X.509: Implement simple static OID registry 2012-10-08 13:50:18 +10:30
bust_spinlocks.c printk: Provide a wake_up_klogd() off-case 2013-03-22 16:41:20 -07:00
check_signature.c
checksum.c lib/checksum.c: fix build for generic csum_tcpudp_nofold 2015-02-11 14:48:17 +08:00
clz_tab.c
cmdline.c lib/cmdline.c: fix get_options() overflow while parsing ranges 2019-07-27 21:44:24 +02:00
cordic.c
cpu-notifier-error-inject.c cpu: rewrite cpu-notifier-error-inject module 2012-07-30 17:25:22 -07:00
cpu_rmap.c irq: Allow multiple clients to register for irq affinity notification 2014-11-09 15:17:27 -08:00
cpumask.c sched/fair, cpumask: Export for_each_cpu_wrap() 2019-07-27 21:44:52 +02:00
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c
crc7.c
crc8.c
crc16.c
crc32.c sections: fix const sections for crc32 table 2012-10-06 03:04:46 +09:00
crc32defs.h
ctype.c
debug_locks.c
debugobjects.c debugobjects: use kmemleak_not_leak for obj_cache 2015-05-29 19:35:14 +05:30
dec_and_lock.c
decompress.c lib/decompress.c: fix initconst 2013-04-30 17:04:09 -07:00
decompress_bunzip2.c decompress_bunzip2: off by one in get_next_block() 2015-01-27 07:52:33 -08:00
decompress_inflate.c lib/decompressors: fix "no limit" output buffer length 2014-02-06 11:08:12 -08:00
decompress_unlzma.c
decompress_unlzo.c lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c 2013-02-20 19:36:00 +01:00
decompress_unxz.c
devres.c This is the 3.10.99 stable release 2017-04-18 17:17:46 +02:00
digsig.c lib/digsig: fix dereference of NULL user_key_payload 2019-07-27 21:44:22 +02:00
div64.c UPSTREAM: math64: New separate div64_u64_rem helper 2016-05-18 14:36:10 +05:30
dma-debug.c dma-debug: switch check from _text to _stext 2016-02-25 11:57:49 -08:00
dump_stack.c dump_stack: consolidate dump_stack() implementations and unify their behaviors 2013-04-30 17:04:02 -07:00
dynamic_debug.c dynamic_debug: Handle kstrdup failure in dynamic_debug_init 2015-06-20 18:25:48 -07:00
dynamic_queue_limits.c bql: Avoid possible inconsistent calculation. 2012-05-31 18:18:17 -04:00
earlycpio.c lib: Add early cpio decoder 2012-09-30 18:02:20 -07:00
extable.c
fault-inject.c debugfs: add get/set for atomic types 2013-10-18 18:13:21 -07:00
fdt.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt_ro.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt_rw.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt_strerror.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt_sw.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt_wip.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
find_last_bit.c
find_next_bit.c
flex_array.c
flex_proportions.c lib/flex_proportions.c: fix corruption of denominator in flexible proportions 2012-09-25 08:59:21 -07:00
gcd.c lib/gcd.c: prevent possible div by 0 2012-10-06 03:04:57 +09:00
gen_crc32table.c sections: fix const sections for crc32 table 2012-10-06 03:04:46 +09:00
genalloc.c Merge upstream linux-stable v3.10.28 into msm-3.10 2014-03-24 14:28:34 -07:00
halfmd4.c
hexdump.c dynamic_debug: dynamic hex dump 2013-01-17 12:19:09 -08:00
hweight.c
idr.c idr: fix overflow bug during maximum ID calculation at maximum height 2014-06-30 20:09:42 -07:00
inflate.c
int_sqrt.c lib/int_sqrt: optimize small argument 2019-07-27 21:46:05 +02:00
interval_tree.c mm: interval tree updates 2012-10-09 16:22:40 +09:00
interval_tree_test_main.c random32: rename random32 to prandom 2012-12-17 17:15:26 -08:00
iomap.c lib: iomap: Add MSM RTB support 2014-09-04 19:40:43 -07:00
iomap_copy.c
iommu-helper.c
ioremap.c
iovec.c Hoist memcpy_fromiovec/memcpy_toiovec into lib/ 2013-05-20 10:24:22 +09:30
irq_regs.c
is_single_threaded.c
jedec_ddr_data.c ddr: add LPDDR2 data from JESD209-2 2012-05-02 00:04:06 -07:00
kasprintf.c lib/kasprintf.c: use kmalloc_track_caller() to get accurate traces for kvasprintf 2012-10-11 08:50:15 +09:00
Kconfig lib: add lz4 compressor module 2015-09-16 18:20:12 +05:30
Kconfig.debug time: Remove CONFIG_TIMER_STATS 2017-04-22 23:02:59 +02:00
Kconfig.kasan kasan: enable instrumentation of global variables 2015-05-04 14:03:57 -07:00
Kconfig.kgdb KGDB/KDB fixes and cleanups 2013-03-02 08:31:39 -08:00
Kconfig.kmemcheck
kfifo.c kfifo: fix kfifo_alloc() and kfifo_init() 2013-02-27 19:10:23 -08:00
klist.c klist: fix starting point removed bug in klist iterators 2016-02-25 11:57:47 -08:00
kobject.c kref: minor cleanup 2013-05-07 16:09:00 -07:00
kobject_uevent.c netlink: hide struct module parameter in netlink_kernel_create 2012-09-08 18:46:30 -04:00
kstrtox.c kstrto*: add documentation 2012-12-17 17:15:22 -08:00
kstrtox.h
lcm.c
libcrc32c.c
list_debug.c kernel/lib: add additional debug capabilites for data corruption 2013-08-22 18:08:50 -07:00
list_sort.c lib/: rename random32() to prandom_u32() 2013-04-29 18:28:42 -07:00
llist.c
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c lockdep: Selftest: convert spinlock to raw spinlock 2013-02-19 08:43:35 +01:00
lru_cache.c lru_cache: introduce lc_get_cumulative() 2013-03-22 22:17:36 -06:00
Makefile lib: add lz4 compressor module 2015-09-16 18:20:12 +05:30
md5.c
memory-notifier-error-inject.c memory: memory notifier error injection module 2012-07-30 17:25:22 -07:00
memweight.c string: introduce memweight() 2012-07-30 17:25:16 -07:00
nlattr.c netlink: rate-limit leftover bytes warning and print process name 2014-06-26 15:12:37 -04:00
notifier-error-inject.c mode_t, whack-a-mole at 11... 2013-04-09 14:13:05 -04:00
notifier-error-inject.h fault-injection: notifier error injection 2012-07-30 17:25:22 -07:00
of-reconfig-notifier-error-inject.c powerpc+of: Rename and fix OF reconfig notifier error inject module 2012-12-14 10:32:52 +11:00
oid_registry.c Give the OID registry file module info to avoid kernel tainting 2013-05-05 14:38:00 -07:00
parser.c lib/parser.c: fix up comments for valid return values from match_number 2013-02-21 17:22:25 -08:00
pci_iomap.c
percpu_counter.c switch the protection of percpu_counter list to spinlock 2012-07-31 09:28:31 +04:00
plist.c lib/plist.c: make plist test announcements KERN_DEBUG 2012-10-06 03:04:58 +09:00
pm-notifier-error-inject.c PM: PM notifier error injection module 2012-07-30 17:25:22 -07:00
prio_heap.c
proportions.c
qmi_encdec.c This is the 3.10.84 stable release 2015-09-30 13:25:40 +05:30
qmi_encdec_priv.h lib: qmi: Introduce QMI Encode/Decode library 2013-09-04 15:34:45 -07:00
radix-tree.c radix-tree: fix race in gang lookup 2016-02-25 11:57:49 -08:00
random32.c random32: include missing header file 2017-09-08 18:50:21 +00:00
ratelimit.c Import latest Samsung release 2017-04-18 03:43:52 +02:00
rational.c lib: Change mail address of Oskar Schirmer 2012-05-17 15:18:37 +02:00
rbtree.c rbtree: add postorder iteration functions 2015-09-16 18:20:19 +05:30
rbtree_test.c rbtree_test: add __init/__exit annotations 2013-04-30 17:04:07 -07:00
reciprocal_div.c
scatterlist.c Merge upstream linux-stable v3.10.28 into msm-3.10 2014-03-24 14:28:34 -07:00
sha1.c
show_mem.c mm, show_mem: suppress page counts in non-blockable contexts 2013-04-29 15:54:28 -07:00
smp_processor_id.c
sort.c
stmp_device.c lib: add support for stmp-style devices 2012-04-20 23:27:08 +02:00
string.c UPSTREAM: lib/string.c: introduce strreplace() 2016-05-18 14:36:10 +05:30
string_helpers.c lib/string_helpers: introduce generic string_unescape 2013-04-30 17:04:03 -07:00
strncpy_from_user.c word-at-a-time: make the interfaces truly generic 2012-05-26 11:33:40 -07:00
strnlen_user.c lib: Fix strnlen_user() to not touch memory after specified maximum 2015-06-05 23:19:54 -07:00
swiotlb.c swiotlb: Setting default IO TBL value to 1MB 2014-06-02 08:46:43 -07:00
syscall.c
test-kstrtox.c lib/test-kstrtox.c: mark const init data with __initconst instead of __initdata 2012-05-29 16:22:32 -07:00
test-string_helpers.c lib/string_helpers: introduce generic string_unescape 2013-04-30 17:04:03 -07:00
textsearch.c
timerqueue.c
ts_bm.c
ts_fsm.c
ts_kmp.c
ucs2_string.c lib/ucs2_string: Correct ucs2 -> utf8 conversion 2016-03-16 08:41:37 -07:00
usercopy.c Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS 2013-04-30 17:04:09 -07:00
uuid.c uuid: use prandom_bytes() 2013-04-29 18:28:42 -07:00
vsprintf.c vsprintf: ignore %n again 2014-05-30 10:23:23 -07:00