When resetting device on a hang the pending transactions in the
VBIF should be cleared since the GPU is hung and unable to accept
any transactions. These pending transactions can cause VBIF pipe
to block the IOMMU so clear them.
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Change-Id: I6e0171a6e61c0dd831ce7afdc177775b2ae3f07f
Reserves CMA memory for kgsl driver early during bootup and then
uses dma_alloc_coherent() to allocate physically contiguous memory
instead of using the MMU
Change-Id: Ica9b244fe9b9d8a902d670293a0bec2edf81bd5d
Signed-off-by: Shrenuj Bansal <shrenujb@codeaurora.org>
Put CP_STATE_DEBUG_INDEX and CP_STATE_DEBUG_DATA under protection
to keep it from being written from an IB1. Doing so however opens
up a subtle "feature" in the microcode: memory read opcodes turn off
protected mode in the microcode to do the read and then turns it
back on regardless of the initial state. This is a problem if the
memory read happens while protected mode is turned off and then we
try to access a protected register which then complains and goes boom.
To account for this irregularity explicitly turn back off protected
mode in all the places where we know this will be a problem.
Change-Id: Ic0dedbad1397ca9b80132241ac006560a615e042
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
When we get a protected mode error print out the register information
that caused the exception.
Change-Id: Ic0dedbad4f586c5715669226619b51665ef9681f
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Put the SMMU register range in protected mode to shield them from
IB1/IB2 writes from userspace.
CRs-Fixed: 599971
Change-Id: Ic0dedbad8c03fc1c54ff73221231e2440d3c34dd
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Turn on protected register mode for the A3XX GPU family and add 0x63
(RBBM_INT_0_MASK) to the list of protected registers.
Change-Id: Ic0dedbad10ebfa6eb6d3d815b5aa9b6b6f0e8e35
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Over time chery-picks for KGSL have been skipped or
have been resolved differently between branches. As
a result, this branch of KGSL has become increasingly
difficult to maintain due to merge conflicts. With a
few exceptions KGSL should match the msm-3.4 mainline
exactly. To rectify the situation, this change brings
KGSL up-to-date with the msm-3.4 mainline as a bulk
change because cherry-picks are not practical.
Change-Id: I53f9f7fbf4942e147dea486ff5dbf179af75ea8c
Signed-off-by: Jeff Boody <jboody@codeaurora.org>
Previously, kgsl_perfcounter_get only returned the offset
of the LO perf counter register. This assumed that the HI
register would always be adjacent to the LO register. With
VBIF 2.0, this assumption has now been broken, so both the
HI and LO register offsets must be returned.
CRs-fixed: 578771
Change-Id: Ie74da5d797e58a143b89a61aba7ebaf1ed42ed5e
Signed-off-by: Kevin Matlage <kmatlage@codeaurora.org>
Add commands to flush the GPU UCHE when a new context submits
commands to GPU. The ensures that the new context does not use
stale data present in UCHE.
Change-Id: I123a323be5f3fb9d1f9f96fed5bb68b8d0d27d76
CRs-Fixed: 607976
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Move logic for handling preamble based context switch
to the core adreno code. This makes it less burdensome
to implement support for newer GPU families that won't
ever support legacy context switching.
Change-Id: Id9ad5936ff91dcdbc9de869baf0d0b9fcf1b5170
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
When reading some counters it was not frozen correctly. Reset the
correct bit to freeze the counters.
Change-Id: I1065adec44fc9c702f9c720f96f444d6476bea7e
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Add return value for adreno_perfcounter_enable function because it can
fail. Also, some perfcounter enable functions were receiving incorrect
parameter, fixed this by passing the right parameter to these enable
functions.
Change-Id: Ia47e9eab7833c44fcc0dc389ac8afce425f0a28e
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
irq_last is a flag to indicate kgsl should busy wait in an
attempt to NAP because the GPU is likely done with its last
batch of processing.
The setting and clearing of the irq_last variable was removed
with the dispatcher. Although there are no visable instances
of trouble getting to NAP in current builds, re-add the functionality
until it can be fully replaced.
Change-Id: I8b4d2b72aec8948919fbeed9939684a09cb4f7f9
Signed-off-by: Lucille Sylvester <lsylvest@codeaurora.org>
VBIF, VBIF power and power performance counters are special case
counters that are stopped by using a different set of registers.
Handle the case to read these counters in separate functions.
Change-Id: Ia89ba44d1031496ac7742b435c28ea2f3b2177fa
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Restructure the performance counter mechanism by combining the two
structures declared for it. The 2 structures can only refer to a
single counter hence it is easier to manage by having them
under a single structure.
Change-Id: I19d13cf5aa619b85a332b383b464c2af65ad38c9
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Rather than just storing if the performance counter was referenced by the
kernel in a flag, keep track of user space and kernel space references
seperately. This allows finer grained manipulations on the performance
counters (ie when allowed to be released). This is needed to be able to
turn off certain performance counters when fault tolerance is disabled
at runtime.
Change-Id: I7b5593459e64557dabd594aeb6532a0c9af6a9c5
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Add VBIF register programming for a420 core during start up of
this device.
Change-Id: I1bd79f72dabdc8c4e9403bb2788b3ea53d8ba04c
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Support for A4XX GPU family. A4xx shares a lot of code with
A3xx, reuse the common functions whenever possible.
Change-Id: If10eac6ad71c92bf699a8874c1f189afc74db914
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
On some targets (A330v2) the hardware is in a wonky state following
a power collapse. Send a special command buffer before the first
submission to put everything back in place.
Change-Id: Ic0dedbadb8e676677b9db95defd53f7bd3fba338
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Store the process private pointer in context. Earlier the process
private pointer was referenced through the dev_priv pointer in the
context, but the dev_priv pointer can be destroyed before the context
process private so store this pointer locally.
Change-Id: Ic07680b79db55d6306306bd61bda5a1288813914
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Implement the KGSL fault tolerance policy for faults in the dispatcher.
Replay (or skip) the inflight command batches as dictated by the policy,
iterating progressively through the various behaviors.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Soft reset GPU when GPU power rail is not turned off.
In hard reset we reload full microcode but in soft reset
since the memory powerrail is not turned off we only
load jump tables part of microcode, this reduces GPU
reset time from 10ms to 200us.
Change-Id: Ibbc36e97cc95425e13856fd5d847eed742743723
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Setup the protection registers for a3xx towards the end of its
start function instead of doing it in generic ringbuffer start
Change-Id: I66df496afa5d1fdf7dea790306f5358c2098674d
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Having a separate allocated struct for the device specific context
makes ownership unclear, which could lead to reference counting
problems or invalid pointers. Also, duplicate members were
starting to appear in adreno_context because there wasn't a safe
way to reach the kgsl_context from some parts of the adreno code.
This can now be done via container_of().
This change alters the lifecycle of the context->id, which is
now freed when the context reference count hits zero rather
than in kgsl_context_detach().
It also changes the context creation and destruction sequence.
The device specific code must allocate a structure containing
a struct kgsl_context and passes a pointer it to kgsl_init_context()
before doing any device specific initialization. There is also a
separate drawctxt_detach() callback for doing device specific
cleanup. This is separate from freeing memory, which is done
by the drawctxt_destroy() callback.
Change-Id: I7d238476a3bfec98fd8dbc28971cf3187a81dac2
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
There were dual functions for reading and writing registers for
adreno devices. Stop the use of one of these dual functions as
it makes the code more uniform.
Change-Id: I703d27d1674a85a6c2d7a9fe6dc49f13005a3410
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Different adreno cores have different offsets for same register.
These registers are referenced in code areas which are common to
all adreno cores. Hence, they should be referenced with a variable
instead of using a constant to make things more generic. This makes
the code more suitable for accomodating future cores.
Change-Id: Ie3d387d7cf767d46eea90e0fecdbba88dad97860
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Special case performance counters do not have select registers.
Enabling performance counters first checks that the counters we want
are valid and have select registers, then it would enable the counters.
The special case performance registers were not being enabled in this
case since they fail the initial test. Move these special cases to
before the select register error checking and perform their own sanity
checks before enabling the performance counters
Change-Id: I716103fb6bfb97ba3e198503531af139fb1725f8
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Make CFF capture a device specific property. This allows the control
of CFF for a particular device without CFF interferance from another
device. This will be useful when we have a virtual device and need to
only capture CFF for the virtual device. CFF capture can only be
turned on for one device at a time.
Change-Id: I14c5a4442ad05327de1413d98bf795dbd196119d
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
SP performance counter 4 is broken on A33x targets so do not assign
this counter. The counter does not reliably return correct values
which can make results misleading.
Change-Id: I87c36e021c547b630e8dfd89abbdb5c65d4b3c46
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Adding support for Coresight debug bus to work with the GPU,
including registering graphics core with Coresight and a coresight
interface to GPU through sysfs.
Change-Id: I9508659ca7d7d67e8a8becba41d06be76360c570
Signed-off-by: Harsh Vardhan Dwivedi <hdwivedi@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Add new GPU ID, macros and VBIF settings for new GPU revision A305C.
Change-Id: Idcea9ac902a605bc1fc4a38f7ad491b98e39a387
Signed-off-by: Lokesh Batra <lbatra@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Certain a3xx registers which can only be read through the HLSQ
block with debug read path were defined in the generic a3xx
register list. This list is read during snapshot and was
causing the system to crash. Remove these registers from the
list to avoid system crash.
Change-Id: Id9e9bdd0fe57fe9282a08913dfc899e6ebabbc11
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Enable VBIF dynamic clock gating for 8974v2 by leaving
VBIF_CLKON at its POR value.
Change-Id: Iea465c992448188bb6a248fa625e37a37b578f7c
Signed-off-by: Pu Chen <puchen@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Supply more VBIF settings for A305B GPU.
Program optimal values to VBIF registers which have
non-optimal power-on-reset values.
Change-Id: I8ca2fafe91d360bbfcafcf6fc86e3f052ad85e27
Signed-off-by: liu zhong <zhongl@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
add VBIF table for A305B in MSM8226
Change-Id: I2b6b43a7658eada81bedcc693b81222b050d229e
Signed-off-by: liu zhong <zhongl@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Add new GPU ID and macros for the updated A305 GPU.
Change-Id: I76072e010352221790d7e01f2ced0d884fa42366
Signed-off-by: liu zhong <zhongl@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
msm: kgsl: Add device init function
Some device specific parameters need to be setup only once during
device initialization. Create an init function for this purpose
rather than re-doing this init everytime the device is started.
Change-Id: I45c7fcda8d61fd2b212044c9167b64f793eedcda
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 2nd commit message:
msm: kgsl: improve active_cnt and ACTIVE state management
Require any code path which intends to touch the hardware
to take a reference on active_cnt with kgsl_active_count_get()
and release it with kgsl_active_count_put() when finished.
These functions now do the wake / sleep steps that were
previously handled by kgsl_check_suspended() and
kgsl_check_idle().
Additionally, kgsl_pre_hwaccess() will no longer turn on
the clocks, it just enforces via BUG_ON that the clocks
are enabled before a register is touched.
Change-Id: I31b0d067e6d600f0228450dbd73f69caa919ce13
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 3rd commit message:
msm: kgsl: Sync memory with CFF from places where it was missing
Before submitting any indirect buffer to GPU via the ringbuffer,
the indirect buffer memory should be synced with CFF so that the
CFF capture will be complete. Add the syncing of memory with CFF
in places where this was missing
Change-Id: I18f506dd1ab7bdfb1a68181016e6f661a36ed5a2
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 4th commit message:
msm: kgsl: Export some kgsl-core functions to EXPORT_SYMBOLS
Export some functions in the KGSL core driver so they can
be seen by the leaf drivers.
Change-Id: Ic0dedbad5dbe562c2e674f8e885a3525b6feac7b
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 5th commit message:
msm: kgsl: Send the right IB size to adreno_find_ctxtmem
adreno_find_ctxtmem expects byte lengths and we were sending it
dword lengths which was about as effective as you would expect.
Change-Id: Ic0dedbad536ed377f6253c3a5e75e5d6cb838acf
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 6th commit message:
msm: kgsl: Add 8974 default GPR0 & clk gating values
Add correct clock gating values for A330, A305 and A320.
Add generic function to return the correct default clock
gating values for the respective gpu. Add default GPR0
value for A330.
Change-Id: I039e8e3622cbda04924b0510e410a9dc95bec598
Signed-off-by: Harsh Vardhan Dwivedi <hdwivedi@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 7th commit message:
msm: kgsl: Move A3XX VBIF settings decision to a table
The vbif selection code is turning into a long series of if/else
clauses. Move the decision to a look up table that will be easier
to update and maintain when when we have eleventy A3XX GPUs.
Change-Id: Ic0dedbadd6b16734c91060d7e5fa50dcc9b8774d
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 8th commit message:
msm: kgsl: Update settings for the A330v2 GPU in 8972v2
The new GPU spin in 8974v2 has some slightly different settings
then the 8974v1: add support for identifying a v2 spin, add a new
table of VBIF register settings and update the clock gating
registers.
Change-Id: Ic0dedbad22bd3ed391b02f6327267cf32f17af3d
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 9th commit message:
msm: kgsl: Fix compilation errors when CFF is turned on
Fix the compilation errors when option MSM_KGSL_CFF_DUMP option
is turned on.
Change-Id: I59b0a7314ba77e2c2fef03338e061cd503e88714
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 10th commit message:
msm: kgsl: Convert the Adreno GPU cycle counters to run free
In anticipation of allowing multiple entities to share access to the
performance counters; make the few performance counters that KGSL
uses run free.
Change-Id: Ic0dedbadbefb400b04e4f3552eed395770ddbb7b
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 11th commit message:
msm: kgsl: Handle a possible ringbuffer allocspace error
In the GPU specific start functions, account for the possibility
that ringbuffer allocation routine might return NULL.
Change-Id: Ic0dedbadf6199fee78b6a8c8210a1e76961873a0
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 12th commit message:
msm: kgsl: Add a new API to allow sharing of GPU performance counters
Adreno uses programmable performance counters, meaning that while there
are a limited number of physical counters each counter can be programmed
to count a vast number of different measurements (we refer to these as
countables). This could cause problems if multiple apps want to use
the performance counters, so this API and infrastructure allows the
counters to be safely shared.
The kernel tracks which countable is selected for each of the physical
counters for each counter group (where groups closely match hardware
blocks). If the desired countable is already in use, or there is an
open physical counter, then the process is allowed to use the counter.
The get ioctl reserves the counter and returns the dword offset of the
register associated with that physical counter. The put ioctl
releases the physical counter. The query ioctl gets the countables
used for all of the counters in the block - up to 8 values can be
returned. The read ioctl gets the current hardware value in the counter
Change-Id: Ic0dedbadae1dedadba60f8a3e685e2ce7d84fb33
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
# This is the 13th commit message:
msm: kgsl: Print the nearest active GPU buffers to a faulting address
Print the two active GPU memory entries that bracket a faulting GPU
address. This will help diagnose premature frees and buffer ovverruns.
Check if the faulting GPU address was freed by the same process.
Change-Id: Ic0dedbadebf57be9abe925a45611de8e597447ea
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Vladimir Razgulin <vrazguli@codeaurora.org>
# This is the 14th commit message:
msm: kgsl: Remove an uneeded register write for A3XX GPUs
A3XX doesn't have the MH block and so the register at 0x40 points
somewhere else. Luckily the write was harmless but remove it anyway.
Change-Id: Ic0dedbadd1e043cd38bbaec8fcf0c490dcdedc8c
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 15th commit message:
msm: kgsl: clean up iommu/gpummu protflag handling
Make kgsl_memdesc_protflags() return the correct type of flags
for the type of mmu being used. Query the memdesc with this
function in kgsl_mmu_map(), rather than passing in the
protflags. This prevents translation at multiple layers of
the code and makes it easier to enforce that the mapping matches
the allocation flags.
Change-Id: I2a2f4a43026ae903dd134be00e646d258a83f79f
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 16th commit message:
msm: kgsl: remove kgsl_mem_entry.flags
The two flags fields in kgsl_memdesc should be enough for
anyone. Move the only flag using kgsl_mem_entry, the
FROZEN flag for snapshot procesing, to use kgsl_memdesc.priv.
Change-Id: Ia12b9a6e6c1f5b5e57fa461b04ecc3d1705f2eaf
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 17th commit message:
msm: kgsl: map the guard page readonly on the iommu
The guard page needs to be readable by the GPU, due to
a prefetch range issue, but it should never be writable.
Change the page fault message to indicate if nearby
buffers have a guard page.
Change-Id: I3955de1409cbf4ccdde92def894945267efa044d
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 18th commit message:
msm: kgsl: Add support for VBIF and VBIF_PWR performance counters
These 2 counter groups are also "special cases" that require
different programming sequences.
Change-Id: I73e3e76b340e6c5867c0909b3e0edc78aa62b9ee
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 19th commit message:
msm: kgsl: Only allow two counters for VBIF performance counters
There are only two VBIF counter groups so validate that the user
doesn't pass in > 1 and clean up the if/else clause.
Change-Id: Ic0dedbad3d5a54e4ceb1a7302762d6bf13b25da1
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 20th commit message:
msm: kgsl: Avoid an array overrun in the perfcounter API
Make sure the passed group is less than the size of the list of
performance counters.
Change-Id: Ic0dedbadf77edf35db78939d1b55a05830979f85
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 21st commit message:
msm: kgsl: Don't go to slumber if active_count is non zero
If active_cnt happens to be set when we go into
kgsl_early_suspend_driver() then don't go to SLUMBER. This
avoids trouble if we come back and and try to access the
hardware while it is off.
Change-Id: Ic0dedbadb13514a052af6199c8ad1982d7483b3f
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 22nd commit message:
msm: kgsl: Enable HLSQ registers in snapshot when available
Reading the HLSQ registers during a GPU hang recovery might cause
the device to hang depending on the state of the HLSQ block.
Enable the HLSQ register reads when we know that they will
succeed.
Change-Id: I69f498e6f67a15328d1d41cc64c43d6c44c54bad
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 23rd commit message:
msm: kgsl: snapshot: Don't keep parsing indirect buffers on failure
Stop parsing an indirect buffer if an error is encountered (such as
a missing buffer). This is a pretty good indication that the buffers
are not reliable and the further the parser goes with a unreliable
buffer the more likely it is to get confused.
Change-Id: Ic0dedbadf28ef374c9afe70613048d3c31078ec6
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 24th commit message:
msm: kgsl: snapshot: Only push the last IB1 and IB2 in the static space
Some IB1 buffers have hundreds of little IB2 buffers and only one of them
will actually be interesting enough to push into the static space. Only
push the last executed IB1 and IB2 into the static space.
Change-Id: Ic0dedbad26fb30fb5bf90c37c29061fd962dd746
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 25th commit message:
msm: kgsl: Save the last active context in snapshot
Save the last active context that was executing when the hang happened
in snapshot.
Change-Id: I2d32de6873154ec6c200268844fee7f3947b7395
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 26th commit message:
msm: kgsl: In snapshot track a larger object size if address is same
If the object being tracked has the same address as a previously
tracked object then only track a single object with larger size
as the smaller object will be a part of the larger one anyway.
Change-Id: I0e33bbaf267bc0ec580865b133917b3253f9e504
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 27th commit message:
msm: kgsl: Track memory address from 2 additional registers
Add tracking of memory referenced by VS_OBJ_START_REG and FS_OBJ_START_REG
registers in snapshot. This makes snapshot more complete in terms of
tracking data that is used by the GPU at the time of hang.
Change-Id: I7e5f3c94f0d6744cd6f2c6413bf7b7fac4a5a069
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 28th commit message:
msm: kgsl: Loop till correct index on type0 packets
When searching for memory addresses in type0 packet we were looping
from start of the type0 packet till it's end, but the first DWORD
is a header so we only need to loop till packet_size - 1. Fix this.
Change-Id: I278446c6ab380cf8ebb18d5f3ae192d3d7e7db62
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 29th commit message:
msm: kgsl: Add global timestamp information to snapshot
Make sure that we always add global timestamp information to
snapshot. This is needed in playbacks for searching whereabouts
of last executed IB.
Change-Id: Ica5b3b2ddff6fd45dbc5a911f42271ad5855a86a
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 30th commit message:
msm: kgsl: Skip cff dump for certain functions when its disabled
Certain functions were generating CFF when CFF was disabled. Make
sure these functions do not dump CFF when it is disabled.
Change-Id: Ib5485b03b8a4d12f190f188b80c11ec6f552731d
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 31st commit message:
msm: kgsl: Fix searching of memory object
Make sure that at least a size of 1 byte is searched when locating
the memory entry of a region. If size is 0 then a memory region
whose last address is equal to the start address of the memory being
searched will be returned which is wrong.
Change-Id: I643185d1fdd17296bd70fea483aa3c365e691bc5
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 32nd commit message:
msm: kgsl: If adreno start fails then restore state of device
Restore the state of the device back to what it was at the
start of the adreno_start function if this function fails to
execute successfully.
Change-Id: I5b279e5186b164d3361fba7c8f8d864395b794c8
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 33rd commit message:
msm: kgsl: Fix early exit condition in ringbuffer drain
The ringbuffer drain function can be called when the ringbuffer
start flag is not set. This happens on startup. Hence,
exiting the function early based on start flag is incorrect.
Simply execute this function regardless of the start flag.
Change-Id: Ibf2075847f8bb1a760bc1550309efb3c7aa1ca49
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 34th commit message:
msm: kgsl: Do not return an error on NULL gpu address
If a NULL gpu address is passed to snapshot object tracking
function then do not treat this as an error and return 0. NULL
objects may be present in an IB so just skip over these objects
instead of exiting due to an error.
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Change-Id: Ic253722c58b41f41d03f83c77017e58365da01a7
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 35th commit message:
msm: kgsl: Don't hold process list global mutex in process private create
Don't hold process list global mutex for long. Instead make
use of process specific spin_lock() to serialize access
to process private structure while creating it. Holding
process list global mutex could lead to deadlocks as other
functions depend on it.
CRs-fixed: 480732
Change-Id: Id54316770f911d0e23384f54ba5c14a1c9113680
Signed-off-by: Harsh Vardhan Dwivedi <hdwivedi@codeaurora.org>
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 36th commit message:
msm: kgsl: Use CPU path to program pagetable when active count is 0
When active count is 0 then we should use the CPU path to program
pagetables because the GPU path requires event registration. Events
can only be queued when active count is valid. Hence, if the active
count is NULL then use the CPU path.
Change-Id: I70f5894d20796bdc0f592db7dc2731195c0f7a82
CRs-fixed: 481887
Signed-off-by: Shubhrapralash Das <sadas@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 37th commit message:
iommu: msm: prevent partial mappings on error
If msm_iommu_map_range() fails mid way through the va
range with an error, clean up the PTEs that have already
been created so they are not leaked.
Change-Id: Ie929343cd6e36cade7b2cc9b4b4408c3453e6b5f
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 38th commit message:
msm: kgsl: better handling of virtual address fragmentation
When KGSL_MEMFLAGS_USE_CPU_MAP is enabled, the mmap address
must try to match the GPU alignment requirements of the buffer,
as well as include space in the mapping for the guard page.
This can cause -ENOMEM to be returned from get_unmapped_area()
when there are a large number of mappings. When this happens,
fall back to page alignment and retry to avoid failure.
Change-Id: I2176fe57afc96d8cf1fe1c694836305ddc3c3420
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 39th commit message:
iommu: msm: Don't treat address 0 as an error case
Currently, the iommu page table code treats a scattergather
list with physical address 0 as an error. This may not be
correct in all cases. Physical address 0 is a valid part
of the system and may be used for valid page allocations.
Nothing else in the system checks for physical address 0
for error so don't treat it as an error.
Change-Id: Ie9f0dae9dace4fff3b1c3449bc89c3afdd2e63a0
CRs-Fixed: 478304
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 40th commit message:
msm: kgsl: prevent race between mmap() and free on timestamp
When KGSL_MEMFLAGS_USE_CPU_MAP is set, we must check that the
address from get_unmapped_area() is not used as part of a
mapping that is present only in the GPU pagetable and not the
CPU pagetable. These mappings can occur because when a buffer
is freed on timestamp, the CPU mapping is destroyed immediately
but the GPU mapping is not destroyed until the GPU timestamp
has passed.
Because kgsl_mem_entry_detach_process() removed the rbtree
entry before removing the iommu mapping, there was a window
of time where kgsl thought the address was available even
though it was still present in the iommu pagetable. This
could cause the address to get assigned to a new buffer,
which would cause iommu_map_range() to fail since the old
mapping was still in the pagetable. Prevent this race by
removing the iommu mapping before removing the rbtree entry
tracking the address.
Change-Id: I8f42d6d97833293b55fcbc272d180564862cef8a
CRs-Fixed: 480222
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 41st commit message:
msm: kgsl: add guard page support for imported memory
Imported memory buffers sometimes do not have enough
padding to prevent page faults due to overzealous
GPU prefetch. Attach guard pages to their mappings
to prevent these faults.
Because we don't create the scatterlist for some
types of imported memory, such as ion, the guard
page is no longer included as the last entry in
the scatterlist. Instead, it is handled by
size ajustments and a separate iommu_map() call
in the kgsl_mmu_map() and kgsl_mmu_unmap() paths.
Change-Id: I3af3c29c3983f8cacdc366a2423f90c8ecdc3059
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 42nd commit message:
msm: kgsl: fix kgsl_mem_entry refcounting
Make kgsl_sharedmem_find* return a reference to the
entry that was found. This makes using an entry
without the mem_lock held less race prone.
Change-Id: If6eb6470ecfea1332d3130d877922c70ca037467
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 43rd commit message:
msm: kgsl: add ftrace for cache operations
Add the event kgsl_mem_sync_cache. This event is
emitted when only a cache operation is actually
performed. Attempts to flush uncached memory,
which do nothing, do not cause this event.
Change-Id: Id4a940a6b50e08b54fbef0025c4b8aaa71641462
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 44th commit message:
msm: kgsl: Add support for bulk cache operations
Add a new ioctl, IOCTL_KGSL_GPUMEM_SYNC_CACHE_BULK, which can be used
to sync a number of memory ids at once. This gives the driver an
opportunity to optimize the cache operations based on the total
working set of memory that needs to be managed.
Change-Id: I9693c54cb6f12468b7d9abb0afaef348e631a114
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 45th commit message:
msm: kgsl: flush the entire cache when the bulk batch is large
On 8064 and 8974, flushing more than 16mb of virtual address
space is slower than flushing the entire cache. So flush
the entire cache when the working set is larger than this.
The threshold for full cache flush can be tuned at runtime via
the full_cache_threshold sysfs file.
Change-Id: If525e4c44eb043d0afc3fe42d7ef2c7de0ba2106
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 46th commit message:
msm: kgsl: Use a read/lock for the context idr
Everybody loves a rcu but in this case we are dangerously mixing rcus and
atomic operations. Add a read/write lock to explicitly protect the idr.
Also fix a few spots where the idr was used without protection.
Change-Id: Ic0dedbad517a9f89134cbcf7af29c8bf0f034708
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 47th commit message:
msm: kgsl: embed kgsl_context struct in adreno_context struct
Having a separate allocated struct for the device specific context
makes ownership unclear, which could lead to reference counting
problems or invalid pointers. Also, duplicate members were
starting to appear in adreno_context because there wasn't a safe
way to reach the kgsl_context from some parts of the adreno code.
This can now be done via container_of().
This change alters the lifecycle of the context->id, which is
now freed when the context reference count hits zero rather
than in kgsl_context_detach().
It also changes the context creation and destruction sequence.
The device specific code must allocate a structure containing
a struct kgsl_context and passes a pointer it to kgsl_init_context()
before doing any device specific initialization. There is also a
separate drawctxt_detach() callback for doing device specific
cleanup. This is separate from freeing memory, which is done
by the drawctxt_destroy() callback.
Change-Id: I7d238476a3bfec98fd8dbc28971cf3187a81dac2
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 48th commit message:
msm: kgsl: Take a reference count on the active adreno draw context
Take a reference count on the currently active draw context to keep
it from going away while we are maintaining a pointer to it in the
adreno device.
Change-Id: Ic0dedbade8c09ecacf822e9a3c5fbaf6e017ec0c
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 49th commit message:
msm: kgsl: Add a command dispatcher to manage the ringbuffer
Implements a centralized dispatcher for sending user commands
to the ringbuffer. Incoming commands are queued by context and
sent to the hardware on a round robin basis ensuring each context
a small burst of commands at a time. Each command is tracked
throughout the pipeline giving the dispatcher better knowledge
of how the hardware is being used. This will be the basis for
future per-context and cross context enhancements as priority
queuing and server-side syncronization.
Change-Id: Ic0dedbad49a43e8e6096d1362829c800266c2de3
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 50th commit message:
msm: kgsl: Only turn on the idle timer when active_cnt is 0
Only turn on the idle timer when the GPU expected to be quiet.
Change-Id: Ic0dedbad57846f1e7bf7820ec3152cd20598b448
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 51st commit message:
msm: kgsl: Add a ftrace event for active_cnt
Add a new ftrace event for watching the rise and fall of active_cnt:
echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_active_count/enable
This will give you the current active count and the caller of the function:
kgsl_active_count: d_name=kgsl-3d0 active_cnt=8e9 func=kgsl_ioctl
Change-Id: Ic0dedbadc80019e96ce759d9d4e0ad43bbcfedd2
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 52nd commit message:
msm: kgsl: Implement KGSL fault tolerance policy in the dispatcher
Implement the KGSL fault tolerance policy for faults in the dispatcher.
Replay (or skip) the inflight command batches as dictated by the policy,
iterating progressively through the various behaviors.
Change-Id: Ic0dedbade98cc3aa35b26813caf4265c74ccab56
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 53rd commit message:
msm: kgsl: Don't process events if the timestamp hasn't changed
Keep track of the global timestamp every time the event code runs.
If the timestamp hasn't changed then we are caught up and we can
politely bow out. This avoids the situation where multiple
interrupts queue the work queue multiple times:
IRQ
-> process events
IRQ
IRQ
-> process events
The actual retired timestamp in the first work item might be well
ahead of the delivered interrupts. The event loop will end up
processing every event that has been retired by the hardware
at that point. If the work item gets re-queued by a subesquent
interrupt then we might have already addressed all the pending
timestamps.
Change-Id: Ic0dedbad79722654cb17e82b7149e93d3c3f86a0
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 54th commit message:
msm: kgsl: Make active_cnt an atomic variable
In kgsl_active_cnt_light() the mutex was needed just to check and
increment the active_cnt value. Move active_cnt to an atomic to
begin the task of freeing ourselves from the grip of the device
mutex if we can avoid it.
Change-Id: Ic0dedbad78e086e3aa3559fab8ecebc43539f769
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 55th commit message:
msm: kgsl: Add a new command submission API
Add an new ioctl entry point for submitting commands to the GPU
called IOCTL_KGSL_SUBMIT_COMMANDS.
As with IOCTL_KGSL_RINGBUFFER_ISSUEIBCMDS the user passes a list of
indirect buffers, flags and optionally a user specified timestamp. The
old way of passing a list of indirect buffers is no longer supported.
IOCTL_KGSL_SUBMIT_COMMANDS also allows the user to define a
list of sync points for the command. Sync points are dependencies
on events that need to be satisfied before the command will be issued
to the hardware. Events are designed to be flexible. To start with
the only events that are supported are GPU events for a given context/
timestamp pair.
Pending events are stored in a list in the command batch. As each event is
expired it is deleted from the list. The adreno dispatcher won't send the
command until the list is empty. Sync points are not supported for Z180.
CRs-Fixed: 468770
Change-Id: Ic0dedbad5a5935f486acaeb033ae9a6010f82346
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 56th commit message:
msm: kgsl: add kgsl_sync_fence_waiter for server side sync
For server side sync the KGSL kernel module needs to perform
an asynchronous wait for a fence object prior to issuing
subsequent commands.
Change-Id: I1ee614aa3af84afc4813f1e47007f741beb3bc92
Signed-off-by: Jeff Boody <jboody@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 57th commit message:
msm: kgsl: Add support for KGSL_CMD_SYNCPOINT_TYPE_FENCE
Allow command batches to wait for external fence sync events.
Change-Id: Ic0dedbad3a211019e1cd3a3d62ab6a3e4d4eeb05
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 58th commit message:
msm: kgsl: fix potential double free of the kwaiter
Change-Id: Ic0dedbad66a0af6eaef52b2ad53c067110bdc6e4
Signed-off-by: Jeff Boody <jboody@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
# This is the 59th commit message:
msm: kgsl: free an event only after canceling successfully
Change-Id: Ic0dedbade256443d090dd11df452dc9cdf65530b
Signed-off-by: Jeff Boody <jboody@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
In the GPU interrupt handler we attempt to clear the ts_cmp_enable
for the active context so that future interrupts are skipped until
someone needs one again. If for some reason the interrupt handler
is delayed then there is a possiblity that the "current" context in
the GPU isn't the one that fired the interrupt. In that case we
could be accidently clearing a ts_cmp_enable for a context that
needs it. Instead of clearing in the interrupt handler clear it
from the GPU so we can be sure we got the right context.
As a bonus pushing this logic to the GPU side lets us get rid of
some extra register reads/writes in the interrupt handlers.
Change-Id: Ic0dedbadbf350f7c4866092fa0686f9b42f3cd33
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Sometimes the core will go idle before the interrupt can be handled on
the GPU. If that happens then we could go to a lower power state before
cleaning up the pending interrupt and various entities that might be
waiting for it. Consider the current interrupt status when checking
for idle.
CRS-fixed: 449813
Change-Id: Ic0dedbadfd2d40e4411cf3b05e1eb4c4eecf7841
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Remove ts_notifier_list from the driver since it is not being
used and is causing extra work to be done in the interrupt
handlers for A2XX, A3XX and Z180.
Change-Id: I5512e36f1e807f3a3e62aeac54cfd3075d4cf7a4
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Jeff Boody <jboody@codeaurora.org>
Renaming recovery to fault tolerance and modifying
the functions and log messages accordingly.
Change-Id: I5f249806026ac514c4aff7da45c3a4e8cc2f8c34
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Fast hang detection algorithm is improved to use additional
performance counters to monitor shader processor activity.
Shader processor's active alu cycles, icl0 misses
and fs cflow instructions are added to list of activities
monitored for fast hang detection.
Change-Id: Ie74b2ca2d8eb587dbdae40f8fafd901e71f50ddb
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Use BIT to define all context related flags. This ensures
that these flags are unique and is easier to maintain. Also,
fix spelling of CTXT_FLAGS_BEING_DESTOYED to
CTXT_FLAGS_BEING_DESTROYED
Change-Id: I866c67b9c5d59d6117e31714756af3106018f9cb
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Reading the A3XX HLSQ registers during a GPU hang recovery might cause
the device to hang. Disable the the HLSQ register reads that would
cause recovery to fail until the failures are better understood.
Change-Id: I1553025fbd824bfacf91f062372d5731cd905cc4
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Update the VBIF register settings for A330 for better performance and
stability per the latest testing and analysis.
CRs-Fixed: 416680
Change-Id: Ic0dedbad71bfd589b322bed503052315d0bd1940
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Always rest the ts_cmp_enable when an interrupt is received
from the GPU. This keeps legacy code that is not using
per context timestamps correctly updated. No effect is
seen with mainline code using per context timestamps.
CRs-fixed: 418172
Change-Id: I7f29086d4885571bdb165c0e759dc6ffc40b554f
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Some intensive shader operations can go for the full timeout
in the SP block without changes in th RBBM and CP registers
that we monitor for hang detection. Add the performance counter
SP_FS_FULL_ALU_INSTRUCTIONS to see if any full precision
instructions have been executed during the hang detection interval.
CRs-Fixed: 392730
Change-Id: Ic0dedbadd6e5bcd0b46aab4209430de2f74711f7
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
VBIF registers are set dependent on what A3XX GPU core is present.
Set the registers from a table that is explicitly tied to each of
the A3XX GPU cores. This will prevent side effects across cores
when changing a specific cores VBIF data.
Change-Id: I4c20cd891a940abd85459ce5bf548cf91d06004a
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
If a hang is detected when allocating space in ringbuffer and
if the context for which the space is being allocated is hung
then do not allocate space at all.
Change-Id: Ia5ade2341fe5016119d8c140413860420c5c3a3d
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
The A330 GPU defines a few new registers that don't exist on
A305/A320. Define a new subset for A330 and dump it in the
postmortem and binary snapshot.
Change-Id: Ic0dedbadd0c44ee8872b99fd6b0b3dc8eb972eea
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>