Bug 110315

Summary: [i915/hsw] Tropico 6 causes GPU hang
Product: Mesa Reporter: Manuel Lauss <manuel.lauss>
Component: Drivers/DRI/i965Assignee: Intel 3D Bugs Mailing List <intel-3d-bugs>
Status: RESOLVED MOVED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: danylo.piliaiev, denys.kostin, intel-gfx-bugs
Version: git   
Hardware: x86-64 (AMD64)   
OS: All   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=111631
Whiteboard:
i915 platform: i915 features:
Attachments: GPU crash dump file
bad_screenshot_from_HSW
GPU crash dump #2

Description Manuel Lauss 2019-04-03 13:20:43 UTC
Created attachment 143851 [details]
GPU crash dump file

The game Tropico 6 (UE4 4.20-based) causes a GPU hang:


[237980.961658] [drm] GPU HANG: ecode 7:0:0x87d5aef8, in Tropico6-Linux- [105317], reason: hang on rcs0, action: reset
[237980.961660] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[237980.961661] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[237980.961661] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[237980.961661] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[237980.961662] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[237980.961698] i915 0000:00:02.0: Resetting chip for hang on rcs0
[237988.909135] i915 0000:00:02.0: Resetting chip for hang on rcs0
[238002.775684] i915 0000:00:02.0: Resetting chip for hang on rcs0
[238012.802223] i915 0000:00:02.0: Resetting chip for hang on rcs0
[238016.855426] Asynchronous wait on fence i915:rcs0:ede19 timed out (hint:intel_atomic_commit_ready+0x0/0x50)
[238020.908796] i915 0000:00:02.0: Resetting chip for hang on rcs0
[238030.721946] i915 0000:00:02.0: Resetting chip for hang on rcs0
[238038.828601] i915 0000:00:02.0: Resetting chip for hang on rcs0
[238046.725141] i915 0000:00:02.0: Resetting chip for hang on rcs0



The game mostly works, but it has a lot of triangles blinking.

Linux-5.0.5, mesa git @ 43db0632e7dea4339bbfc05caf9f5165ee8329a2

Thanks!
Comment 1 Denis 2019-04-16 15:46:11 UTC
Created attachment 143992 [details]
bad_screenshot_from_HSW

hi, thank you for the report. I checked this game on HSW also, and didn't reproduce hang (kernel is 5.0.7, OS Fedora 29)
Mesa version was taken from today's git master.

Could you please clarify - what steps did you do to reproduce hang?

>The game mostly works, but it has a lot of triangles blinking.
According to this - in my case I would say that game is completely not playable in 2 reasons - first one is performance - too slow and laggy, second one - a lot of huge graphic artifacts.

But interesting that not all these artifacts can be recorded into apitrace. Here are few screenshots, recorded on HSW and replayed on KBL.
https://drive.google.com/open?id=1yD1auM-WA8pLY3qKINKcOF7yzAWo4jIA
Comment 2 Manuel Lauss 2019-04-16 18:20:05 UTC
Created attachment 143997 [details]
GPU crash dump #2

I tried again with a fresh mesa HEAD, Linux-5.0.7.  The GPU hangs appear as soon as the "postcard" loading screen is shown. It gets to the main menu eventually, but as you already found out, it's very slow and with tons of graphics artifacts.
(Unsurprisingly, it's playable with the nvidia card).

[  498.568025] [drm] GPU HANG: ecode 7:0:0x87d53d10, in Tropico6-Linux- [15839], reason: hang on rcs0, action: reset
[  498.568028] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[  498.568028] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[  498.568029] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[  498.568029] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[  498.568030] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[  498.568067] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  502.580665] Asynchronous wait on fence i915:rcs0:2abd timed out (hint:intel_atomic_commit_ready+0x0/0x50)
[  506.420698] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  514.527364] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  516.660602] Asynchronous wait on fence i915:rcs0:2ad3 timed out (hint:intel_atomic_commit_ready+0x0/0x50)
[  522.420684] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  530.527273] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  533.513855] Asynchronous wait on fence i915:rcs0:2aea timed out (hint:intel_atomic_commit_ready+0x0/0x50)
[  538.420647] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  546.527074] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  549.513633] Asynchronous wait on fence i915:rcs0:2b04 timed out (hint:intel_atomic_commit_ready+0x0/0x50)
[  554.420338] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  562.526852] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  570.420119] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  578.526676] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  586.419943] i915 0000:00:02.0: Resetting chip for hang on rcs0
[  589.619838] Asynchronous wait on fence i915:rcs0:2b3a timed out (hint:intel_atomic_commit_ready+0x0/0x50)
[  594.526510] i915 0000:00:02.0: Resetting chip for hang on rcs0
Comment 3 Denis 2019-04-17 12:30:21 UTC
thank you for clarification. Finally reproduced hang. But in my case I got them between splash screens and main game menu. Investigating and trying to reproduce them (they were not stable unfortunately)
Comment 4 Danylo 2019-05-08 11:37:55 UTC
I have tested Tropico 6 on Kaby Lake and while there is no such issues as on Haswell it has geometry flickering caused by compute workload, see https://gitlab.freedesktop.org/mesa/mesa/merge_requests/621
Comment 5 Danylo 2019-05-08 14:03:23 UTC
Regarding the graphical artifacts on HSW - the artifacts of water are caused by TCS shader most probably it has the same cause as artifacts in Downward (https://bugs.freedesktop.org/show_bug.cgi?id=104297), they could be reproduced by launching game with INTEL_SCALAR_TCS=0 on skylake or kabylake.

The causes of hang and flickering shadows are still unknown to me.
Comment 6 Danylo 2019-05-13 12:37:58 UTC
Shadows flickering on HSW are caused by the same issue as in SuperTuxKart (https://bugs.freedesktop.org/show_bug.cgi?id=110395). There is a fix/workaround https://gitlab.freedesktop.org/mesa/mesa/merge_requests/660
Comment 7 GitLab Migration User 2019-09-25 20:32:52 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1803.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.