Bug 101865

Summary: [HSW/IVB] Blank screen after resume with HSW
Product: DRI Reporter: nutrinfnon
Component: DRM/IntelAssignee: nutrinfnon
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: high CC: intel-gfx-bugs, jeffreylong+freedesktop, marco, somini29, yuraofo
Version: XOrg gitKeywords: bisect_pending, regression
Hardware: x86-64 (AMD64)   
OS: All   
Whiteboard: ReadyForDev
i915 platform: HSW, IVB i915 features: display/atomic
Attachments:
Description Flags
dmesg output
none
dmesg log with drm.debug
none
dmesg output of 4.11.9-1 with hibernation working
none
dmesg output of 4.12.0-1 with black screen after resume
none
dmesg output
none
test 4.14.0-rc6
none
dmesg output none

Description nutrinfnon 2017-07-21 07:23:01 UTC
Resume from hibernation fails with linux kernel 4.13.0-rc1


[  384.831395] atomic remove_fb failed with -22
[  384.831432] ------------[ cut here ]------------
[  384.831449] WARNING: CPU: 0 PID: 3 at drivers/gpu/drm/drm_framebuffer.c:912 drm_framebuffer_remove+0x230/0x2fb [drm]
[  384.831449] Modules linked in: ...
[  384.831514] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.13.0-rc1 #1
[  384.831515] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-GS, BIOS P2.00 03/08/2013
[  384.831524] Workqueue: events drm_mode_rmfb_work_fn [drm]
[  384.831526] task: ffff880198578cc0 task.stack: ffffc90000c74000
[  384.831534] RIP: 0010:drm_framebuffer_remove+0x230/0x2fb [drm]
[  384.831535] RSP: 0018:ffffc90000c77e00 EFLAGS: 00010286
[  384.831536] RAX: 0000000000000020 RBX: ffff88019197cd00 RCX: 0000000000000000
[  384.831538] RDX: ffff88019f214a01 RSI: ffff88019f20cdf8 RDI: ffff88019f20cdf8
[  384.831539] RBP: ffff88019295f800 R08: 000000ab71c23788 R09: ffffffff81a11210
[  384.831540] R10: 0000000000000006 R11: 0000000000000040 R12: ffff880196fd8000
[  384.831541] R13: 0000000000000001 R14: 0000000000000005 R15: ffff880196fd8320
[  384.831542] FS:  0000000000000000(0000) GS:ffff88019f200000(0000) knlGS:0000000000000000
[  384.831543] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  384.831544] CR2: 00000010485f5cc0 CR3: 0000000195b47000 CR4: 00000000000406f0
[  384.831545] Call Trace:
[  384.831556]  ? drm_mode_rmfb_work_fn+0x37/0x3c [drm]
[  384.831560]  ? process_one_work+0x149/0x255
[  384.831563]  ? worker_thread+0x189/0x233
[  384.831565]  ? rescuer_thread+0x262/0x262
[  384.831566]  ? kthread+0xea/0xef
[  384.831568]  ? init_completion+0x1d/0x1d
[  384.831569]  ? init_completion+0x1d/0x1d
[  384.831571]  ? ret_from_fork+0x22/0x30
[  384.831572] Code: 89 04 24 e8 79 cf ff ff 48 8d 7c 24 10 e8 4e cf ff ff 8b 04 24 85 c0 0f 84 b5 00 00 00 89 c6 48 c7 c7 86 36 37 a0 e8 f8 99 d2 e0 <0f> ff e9 a0 00 00 00 e8 26 d2 ff ff 49 8b 84 24 40 03 00 00 4d
[  384.831611] ---[ end trace 07ca2c6d3bb05fc0 ]---
Comment 1 Elizabeth 2017-07-21 15:07:30 UTC
(In reply to nutrinfnon from comment #0)
> Resume from hibernation fails with linux kernel 4.13.0-rc1
>
Hello, 
Is this a regression? Could you please attach dmesg with drm.debug=0xe parameter on grub?
Thanks.
Comment 2 nutrinfnon 2017-07-22 05:46:14 UTC
Created attachment 132825 [details]
dmesg output
Comment 3 nutrinfnon 2017-07-22 06:19:42 UTC
I think it is a regression, I always used hibernation with 4.11.0. Since 4.12.x starts this resume problem.

I done test with drm.debug=0xe:
 first time, I executed "systemctl hibernate" and at kernel time [53.314063] resume fails, system is working but screen is off;
 second time, I executed "systemctl hibernate" and at kernel time [431.539799] resume fails too. This time, system is always working but screen is on with dirty image, no ctrl-alt-f1 possible;

in attach dmesg.





Thanks

P.S. I changed hardware for do better test.
Comment 4 Yuriy 2017-08-18 11:55:49 UTC
Created attachment 133602 [details]
dmesg log with drm.debug
Comment 5 Yuriy 2017-08-18 12:00:29 UTC
I also experience this issue on my Thinkpad x230.
Screen stays black(off) after resume if it was off during start of hibernation.

To reproduce:
xset dpms force off; sleep 2; systemctl hibernate

I have attached dmesg log with drm.debug=0xe

To make screen work without reboot: 1. Suspend to ram 2. Resume 3. Restart X server.
Comment 6 Elizabeth 2017-08-18 19:53:05 UTC
(In reply to nutrinfnon from comment #2)
> Created attachment 132825 [details]
> dmesg output
From dmesg:
[    7.073381] [drm:intel_device_info_dump [i915]] i915 device info: platform=HASWELL gen=7 pciid=0x0402 rev=0x06
Changing platform.
Comment 7 Yuriy 2017-08-19 09:03:01 UTC
Created attachment 133626 [details]
dmesg output of 4.11.9-1 with hibernation working
Comment 8 Yuriy 2017-08-19 09:04:04 UTC
Created attachment 133627 [details]
dmesg output of 4.12.0-1 with black screen after resume
Comment 9 Yuriy 2017-08-19 09:06:27 UTC
I can confirm that it is a regression.
4.11.9-1-ARCH - hibernation works fine
4.12.0-1-ARCH - screen is off after resume

I've attached both dmesg outputs.
Comment 10 nutrinfnon 2017-08-21 10:56:52 UTC
Created attachment 133644 [details]
dmesg output

For me, I isolated the cause of the problem:

Hibernation and following resuming fails when XOrg is in "dpms off".
At console or XOrg without "xscreensaver Power Management" active, resume works fine.

I tested 4.13.0-rc6:
 I executed "xset dpms force off";
 then I executed "systemctl hibernate" from network.

at resuming, the screen remains off, view dmseg output in attach.

xrandr does not solve the situation:

xrandr --output VGA-1 --off 
xrandr: Configure crtc 0 failed

xrandr --output VGA-1 --mode 1024x768 
xrandr: Configure crtc 0 failed

Thanks
Comment 11 Elizabeth 2017-10-25 17:22:37 UTC
Hello, latest tip has some relevant S3/S4 patches merged, could you re-test with it? Thank you. https://cgit.freedesktop.org/drm-tip
Comment 12 nutrinfnon 2017-10-26 23:13:37 UTC
Created attachment 135095 [details]
test 4.14.0-rc6

Hello,

among the latest S3/S4 patches are:

2017-10-04	drm/i915/cnl: Reprogram DMC firmware after S3/S4 resume
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -2809,6 +2809,9 @@ static void cnl_display_core_init(struct drm_i915_private *dev_priv, bool resume
 
 	/* 6. Enable DBUF */
 	gen9_dbuf_enable(dev_priv);
+
+	if (resume && dev_priv->csr.dmc_payload)
+		intel_csr_load_program(dev_priv);
 }
2017-10-04	drm/i915/cnl: Reprogram DMC firmware after S3/S4 resume
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 7933d1b..3791c3f 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -2809,6 +2809,9 @@ static void cnl_display_core_init(struct drm_i915_private *dev_priv, bool resume
 
 	/* 6. Enable DBUF */
 	gen9_dbuf_enable(dev_priv);
+
+	if (resume && dev_priv->csr.dmc_payload)
+		intel_csr_load_program(dev_priv);
 }


These patches are already merged in linux-4.14.0-rc6, so I tried it.

My test with 4.14.0-rc6 still shows the presence of the problem, blank screen after resume.

In attach dmesg output.

Thanks.
Comment 13 Elizabeth 2017-10-27 17:22:16 UTC
(In reply to nutrinfnon from comment #12)
> Created attachment 135095 [details]
> test 4.14.0-rc6
>...
> My test with 4.14.0-rc6 still shows the presence of the problem, blank
> screen after resume.
> 
> In attach dmesg output.
> 
> Thanks.
Attachment is empty, could you please re-attach dmesg?? Thank you.
Comment 14 nutrinfnon 2017-10-27 21:53:11 UTC
Created attachment 135136 [details]
dmesg output
Comment 15 Jani Nikula 2017-10-30 16:02:37 UTC
(In reply to nutrinfnon from comment #12)
> Created attachment 135095 [details]
> test 4.14.0-rc6
> 
> Hello,
> 
> among the latest S3/S4 patches are:
> 
> 2017-10-04	drm/i915/cnl: Reprogram DMC firmware after S3/S4 resume
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -2809,6 +2809,9 @@ static void cnl_display_core_init(struct
> drm_i915_private *dev_priv, bool resume
>  
>  	/* 6. Enable DBUF */
>  	gen9_dbuf_enable(dev_priv);
> +
> +	if (resume && dev_priv->csr.dmc_payload)
> +		intel_csr_load_program(dev_priv);
>  }
> 2017-10-04	drm/i915/cnl: Reprogram DMC firmware after S3/S4 resume
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c
> b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 7933d1b..3791c3f 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -2809,6 +2809,9 @@ static void cnl_display_core_init(struct
> drm_i915_private *dev_priv, bool resume
>  
>  	/* 6. Enable DBUF */
>  	gen9_dbuf_enable(dev_priv);
> +
> +	if (resume && dev_priv->csr.dmc_payload)
> +		intel_csr_load_program(dev_priv);
>  }
> 
> 
> These patches are already merged in linux-4.14.0-rc6, so I tried it.
> 
> My test with 4.14.0-rc6 still shows the presence of the problem, blank
> screen after resume.

No surprise there, given that the the above changes are only relevant for Cannonlake.
Comment 16 Ville Syrjala 2017-10-30 16:24:52 UTC
I suspect this might be fixed by the following two commits:

commit b6b178a77210055b153dbc175e4468bd3c7122df
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Thu Oct 19 17:13:41 2017 +0200

    drm/i915: Calculate ironlake intermediate watermarks correctly, v2.

commit 28283f4f359cd7cfa9e65457bb98c507a2cd0cd0
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date:   Thu Oct 19 17:13:40 2017 +0200

    drm/i915: Do not rely on wm preservation for ILK watermarks
Comment 17 nutrinfnon 2017-10-31 17:23:09 UTC
I tried to use mentioned patches, but the first one refers to missing function intel_atomic_get_old_crtc_state.

Search does not help.

I don't know usage of drm-tip, so I think to wait next week for next release.

Thanks.
Comment 18 Elizabeth 2017-11-06 23:43:23 UTC
Could you try to do bisecting? That could help.
Comment 19 nutrinfnon 2017-11-13 05:16:18 UTC
I'm trying (no much time for):

bisect start (good=v4.11, bad=v4.12)

test of v4.11-7904-g2bd804017435 is good (roughly 13 steps);
test of v4.11-11905-g85d604902eb2 is bad
test of v4.11-10417-gc6a677c6f37b is bad
test of v4.11-11129-gbf5f89463f5b is bad
test of v4.11-9227-ge87d51ac61f8 is bad
test of v4.11-8548-ge579dde654fc is bad
test of v4.11-rc5-271-gc034a43e72dd is good (roughly 8 steps)

I will continue as I have time.
Comment 20 nutrinfnon 2017-11-26 06:55:00 UTC
Hi,

the tests show that the problem has been introduced with commit 2f34c1231bfc9f2550f934acb268ac7315fb3837 (v4.11-6543-g2f34c1231bfc).

commit a3719f34fdb664ffcfaec2160ef20fca7becf2ee (v4.11-4715-ga3719f34fdb6) is good.

Thanks
Comment 21 nutrinfnon 2017-11-27 09:07:40 UTC
Linux 4.15.0-rc1 has fixed this problem!

Thanks.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.