Bug 70254

Summary: [snb dp hotplug] Pipe B, PCH transcoder B FIFO underrun
Product: DRI Reporter: Robert N <crshman>
Component: DRM/IntelAssignee: Ville Syrjala <ville.syrjala>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs, rsalveti, rubin, ville.syrjala
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 73943    
Attachments:
Description Flags
kernel error output
none
glxinfo output
none
01-10-2014 drm-intel-nightly
none
kernel output
none
reg_read_2014-01-14T09:33:48-0800.txt
none
reg_read_2014-01-14T09:30:53-0800.txt
none
reg_read_2014-01-14T09:30:42-0800.txt
none
reg_read_2014-01-14T09:30:01-0800.txt
none
reg_read_2014-01-14T09:29:03-0800.txt
none
Patch to allow changing watermark latency values
none
drm/i915: Increase WM memory latency values on SNB with high pixel clock none

Description Robert N 2013-10-07 20:55:49 UTC
Created attachment 87256 [details]
kernel error output

I'm not sure if this is the right place, but based on the error log it seemed right.

Anyways, I tried plugging in my two Dell monitors via display port to my Lenovo x220 machine and it generated these errors in the system log.

I'm not sure if it's related, but the primary reason I opened this bug was because the monitors will blank after a few minutes of activity. The timeout period seems random and I can't really correlate it with any one activity.

I'm running the latest version of the intel graphics drivers, 2013Q3.

## lspci output
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)

## uname -a output
Linux rnavarro-thinkpad 3.8.0-31-generic #46-Ubuntu SMP Tue Sep 10 20:03:44 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

## lsb_release -a output
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 13.04
Release:	13.04
Codename:	raring

## aptitude show libdrm-intel1 output
Package: libdrm-intel1                   
State: installed
Automatically installed: no
Multi-Arch: same
Version: 2.4.45-0ubuntu1
Priority: optional
Section: libs
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Architecture: amd64
Uncompressed Size: 189 k
Depends: libc6 (>= 2.17), libdrm2 (>= 2.4.38), libpciaccess0
PreDepends: multiarch-support
Breaks: libdrm-intel1 (!= 2.4.45-0ubuntu1)
Replaces: libdrm-intel1 (< 2.4.45-0ubuntu1)
Description: Userspace interface to intel-specific kernel DRM services -- runtime
 This library implements the userspace interface to the intel-specific kernel DRM services.  DRM stands for "Direct Rendering
 Manager", which is the kernelspace portion of the "Direct Rendering Infrastructure" (DRI). The DRI is currently used on Linux to
 provide hardware-accelerated OpenGL drivers.

## aptitude show libdrm2 output
Package: libdrm2                         
State: installed
Automatically installed: no
Multi-Arch: same
Version: 2.4.45-0ubuntu1
Priority: optional
Section: libs
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Architecture: amd64
Uncompressed Size: 103 k
Depends: libc6 (>= 2.17)
PreDepends: multiarch-support
Breaks: libdrm2 (!= 2.4.45-0ubuntu1)
Replaces: libdrm2 (< 2.4.45-0ubuntu1)
Description: Userspace interface to kernel DRM services -- runtime
 This library implements the userspace interface to the kernel DRM services.  DRM stands for "Direct Rendering Manager", which is
 the kernelspace portion of the "Direct Rendering Infrastructure" (DRI). The DRI is currently used on Linux to provide
 hardware-accelerated OpenGL drivers. 
 
 This package provides the runtime environment for libdrm.
Comment 1 Robert N 2013-10-07 20:56:23 UTC
Created attachment 87257 [details]
glxinfo output
Comment 2 Robert N 2013-10-07 20:57:10 UTC
Forgot to mention, let me know if there is any other information that you need or if you'd like me to change any settings to up the debug level anywhere (along with location)
Comment 3 Jani Nikula 2013-10-08 07:13:57 UTC
You've come to the right place - if you're prepared to build your own kernels. Please try the drm-intel-nightly branch of [1]. First, your issues are related to hotplugging and display port link training/maintenance, both of which have been updated and fixed considerably in the latest kernels. It could be something we've taken care of already. Second, I can't find a kernel version where your log would match the source code; I can only presume it's a distro kernel with some changes on top. So I don't know what exactly you're running and what version I should be looking at.

Please report back; if the problem persist, attach the dmesg from early boot to the problem, with drm.debug=0xe module parameter.

[1] git://people.freedesktop.org/~danvet/drm-intel
Comment 4 Robert N 2013-10-12 18:39:45 UTC
Hello Jani,

So I've gone ahead and updated my os install to the latest ubuntu, bringing me up to the 3.11.0-12 kernel.

It looks like the hot plug issues have gone away, which is great....however once I turned on debugging like you mentioned I figured out what was actually going on.

Right before my either of my monitors goes blank I get a message emitted like this:

Oct 12 11:22:02 rnavarro-thinkpad kernel: [  673.201706] [drm:ironlake_irq_handler], Pipe B FIFO underrun
Oct 12 11:22:02 rnavarro-thinkpad kernel: [  673.201719] [drm:cpt_serr_int_handler], PCH transcoder B FIFO underrun

Shortly after my second monitor goes blank, emitting a similar message:
Oct 12 11:22:31 rnavarro-thinkpad kernel: [  702.411812] [drm:ironlake_irq_handler], Pipe A FIFO underrun

Is there more detailed debug information that I can output to help identify what is causing this?

I'm driving both monitors with:
Oct 12 11:22:56 rnavarro-thinkpad kernel: [  727.212145] [drm:drm_mode_debug_printmodeline], Modeline 32:"2560x1440" 60 241500 2560 2608 2640 2720 1440 1443 1448 1481 0x48 0x9
Comment 5 Jani Nikula 2013-10-14 09:54:08 UTC
Updating subject accordingly. We probably have fixes in this area too, so a test spin on drm-intel-nightly would be appreciated. Thanks.
Comment 6 Robert N 2013-10-15 20:30:26 UTC
Hey Jani,

So I grabbed the latest intel drm kernel from here:

http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-nightly/current/

In particular, linux-image-3.12.0-994-generic_3.12.0-994.201310150447_amd64

I tried it again...this time I booted with my two DP monitors plugged in. Started up fine, logged in...still good.

I switched the active workspace a few times left/right and then the first monitor blanked.

I did it a few more times and then the second monitor blanked.

I then physically undocked the laptop, detaching both screens and everything else, to send the captured system reports.

Here is a copy of the apport log dump from right after the error occurred:

http://www.crshman.com/debug/apport.xserver-xorg-video-intel.4x6KCY.apport_unpack/

Also, if you look at the CurrentDmesg file, around [21.961459] is where I pause for a second, then start switching the active workspace back and forth.

A few seconds later at [72.034577] is when the first monitor blanks off

What other debug information can I grab to help out with this?
Comment 7 Robert N 2013-10-21 17:34:58 UTC
Hello,

Is there anything else I can test/add to this report to help figure out what's going on?
Comment 8 Robert N 2013-11-10 20:53:35 UTC
Hello,

Is there anything I can test/add to this report to help figure out what's going on?
Comment 9 Mika Kuoppala 2014-01-03 11:11:36 UTC
(In reply to comment #8)
> Hello,
> 
> Is there anything I can test/add to this report to help figure out what's
> going on?

Would be intresting to see if underruns persist with lower resolution modes.
Comment 10 Robert N 2014-01-03 15:10:16 UTC
(In reply to comment #9)
> Would be intresting to see if underruns persist with lower resolution modes.

Here are the different modes my monitors can do:

DP2 connected 1440x2560+1440+0 left (normal left inverted right x axis y axis) 597mm x 336mm
   2560x1440      60.0*+
   1920x1200      59.9  
   1920x1080      60.0     60.0     50.0     59.9     24.0     24.0  
   1920x1080i     60.1     50.0     60.0  
   1600x1200      60.0  
   1680x1050      60.0  
   1280x1024      75.0     60.0  
   1280x800       59.8  
   1152x864       75.0  
   1280x720       60.0     50.0     59.9  
   1024x768       75.1     60.0  
   800x600        75.0     60.3  
   720x576        50.0  
   720x480        60.0     59.9  
   640x480        75.0     60.0     59.9  
   720x400        70.1  

Any particular one you think would work best?
Comment 11 Daniel Vetter 2014-01-08 17:15:16 UTC
We've had piles and piles of watermark fixes, which should help in rectifying pipe underruns. Can you please retest with a recent drm-intel-nightly build?
Comment 12 Robert N 2014-01-08 17:43:35 UTC
Hey Daniel,

Sounds good, I'll update to the latest nightly and see how things go.
Comment 13 Robert N 2014-01-10 16:53:31 UTC
Created attachment 91827 [details]
01-10-2014 drm-intel-nightly
Comment 14 Robert N 2014-01-10 16:54:31 UTC
I've gone ahead and updated to the latest nightly:

Linux rnavarro-thinkpad 3.13.0-994-generic #201401100405 SMP Fri Jan 10 09:05:49 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

The release done on 1/10/14

I'm still getting the underruns. Right after I logged in I got one one my second monitor, and then shortly after another on my first.

The latest log can be found here (01-10-2014 drm-intel-nightly):

https://bugs.freedesktop.org/attachment.cgi?id=91827
Comment 15 Daniel Vetter 2014-01-14 13:51:03 UTC
Ville, any ideas?
Comment 16 Ville Syrjala 2014-01-14 16:08:40 UTC
Assuming the watermarks get computed correctly, the only issue i can think of is that the WM latency values provided by the BIOS might be too optimistic. My SNB machine has the same SSKPD though (not that I've ever tried dual 25x14 displays on it).

Does the problem occur with just one of the displays plugged in?

Or with two displays and lower resolution on both. 1920x1080@60 should be OK.

Can you take some register dumps (for the original dual 2560x1440 case)? I'd like to check it things look OK. As root run this:

intel_reg_read 0xc6014 0xc6040 0xc6044 0xc6018 0xc6048 0xc604c 0x45100 0x45104 
0x45108 0x4510c 0x45110 0x45120 0x145d10 0x42000 0x42004 0x42020 0x70180 0x71180

intel_reg_read is part of intel-gpu-tools.
Comment 17 Robert N 2014-01-14 16:13:47 UTC
Replies inline:

> Does the problem occur with just one of the displays plugged in?
I've never had this happen with only a single display.

> Or with two displays and lower resolution on both. 1920x1080@60 should be OK.
With both monitors 1920x1080@60 it doesn't happen

> Can you take some register dumps (for the original dual 2560x1440 case)? I'd
> like to check it things look OK. As root run this:
> 
> intel_reg_read 0xc6014 0xc6040 0xc6044 0xc6018 0xc6048 0xc604c 0x45100
> 0x45104 
> 0x45108 0x4510c 0x45110 0x45120 0x145d10 0x42000 0x42004 0x42020 0x70180
> 0x71180
> 
> intel_reg_read is part of intel-gpu-tools.
Do I run the intel_reg_read right after the problem has occurred, or when? (Never used that tool)

Thanks for looking into this guys, I'll do my best to get you all and any information you need to debug this!
Comment 18 Ville Syrjala 2014-01-14 16:18:33 UTC
(In reply to comment #17)
> Replies inline:
> 
> > Does the problem occur with just one of the displays plugged in?
> I've never had this happen with only a single display.
> 
> > Or with two displays and lower resolution on both. 1920x1080@60 should be OK.
> With both monitors 1920x1080@60 it doesn't happen
> 
> > Can you take some register dumps (for the original dual 2560x1440 case)? I'd
> > like to check it things look OK. As root run this:
> > 
> > intel_reg_read 0xc6014 0xc6040 0xc6044 0xc6018 0xc6048 0xc604c 0x45100
> > 0x45104 
> > 0x45108 0x4510c 0x45110 0x45120 0x145d10 0x42000 0x42004 0x42020 0x70180
> > 0x71180
> > 
> > intel_reg_read is part of intel-gpu-tools.
> Do I run the intel_reg_read right after the problem has occurred, or when?
> (Never used that tool)

You can run it as soon as the displays are lit up, and probably best to run it also after the problem has occured (just to make sure the registers haven't changed magically in between).
Comment 19 Robert N 2014-01-14 17:35:39 UTC
Created attachment 92058 [details]
kernel output
Comment 20 Robert N 2014-01-14 17:36:18 UTC
Created attachment 92059 [details]
reg_read_2014-01-14T09:33:48-0800.txt
Comment 21 Robert N 2014-01-14 17:36:32 UTC
Created attachment 92060 [details]
reg_read_2014-01-14T09:30:53-0800.txt
Comment 22 Robert N 2014-01-14 17:36:49 UTC
Created attachment 92061 [details]
reg_read_2014-01-14T09:30:42-0800.txt
Comment 23 Robert N 2014-01-14 17:37:16 UTC
Created attachment 92062 [details]
reg_read_2014-01-14T09:30:01-0800.txt
Comment 24 Robert N 2014-01-14 17:37:30 UTC
Created attachment 92063 [details]
reg_read_2014-01-14T09:29:03-0800.txt
Comment 25 Robert N 2014-01-14 17:40:00 UTC
Ok, i've gone ahead and bumped my kernel to this version:

Linux rnavarro-thinkpad 3.13.0-994-generic #201401140526 SMP Tue Jan 14 10:27:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

I went ahead and added a dump of the intel_reg_read output to rc.local to get an early dump.

I then manually ran a dump right when I logged in, and again a few more times after it happened again.

Here is the kernel log:
https://bugs.freedesktop.org/attachment.cgi?id=92058

P.S. This was a particularly good one, my main monitor didn't even turn back on this time after the underrun.
Comment 26 Robert N 2014-01-14 17:48:40 UTC
Also, to save you guys some time, I didn't notice any differences between any of the dumps from intel_reg_read
Comment 27 Ville Syrjala 2014-01-20 17:30:19 UTC
Hmm. The register dumps look perfectly fine. So the BIOS provided memory latency values being too low to keep the system happy remains my only theory.

Any chance there might be a BIOS update available for the machine? That might be worth a shot, although I can't guarantee that any update would affect the latency values.

In any case, I'll need to cook up some patches to allow run-time modification of the latency values, so that we can try and see if increasing them would actually help...
Comment 28 Robert N 2014-01-20 17:42:07 UTC
(In reply to comment #27)

> Any chance there might be a BIOS update available for the machine? That
> might be worth a shot, although I can't guarantee that any update would
> affect the latency values.
I took a look at the lenovo website and I'm currently running the latest BIOS revision, 1.39.
Comment 29 Ville Syrjala 2014-01-21 19:45:15 UTC
Created attachment 92540 [details] [review]
Patch to allow changing watermark latency values

This patch allows changing the latency values we use for computing the watermarks.

It adds three new debugfs files. "i915_pri_wm_latency" being the one we're interested in here.

Reading the file should give similar output as the kernel log had. So in this case it should look like this:

# cat i915_pri_wm_latency
Primary WM0 latency 7 (0.7 usec)
Primary WM1 latency 3 (1.5 usec)
Primary WM2 latency 4 (2.0 usec)
Primary WM3 latency 22 (11.0 usec)

What you could then do is write new latency values to the file. Let's say we try to double the latency values:
# echo '14 6 8 44' > i915_pri_wm_latency

Now reading the file again should show the new values. To actually make the system use them you'd need to force a modeset on all the displays.
"xset dpms force off; xset dpms force on" should be enough for that. After this is done you should see some change in the 0x45100 and 0x45104 registers.

And then it should just be a matter of trying to cause another underrun, and increasing the latency values until they no longer occur.
Comment 30 Robert N 2014-01-21 19:53:00 UTC
Hello,

I actually just updated my kernel version a few minutes ago to check to see how things were going.

I'm currently running:

Linux rnavarro-thinkpad 3.13.0-994-generic #201401210405 SMP Tue Jan 21 09:05:52 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Did this patch make it into the 01/21/14 nightly, or should I wait for the 01/22/14 nightly?

Additionally, where is the 'i915_pri_wm_latency' located?

I searched in /sys/kernel/debug and it didn't show up (which would make sense if the patch hadn't hit the nightly yet)
Comment 31 Ville Syrjala 2014-01-22 08:42:24 UTC
(In reply to comment #30)
> Hello,
> 
> I actually just updated my kernel version a few minutes ago to check to see
> how things were going.
> 
> I'm currently running:
> 
> Linux rnavarro-thinkpad 3.13.0-994-generic #201401210405 SMP Tue Jan 21
> 09:05:52 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> 
> Did this patch make it into the 01/21/14 nightly, or should I wait for the
> 01/22/14 nightly?

I didn't even post it to the mailing list yet. I can do that if it's easier for you to test it through that prebuilt kernel.

> Additionally, where is the 'i915_pri_wm_latency' located?

It will show up in /sys/kernel/debug/dri/0/
Comment 32 Robert N 2014-01-22 15:12:26 UTC
(In reply to comment #31)
> I didn't even post it to the mailing list yet. I can do that if it's easier
> for you to test it through that prebuilt kernel.

Oops jumping the gun a bit, yes it would be much easier for me to test in a pre-built kernel.

Thanks for your efforts in helping resolve this!
Comment 33 Ville Syrjala 2014-01-28 20:02:15 UTC
(In reply to comment #32)
> (In reply to comment #31)
> > I didn't even post it to the mailing list yet. I can do that if it's easier
> > for you to test it through that prebuilt kernel.
> 
> Oops jumping the gun a bit, yes it would be much easier for me to test in a
> pre-built kernel.
> 
> Thanks for your efforts in helping resolve this!

Daniel picked up the patch for -nightly, so hopefully it'll appear in your prebuilt kernels soonish.
Comment 34 Robert N 2014-01-29 16:52:33 UTC
Ok, so it looks like I have this now.

root@rnavarro-thinkpad:/sys/kernel/debug/dri/0# cat i915_pri_wm_latency
WM0 7 (0.7 usec)
WM1 3 (1.5 usec)
WM2 4 (2.0 usec)
WM3 22 (11.0 usec)

root@rnavarro-thinkpad:/sys/kernel/debug/dri/0# echo '14 6 8 44' > i915_pri_wm_latency

root@rnavarro-thinkpad:/sys/kernel/debug/dri/0# cat i915_pri_wm_latency
WM0 14 (1.4 usec)
WM1 6 (3.0 usec)
WM2 8 (4.0 usec)
WM3 44 (22.0 usec)

Ran the commands just as described, would it make sense to figure out what the minimums are?

Does it matter which WMx I'm changing?

Should I change them all at the same time as described, or one by one?
Comment 35 Ville Syrjala 2014-01-29 17:05:57 UTC
(In reply to comment #34)
> Ok, so it looks like I have this now.
> 
> root@rnavarro-thinkpad:/sys/kernel/debug/dri/0# cat i915_pri_wm_latency
> WM0 7 (0.7 usec)
> WM1 3 (1.5 usec)
> WM2 4 (2.0 usec)
> WM3 22 (11.0 usec)
> 
> root@rnavarro-thinkpad:/sys/kernel/debug/dri/0# echo '14 6 8 44' >
> i915_pri_wm_latency
> 
> root@rnavarro-thinkpad:/sys/kernel/debug/dri/0# cat i915_pri_wm_latency
> WM0 14 (1.4 usec)
> WM1 6 (3.0 usec)
> WM2 8 (4.0 usec)
> WM3 44 (22.0 usec)
> 
> Ran the commands just as described, would it make sense to figure out what
> the minimums are?

I guess we can try to narrow it down as much as possible. If the doubled values work, then we could bisect it further to find the smallest acceptable value. If the doubled values didn't work, might want to try 3x,4x,5x...

> 
> Does it matter which WMx I'm changing?

With two displays only WM0 will be used. The others only kick in to provide more power savings in single display use cases.

> 
> Should I change them all at the same time as described, or one by one?

Probably best to keep changing all in sync for now. I think we at least need to maintain the relationship WM0<=WM1<=WM2<=WM3 (for the usec values).

Also probably a good idea to check at each step that the change resulted in a corresponding change to the 0x45100 and 0x45104 register values. In your two display case, those two registers should always have an identical value to each other.
Comment 36 Robert N 2014-01-29 18:05:30 UTC
I'm not noticing a change in the register values when I change the latencies:

root@rnavarro-thinkpad:~# cat /sys/kernel/debug/dri/0/i915_pri_wm_latency
WM0 7 (0.7 usec)
WM1 3 (1.5 usec)
WM2 4 (2.0 usec)
WM3 22 (11.0 usec)

root@rnavarro-thinkpad:~# intel_reg_read 0x45100 0x45104
0x45100 : 0xD0006
0x45104 : 0xD0006

root@rnavarro-thinkpad:~# echo '14 6 8 44' > /sys/kernel/debug/dri/0/i915_pri_wm_latency

root@rnavarro-thinkpad:~# cat /sys/kernel/debug/dri/0/i915_pri_wm_latency
WM0 14 (1.4 usec)
WM1 6 (3.0 usec)
WM2 8 (4.0 usec)
WM3 44 (22.0 usec)

root@rnavarro-thinkpad:~# intel_reg_read 0x45100 0x45104
0x45100 : 0xD0006
0x45104 : 0xD0006

Is that unexpected?
Comment 37 Ville Syrjala 2014-01-29 18:32:27 UTC
(In reply to comment #36)
> I'm not noticing a change in the register values when I change the latencies:
> 
> root@rnavarro-thinkpad:~# cat /sys/kernel/debug/dri/0/i915_pri_wm_latency
> WM0 7 (0.7 usec)
> WM1 3 (1.5 usec)
> WM2 4 (2.0 usec)
> WM3 22 (11.0 usec)
> 
> root@rnavarro-thinkpad:~# intel_reg_read 0x45100 0x45104
> 0x45100 : 0xD0006
> 0x45104 : 0xD0006
> 
> root@rnavarro-thinkpad:~# echo '14 6 8 44' >
> /sys/kernel/debug/dri/0/i915_pri_wm_latency
> 
> root@rnavarro-thinkpad:~# cat /sys/kernel/debug/dri/0/i915_pri_wm_latency
> WM0 14 (1.4 usec)
> WM1 6 (3.0 usec)
> WM2 8 (4.0 usec)
> WM3 44 (22.0 usec)
> 
> root@rnavarro-thinkpad:~# intel_reg_read 0x45100 0x45104
> 0x45100 : 0xD0006
> 0x45104 : 0xD0006
> 
> Is that unexpected?

Did you do the "xset dpms force off; xset dpms force on" commands in between?
Comment 38 Robert N 2014-01-30 16:43:02 UTC
Ah yes, forgot to run that command. Once I do that the register values are changed.

Doubling all of these numbers seems to work great. I'm trying to track down what the minimums are, I reset everything to the defaults and I'm slowly bumping WM0 (while keeping with WM0<=WM1<=WM2<=WM3) to see where things stop breaking.

So far so good with this:

root@rnavarro-thinkpad:~# cat /sys/kernel/debug/dri/0/i915_pri_wm_latency; intel_reg_read 0x45100 0x45104
WM0 10 (1.0 usec)
WM1 3 (1.5 usec)
WM2 4 (2.0 usec)
WM3 22 (11.0 usec)
0x45100 : 0x120006
0x45104 : 0x120006

WM0 7 (0.7 usec) --> WM0 10 (1.0 usec)

However, I'm going to keep testing to make sure that the WM0 1.0usec is solid.

Thanks for the assistance thus far, we're getting close to pinning this down!
Comment 39 Robert N 2014-02-02 00:36:05 UTC
Hey Guys,

So after a few days of testing I've seen zero flickers with this config:

root@rnavarro-thinkpad:~# cat /sys/kernel/debug/dri/0/i915_pri_wm_latency
WM0 12 (1.2 usec)
WM1 3 (1.5 usec)
WM2 4 (2.0 usec)
WM3 22 (11.0 usec)

So I went from:
WM0 7 (0.7 usec) --> WM0 12 (1.2 usec)
Comment 40 Ville Syrjala 2014-02-26 13:16:26 UTC
Created attachment 94766 [details] [review]
drm/i915: Increase WM memory latency values on SNB with high pixel clock

This patch should make the driver automagically increase the latency values when encoutering a high resolution display. Please test and report back whether it works as intended.
Comment 41 Robert N 2014-03-02 06:43:07 UTC
Sounds good, I'll keep an eye out for it on my prebuilt kernels and report back when it's merged in.
Comment 42 Daniel Vetter 2014-03-03 07:32:36 UTC
(In reply to comment #41)
> Sounds good, I'll keep an eye out for it on my prebuilt kernels and report
> back when it's merged in.

Nope, we won't merge this without positive testing feedback from you. Which means you need to apply this patch and build kernels yourself - we can't test every possible crazy hw combination out there ourselves and applying random patches to the main tree is a no-go (besides that usually it takes a bit of time for patches to land in pre-built kernels that way anyway).

If you can't test patches we need to close this as unresolved unfortunately.
Comment 43 Robert N 2014-03-03 15:08:58 UTC
Ok, I'll have to figure out how to compile the kernel for my OS. It may take some time, but I'll figure it out.
Comment 44 rubin110 2014-03-11 20:00:36 UTC
TL;DR The patch seems to correct my issue, which I've been directed to this bug after complaining about it on the Intel-gfx list. However my issue isn't exactly identical. I'm providing compiling instructions for Robert N to verify with also.

Some months ago I bought a cheap 27" S-IPS display off of ebay. The panel supports DVI, Displayport, and some others, and its native resolution is QHD 2560x1440. The display shipped from Korea, I plugged it into my Thinkpad X220 running Debian Sid and had a slew of issues. The seller went back and forth with me on trying to fix the issues, and provided replacement boards for the inside of the display, but ultimately the seller stalled and the one month period to request a refund flew by.

There are two issues I encounter...

Through a direct Displayport connection from my X220 to the monitor at full resolution, if there's a lot of motion on the screen (a full screen video or scrolling a web page back and forth) for about 60 seconds, the screen will blank out and return a few times until it acts as though the Displayport cable has been disconnected and there's no signal. Eventually the display will give up and go to sleep.

Through Displayport to Dual Link DVI via one of those adapters that requires power over USB, the display was more usable. During a lot of motion on the screen it wouldn't blank out, however after about 5 minutes of that all the pixels on the screen would vibrate together back and forth about 200px horizontally for half a second. This will repeat anywhere between every 30 seconds to 5 minutes. Again never blanking out.

If I drive the display at a smaller resolution like 1080p, I have no issues. The same goes with pushing over HDMI, but the max solution here is 1080p anyhow. Additional I haven't noticed this issue on most actual name brand displays, namely the higher priced Dell displays.

Recently I was getting fed up with this issue and started looking for a replacement monitor. After realizing that blowing another $500 sort of sucks, so I decided to do a little more testing. Using a spare Mac Mini I keep around for testing, I tested out Displayport to Displayport, playing a 1080p video on the display at full resolution. I encountered no issues (other than the Mac Mini simply dropping frames from the large video). Plugging in a spare drive I keep around with a bootable copy of Windows 7 into my Thinkpad X220, I again was able to drive the display playing full screen video with zero issues over Displayport.

Through out all my issues, I was never once able to find any sort of error or debug output in any logs. This includes kern.log, syslog and Xorg. Due to this I'm not sure if my issue is the same as Robert N's, and there for I would like it if he verified the fix too unless the devs here can safely say my issue described is the same.

So at this point I started to poke people on lists and bug some of my smarter kernel hacker friends, which has brought me to this bug.

After grabbing a copy of the drm-intel nightly source, applying the patch, compiling and giving it a spin, I was able to play a full screened video at full display resolution without issue over Displayport. I left the video playing for 2 hours, no blank outs, no shaking. I have not tested Dual Link DVI yet.


My kernel compiling steps, for Robert N:

# Some packages you might need, I'm most likely missing things here that I already have, you can figure it out
sudo apt-get install libncurses5-dev kernel-package

# Let's make a directory to get messy in
mkdir ~/temp-kernel
cd ~/temp-kernel

# Grab a copy of the patch
wget -O bug70254.patch https://bugs.freedesktop.org/attachment.cgi?id=94766

# Grab a copy of the nightly source, this will take a little while
git clone --depth=1 -b drm-intel-nightly git://people.freedesktop.org/~danvet/drm-intel

# Apply the patch
cd drm-intel
patch -p1 < ../bug70254.patch

# Copy your current kernel's config, this'll be different for you
cp /boot/config-3.12-1-amd64 .config

# Get the config ready for the new kernel, once this opens simply select save, save it at .config, and exit
make menuconfig

# We're now going to compile using fakeroot make-kpkg. This is a more sensible Debian way of putting together and installing kernels. In the end you should have two deb packages which you can later on apt-get remove easily if you want to. This (should) also takes care of updating grub.

# Provide the number of cores you want to dedicate to compiling, if you don't know just select 1
export CONCURRENCY_LEVEL=3

# Start compiling and build a pair of deb packages for you, this will take a long while
fakeroot make-kpkg --initrd --append-to-version=-custom-drm-intel-nightly-bug70254 kernel-image kernel-headers modules_image

# You should now have two deb packages, linux-image and linux-headers, named along with the version number
cd ..
ls -la

# Install the new packages, which should be the only two debs in the current wo
sudo dpkg -i linux-*.deb

# Reboot , there is a chance you need to select the kernel by hand when grub pops up during boot. I'm not sure how that all works in the Ubuntu world but I'm sure you can figure it out.

# When you've booted up, verify you've running they new patched kernel
uname -a

# I see: Linux lines 3.13.0-custom01+ #1 SMP Tue Mar 11 10:51:56 PDT 2014 x86_64 GNU/Linux
# Test away!


With all that being said if this patch gets accepted, approximately how soon till it might end up in a stable release of mainline kernel? Though honestly this current nightly kernel seems to be running a-ok thus far.

Additionally if I got a new fangled Lenovo dock with two Displayports, will I be able to drive two of the same displays at full resolution with this patch? I do understand I'll have to disable the laptop display to make the second external work.

Thanks!
Comment 45 Robert N 2014-03-12 20:50:32 UTC
Hey rubin110!

Thanks for the incredibly detailed instructions! (Stashing those away for the future!)

I'm compiling the new kernel as I write this, once it's done I'll reboot and start testing.

>>Additionally if I got a new fangled Lenovo dock with two Displayports, will I >>be able to drive two of the same displays at full resolution with this patch? I >>do understand I'll have to disable the laptop display to make the second >>external work.

The answer is YES! I actually drive my dual screens (3x Dell U2713HM) using this docking station:

http://www.amazon.com/gp/product/B0085MQLGC

Using both of the DP connectors.
Comment 46 Robert N 2014-03-13 14:32:33 UTC
I got this compiled and running yesterday, worked for the rest of the evening without issues, I'll keep poking at it today to see how it goes. But things are definitely looking promising!
Comment 47 Robert N 2014-03-14 14:40:00 UTC
So far so good with this patch, no flickers, no blanking and I didn't even have to touch any of the wm_latency params.

I've tested this on both 3.13 and 3.14rc6 (currently running on 3.14) and things look great.

Thanks for all the hard work guys!
Comment 48 Robert N 2014-03-14 23:22:49 UTC
Hey Guys,

So about an hour ago I rebooted and forgot to select the custom kernel on boot up. Within 5 minutes screens were blanking and flickering like crazy....then I realized I was on the stock kernel.

I just wanted to stop and say thanks again for all the hard work, this has changed my computing experience greatly!

So far I've spent two days on the newly patched kernel with zero issues at all.
Comment 49 Ricardo Salveti de Araujo 2014-03-16 19:52:07 UTC
(In reply to comment #40)
> Created attachment 94766 [details] [review] [review]
> drm/i915: Increase WM memory latency values on SNB with high pixel clock
> 
> This patch should make the driver automagically increase the latency values
> when encoutering a high resolution display. Please test and report back
> whether it works as intended.

Backported this patch on top of latest 3.13 based Ubuntu kernel tree, and indeed fixed the issue described by this bug (you can find more details at https://bugs.launchpad.net/xserver-xorg-video-intel/+bug/1239186).

Let me know if you need any further testing before sending the patch upstream.
Comment 50 Jani Nikula 2014-03-17 07:16:50 UTC
(In reply to comment #40)
> Created attachment 94766 [details] [review] [review]
> drm/i915: Increase WM memory latency values on SNB with high pixel clock
> 
> This patch should make the driver automagically increase the latency values
> when encoutering a high resolution display. Please test and report back
> whether it works as intended.

Ville, has this been posted on the ml?
Comment 51 rubin110 2014-03-20 19:03:24 UTC
Anything else us testers need to do to get this bug into a fixed verified state? Thanks.
Comment 52 Jani Nikula 2014-03-21 12:49:41 UTC
I posted Ville's patch for review [1]. Some further work is needed.

[1] http://mid.gmane.org/1395392448-6337-1-git-send-email-jani.nikula@intel.com
Comment 53 Robert N 2014-03-28 16:19:56 UTC
Where there any changes required for the posted patch?

Any indication on when it'll get pushed out so I can start running pre-compiled kernels again?

I know how to compile my own now, thanks rubin110, but it's far more convenient to not have to waste my time building a kernel.
Comment 54 Robert N 2014-04-11 15:46:12 UTC
Hey Guys,

I've asked around and the consensus is that the patch "still needs work" but I'm not sure what that work might be.

What other things are needed for this to get included?
Comment 55 Vitaly Minko 2014-05-13 17:24:41 UTC
I had the same issue. Ville's patch solved the problem. Thanks a lot guys. I wish you all the best!
Comment 56 Robert N 2014-05-14 01:05:09 UTC
Just as a heads up to all following this bug. Ville posted a cleaner patch here for testing:

http://patchwork.freedesktop.org/patch/25568/
Comment 57 Jani Nikula 2014-05-15 11:06:06 UTC
Fix pushed to drm-intel-fixes as

commit 94b93bc0093a37230ea7a0e91f04bfce677c430f
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Thu May 8 15:09:19 2014 +0300

    drm/i915: Increase WM memory latency values on SNB

Thanks for the report.
Comment 58 Jani Nikula 2014-05-15 11:13:19 UTC
(In reply to comment #57)
> Fix pushed to drm-intel-fixes as

commit e95a2f7509f5219177d6821a0a8754f93892ca56

> Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Date:   Thu May 8 15:09:19 2014 +0300
> 
>     drm/i915: Increase WM memory latency values on SNB
> 
> Thanks for the report.
Comment 59 Jari Tahvanainen 2016-10-06 07:36:10 UTC
Closing due to "patch solved the problem"

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.