Summary: | intel_dp WARN_ON(!msg->buffer != !msg->size) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | mwa <matthew.auld> | ||||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||
Severity: | normal | ||||||||||
Priority: | highest | CC: | aleksey, andrey.vihrov, dex+fdobugzilla, intel-gfx-bugs, leho, mihai.dontu, peter.ujfalusi, reddy.harshak, yex.tian, ziegler | ||||||||
Version: | DRI git | ||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||
OS: | Linux (All) | ||||||||||
Whiteboard: | |||||||||||
i915 platform: | BDW | i915 features: | display/DP | ||||||||
Attachments: |
|
Description
mwa
2016-08-14 13:23:24 UTC
So this will bisect to commit dd788090822300a66ff469ae9e50f6d28d124eb8 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Thu Jul 28 17:55:04 2016 +0300 drm/i915: Warn about aux msg buffer vs. size mismatch but that commit just uncovers a pre-existing bug elsewhere. The bug is lane count being 0 when we end up in link training. How that happens, I don't know. Matthew can you attach kernel log and update i915 platform field? Fix (from Matthew) landing here: https://patchwork.freedesktop.org/series/11667/ *** Bug 98304 has been marked as a duplicate of this bug. *** *** Bug 98288 has been marked as a duplicate of this bug. *** Bug https://bugs.freedesktop.org/show_bug.cgi?id=98287 "gpu hangs after hibernation" which hit me in 4.9-rc1 is still there in 4.9.0-rc5 Created attachment 128018 [details] dmesg.txt Jumping here from https://bugzilla.kernel.org/show_bug.cgi?id=187571 I'm on HSW, and 4.9-rc5 is flooding dmesg with the subject matter. This did not occur on 4.8-rc4 that I somehow ended up running without a reboot for 70 days straight (it set a new `uptimed` record). 4.9 also surprised me by not recognizing HDMI connector unplugging anymore. This hasn't occured for a long time and I've been on bleeding edge kernels here since mid-3.x. When I unplug the monitor, no display re-configurations happen. Suspend-resume cycle helps restore connector state sanity, after wakeup the extra display is gone (verified in Gnome Display Settings). ... nov 14 00:52:46 papaya kernel: [drm] Memory usable by graphics device = 2048M nov 14 00:52:46 papaya kernel: [drm] VT-d active for gfx access nov 14 00:52:46 papaya kernel: [drm] Replacing VGA console driver nov 14 00:52:46 papaya kernel: [drm] DMAR active, disabling use of stolen memory nov 14 00:52:46 papaya kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). nov 14 00:52:46 papaya kernel: [drm] Driver supports precise vblank timestamp query. nov 14 00:52:46 papaya kernel: vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem nov 14 00:52:46 papaya kernel: ACPI: Video Device [GFX0] (multi-head: yes rom: no post: no) nov 14 00:52:46 papaya kernel: input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input6 nov 14 00:52:46 papaya kernel: [drm] Initialized i915 1.6.0 20160919 for 0000:00:02.0 on minor 0 nov 14 00:52:46 papaya kernel: ------------[ cut here ]------------ nov 14 00:52:46 papaya kernel: WARNING: CPU: 0 PID: 38 at drivers/gpu/drm/i915/intel_dp.c:1062 intel_dp_aux_transfer+0x1dc/0x220 [i915] nov 14 00:52:46 papaya kernel: WARN_ON(!msg->buffer != !msg->size) nov 14 00:52:46 papaya kernel: Modules linked in: nov 14 00:52:46 papaya kernel: crc32_pclmul i915 fbcon bitblit softcursor font intel_gtt sdhci_pci sdhci mmc_core drm_kms_helper cfbfillrect syscopyarea cfbimgblt ehci_pci sysfillrect sysimgblt ehci_hcd fb_sys_ nov 14 00:52:46 papaya kernel: CPU: 0 PID: 38 Comm: kworker/0:1 Tainted: G U 4.9.0-rc5-bfq-gentoo+ #20 nov 14 00:52:46 papaya kernel: Hardware name: Dell Inc. Latitude E7440/0PC4X0, BIOS A18 04/28/2016 nov 14 00:52:46 papaya kernel: Workqueue: events i915_hotplug_work_func [i915] nov 14 00:52:46 papaya kernel: ffffc90000167b98 ffffffff8137adfb ffffc90000167be8 0000000000000000 nov 14 00:52:46 papaya kernel: ffffc90000167bd8 ffffffff810523cc 00000426607500e1 ffffc90000167ca8 nov 14 00:52:46 papaya kernel: ffff880407a800e0 0000000000000003 0000000000000000 ffff880407a80158 nov 14 00:52:46 papaya kernel: Call Trace: nov 14 00:52:46 papaya kernel: [<ffffffff8137adfb>] dump_stack+0x4d/0x72 nov 14 00:52:46 papaya kernel: [<ffffffff810523cc>] __warn+0xcc/0xf0 nov 14 00:52:46 papaya kernel: [<ffffffff8105243a>] warn_slowpath_fmt+0x4a/0x50 nov 14 00:52:46 papaya kernel: [<ffffffffa021a31c>] ? intel_dp_aux_transfer+0xcc/0x220 [i915] nov 14 00:52:46 papaya kernel: [<ffffffffa021a42c>] intel_dp_aux_transfer+0x1dc/0x220 [i915] nov 14 00:52:46 papaya kernel: [<ffffffffa00f0bc8>] drm_dp_dpcd_access+0x58/0xf0 [drm_kms_helper] nov 14 00:52:46 papaya kernel: [<ffffffffa00f0c76>] drm_dp_dpcd_write+0x16/0x20 [drm_kms_helper] nov 14 00:52:46 papaya kernel: [<ffffffffa0215cc8>] intel_dp_start_link_train+0x2a8/0x4c0 [i915] nov 14 00:52:46 papaya kernel: [<ffffffffa0217106>] intel_dp_check_link_status+0xb6/0xf0 [i915] nov 14 00:52:46 papaya kernel: [<ffffffffa021ba0b>] intel_dp_detect+0x72b/0xbb0 [i915] nov 14 00:52:46 papaya kernel: [<ffffffffa02049ff>] i915_hotplug_work_func+0x1df/0x2b0 [i915] nov 14 00:52:46 papaya kernel: [<ffffffff8106a3a0>] process_one_work+0x140/0x3e0 nov 14 00:52:46 papaya kernel: [<ffffffff8106a689>] worker_thread+0x49/0x480 nov 14 00:52:46 papaya kernel: [<ffffffff8106a640>] ? process_one_work+0x3e0/0x3e0 nov 14 00:52:46 papaya kernel: [<ffffffff8106a640>] ? process_one_work+0x3e0/0x3e0 nov 14 00:52:46 papaya kernel: [<ffffffff8106f9a5>] kthread+0xc5/0xe0 nov 14 00:52:46 papaya kernel: [<ffffffff8106f8e0>] ? kthread_park+0x60/0x60 nov 14 00:52:46 papaya kernel: [<ffffffff81632fd2>] ret_from_fork+0x22/0x30 nov 14 00:52:46 papaya kernel: ---[ end trace 60f064180c1be639 ]--- nov 14 00:52:46 papaya kernel: ------------[ cut here ]------------ I can still see the warning with 4.9-rc7, but other than that no other visible issues (suspend/resume works OK on my HSW). With 4.9-rc7 I still get aroiund 49 warnings at boot and a crash of the X-server after hibernation: Nov 27 23:20:04 kernel: [drm] GPU HANG: ecode 8:0:0x5d1a7470, in Xorg [2162], reason: Hang on render ring, action: reset Nov 27 23:20:04 kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. Nov 27 23:20:04 kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel Nov 27 23:20:04 kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. Nov 27 23:20:04 kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. Nov 27 23:20:04 kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error Nov 27 23:20:04 kernel: drm/i915: Resetting chip after gpu hang I think this is a symptom of something worse, and our CI should catch this. The GPU hang seen in comment #9 is unrelated, though. Created attachment 128581 [details]
dmesg with drm.debug=0x1e log_buf_len=1M
It is still happening with 4.9 kernel: flood of WARN_ON(!msg->buffer != !msg->size) on Toshiba Satellite Z30-A during boot.
Note that I need i915.enable_psr=0 in order to boot since 4.6.
Created attachment 128582 [details]
dmesg with drm.debug=0x1e log_buf_len=1M
Different laptop with 4.9 kernel: flood of WARN_ON(!msg->buffer != !msg->size) on Dell Latitude E7440 during boot.
The patch http://patchwork.freedesktop.org/patch/msgid/20160921135108.29574-3-chris@chris-wilson.co.uk applied to 4.9 solved my problem. (In reply to Martin Ziegler from comment #13) > The patch > > > http://patchwork.freedesktop.org/patch/msgid/20160921135108.29574-3- > chris@chris-wilson.co.uk > > applied to 4.9 solved my problem. That should be totally unrelated. If I disable: CONFIG_DRM_FBDEV_EMULATION, CONFIG_FB and CONFIG_FRAMEBUFFER_CONSOLE the WARN_ON flood is gone, but obviously I no longer see the boot messages. With stable kernel 4.9.3 I can confirm that the warning is gone and the "failed to update link training" error is gone too on Intel HD Graphics 5500 (no external monitor). 4.9.3 includes this commit: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=d4cb3fd9b548b8bfe2a712ec920b9ebabd3547ab I can confirm 4.9.3 fixes the issue for me too. It is gone for me also with 4.9.3. Regards, Péter So based on latest comments, it looks like this is fixed in upstream: resolving as fixed. mwa, please confirm that we can close it. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.