Summary: | [945 pf interrupt] Freezes when compiz enabled | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Geir Ove Myhr <gomyhr> | ||||||||||||||||
Component: | Driver/intel | Assignee: | Carl Worth <cworth> | ||||||||||||||||
Status: | RESOLVED DUPLICATE | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||||||||
Severity: | major | ||||||||||||||||||
Priority: | medium | CC: | bryce, cfeck, tt.hogehoge | ||||||||||||||||
Version: | unspecified | ||||||||||||||||||
Hardware: | x86 (IA32) | ||||||||||||||||||
OS: | Linux (All) | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||
Attachments: |
|
Description
Geir Ove Myhr
2010-01-12 14:29:54 UTC
Created attachment 32598 [details]
Batchbufer dump. i915_regs excluded, since reading it causes hangs on this machine
Created attachment 32599 [details]
BootDmesg.txt
Created attachment 32600 [details]
CurrentDmesg.txt
Created attachment 32601 [details]
PciDisplay.txt
Created attachment 32602 [details]
XorgLog.txt
Created attachment 33630 [details] dri_debug tarball From downstream: Takashi has tested with drm-intel-next kernel that should detect a GPU hang and add information to i915_error_state. Possibly, this is because he uses Ubuntu 9.10 and not 10.04 now. Not sure which packages are relevant. -- from downstream -- I install the latest drm-intel-next kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-next/2010-02-24/ . And I got batchbuffer dump. sudo service apport start force_start=1 mkdir dri_debug-$datestr sudo cp -r /sys/kernel/debug/dri/0/i915* dri_debug-$datestr sudo intel_gpu_dump > dri_debug-$datestr/intel_gpu_dump.txt dmesg > dri_debug-$datestr/dmesg.txt cp /var/log/Xorg.0.log dri_debug-$datestr/ sudo cp /var/log/gdm/\:0.log dri_debug-$datestr/gdm.log sudo tar czf dri_debug-$datestr.tgz dri_debug-$datestr/ The batchbuffer dump is attached. i915_error_state shows "no error state collected". *** Bug 26898 has been marked as a duplicate of this bug. *** Bumping the priority down on this bug, only because we don't expect to have this fixed in time for the release that's coming together right now. -Carl A page-flipping bug. The Q2 release should have most of these fixed, at least the known ones... The kernel patches are upstream as part of 2.6.35-rc4: commit 1afe3e9d4335bf3bc5615e37243dc8fef65dac8f Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Fri Mar 26 10:35:20 2010 -0700 drm/i915: gen3 page flipping fixes Gen3 chips have slightly different flip commands, and also contain a bit that indicates whether a "flip pending" interrupt means the flip has been queued or has been completed. So implement support for the gen3 flip command, and make sure we use the flip pending interrupt correctly depending on the value of ECOSKPD bit 0. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> commit 83f7fd055eb3f1e843803cd906179d309553967b Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Mon Apr 5 14:03:51 2010 -0700 drm/i915: don't queue flips during a flip pending event Hardware will set the flip pending ISR bit as soon as it receives the flip instruction, and (supposedly) clear it once the flip completes (e.g. at the next vblank). If we try to send down a flip instruction while the ISR bit is set, the hardware can become very confused, and we may never receive the corresponding flip pending interrupt, effectively hanging the chip. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> I believe that this fixes the page-flipping issues on i945. (In reply to comment #10) > The kernel patches are upstream as part of 2.6.35-rc4: [...] > I believe that this fixes the page-flipping issues on i945. Seems it didn't the current Ubuntu 10.10 (Maverick) development kernel is based on 2.6.35-rc5 and it still happens. The original reporter says downstream: It still freezes with Maverick (13th July), same as Tomas. $cat /sys/kernel/debug/dri/0/i915_error_state no error state collected dmesg is attatched here. dmesg shows that the freeze is caused by mutex lock. It may be caused by below sequence. 1. mutex locked and not unlocked. 2. DRM_IOCTL_I915_GEM_PREAD wait the lock. 3. dmesg shows error ( 120s after freeze ). Created attachment 37104 [details]
dmesg output with Ubuntu kernel 2.6.35-8.13 (based on -rc5)
(In reply to comment #11) > dmesg is attatched here. > dmesg shows that the freeze is caused by mutex lock. > > It may be caused by below sequence. > 1. mutex locked and not unlocked. > 2. DRM_IOCTL_I915_GEM_PREAD wait the lock. > 3. dmesg shows error ( 120s after freeze ). Or it's a missed interrupt ;-) Want to place a bet? (In reply to comment #13) There were 2 problem, I think. (1) screen lock at any time. (2) screen lock when 3D use (blender, compiz, ...). Both of them show the similar dmesg error (lock time over 120s). But, (1) is unlocked by mouse move or keyboard interrupt. And, (1) might be fixed by 2.6.35-rc4. > Or it's a missed interrupt ;-) I'll bet below :-) 1. mutex locked 2. missed interrupt -> not unlocked. 3. DRM_IOCTL_I915_GEM_PREAD wait the lock. 4. dmesg shows error ( 120s after freeze ). Beyond the usual fixes in 2.6.35, 2.6.36-rc2 contains a patch to fixup missed interrupts, http://cgit.freedesktop.org/~ickle/drm-intel drm-intel-fixes contains a patch for one observed source of missed interrupts and http://cgit.freedesktop.org/~ickle/drm-intel drm-intel-next contains an enhanced hangcheck. I tested 2.6.36-rc3. The Freeze may be fixed. But I got dmesg below. [ 114.364014] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU idle, missed IRQ. [ 352.800012] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU idle, missed IRQ. [ 355.488019] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU idle, missed IRQ. [ 366.724024] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU idle, missed IRQ. [ 114.364014] is caused by blender. Others are caused by blender or compiz. Still need to solve why the interrupt stops firing, but these two bugs have now been reduced to the same problem. *** This bug has been marked as a duplicate of bug 25345 *** |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.