Summary: | [radeon-rewrite] crashes server through radeonRefillCurrentDmaRegion | ||
---|---|---|---|
Product: | Mesa | Reporter: | Tormod Volden <bugzi11.fdo.tormod> |
Component: | Drivers/DRI/r300 | Assignee: | Default DRI bug account <dri-devel> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | fatih, lowell87, pavel, pedretti.fabio |
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
full backtrace of Xorg
Xorg log with backtrace gdm log with mismatch and assertion full backtrace of Xorg Additional backtrace Torcs crashing when trying to tart the race |
Description
Tormod Volden
2009-05-05 15:22:31 UTC
Created attachment 25524 [details]
Xorg log with backtrace
same trigger. same backtrace. same stack - except for the kernel being 2.6.28. (jaunty default) *** Bug 21618 has been marked as a duplicate of this bug. *** The problem is in radeonRefillCurrentDmaRegion: we call radeon_revalidate_bos which calls radeonFlush which frees rmesa->dma.current (only if some conditions are met) so we end up dereferencing null pointer by radeon_bo_map. There are two solutions: - remove radeon_revalidate_bos from radeonRefillCurrentDmaRegion, - check if rmesa->dma.current is null after calling radeon_revalidate_bos and create new bo if necessary Both solutions proved to be working, unfortunately I don't know which one is the correct one. If Jerome Glisse doesn't know too we probably have to wait for Dave Airlie to decide. I naively commented out the radeon_revalidate_bos line, but then Xorg crashes at startup. Is the workaround more complicated? This bug keeps me from testing radeon-rewrite much, so a temporary workaround would be most welcome if the real fix has to wait. (In reply to comment #5) > I naively commented out the radeon_revalidate_bos line, but then Xorg crashes > at startup. Is the workaround more complicated? This bug keeps me from testing > radeon-rewrite much, so a temporary workaround would be most welcome if the > real fix has to wait. > Can you post a backtrace of where it is crashing with commented out radeon_revalidate_bos call? It did not crash now. Last time, I had tested it against 76a64958a4ca38ec27b63a909979c493c507b952 so it was probably compiz and bug 21776 that kicked in and got me confused. I am not 100% sure this is relevant, but when I do the window cycling with alt-tab in compiz, there is now some lag and it sometimes hangs for up to a few seconds on my M26 card. On my RV515 there is no lag. Commit a13e96359baaa0331561f86ef6487feba6540464 should bring definitive fix for this issue please reopen if it's not the case. I am afraid I still see the original problem even after updating to 7dd184dc4da37233471875df6f40cce0560cb7bc. This time 9b1efcb87c794ded9306f01336d48a80aaad3261 (commit just after the one you tested last) fix the issue :), once again if it's not the case reopen. Created attachment 26207 [details]
gdm log with mismatch and assertion
With 9dee2f20... I don't get the same backtrace, but X dies and I can only find some errors and a failed assertion in the gdm log:
CS section size missmatch start at (r300_cmdbuf.c,emit_tex_offsets,182) 4 vs 2
CS section end at (r300_cmdbuf.c,emit_tex_offsets,202)
X: radeon_common.c:1008: radeon_validate_bo: Assertion `radeon->state.validated_bo_count < 24' failed.
BTW, there was also a "failed to revalidate buffers" in between the mismatch errors in the log. Created attachment 26254 [details]
full backtrace of Xorg
The mismatch messages come all the time, they are not causing the crash.
Created attachment 26265 [details]
Additional backtrace
Not sure if additional backtraces will be helpful on this or not... mine looks slightly different.
Is there a way to track down what X command is being sent to cause this bug? I have a specific button that causes this crash every time I click it... Can I trace the X protocol between client/server to get additional info to help track this down?
This is different bug, Tormod are you using KMS ? Does gdm crash with compiz enabled ? disabled ? I can't reproduce the bug here with kms or not. Is it with rv515 ? Others ? I pushed change to how we emit texture offset in r300, please test with 2f9189d538ac56bd241ccc8f8f82bc4fdd779aa6 and report if it helps for this new issue. > This is different bug, Tormod are you using KMS ? Does gdm crash with compiz
> enabled ? disabled ? I can't reproduce the bug here with kms or not. Is it with
> rv515 ? Others ?
It appears the same as initially reported, the backtrace has only changed to issue an assert instead of an SEGV:
"As soon as I press Alt-Tab to cycle windows (running compix), X crashes.
This is with latest radeon-rewrite and -ati driver on Ubuntu 9.04 with
2.6.30-rc based kernel."
So I am not using KMS, and gdm does not crash. It happens on M26 and has never happened on RV515.
I have to correct what I said before on lag: I sometimes do see this lag on RV515 also. (Similar to the lag I saw on M26 when I worked around the crash by commenting out radeon_revalidate_bos.) So the lag is likely unrelated. Just that every time it lags, my heart jumps and I think it is the crash kicking in :)
I will try 2f9189d538ac56bd241ccc8f8f82bc4fdd779aa6 later today. Thanks!
I could test 2f9189d5 on RV515 now, and alt-tab (with compiz) makes it crash: X: radeon_common.c:1008: radeon_validate_bo: Assertion `radeon->state.validated_bo_count < 24' failed. On the good side of things, the mismatch messages are gone now. Tested 5dcbcbfca4f3c00de1fdab28d1cc8d691f67edce on both RV515 and M26 and got the same assertion failure. Did you restarted Xorg after installing lastest rewrite lib ? Also does Xorg load the new driver ? I have no luck reproducing your bug, how much window do you have open ? Which software ? I test with compiz + firefox + midori + several terminal all running at the same time and cycling through window with alt-tab does work properly no crash. Yes, I always restart X after installing the new mesa. I make distribution packages so I am sure the new mesa overwrites the old and there is only one version installed on my machine at any time. To reproduce my setup you can boot a Ubuntu 9.04 live CD and then install these packages on top of it: https://launchpad.net/~xorg-edgers/+archive/radeon After logging in to the default Gnome session, I open two gnome-terminal windows and press alt-tab. I have noticed that I sometimes can swap windows without crashing if I press alt-tab for only a very short moment. But if I keep it down to get the window selector displayed, it crashes. On what cards have you tried? I have an M10 and I'm also using Tormod's radeon-rewrite packages. I'm able to reproduce the issue consistently using Amarok (KDE music player) package version 2.0.2mysql5.1.30-0ubuntu3. I can cause the crash by opening up Amarok, and bringing up the "Collection" panel. As soon as as I click the "Advanced" button (part of the search interface) at the top of that panel, Xorg crashes with the backtrace I previously provided. 01:00.0 VGA compatible controller [0300]: ATI Technologies Inc RV350 [Mobility Radeon 9600 M10] [1002:4e50] Subsystem: IBM Device [1014:0550] I don't remember if I was running the stock 2.28.12-generic kernel, or the 2.6.29-02062902-generic kernel at the time I got this backtrace. If there is some way to get some additional state information, or somehow trace the X client/server communications to determine what command/request causes this, let me know. I could not reproduce this bug, unfortunately. System: - Ubuntu 9.04 (Jaunty) - Standard kernel - Packages from: * deb http://ppa.launchpad.net/xorg-edgers/radeon/ubuntu jaunty main * deb http://ppa.launchpad.net/tormodvolden/ppa/ubuntu jaunty main Compiz is running, glxinfo confirms that the system is running radeon-rewrite in DRI1/non-KMS mode. Neither Alt+Tab nor the Amarok steps mentioned in comment #23 crash anything. Graphics card is a Radeon X1650 Pro, connected via AGP (PCI ID 1002:71c1, should be an RV530/RV535 if I recall correctly) Following up on #24: Replaced the graphics card with a Radeon 9700 Pro (R300), still no crashes with the same system setup. I wouldn't be surprised if these issues are very card specific, since I originally did not see crashes on RV515. I might add that I can not reproduce when using KMS and DRI2. The mesa version is the same, but the DDX is then glisse's latest and libdrm has a patch from zhasha for libdrm-radeon. Just confirming that this bug now is in latest git master. Any ideas how I can debug this or provide useful information? X: radeon_common.c:1008: radeon_validate_bo: Assertion `radeon->state.validated_bo_count < 32' failed. Also the game sauerbraten has this problem, when using the aqueducts map. It crashes with: sauer_client: radeon_common.c:1008: radeon_validate_bo: Assertion `radeon->state.validated_bo_count < 32' failed. Aborted As suggested in IRC I tried changing RADEON_MAX_BOS to some bigger value (in radeon_common_context.h) but I get the assertion also with 64. I'm also seeing the assertion failure, I have to use a pre-radeon-rewrite master snapshot for compiz... Do those who can't reproduce it build the driver with --enable-debug? > Do those who can't reproduce it build the driver with --enable-debug? Nicolai can maybe comment on this himself, but in comment 24 he tested the same binaries as I did. this is probably only happens on Mxx series. I experience this bug and have an M56GL, tormod has an M26 and lowell an M10. I could also reproduce it on my RV515. FYI, the compiz effect on alt-tab which crashes is the "Static Application Switcher". It has a "mipmap" option (in compizconfig-settings-manager) but disabling it does not help. OTOH the "Application Switcher" does not crash. Sometimes I can switch windows successfully although with some lag. There is a "WARNING! Falling back to software for invalid buffers" message which can be correlated to this but I am not sure. Created attachment 26956 [details]
Torcs crashing when trying to tart the race
Torcs is causing same assertion failure but is it same bug?
I can reproduce this every time using DRI2 and git master of mesa. (r280 hw)
(In reply to comment #35) > Torcs is causing same assertion failure but is it same bug? Apparently not, compiz works for me with Dave's latest fix from master. Let's track the torcs problem or any other remaining issues in separate reports. I can confirm that everything now works perfectly with latest git. Thanks a lot! I am still having the "Assertion `radeon->state.validated_bo_count < 32' failed." problem with sauerbraten (which appears to be the same bug of torcs reported by Pauli). Bug filed at https://bugs.freedesktop.org/show_bug.cgi?id=22438 . |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.