Bug 7274

Summary: Radeon AccelDFS cause screen corruption
Product: xorg Reporter: Marcin Kurek <morgoth6>
Component: Driver/RadeonAssignee: Michel Dänzer <michel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: high Keywords: patch
Version: git   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Example
none
xorg.conf
none
Call the DRM to wait for idle none

Description Marcin Kurek 2006-06-19 14:57:17 UTC
When I enable AccelDFS in recent Radeon drivers (Git) I can observe a tiny
corruption on the screen, mainly on the top of small objects like icons and
sometimes on the desktop bacground.

I am using Pegasos II machine with HIS Radeon 9000 card.
Comment 1 Marcin Kurek 2006-06-19 14:59:33 UTC
Created attachment 5984 [details]
Example

Take a look at the top of the icon near the Ogame.pl or near DRM text (Google
search)
Comment 2 Marcin Kurek 2006-06-19 15:00:20 UTC
Created attachment 5985 [details]
xorg.conf
Comment 3 Michel Dänzer 2006-06-19 23:28:20 UTC
This is more likely a driver issue, and I read the xorg-team list.

Out of curiosity, what kind of x11perf -getimage numbers do you get with AccelDFS?

BTW, I recommend Option "MigrationHeuristic" "Always" with this and exa-damagetrack.
Comment 4 Marcin Kurek 2006-06-20 15:19:45 UTC
Hmm, if you say so. Anyway I can reproduce it only when I enable AccelDFS. 

And about the results I am quite soprised because it seems accelerated function
is slower than non accelerated here. Take a look at the results:

------- AccelDFS OFF --------

Sync time adjustment is 0.0543 msecs.

  80000 reps @   0.0735 msec ( 13600.0/sec): GetImage 10x10 square
  80000 reps @   0.0734 msec ( 13600.0/sec): GetImage 10x10 square
  80000 reps @   0.0735 msec ( 13600.0/sec): GetImage 10x10 square
  80000 reps @   0.0734 msec ( 13600.0/sec): GetImage 10x10 square
  80000 reps @   0.0734 msec ( 13600.0/sec): GetImage 10x10 square
 400000 trep @   0.0734 msec ( 13600.0/sec): GetImage 10x10 square

   8000 reps @   1.2195 msec (   820.0/sec): GetImage 100x100 square
   8000 reps @   1.2158 msec (   822.0/sec): GetImage 100x100 square
   8000 reps @   1.2161 msec (   822.0/sec): GetImage 100x100 square
   8000 reps @   1.2172 msec (   822.0/sec): GetImage 100x100 square
   8000 reps @   1.2151 msec (   823.0/sec): GetImage 100x100 square
  40000 trep @   1.2167 msec (   822.0/sec): GetImage 100x100 square

    120 reps @  44.0548 msec (    22.7/sec): GetImage 500x500 square
    120 reps @  43.9318 msec (    22.8/sec): GetImage 500x500 square
    120 reps @  44.0359 msec (    22.7/sec): GetImage 500x500 square
    120 reps @  44.0629 msec (    22.7/sec): GetImage 500x500 square
    120 reps @  43.9520 msec (    22.8/sec): GetImage 500x500 square
    600 trep @  44.0075 msec (    22.7/sec): GetImage 500x500 square

------- AccelDFS ON --------

  80000 reps @   0.0832 msec ( 12000.0/sec): GetImage 10x10 square
  80000 reps @   0.0822 msec ( 12200.0/sec): GetImage 10x10 square
  80000 reps @   0.0833 msec ( 12000.0/sec): GetImage 10x10 square
  80000 reps @   0.0822 msec ( 12200.0/sec): GetImage 10x10 square
  80000 reps @   0.0832 msec ( 12000.0/sec): GetImage 10x10 square
 400000 trep @   0.0828 msec ( 12100.0/sec): GetImage 10x10 square

   8000 reps @   0.7950 msec (  1260.0/sec): GetImage 100x100 square
   8000 reps @   0.7954 msec (  1260.0/sec): GetImage 100x100 square
   8000 reps @   0.7950 msec (  1260.0/sec): GetImage 100x100 square
   8000 reps @   0.7953 msec (  1260.0/sec): GetImage 100x100 square
   8000 reps @   0.7952 msec (  1260.0/sec): GetImage 100x100 square
  40000 trep @   0.7952 msec (  1260.0/sec): GetImage 100x100 square

    200 reps @  41.6941 msec (    24.0/sec): GetImage 500x500 square
    200 reps @  45.9822 msec (    21.7/sec): GetImage 500x500 square
    200 reps @  45.0916 msec (    22.2/sec): GetImage 500x500 square
    200 reps @  44.2984 msec (    22.6/sec): GetImage 500x500 square
    200 reps @  46.4023 msec (    21.6/sec): GetImage 500x500 square
   1000 trep @  44.6937 msec (    22.4/sec): GetImage 500x500 square

What do you think ? I must say that when I first hear about EXA I was a bit
excited, but it's still hard to use it. EXA enabled xserver feels faster when
there is nothing on the screen, but there is enough to open kterm + firefox with
 page with many graphics to make system almost freeze for couple of seconds.

XAA works much better here. I think this can be CPU related because with EXA
xorg server can consume CPU cycles like hell (80% - 100%) sometimes.

What setup is recomended to use EXA ?
Comment 5 Marcin Kurek 2006-06-20 15:22:03 UTC
Not 'is slower' but 'are slower' and of coz I mean only the 10x10 case. Also can
my corruption problems are related to that ? I can observe it only on small
objects like icons, etc.
Comment 6 Michel Dänzer 2006-06-21 02:52:09 UTC
(In reply to comment #4)
> 
> EXA enabled xserver feels faster when there is nothing on the screen, but
> there is enough to open kterm + firefox with page with many graphics to make
> system almost freeze for couple of seconds.

Please try and profile this. I noticed that KDE seems much better at triggering
pathological behaviour with exa-damagetrack than GNOME (which I use normally),
but it's quite snappy for me as well in quick testing.

> XAA works much better here. 

With compositing? ;) Seriously though, this is all about choice, pick whichever
suits you best.

> I think this can be CPU related because with EXA xorg server can consume CPU
> cycles like hell (80% - 100%) sometimes.

I have the same CPU (and generally pretty similar hardware) and am not seeing
this. Could be pathological behaviour mentioned above. Also, are you still
running conky or something like that?

> What setup is recomended to use EXA ?

I'm currently using AccelDFS, exa-damagetrack and MigrationHeuristic "Always".


(In reply to comment #5)
> Not 'is slower' but 'are slower' and of coz I mean only the 10x10 case. 

That's not too surprising, as accelerated DFS (just like any accelerated op)
involves some overhead. The less data to transfer, the bigger the overhead, as
the additional cost is more or less constant. I'm getting even slightly lower
numbers, still I feel it's a net win for me. If you feel differently, just
disable it. It'll have to be disabled by default anyway unless we can solve the
corruption issues. BTW, the patch from bug 6772 seems to have an impact on the
corruption for me, can you confirm that?

I was hoping you'd get higher numbers for -getimage500 though, as in contrast to
AGP, PCI is cache coherent, just like PCIe, and a PCIe X550 in an AMD64 machine
yields several 100/sec (but less than 10/sec with AccelDFS off!).

> Also can my corruption problems are related to that ? I can observe it only on
small
> objects like icons, etc.

Probably the smaller the transfer, the more likely the failure condition behind
the corruption gets triggered.
Comment 7 Marcin Kurek 2006-06-21 05:44:37 UTC
> Please try and profile this. I noticed that KDE seems much better at triggering
> pathological behaviour with exa-damagetrack than GNOME (which I use normally),
> but it's quite snappy for me as well in quick testing.

I can try again. Previously performance was eaten by fbBlt calls, but after
system refresh I can't see fbBlt in oprofile log. We will see ...

> With compositing? ;) Seriously though, this is all about choice, pick whichever
> suits you best.

Without of coz :) I doesn't need eyecandy, but I still like EXA for example
because of Wesnoth. I like to play from time to time and on EXA scrolling the
map in this game is *MUCH* faster and smoother.

> BTW, the patch from bug 6772 seems to have an impact on the corruption for me,
can you confirm that?

I am using this one quite long time now without any problems. I reverted it now,
but the corruptions are still there with AccelDFS enabled.

> I was hoping you'd get higher numbers for -getimage500 though, as in contrast to
> AGP, PCI is cache coherent, just like PCIe, and a PCIe X550 in an AMD64 machine
> yields several 100/sec (but less than 10/sec with AccelDFS off!).

As far I know GFX card DMA transfers are realy slow on Pegasos machine maybe
this is a reason. Or there is another problem here.
Comment 8 Marcin Kurek 2006-06-21 05:48:55 UTC
Ahhh, I forgot. I work now with EXA enabled and It seems AccelDFS feels faster,
but benchmarks doesn't show that. 
Comment 9 Michel Dänzer 2006-06-21 06:22:43 UTC
(In reply to comment #7)
> 
> Previously performance was eaten by fbBlt calls, but after system refresh I
> can't see fbBlt in oprofile log. We will see ...

Thanks. May I ask again whether you're still running a 'performance eater' such
as conky?

> Without of coz :) I doesn't need eyecandy, [...]

The thing is, EXA is kind of geared towards compositing. Have you tried enabling
KDE's compositing manager recently?

> but I still like EXA for example because of Wesnoth. I like to play from time
> to time and on EXA scrolling the map in this game is *MUCH* faster and
> smoother.

See, it's not all bad. ;)


> > BTW, the patch from bug 6772 seems to have an impact on the corruption for me,
> can you confirm that?
> 
> I am using this one quite long time now without any problems. I reverted it now,
> but the corruptions are still there with AccelDFS enabled.

Then it was probably one of my radeon driver hacks. :) I'll keep playing with
those to see which one made a difference, maybe that'll give me ideas what the
problem could be.


(In reply to comment #8)
> Ahhh, I forgot. I work now with EXA enabled and It seems AccelDFS feels
faster, but benchmarks doesn't show that. 

Glad to hear it. There's a saying that goes something like "There's lies, damned
lies, and benchmarks". :)
Comment 10 Marcin Kurek 2006-06-21 09:26:08 UTC
> Thanks. May I ask again whether you're still running a 'performance eater' such
as conky?

I stop using conky now and get back to root-tail to diplay system log's on
desktop. Anyway 'performance eater' sounds realy funny when we talk about simple
application displays some text on desktop. And remeber the problems are only
when conky uses doublebufer output, singlebufered blinking mode doesn't show any
problems with XAA and EXA.

> The thing is, EXA is kind of geared towards compositing. Have you tried enabling
KDE's compositing manager recently?

I try with xcompmgr -a some time ago and it seems to make things much faster,
but it makes things also MUCH worse when there is a lot of windows opened. I try
now and it is definitly better than beffore, but ...

> See, it's not all bad. ;)

I know it can be worse than now :) Then I guess it's not.

> Then it was probably one of my radeon driver hacks. :) I'll keep playing with
> those to see which one made a difference, maybe that'll give me ideas what the
> problem could be.

Hope to hear something about that soon.

> Glad to hear it. There's a saying that goes something like "There's lies,
damned lies, and benchmarks". :)

HeHe, I definitly need to remember that ;)
Comment 11 Marcin Kurek 2006-06-21 13:01:18 UTC
Hmmm, also maybe this is a good place to ask about two things.

First is backingstore. Should be enabled or disabled ? I read about this in 
many places and sometimes it says this increase the performance and sometimes 
it decrease  the performance. What is for real ? 

Second about ColorTiling on bigendiam machines, I read it required some 
overhead on BE and can slow things down is that true ?

I think there is good idea to add informations about that to radeon man page 
for example. What do you think ?
Comment 12 Michel Dänzer 2006-06-23 01:46:31 UTC
Created attachment 6027 [details] [review]
Call the DRM to wait for idle

Please try this patch, it seems to fix this here. No time to answer your
questions right now, I'll try to get back to them when I find the time.
Comment 13 Marcin Kurek 2006-06-23 07:59:04 UTC
Yes, no more corruption after apply this one. 
Comment 14 Michel Dänzer 2006-06-24 07:06:15 UTC
Fixed in git. Thanks for testing.

(In reply to comment #11)
> Hmmm, also maybe this is a good place to ask about two things.

Not really, a mailing list would have been better.

> First is backingstore. Should be enabled or disabled ?

Disabled. The current implementation is known to perform badly and be buggy.
Compositing is currently the better solution for most if not all intents and
purposes of backing store, in fact if the current backing store issues are ever
to be addressed, it's most likely going to be by re-writing it to use Composite.

> Second about ColorTiling on bigendiam machines, I read it required some 
> overhead on BE and can slow things down is that true ?

It incurs overhead when the X server needs to access an offscreen pixmap with
the CPU, i.e. during a software fallback. This overhead can be avoided with
Option "MigrationHeuristic" "Always".

OTOH, ColorTiling is usually a clear win for 3D rendering.

> I think there is good idea to add informations about that to radeon man page 
> for example. What do you think ?

Backing store is driver independent, and the ColorTiling overhead could be
mostly eliminated by improving the driver, so I'm not sure that's a good solution.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.