Bug 55723

Summary: Fallback problem in exaGlyphs
Product: xorg Reporter: Simon Hall <simonjhall>
Component: Server/Acceleration/EXAAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED DUPLICATE QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: simonjhall
Version: 7.7 (2012.06)   
Hardware: ARM   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Text corruption on the desktop and web browser
none
Menu corruption
none
Stack trace to show where the accelerated add is being called from
none
Track pixmaps more explicitly none

Description Simon Hall 2012-10-07 11:33:27 UTC
Created attachment 68199 [details]
Text corruption on the desktop and web browser

Hello,

I appear to have a similar (or the same) problem as bug 47266. As I can actually see what's going wrong in gdb I figured I'd best stick it as a separate tracker.

I am writing an driver for the Raspberry Pi computer, and am doing my acceleration via the EXA framework. I'm nearly finished, just ironing out some kinks.

For composite, I have provided the ability to accelerate most 24/32-bit RGBA in, over and add operations. I also support 8-bit alpha operations too. No support for 15- or 16-bit. At the moment I have also not provided the ability to accept solid sources and so just reject them (ie where PrepareComposite is called with either the source or destination PixmapPtr being null).

Anyway whilst running the server in 565 colour mode I see corruption just like in 47266. Examples attached.
This corruption occurs whenever a line of text happens to request an add operation with a A8 source, no mask and an A8 destination. The rest of the image is composed with pixel formats I don't support. This add operation is the call to exaGlyphsToMask at line 822 ish.

I accelerate this call and do the work *synchronously* within DoneComposite (stack trace attached). The finished mask lives in offscreen memory.

The problem then comes with the subsequent call at the end of the function to CompositePicture->damageComposite->exaComposite. The operation that it's trying to do, I don't support (wrong pixel format). I accept it CheckComposite, (called via exa_render.c:1006, exaTryDriverComposite), the mask and destination get migrated into offscreen memory (where they both already were...so no work is actually done) and then PrepareComposite gets called, where I reject it based on the pixel format.

exaComposite then tries again with exaTryMagicTwoPassCompositeHelper which of course fails too. The fallback then kicks in and ExaCheckComposite is called. This calls ExaPrepareCompositeReg that falls out with a FALSE via the if (!ret) route. Nothing is migrated back to system memory. ps->Composite then calls fbComposite with the *system memory* version of the mask which contains just junk.

I can demonstrate there's no migration back to system memory by going back to the original call to the exaGlyphsToMask function in exaGlyphs.

If I change it to,
    if (buffer.count) {
        if (maskFormat)
        {
            exaGlyphsToMask(pMask, &buffer);
            PixmapPtr p = exaGetDrawablePixmap(pMask->pDrawable);
            if (p)
            	exaMoveOutPixmap(p);
        }
        else
            exaGlyphsToDst(pSrc, pDst, &buffer);
    }
everything looks perfect. Of course it's not the correct fix, though ;)

Any help?

NB: it's Xorg 7.7+1 and xorg-server 1.12.3.902 with whatever patches Debian Wheezy adds.

Cheers
Simon
Comment 1 Simon Hall 2012-10-07 11:34:10 UTC
Created attachment 68200 [details]
Menu corruption
Comment 2 Simon Hall 2012-10-07 11:34:50 UTC
Created attachment 68201 [details]
Stack trace to show where the accelerated add is being called from
Comment 3 Michel Dänzer 2012-10-08 15:13:10 UTC
(In reply to comment #3)
> [...] ExaCheckComposite is called. This calls ExaPrepareCompositeReg that falls
> out with a FALSE via the if (!ret) route.

Sure about that? It would mean the composite region is empty according to miComputeCompositeRegion(), and the actual compositing to the destination drawable is skipped. I'd expect just nothing to be visible in that case instead of garbage.

> Nothing is migrated back to system memory. ps->Composite then calls fbComposite
> with the *system memory* version of the mask which contains just junk.

From the rest of your description, it sounds like maybe the damage region for the intermediate mask is miscalculated / -translated somehow between exaCompositeRects and ExaPrepareCompositeReg.
Comment 4 Simon Hall 2012-10-08 20:55:32 UTC
> Sure about that? It would mean the composite region is empty according to
> miComputeCompositeRegion(), and the actual compositing to the destination
> drawable is skipped. I'd expect just nothing to be visible in that case
> instead of garbage.

Oops, my bad. You're right it returns true from miComputeCompositeRegion (pRegion= {extents = {x1 = 86, y1 = 311, x2 = 160, y2 = 325}, data = 0x0}).

Execution then continues to exaCopyDirty. Is the damage wrong?

(gdb) print *pExaPixmap
$8 = {area = 0x2a3521a0, score = 1001, use_gpu_copy = 1,
  sys_ptr = 0x4296b080 "<snip>", <incomplete sequence \367>..., sys_pitch = 76,
  fb_ptr = 0x41af71e8 "", fb_pitch = 74, fb_size = 1036, accel_blocked = 0, pDamage = 0x2a351f28, validSys = {extents = {
      x1 = 0, y1 = 0, x2 = 0, y2 = 0}, data = 0x2a1b38b0}, validFB = {extents = {x1 = 0, y1 = 0, x2 = 74, y2 = 14},
    data = 0x0}, driverPriv = 0x0}

(gdb) print *pExaPixmap->pDamage
$9 = {pNext = 0x0, pNextWin = 0x0, damage = {extents = {x1 = 0, y1 = 0, x2 = 0, y2 = 0}, data = 0x2a1b38b0},
  damageLevel = DamageReportNone, isInternal = 1, closure = 0x4296b000, isWindow = 0, pDrawable = 0x4296b000,
  damageReport = 0, damageReportPostRendering = 0, damageDestroy = 0, damageMarker = 0, reportAfter = 1, pendingDamage = {
    extents = {x1 = 0, y1 = 0, x2 = 0, y2 = 0}, data = 0x2a1b38b0}, backupDamage = {extents = {x1 = 0, y1 = 0, x2 = 0,
      y2 = 0}, data = 0x0}, pScreen = 0x2a1db550, devPrivates = 0x0}


(gdb) print *pExaPixmap->area
$10 = {base_offset = 18813416, offset = 18813416, size = 1036, last_use = 1596, privData = 0x4296b000,
  save = 0x4064052c <exaPixmapSave>, state = ExaOffscreenRemovable, next = 0x2a341d90, eviction_cost = 0,
  prev = 0x2a33d908, align = 1}


(gdb) print *migrate
$12 = {as_dst = 0, as_src = 1, pPix = 0x4296b000, pReg = 0x2a1dc5f0}
(gdb) print *migrate->pReg
$13 = {extents = {x1 = 0, y1 = 0, x2 = 0, y2 = 0}, data = 0x2a1b38b0}
(gdb) print *migrate->pPix
$14 = {drawable = {type = 1 '\001', class = 0 '\000', depth = 8 '\b', bitsPerPixel = 8 '\b', id = 0, x = 0, y = 0,
    width = 74, height = 14, pScreen = 0x2a1db550, serialNumber = 3013}, devPrivates = 0x4296b030, refcnt = 2,
  devKind = 74, devPrivate = {ptr = 0x0, val = 0, uval = 0, fptr = 0}, screen_x = 0, screen_y = 0, usage_hint = 1}

nbox is then computed as zero as CopyReg is full of zeroes.
Why then does my workaround work then? That calls the exaDoMoveOutPixmap...?
Comment 5 Simon Hall 2012-10-08 21:24:26 UTC
If I dump out the pixmap info directly after the call to exaGlyphsToMask returns I get,
(gdb) print *pExaPixmap
$6 = {area = 0x2a352148, score = 1001, use_gpu_copy = 1, sys_ptr = 0x42beb080 "", sys_pitch = 76, fb_ptr = 0x41b14104 "",
  fb_pitch = 74, fb_size = 1036, accel_blocked = 0, pDamage = 0x2a351de8, validSys = {extents = {x1 = 0, y1 = 0, x2 = 0,
      y2 = 0}, data = 0x2a1b38b0}, validFB = {extents = {x1 = 0, y1 = 0, x2 = 74, y2 = 14}, data = 0x0}, driverPriv = 0x0}

and here's the damage!

$7 = {pNext = 0x0, pNextWin = 0x0, damage = {extents = {x1 = 0, y1 = 0, x2 = 74, y2 = 14}, data = 0x0},
  damageLevel = DamageReportNone, isInternal = 1, closure = 0x42beb000, isWindow = 0, pDrawable = 0x42beb000,
  damageReport = 0, damageReportPostRendering = 0, damageDestroy = 0, damageMarker = 0, reportAfter = 1, pendingDamage = {
    extents = {x1 = 0, y1 = 0, x2 = 0, y2 = 0}, data = 0x2a1b38b0}, backupDamage = {extents = {x1 = 0, y1 = 0, x2 = 0,
      y2 = 0}, data = 0x0}, pScreen = 0x2a1db550, devPrivates = 0x0}
Comment 6 Simon Hall 2012-10-08 21:48:22 UTC
Final post for today - yes, the damage looks valid at that point. This now makes a bit more sense in the context of my original report,
- exaGlyphsToMask renders the mask to offscreen memory, sets the damage
- exaComposite is called
-- my CheckComposite is called, and passes
-- exaDoMigration is called (exa.c:1145), which calls exaDoMoveInPixmap, which calls exaCopyDirtyToFb (and calls exaCopyDirty)
-- in exaCopyDirty the use_gpu_copy is set, the damage looks valid, as does validFB and validSys is zero. migrate->pReg is null.
-- the damage is then cleared after doing the RegionUnion+RegionSubract.
-- CopyReg contains nothing, and nbox is zero - no copying is done
-- my PrepareComposite is called, and fails (as I don't support the pixel format)
- exaComposite then calls the fallback, ExaCheckComposite
-- the fallback calls prepare on pMask, to migrate back to system memory
-- pMask has zero damage, nothing is migrated
-- the fallback then calls pixman with a garbage mask.
Comment 7 Michel Dänzer 2012-10-09 14:08:47 UTC
(In reply to comment #6)
> -- exaDoMigration is called (exa.c:1145), which calls exaDoMoveInPixmap,
> which calls exaCopyDirtyToFb (and calls exaCopyDirty)
> -- in exaCopyDirty the use_gpu_copy is set, the damage looks valid, as does
> validFB and validSys is zero. migrate->pReg is null.
> -- the damage is then cleared after doing the RegionUnion+RegionSubract.
> -- CopyReg contains nothing, and nbox is zero - no copying is done
> -- my PrepareComposite is called, and fails (as I don't support the pixel
> format)
> - exaComposite then calls the fallback, ExaCheckComposite
> -- the fallback calls prepare on pMask, to migrate back to system memory
> -- pMask has zero damage, nothing is migrated

No damage alone can't explain why nothing is migrated: CopyReg is the destination valid region subtracted from the source valid region. After the above scenario, the destination valid region (validSys) should be empty, so CopyReg should equal validFB, which should cover the whole pixmap.

Maybe maskReg is an empty region (or one that doesn't overlap the pixmap extents) at ExaPrepareCompositeReg:564? That would clear CopyReg at exaCopyDirty:201.
Comment 8 Simon Hall 2012-10-15 20:04:31 UTC
Hello Michel, cheers again for the help.
(sorry for the delay)

You're right - maskReg is empty (0, 0, 0, 0). However I can't see how it can't not be empty! Since I'm sure I've got it wrong...please correct me and I'll debug further.

in ExaPrepareCompositeReg, maskReg is set using this:
if (pMask && pMask->pDrawable) {
        pMaskPix = exaGetDrawablePixmap(pMask->pDrawable);
        RegionNull(&pExaScr->maskReg);
        maskReg = &pExaScr->maskReg;
        if (pMask != pDst && pMask != pSrc)
            RegionTranslate(pMask->pCompositeClip,
                            -pMask->pDrawable->x, -pMask->pDrawable->y);
    }

The region gets cleared with RegionNull, and then maskReg points to that. RegionTranslate does not touch maskReg (right?).

AFAIK the next time maskReg is used is in ExaPrepareCompositeReg a little further down (our line numbers appear to differ).

if (pMaskPix)
        pExaScr->prepare_access_reg(pMaskPix, EXA_PREPARE_MASK, maskReg);

maskReg is still (0, 0, 0, 0) at this point.
When this then makes it through to exaCopyDirty, 

if (migrate->pReg)
            RegionIntersect(&CopyReg, &CopyReg, migrate->pReg);

effectively clears CopyReg. Before that call it was  CopyReg = {extents = {x1 = 0, y1 = 0, x2 = 74, y2 = 14}, data = 0x0}.

Where should the original maskReg/pMask->maskReg be set to something 'interesting'?
Comment 9 Michel Dänzer 2012-10-16 10:42:34 UTC
(In reply to comment #8)
> Where should the original maskReg/pMask->maskReg be set to something
> 'interesting'?

In ExaSrcValidate(), called from miComputeCompositeRegion().
Comment 10 Simon Hall 2012-10-16 19:17:17 UTC
Ok, in ExaSrcValidate it the RegionRec which gets set is pExaSrc->srcReg, not maskReg. This is because,

dst = (pExaScr->srcPix == pPix) ? &pExaScr->srcReg : &pExaScr->maskReg;

the == is true. Really it should evaluate to false, as ->maskReg is the thing that gets set in ExaPrepareCompositeReg!

If you scroll down a bit to ExaPrepareCompositeReg, srcPix is set inside the if(pSrc->pDrawable) block. However, since in this case the drawable is null as the source is a solid. The block is not entered and the if (pMask && pMask->pDrawable) is correctly entered.

As this appears to be the only place srcPix is set, would this mean a previous composite operation (where the same pixmap was actually the source) has set the value and left it as-is? Unfortunately the previous operation's pExaScr->srcReg value does not appear to be correct (only pExaScr->srcPix is correct).
Comment 11 Michel Dänzer 2012-10-17 15:46:47 UTC
Created attachment 68718 [details] [review]
Track pixmaps more explicitly

Does this patch help?
Comment 12 Simon Hall 2012-10-18 19:51:11 UTC
Looking good! Thanks Michel.
Out of interest, what happens - either with the original code or the new code - if both pSrc->pDrawable and pMask->pDrawable are valid? ExaSrcValidate can only choose one of them?
Comment 13 Michel Dänzer 2012-10-19 11:37:28 UTC
(In reply to comment #12)
> Out of interest, what happens - either with the original code or the new
> code - if both pSrc->pDrawable and pMask->pDrawable are valid?
> ExaSrcValidate can only choose one of them?

No, it's called for both of them, and both versions of the code should handle this correctly.
Comment 14 Simon Hall 2012-10-19 21:28:01 UTC
Cool, well I've had literally no corruption since I've been using this patch. Also a corruption elsewhere I've been seeing for six months has gone away!

Happy to close?
Comment 15 Michel Dänzer 2012-10-22 17:09:28 UTC
(In reply to comment #14)
I'm glad to hear the patch fixes your corruption.

Once the fix lands in Git, this bug can be resolved as a duplicate of bug 47266.

Thanks a lot for your help getting to the bottom of this!
Comment 16 Michel Dänzer 2012-10-29 16:14:53 UTC

*** This bug has been marked as a duplicate of bug 47266 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.