Summary: | OpenCL/Clover: AMD Turks: corrupt output buffer (depending on dimension order?) | ||
---|---|---|---|
Product: | Mesa | Reporter: | Dave Gilbert <freedesktop> |
Component: | Drivers/Gallium/r600 | Assignee: | mesa-dev |
Status: | RESOLVED MOVED | QA Contact: | mesa-dev |
Severity: | normal | ||
Priority: | medium | CC: | vedran |
Version: | 17.2 | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Bug Depends on: | |||
Bug Blocks: | 99553 | ||
Attachments: |
foo.ll from debug run
foo.link-0.asm foo.link-0.ll annotated asm dump |
Description
Dave Gilbert
2017-11-06 00:20:53 UTC
can you run using CLOVER_DEBUG=llvm,native CLOVER_DEBUG_FILE=foo and attach both llvm and isa dumps? Created attachment 135311 [details]
foo.ll from debug run
Created attachment 135312 [details]
foo.link-0.asm
Created attachment 135313 [details]
foo.link-0.ll
That's all 3 of the debug files it produced.
(I wasn't sure which were the llvm and which the isa dumps; I guess the asm is the isa? and the ll's are both llvm dumps?)
(In reply to Dave Gilbert from comment #4) > Created attachment 135313 [details] > foo.link-0.ll > > That's all 3 of the debug files it produced. > (I wasn't sure which were the llvm and which the isa dumps; I guess the asm > is the isa? and the ll's are both llvm dumps?) yes. the first .ll is from compilation step, the other one is from linking step. .ll dump looks correct. .asm also looks correct. you can try producing multiple asm dumps for working and non-working runs. But I don't think that the llvm is the culprit here. Can you try waiting for the kernel execution to complete explicitly before mapping the buffer? Ideally call clFinish() on line 63. (In reply to Jan Vesely from comment #5) > (In reply to Dave Gilbert from comment #4) > > Created attachment 135313 [details] > > foo.link-0.ll > > > > That's all 3 of the debug files it produced. > > (I wasn't sure which were the llvm and which the isa dumps; I guess the asm > > is the isa? and the ll's are both llvm dumps?) > > yes. the first .ll is from compilation step, the other one is from linking > step. > > .ll dump looks correct. > .asm also looks correct. > > you can try producing multiple asm dumps for working and non-working runs. > But I don't think that the llvm is the culprit here. > > Can you try waiting for the kernel execution to complete explicitly before > mapping the buffer? > Ideally call clFinish() on line 63. Since I'm on the C++ binding (probably a mistake) I used: queue.finish(); and it seems to be working. (This also corresponds possibly to what I'm seeing on a more complex kernel; with a more complex kernel I'm seeing on a whole pile of data on the last few Z slices as being bogus suggesting it's not finished). Dave Created attachment 135318 [details]
annotated asm dump
(In reply to Dave Gilbert from comment #6) > (In reply to Jan Vesely from comment #5) > > (In reply to Dave Gilbert from comment #4) > > > Created attachment 135313 [details] > > > foo.link-0.ll > > > > > > That's all 3 of the debug files it produced. > > > (I wasn't sure which were the llvm and which the isa dumps; I guess the asm > > > is the isa? and the ll's are both llvm dumps?) > > > > yes. the first .ll is from compilation step, the other one is from linking > > step. > > > > .ll dump looks correct. > > .asm also looks correct. > > > > you can try producing multiple asm dumps for working and non-working runs. > > But I don't think that the llvm is the culprit here. > > > > Can you try waiting for the kernel execution to complete explicitly before > > mapping the buffer? > > Ideally call clFinish() on line 63. > > Since I'm on the C++ binding (probably a mistake) I used: > queue.finish(); > > and it seems to be working. > > (This also corresponds possibly to what I'm seeing on a more complex kernel; > with a more complex kernel I'm seeing on a whole pile of data on the last > few Z slices as being bogus suggesting it's not finished). > > Dave thanks for testing. I see you are using mesa 17.2. there were few changes to blocking call synchronization that went to mesa 17.3: 02f8ac6b70033a1b240d497c4664c359d2398cc3 (clover: Wrap event::wait_count in a method taking care of the required locking.) bc4000ee40c78efe1e5e8a6244d4bb55389d8418 (clover: Run the associated action before an event is signalled.) 3a5b69c09ba355c616c274b0c7f5aba3bd21fd54 (clover: Wait for requested operation if blocking flag is set) which might help address the issue. Can you test mesa 17.3? (In reply to Jan Vesely from comment #8) > (In reply to Dave Gilbert from comment #6) > > (In reply to Jan Vesely from comment #5) > > > (In reply to Dave Gilbert from comment #4) > > > > Created attachment 135313 [details] > > > > foo.link-0.ll > > > > > > > > That's all 3 of the debug files it produced. > > > > (I wasn't sure which were the llvm and which the isa dumps; I guess the asm > > > > is the isa? and the ll's are both llvm dumps?) > > > > > > yes. the first .ll is from compilation step, the other one is from linking > > > step. > > > > > > .ll dump looks correct. > > > .asm also looks correct. > > > > > > you can try producing multiple asm dumps for working and non-working runs. > > > But I don't think that the llvm is the culprit here. > > > > > > Can you try waiting for the kernel execution to complete explicitly before > > > mapping the buffer? > > > Ideally call clFinish() on line 63. > > > > Since I'm on the C++ binding (probably a mistake) I used: > > queue.finish(); > > > > and it seems to be working. > > > > (This also corresponds possibly to what I'm seeing on a more complex kernel; > > with a more complex kernel I'm seeing on a whole pile of data on the last > > few Z slices as being bogus suggesting it's not finished). > > > > Dave > > thanks for testing. I see you are using mesa 17.2. > > there were few changes to blocking call synchronization that went to mesa > 17.3: > 02f8ac6b70033a1b240d497c4664c359d2398cc3 (clover: Wrap event::wait_count in > a method taking care of the required locking.) > bc4000ee40c78efe1e5e8a6244d4bb55389d8418 (clover: Run the associated action > before an event is signalled.) > 3a5b69c09ba355c616c274b0c7f5aba3bd21fd54 (clover: Wait for requested > operation if blocking flag is set) > > which might help address the issue. Can you test mesa 17.3? Yeh, I'll figure out how to get 17.3 built on this box. I believe I'm still seeing this: dg@hath:~/ocl2$ clinfo Number of platforms 1 Platform Name Clover Platform Vendor Mesa Platform Version OpenCL 1.1 Mesa 17.4.0-devel (git-a16dc04ad5) .... dg@hath:~/ocl2$ echo $LD_LIBRARY_PATH /home/dg/mesa/try/lib: so I *think* it's using my build. and I believe I'm still seeing it. Is my test valid or do I really need that finish? (In reply to Dave Gilbert from comment #10) > I believe I'm still seeing this: > > dg@hath:~/ocl2$ clinfo > Number of platforms 1 > Platform Name Clover > Platform Vendor Mesa > Platform Version OpenCL 1.1 Mesa > 17.4.0-devel (git-a16dc04ad5) > .... > dg@hath:~/ocl2$ echo $LD_LIBRARY_PATH > /home/dg/mesa/try/lib: > > so I *think* it's using my build. yes, that looks OK. > and I believe I'm still seeing it. > Is my test valid or do I really need that finish? it should be OK. Can you replace the clFinish with clWaitForEvents (or the respective C++ method) to wait for kernel execution? It looks to me that clover creates new map without waiting for all the dep events. (In reply to Jan Vesely from comment #11) > (In reply to Dave Gilbert from comment #10) > > I believe I'm still seeing this: > > > > dg@hath:~/ocl2$ clinfo > > Number of platforms 1 > > Platform Name Clover > > Platform Vendor Mesa > > Platform Version OpenCL 1.1 Mesa > > 17.4.0-devel (git-a16dc04ad5) > > .... > > dg@hath:~/ocl2$ echo $LD_LIBRARY_PATH > > /home/dg/mesa/try/lib: > > > > so I *think* it's using my build. > > yes, that looks OK. > > > and I believe I'm still seeing it. > > Is my test valid or do I really need that finish? > > it should be OK. Can you replace the clFinish with clWaitForEvents (or the > respective C++ method) to wait for kernel execution? > It looks to me that clover creates new map without waiting for all the dep > events. It doesn't seem to help, if I add: --- a/ocl.cpp +++ b/ocl.cpp @@ -74,6 +74,7 @@ static int got_dev(cl::Platform &plat, std::vector<cl::Device> &devices, cl::Dev cl::Event eventBarrier2; queue.enqueueBarrierWithWaitList(NULL,&eventBarrier2); std::cerr << __func__ << "enqueueMapBuffer gave: " << err << std::endl; + event.wait(); eventMap.wait(); eventBarrier2.wait(); that doesn't seem to help and I think event is the event triggered by the kernel. (In reply to Dave Gilbert from comment #6) > (In reply to Jan Vesely from comment #5) > > (In reply to Dave Gilbert from comment #4) > > > Created attachment 135313 [details] > > > foo.link-0.ll > > > > > > That's all 3 of the debug files it produced. > > > (I wasn't sure which were the llvm and which the isa dumps; I guess the asm > > > is the isa? and the ll's are both llvm dumps?) > > > > yes. the first .ll is from compilation step, the other one is from linking > > step. > > > > .ll dump looks correct. > > .asm also looks correct. > > > > you can try producing multiple asm dumps for working and non-working runs. > > But I don't think that the llvm is the culprit here. > > > > Can you try waiting for the kernel execution to complete explicitly before > > mapping the buffer? > > Ideally call clFinish() on line 63. > > Since I'm on the C++ binding (probably a mistake) I used: > queue.finish(); > > and it seems to be working. > > (This also corresponds possibly to what I'm seeing on a more complex kernel; > with a more complex kernel I'm seeing on a whole pile of data on the last > few Z slices as being bogus suggesting it's not finished). > > Dave This reminds me of a certain issue I experienced with OpenMM. Is it limited to Turks, or it happens on SI+ cards? (In reply to Dave Gilbert from comment #12) > > It doesn't seem to help, if I add: > --- a/ocl.cpp > +++ b/ocl.cpp > @@ -74,6 +74,7 @@ static int got_dev(cl::Platform &plat, > std::vector<cl::Device> &devices, cl::Dev > cl::Event eventBarrier2; > queue.enqueueBarrierWithWaitList(NULL,&eventBarrier2); > std::cerr << __func__ << "enqueueMapBuffer gave: " << err << std::endl; > + event.wait(); > eventMap.wait(); > eventBarrier2.wait(); > > > that doesn't seem to help and I think event is the event triggered by the > kernel. can you move it few lines up? (before the call to mapBuffer). Hi Jan, Yes, doing: --- a/ocl.cpp +++ b/ocl.cpp @@ -65,6 +65,7 @@ static int got_dev(cl::Platform &plat, std::vector<cl::Device> &devices, cl::Dev events.push_back(event); cl::Event eventMap; queue.enqueueBarrierWithWaitList(&events); + event.wait(); mapped = (cl_uint*)queue.enqueueMapBuffer(output, CL_TRUE /* blocking */, CL_MAP_READ, 0 /* offset */, SIZE * SIZE * SIZE * sizeof(cl_uint) /* size */, does seem to work. Vedran: I've only got a Turks to play with; feel free to try my test on something else. (In reply to Dave Gilbert from comment #15) > Hi Jan, > Yes, doing: > --- a/ocl.cpp > +++ b/ocl.cpp > @@ -65,6 +65,7 @@ static int got_dev(cl::Platform &plat, > std::vector<cl::Device> &devices, cl::Dev > events.push_back(event); > cl::Event eventMap; > queue.enqueueBarrierWithWaitList(&events); > + event.wait(); > mapped = (cl_uint*)queue.enqueueMapBuffer(output, CL_TRUE /* blocking > */, CL_MAP_READ, > 0 /* offset */, > SIZE * SIZE * SIZE * sizeof(cl_uint) /* size */, > > does seem to work. thanks, that means the kernel work event works correctly. I'll need to double check the specs wrt synchronization points. we either miss a wait, or fail to update mapped buffers after kernel finishes execution. > > Vedran: I've only got a Turks to play with; feel free to try my test on > something else. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/615. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.