Bug 94273

Summary: Clover on RadeonSI OpenCL segfault during testing of clBLAS
Product: Mesa Reporter: joshua.r.marshall.1991
Component: Gallium/StateTracker/CloverAssignee: mesa-dev
Status: RESOLVED MOVED QA Contact: mesa-dev
Severity: normal    
Priority: medium CC: pbrobinson
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 99553, 99765, 100105    

Description joshua.r.marshall.1991 2016-02-24 01:05:09 UTC
When running the included tests in clBLAS, there is a sagfault originating in libMesaOpenCL.so.1 called from the clBLAS test suite.  I cannot localize the bug to mesa or clBLAS.

the clBLAS bug report is here: https://github.com/clMathLibraries/clBLAS/issues/229

Printed output when run in gdb: https://gist.github.com/anadon/82d7cdffb1275d71708f

Stack trace: https://gist.github.com/anadon/586f54b5f62a22e339af

Core dump at 1131MB: http://cs.mtu.edu/~jrmarsha/core.27756
Comment 1 joshua.r.marshall.1991 2016-02-24 07:10:27 UTC
Part of the issue EdB from irc found was the use of a function called "cl_amd_print" as a functions which appears to be built into the AMD implementation of OpenCL but does not appear in any revision of the specification.  While this shouldn't necessarily build, I think it is worth asking the question of if it can be made more apparent that this is the issue.  I'm not experienced to know if this discussion has been had or I missed something, or that this should have been handled differently in Mesa.
Comment 2 joshua.r.marshall.1991 2016-02-24 22:51:20 UTC
Improved stacktrace with debug information: https://gist.github.com/anadon/4bc558761e1192e26d0d
Comment 3 joshua.r.marshall.1991 2016-02-24 23:12:39 UTC
https://gist.github.com/anadon/c1ea234ade1e9d076970
Comment 4 Serge Martin 2016-02-27 18:49:08 UTC
What version of llvm do you use? This sources compile fine here when clover is build with llvm 3.9svn
Comment 5 Matt Arsenault 2016-02-27 19:43:55 UTC
Can you post the output of CLOVER_DEBUG=llvm

The backtrace seems to indicate that you somehow have ended up with debug info included which isn't fully implemented yet
Comment 6 Matt Arsenault 2016-02-27 19:46:53 UTC
(In reply to Matt Arsenault from comment #5)
> Can you post the output of CLOVER_DEBUG=llvm
> 
> The backtrace seems to indicate that you somehow have ended up with debug
> info included which isn't fully implemented yet

I think I might have fixed this crash or a similar one a few months ago
Comment 7 joshua.r.marshall.1991 2016-02-29 00:21:31 UTC
It's running LLVM 3.7.  Getting it to work off of SVN is a pain so I've been sticking with Arch's package.
Comment 8 Vedran Miletić 2017-03-22 14:08:27 UTC
I have made some progress with clBLAS by implementing clEnqueueFillBuffer() from OpenCL 1.2 [1] (have to clean up and post the patch, will do soon). Test makes it a bit further now:

$ test-short 
Initialize default OpenCL and clblas...
SetUp: about to create command queues

Test environment:

Device name: AMD FIJI (DRM 3.8.0 / 4.9.14-200.fc25.x86_64, LLVM 5.0.0)
Device vendor: AMD
Platform (bit): Linux
clblas version: 2.12.0
Driver version: 17.1.0-devel
Device version: OpenCL 1.1 Mesa 17.1.0-devel (git-0c3fbf8)
Global mem size: 7984 MB
---------------------------------------------------------

[==========] Running 10096 tests from 125 test cases.
[----------] Global test environment set-up.
[----------] 4 tests from TRSM_extratest
[ RUN      ] TRSM_extratest.strsm
[       OK ] TRSM_extratest.strsm (813 ms)

clBLAS is required (at least) for Octopus and Theano. I expect to get it working over the coming months, hopefully along with improving clBLAS, fixing stuff like [2, 3].

[1] https://www.khronos.org/registry/OpenCL/sdk/1.2/docs/man/xhtml/clEnqueueFillBuffer.html
[2] https://github.com/clMathLibraries/clBLAS/issues/307
[3] https://github.com/clMathLibraries/clBLAS/issues/308
Comment 9 GitLab Migration User 2019-09-18 17:55:50 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/132.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.