Bug 70199

Summary: clang+llvm from svn crashes when generating opencl code for 64 bit types
Product: Mesa Reporter: klondike <klondike>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: jv356, peter, vedran
Version: 9.2   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 99553    
Attachments: 64 bit unsigned integer divide it causes the first issue
64 bit unsigned add, causes the second
64 bit unsigned multiply, also generates the second isssue.
Expand most of the 64 bit operations into 32 bit ones
piglit test showing importance of parameter order

Description klondike 2013-10-06 16:40:20 UTC
Created attachment 87199 [details]
64 bit unsigned integer divide it causes the first issue

When generating code from opencl files containing 64 bit integer types the compiler crashes with messages similar to this:
0x67be970: i32 = ExternalSymbol'__udivdi3'
Undefined function
UNREACHABLE executed at /home/klondike/myllvm/llvm/lib/Target/R600/AMDGPUISelLowering.h:76!
0  clang           0x0000000001d2f795 llvm::sys::PrintStackTrace(_IO_FILE*) + 37
1  clang           0x0000000001d2fbe3
2  libpthread.so.0 0x00000337e68ffbf0
3  libc.so.6       0x00000337e54e5b05 gsignal + 53
4  libc.so.6       0x00000337e54e6f7b abort + 379
5  clang           0x0000000001d1f088 llvm::llvm_unreachable_internal(char const*, char const*, unsigned int) + 440
6  clang           0x0000000001597712
7  clang           0x00000000016696cf llvm::TargetLowering::LowerCallTo(llvm::TargetLowering::CallLoweringInfo&) const + 2511
8  clang           0x000000000168e1b6 llvm::TargetLowering::makeLibCall(llvm::SelectionDAG&, llvm::RTLIB::Libcall, llvm::EVT, llvm::SDValue const*, unsigned int, bool, llvm::SDLoc, bool, bool) const + 806
9  clang           0x0000000001721c5e
10 clang           0x000000000171d390
11 clang           0x00000000016c1823
12 clang           0x00000000016c6964 llvm::SelectionDAG::LegalizeTypes() + 36
13 clang           0x000000000167d6fd llvm::SelectionDAGISel::CodeGenAndEmitDAG() + 1389
14 clang           0x000000000167c909 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) + 6249
15 clang           0x000000000167a3a7 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) + 1319
16 clang           0x00000000017c876c llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 124
17 clang           0x0000000001c6f3d3 llvm::FPPassManager::runOnFunction(llvm::Function&) + 355
18 clang           0x0000000001c6f64b llvm::FPPassManager::runOnModule(llvm::Module&) + 43
19 clang           0x0000000001c6f994 llvm::MPPassManager::runOnModule(llvm::Module&) + 420
20 clang           0x0000000001c7003b llvm::PassManagerImpl::run(llvm::Module&) + 539
21 clang           0x0000000001c701aa llvm::PassManager::run(llvm::Module&) + 10
22 clang           0x0000000000808137 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::Module*, clang::BackendAction, llvm::raw_ostream*) + 6167
23 clang           0x0000000000805ab3
24 clang           0x000000000096db63 clang::ParseAST(clang::Sema&, bool, bool) + 515
25 clang           0x0000000000804f12 clang::CodeGenAction::ExecuteAction() + 514
26 clang           0x0000000000682461 clang::FrontendAction::Execute() + 113
27 clang           0x00000000006607bd clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 909
28 clang           0x00000000006474f5 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 3077
29 clang           0x000000000063efb4 cc1_main(char const**, char const**, char const*, void*) + 628
30 clang           0x0000000000645454 main + 8500
31 libc.so.6       0x00000337e54d25dd __libc_start_main + 237
32 clang           0x000000000063ec5d
Stack dump:
0.      Program arguments: /home/klondike/myllvm/build/Release+Asserts/bin/clang -cc1 -triple r600-- -S -disable-free -main-file-name ldiv.cl -mrelocation-model static -mdisable-fp-elim -fmath-errno -mconstructor-aliases -target-cpu redwood -target-linker-version 2.23.1 -coverage-file /home/klondike/opencl-example/- -resource-dir /home/klondike/myllvm/build/Release+Asserts/bin/../lib/clang/3.4 -include clc/clc.h -D cl_clang_storage_class_specifiers -D cl_khr_fp64 -std=cl -fno-dwarf-directory-asm -fdebug-compilation-dir /home/klondike/opencl-example -ferror-limit 19 -fmessage-length 192 -mstackrealign -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -vectorize-slp -o - -x cl ldiv.cl 
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'ldiv.cl'.
4.      Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@ldiv'
clang: error: unable to execute command: Aborted
clang: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 3.4 (trunk 192013)
Target: r600--
Thread model: posix
clang: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.
clang: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/ldiv-5f4dde.cl
clang: note: diagnostic msg: /tmp/ldiv-5f4dde.sh
clang: note: diagnostic msg: 

********************

And this:
fatal error: error in backend: Cannot select: 0x5a141e0: ch = store 0x59d1bf0, 0x5a147e0, 0x5a127c0, 0x5a11bc0<ST4[%out+4]> [ORD=2] [ID=26]
  0x5a147e0: i32 = add 0x5a146e0, 0x5a145e0 [ORD=1] [ID=23]
    0x5a146e0: i32 = add 0x5a143e0, 0x5a144e0 [ORD=1] [ID=20]
      0x5a143e0: i32 = mulhu 0x5a124c0, 0x5a120c0 [ORD=1] [ID=16]
        0x5a124c0: i32 = CONST_ADDRESS 0x5a123c0 [ID=10]
          0x5a123c0: i32 = Constant<8240> [ID=5]
        0x5a120c0: i32 = CONST_ADDRESS 0x5a11fc0 [ID=12]
          0x5a11fc0: i32 = Constant<8232> [ID=7]
      0x5a144e0: i32 = mul 0x5a124c0, 0x5a122c0 [ORD=1] [ID=14]
        0x5a124c0: i32 = CONST_ADDRESS 0x5a123c0 [ID=10]
          0x5a123c0: i32 = Constant<8240> [ID=5]
        0x5a122c0: i32 = CONST_ADDRESS 0x5a121c0 [ID=11]
          0x5a121c0: i32 = Constant<8236> [ID=6]
    0x5a145e0: i32 = mul 0x5a151f0, 0x5a120c0 [ORD=1] [ID=15]
      0x5a151f0: i32 = CONST_ADDRESS 0x5a14fe0 [ID=9]
        0x5a14fe0: i32 = Constant<8244> [ID=4]
      0x5a120c0: i32 = CONST_ADDRESS 0x5a11fc0 [ID=12]
        0x5a11fc0: i32 = Constant<8232> [ID=7]
  0x5a127c0: i32 = DWORDADDR 0x5a125c0 [ORD=2] [ID=25]
    0x5a125c0: i32 = srl 0x5a14ae0, 0x5a126c0 [ORD=2] [ID=22]
      0x5a14ae0: i32 = add 0x5a11ec0, 0x5a149e0 [ORD=2] [ID=19]
        0x5a11ec0: i32 = CONST_ADDRESS 0x5a11dc0 [ID=13]
          0x5a11dc0: i32 = Constant<8228> [ID=8]
        0x5a149e0: i32 = Constant<4> [ID=2]
      0x5a126c0: i32 = Constant<2> [ID=3]
  0x5a11bc0: i32 = undef [ID=1]
In function: lmul
clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 3.4 (trunk 192013)
Target: r600--
Thread model: posix
clang: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.
clang: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/lmul-213f81.cl
clang: note: diagnostic msg: /tmp/lmul-213f81.sh
clang: note: diagnostic msg: 

********************


The suggestion on the chat has been using TargetLowering to handle this issue. (See http://llvm.org/docs/doxygen/html/structllvm_1_1TargetLowering_1_1TargetLoweringOpt.html for details).

Following come some examples of llvm ir that produces crashes despite being generated by clang from valid opencl programs.
Comment 1 klondike 2013-10-06 16:40:56 UTC
Created attachment 87200 [details]
64 bit unsigned add, causes the second
Comment 2 klondike 2013-10-06 16:41:50 UTC
Created attachment 87201 [details]
64 bit unsigned multiply, also generates the second isssue.
Comment 3 Peter Wu 2013-10-12 21:57:13 UTC
Confirmed, both piglit tests scalar-arithmetic-long.cl and scalar-arithmetic-ulong.cl tests show the following error (__udivdi3 for ulong):

0x1cb47c0: i32 = ExternalSymbol'__divdi3'
Undefined function
UNREACHABLE executed at /src/llvm/lib/Target/R600/AMDGPUISelLowering.h:77!
Stack dump:
0.      Running pass 'Function Pass Manager' on module 'radeon'.
1.      Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@div'
Comment 4 Peter Wu 2013-10-12 22:03:12 UTC
I forgot to mention my environment:

- HD6790
- Linux 3.11.0
- Mesa from git master, commit d7d539a1cb8dcf50cb7cd534e6ae7df3f42914c8
- LLVM from SVN trunk, rev 192532 (via git-svn)
Comment 5 klondike 2013-10-13 01:52:06 UTC
Created attachment 87543 [details] [review]
Expand most of the 64 bit operations into 32 bit ones

Well I have been messing a little bit with the TargetLowering. I have managed to get some operations to work (ANDs, ORs, XORs, UMULs, ADDS and SUBS amongst others) by forcing llvm to expand them.

DIVs are a completely different world since they require more advanced algorithms which I'm not familiar with or support for calls (plus porting gcc's functions). The first one I don't have time to do (for now) the second one is waaaaay out of my league.

The attached patch expands some of the 64 bit integer operations (but is still a WIP since likely most if not all (exceptions being loads and stores) need to be expanded and I haven't covered them all.
Comment 6 klondike 2013-10-13 03:29:45 UTC
Just a small note, the functions also seem to be defined in compiler-rt, we could use clang to compile them and preload them as part of the runtime.
Comment 7 Peter Wu 2013-10-13 09:15:22 UTC
I tried the the patch from comment 5, but the piglit tests have not improved.
Comment 8 Peter Wu 2013-10-13 12:41:50 UTC
Created attachment 87554 [details]
piglit test showing importance of parameter order

The order of parameters for long seems to matter. Output for the attached piglit test (failing tests on top):

> Running kernel test: Arg0 = a + b
Using kernel add0
Setting kernel arguments...
Running the kernel...
Validating results...
Expecting 3 (0x3) with tolerance 0, but got 12884901888 (0x300000000)
Error at long[0]
 Argument 0: FAIL
PIGLIT:subtest {'Arg0 = a + b' : 'fail'}
> Running kernel test: Arg1 = a + b
Using kernel add1
Setting kernel arguments...
Running the kernel...
Validating results...
Expecting 3 (0x3) with tolerance 0, but got 8589934593 (0x200000001)
Error at long[0]
 Argument 1: FAIL
PIGLIT:subtest {'Arg1 = a + b' : 'fail'}
> Running kernel test: set arg0 to arg1
Using kernel set0
Setting kernel arguments...
Running the kernel...
Validating results...
Expecting 4 (0x4) with tolerance 0, but got 17179869184 (0x400000000)
Error at long[0]
 Argument 0: FAIL
PIGLIT:subtest {'set arg0 to arg1' : 'fail'}

(Passing tests below:)

> Running kernel test: set arg1 to arg0
Using kernel set1
Setting kernel arguments...
Running the kernel...
Validating results...
 Argument 1: PASS
PIGLIT:subtest {'set arg1 to arg0' : 'pass'}
> Running kernel test: Arg2 = a + b
Using kernel add2
Setting kernel arguments...
Running the kernel...
Validating results...
 Argument 2: PASS
PIGLIT:subtest {'Arg2 = a + b' : 'pass'}
> Running kernel test: set arg0 to arg1 (indirected)
Using kernel setp0
Setting kernel arguments...
Running the kernel...
Validating results...
 Argument 0: PASS
PIGLIT:subtest {'set arg0 to arg1 (indirected)' : 'pass'}
> Running kernel test: set arg1 to arg0 (indirected)
Using kernel setp1
Setting kernel arguments...
Running the kernel...
Validating results...
 Argument 1: PASS
PIGLIT:subtest {'set arg1 to arg0 (indirected)' : 'pass'}
> Running kernel test: set arg0 to arg2 (with dummy pointer)
Using kernel set0_2
Setting kernel arguments...
Running the kernel...
Validating results...
 Argument 0: PASS
PIGLIT:subtest {'set arg0 to arg2 (with dummy pointer)' : 'pass'}
> Running kernel test: set arg2 to arg0 (with dummy pointer)
Using kernel set2_0
Setting kernel arguments...
Running the kernel...
Validating results...
 Argument 2: PASS
PIGLIT:subtest {'set arg2 to arg0 (with dummy pointer)' : 'pass'}
Comment 9 Vedran Miletić 2015-12-21 12:53:21 UTC
Does this still occur?
Comment 10 klondike 2015-12-21 16:06:08 UTC
(In reply to Vedran Miletić from comment #9)
> Does this still occur?

I no longer have access to such machines aas the one I used has died. Sorry :(
Comment 11 Vedran Miletić 2017-03-22 15:35:39 UTC
I am unable to recompile the the LLVM IR files due to the following error:

llc: kk.ll:9:59: error: use of undefined metadata '!1'
  store i64 %div, i64 addrspace(1)* %out, align 8, !tbaa !1
                                                          ^

However, compiling the piglit test with

$ clang -x cl -target r600-- -mcpu=cayman -Dcl_clang_storage_class_specifiers=1 -Xclang -mlink-bitcode-file -Xclang /usr/local/lib64/clc/cayman-r600--.bc -I/usr/local/include/clc/ -include /usr/local/include/clc/clc.h kernel.cl

produces no crashes. If anyone can confirm this is still an issue, please reopen.
Comment 12 Jan Vesely 2017-03-22 20:34:19 UTC
64 bit integer division was implemented back in 2014 (r207508, r207589, fixed in r222073). should be in llvm as old 3.6

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.