Bug 111522

Summary: [bisected] Supraland no longer start
Product: Mesa Reporter: MWATTT <megwattt>
Component: Drivers/Vulkan/CommonAssignee: mesa-dev
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: not set CC: airlied, chadversary, jason
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=110765
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 111444    
Attachments: Full log from UE4.22 Scifi Hallway demo

Description MWATTT 2019-08-29 15:05:56 UTC
Commit 9653d80de187fe9d9e5211b475065e7e09598f19 break Supraland (UE 4.21 vulkan game). The game only show a black window which is closed after roughly 1 minute.
Tested on a RX 570. A demo of the game is freely available if needed.

Culprit commit: 9653d80de187fe9d9e5211b475065e7e09598f19
Author: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Date:   Mon May 20 22:58:32 2019 +0200

    vulkan/wsi/x11: Increase the effective min. images for mailbox.
    
    We need 5 images:
    1) CPU work
    2) GPU work
    3) idle
    4) queued for flip
    5) presenting
    
    Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Comment 1 Eero Tamminen 2019-08-30 07:44:40 UTC
Related to bug 110765 ?
Comment 2 Eric Engestrom 2019-08-30 11:52:33 UTC
Could you try if this fixes your issue?
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1819
Comment 3 Eero Tamminen 2019-08-30 12:22:44 UTC
(In reply to Eric Engestrom from comment #2)
> Could you try if this fixes your issue?
> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1819

Not alone, as that MR enables the workaround only for gfxbench (testfw_app binary name).

What binary name Supraland uses?
Comment 4 MWATTT 2019-08-30 12:54:22 UTC
The binary is "Supraland-Linux-Shipping"
It may also affect others UE4 games. Haven't tested yet
Comment 5 Eric Engestrom 2019-08-30 13:09:08 UTC
Alright, you'll need to add this to your ~/.drirc on top of MR !1819:

<driconf>
    <device>
        <application name="Supraland" executable="Supraland-Linux-Shipping">
            <option name="override_min_image_count" value="2" />
        </application>
    </device>
</driconf>

If you could test other UE4 games too that would be great :)
Comment 6 MWATTT 2019-08-30 16:34:54 UTC
Applying 1819 patch + /.drirc file doesn't have any effects.
Comment 7 Lionel Landwerlin 2019-09-05 15:44:30 UTC
Just to confirm, is this the title causing problems? : https://store.steampowered.com/app/813630/Supraland/
Comment 8 Mark Janes 2019-09-05 16:15:29 UTC
Eric, can you make a similar fix to what you did for minimagecount dri config?
Comment 9 MWATTT 2019-09-05 17:23:26 UTC
Yes Lionel
Comment 10 Lionel Landwerlin 2019-09-05 21:06:29 UTC
I've put up another MR : https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1883

Here is the ~/.drirc I was using :

<driconf>
   <device>
      <application name="Supraland" executable="Supraland-Linux-Shipping">
         <option name="vk_x11_override_min_image_count" value="2" />
         <option name="vk_x11_strict_image_count" value="true" />
      </application>
   </device>
</driconf>

If you could test this that would be great.


Note that for me with this fix, the game crashes at start with the following backtrace :

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f051ec464e5 in ralloc_parent (ptr=0x7f04fd83d760) at ../src/util/ralloc.c:356
356        return info->parent ? PTR_FROM_HEADER(info->parent) : NULL;
[Current thread is 1 (Thread 0x7f0505afb700 (LWP 15827))]
(gdb) bt
#0  0x00007f051ec464e5 in ralloc_parent (ptr=0x7f04fd83d760) at ../src/util/ralloc.c:356
#1  0x00007f051ec464b8 in ralloc_parent (ptr=0x7f04fc891a00) at ../src/util/ralloc.c:353
#2  0x00007f051ec464b8 in ralloc_parent (ptr=0x7f04fce65fe0) at ../src/util/ralloc.c:353
#3  0x00007f051ec463d9 in ralloc_adopt (new_ctx=0x7f051ec463d9 <ralloc_adopt+48>, old_ctx=0x7f0505aebde0) at ../src/util/ralloc.c:326
#4  0x00007f051e8ff6b3 in anv_pipeline_compile_graphics (pipeline=0x7f04fd443410, cache=0x8d20800, info=0x7f0505af9c68) at ../src/intel/vulkan/anv_pipeline.c:1448
#5  0x00007f051e900b5f in anv_pipeline_init (pipeline=0x7f04fd443410, device=0x8d208f0, cache=0x8d20800, pCreateInfo=0x7f0505af9c68, alloc=0x8d208f8) at ../src/intel/vulkan/anv_pipeline.c:1930
#6  0x00007f051e9f4793 in gen9_graphics_pipeline_create (_device=0x8d208f0, cache=0x8d20800, pCreateInfo=0x7f0505af9c68, pAllocator=0x0, pPipeline=0x7f04a4b01c10)
    at ../src/intel/vulkan/genX_pipeline.c:2135
#7  0x00007f051e9f5200 in VALGRIND_PRINTF (format=0x7f051e9e82aa <_anv_combine_address+105> "H\215\065\357\t?") at /usr/include/valgrind/valgrind.h:6248
#8  0x00007f051e27d20c in vkCreateGraphicsPipelines (device=0x8d208f0, pipelineCache=0x8d20800, createInfoCount=1, pCreateInfos=0x7f0505af9c68, pAllocator=0x0, pPipelines=0x7f04a4b01c10)
    at layersvt/api_dump.cpp:8318
#9  0x00007f051dcd7078 in ?? () from /home/djdeath/.steam/ubuntu12_64/libVkLayer_steam_fossilize.so
#10 0x00007f0528057c94 in vkCreateGraphicsPipelines (device=0x8d208f0, pipelineCache=0x8d20800, createInfoCount=1, pCreateInfos=0x7f0505af9c68, pAllocator=0x0, pPipelines=0x7f04a4b01c10)
    at ../loader/trampoline.c:1275
#11 0x000000000467195d in FVulkanPipelineStateCacheManager::CreateGfxPipelineFromEntry(FVulkanPipelineStateCacheManager::FGfxPipelineEntry*, FVulkanShader**, FVulkanGfxPipeline*) ()
#12 0x0000000004670ee1 in FVulkanPipelineStateCacheManager::CreateAndAdd(FGraphicsPipelineStateInitializer const&, FGfxPSIKey, TSharedPtr<FVulkanPipelineStateCacheManager::FGfxPipelineEntry, (ESPMode)1>, FGfxEntryKey) ()
#13 0x0000000004674dcb in FVulkanDynamicRHI::RHICreateGraphicsPipelineState(FGraphicsPipelineStateInitializer const&) ()
#14 0x000000000472a1b6 in PipelineStateCache::GetAndOrCreateGraphicsPipelineState(FRHICommandList&, FGraphicsPipelineStateInitializer const&, EApplyRendertargetOption) ()
#15 0x0000000004729e3f in SetGraphicsPipelineState(FRHICommandList&, FGraphicsPipelineStateInitializer const&, EApplyRendertargetOption) ()
#16 0x00000000042c3f86 in FRCPassPostProcessCombineLUTs::Process(FRenderingCompositePassContext&) ()
#17 0x00000000043820d8 in FRenderingCompositionGraph::RecursivelyProcess(FRenderingCompositeOutputRef const&, FRenderingCompositePassContext&) const ()
#18 0x0000000004381d94 in FRenderingCompositePassContext::Process(TArray<FRenderingCompositePass*, FDefaultAllocator> const&, char16_t const*) ()
#19 0x00000000042d43f1 in FPostProcessing::Process(FRHICommandListImmediate&, FViewInfo const&, TRefCountPtr<IPooledRenderTarget>&) ()
#20 0x000000000415007b in FDeferredShadingSceneRenderer::Render(FRHICommandListImmediate&) ()
#21 0x000000000446c7c6 in ?? ()
#22 0x00000000044772ba in ?? ()
#23 0x000000000391c18f in FNamedTaskThread::ProcessTasksNamedThread(int, bool) ()
#24 0x000000000391bdf3 in FNamedTaskThread::ProcessTasksUntilQuit(int) ()
#25 0x000000000475e0b2 in FRenderingThread::Run() ()
#26 0x0000000003954b03 in FRunnableThreadPThread::Run() ()
#27 0x0000000003946aad in FRunnableThreadPThread::_ThreadProc(void*) ()
#28 0x00007f054e77d182 in start_thread (arg=<optimised out>) at pthread_create.c:486
#29 0x00007f054dd88b1f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95


Though running the game under valgrind shows incorrect free() from the application so I believe the above backtrace is the result of a previous memory corruption.

Thanks!
Comment 11 MWATTT 2019-09-05 22:13:51 UTC
I can confirm that the last MR + .drirc solves this issue, at least on radv. Anv may have additional issues.

May be unrelated but I have a lot of
"SPIR-V WARNING:
    In file ../src/compiler/spirv/spirv_to_nir.c:826
    Decoration not allowed on struct members: SpvDecorationInvariant
" 
in the console
Comment 12 MWATTT 2019-09-06 17:06:22 UTC
I just compiled the Scifi Hallway demo with Unreal Engine 4.22. I have the same issue so it probably affect all UE4 applications using vulkan.

This time, I have some info from the log:
"Assertion failed: Images.Num() == NUM_BUFFERS [File:D:\Build\++UE4\Sync\Engine\Source\Runtime\VulkanRHI\Private\VulkanViewport.cpp] [Line: 610] 
Actual Num: 5"

I'll attach the full log.

With UE4.22+, using "-vulkanpresentmode=0" also solves the crash. (Does not work with Supraland, as it's a UE4.21 game)
Comment 13 MWATTT 2019-09-06 17:08:13 UTC
Created attachment 145284 [details]
Full log from UE4.22 Scifi Hallway demo
Comment 14 Jason Ekstrand 2019-09-06 18:20:27 UTC
I've got an e-mail thread going with some people at Epic.  They're going to be looking into fixing the issue in UE4.  Until then, driver workarounds will be needed. :-(
Comment 15 MWATTT 2019-09-07 02:33:33 UTC
Can these workarounds be applied to all applications using this engine with information in VkApplicationInfo->pEngineName?
Comment 16 Lionel Landwerlin 2019-09-07 21:30:11 UTC
(In reply to MWATTT from comment #15)
> Can these workarounds be applied to all applications using this engine with
> information in VkApplicationInfo->pEngineName?

Yes, will look into that.
Comment 17 Lionel Landwerlin 2019-09-09 06:03:22 UTC
(In reply to Lionel Landwerlin from comment #16)
> (In reply to MWATTT from comment #15)
> > Can these workarounds be applied to all applications using this engine with
> > information in VkApplicationInfo->pEngineName?
> 
> Yes, will look into that.

Done, updated the MR.
Comment 18 MWATTT 2019-09-09 16:18:28 UTC
Supraland does start with the updated MR. (I suppose .drirc is only updated if using ninja install so I manually put content from 00-mesa-defaults.conf to .drirc).

Note that will only affect UE4.21 applications. For instance, my previous UE4.22 sample have the same issue, unless I modify .drirc with "UnrealEngine4.22"

A workaround is to set
<engine engine_name="UnrealEngine4" engine_version="0:23">
on .drirc and use strstr instead of strcmp on xmlconfig.c at line 793, if it's acceptable
Comment 19 Lionel Landwerlin 2019-09-09 17:35:06 UTC
(In reply to MWATTT from comment #18)
> Supraland does start with the updated MR. (I suppose .drirc is only updated
> if using ninja install so I manually put content from 00-mesa-defaults.conf
> to .drirc).
> 
> Note that will only affect UE4.21 applications. For instance, my previous
> UE4.22 sample have the same issue, unless I modify .drirc with
> "UnrealEngine4.22"
> 
> A workaround is to set
> <engine engine_name="UnrealEngine4" engine_version="0:23">
> on .drirc and use strstr instead of strcmp on xmlconfig.c at line 793, if
> it's acceptable

I've been wondering about regexps but that's another thing to add to mesa.
It's also unfortunate that Unreal puts its engien version in the engine name.
Comment 20 Timothy Arceri 2019-09-10 00:44:29 UTC
(In reply to Lionel Landwerlin from comment #19)
> (In reply to MWATTT from comment #18)
> > Supraland does start with the updated MR. (I suppose .drirc is only updated
> > if using ninja install so I manually put content from 00-mesa-defaults.conf
> > to .drirc).
> > 
> > Note that will only affect UE4.21 applications. For instance, my previous
> > UE4.22 sample have the same issue, unless I modify .drirc with
> > "UnrealEngine4.22"
> > 
> > A workaround is to set
> > <engine engine_name="UnrealEngine4" engine_version="0:23">
> > on .drirc and use strstr instead of strcmp on xmlconfig.c at line 793, if
> > it's acceptable
> 
> I've been wondering about regexps but that's another thing to add to mesa.
> It's also unfortunate that Unreal puts its engien version in the engine name.

The filename matching code is already full of platform specific paths so adding regexps here would be fairly simple as we can just use the posix functions.
Comment 21 Eero Tamminen 2019-09-10 11:41:57 UTC
(In reply to Timothy Arceri from comment #20)
> (In reply to Lionel Landwerlin from comment #19)
> > I've been wondering about regexps but that's another thing to add to mesa.
> > It's also unfortunate that Unreal puts its engien version in the engine name.
> 
> The filename matching code is already full of platform specific paths so
> adding regexps here would be fairly simple as we can just use the posix
> functions.

Wouldn't globbing (fnmatch()) be sufficient and easier to end users to understand than regexps?
Comment 22 MWATTT 2019-09-14 18:58:54 UTC
Your last MR works fine for me. 
Note that the last Unreal Engine version is 4.23 and is still affected by this bug, so I'll be better with "engine_version="0:23" in .drirc
Comment 23 Lionel Landwerlin 2019-09-15 13:04:02 UTC
Should fixed on master with :

commit 0616b7ac90cf4f86bb409d34101e3a3cceac8cbe
Author: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Date:   Thu Sep 5 23:54:53 2019 +0300

    vulkan: add vk_x11_strict_image_count option
Comment 24 Eric Engestrom 2019-09-15 13:22:25 UTC
(In reply to MWATTT from comment #22)
> Note that the last Unreal Engine version is 4.23 and is still affected by
> this bug, so I'll be better with "engine_version="0:23" in .drirc

Lionel, you put the workaround up to version 4.21 in 0616b7ac90cf4f86bb409d34101e3a3cceac8cbe; did you mean to put 4.23?

If so, you can push this with my r-b:
---8<---
--- a/src/util/00-mesa-defaults.conf
+++ b/src/util/00-mesa-defaults.conf
@@ -475,7 +475,7 @@ TODO: document the other workarounds.
         <!-- Works around the game not starting (does not deal with
              the implementation returning more images than the minimum
              specified by the application. -->
-        <engine engine_name_match="UnrealEngine4.*" engine_versions="0:21">
+        <engine engine_name_match="UnrealEngine4.*" engine_versions="0:23">
             <option name="vk_x11_strict_image_count" value="true" />
         </engine>
     </device>
--->8---
Comment 25 Lionel Landwerlin 2019-09-15 19:05:44 UTC
(In reply to Eric Engestrom from comment #24)
> (In reply to MWATTT from comment #22)
> > Note that the last Unreal Engine version is 4.23 and is still affected by
> > this bug, so I'll be better with "engine_version="0:23" in .drirc
> 
> Lionel, you put the workaround up to version 4.21 in
> 0616b7ac90cf4f86bb409d34101e3a3cceac8cbe; did you mean to put 4.23?

Yes, because of the comment in https://bugs.freedesktop.org/show_bug.cgi?id=111522#c22

> 
> If so, you can push this with my r-b:
> ---8<---
> --- a/src/util/00-mesa-defaults.conf
> +++ b/src/util/00-mesa-defaults.conf
> @@ -475,7 +475,7 @@ TODO: document the other workarounds.
>          <!-- Works around the game not starting (does not deal with
>               the implementation returning more images than the minimum
>               specified by the application. -->
> -        <engine engine_name_match="UnrealEngine4.*" engine_versions="0:21">
> +        <engine engine_name_match="UnrealEngine4.*" engine_versions="0:23">
>              <option name="vk_x11_strict_image_count" value="true" />
>          </engine>
>      </device>
> --->8---

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.