When a VFX issue reveals a compiler bug
Fri, Apr 21, 2023
Table of Contents
The first symptoms
On a quiet simple Tuesday in March when one of our Sound Designers shared his latest sound work.
Nothing really out of the usual here just some nice sci-fi pews and beams.
But among the people feedbacking on the sound a keen eyed programmer noticed that the beam vfx of our weapons were coming from a weird origin.
Not from the weapons muzzle as one would expect.
My first suspect was the Particle Effect Component as I noticed that many components were attached to the weapon
but they were done so with the FAttachmentTransformRules::KeepRelativeTransform flag.
So I wondered if something had caused them to shift over and get offset.
And would you look at that, switching it to FAttachmentTransformRules::SnapToTargetNotIncludingScale fixed it.
So that would be it right? Issue fixed and I moved on. Until a week later I got another report saying it was happening again.
I launch the game, hop into a debug level, and shoot the beam weapon. And look at that the beam VFX work just fine for me.
So that very weird right, it works on my machine but not theirs.
Background
A little background on our workflow, since we have a lot of people working on our game that aren't programmers, we make sure to push our DLL's to git.
This way all the art, design, narrative, animation, etc people have the updated version of our project.
What this also means is that if one programmer compiles without all the code and pushes it, that all those people would be missing changes even though they pulled those changes from git.
We just work around this by making sure our programmers pull changes and recompile the dlls before they commit and push.
All together we rarely have issues with this workflow.
"It works on my machine"
So now with a "it works on my machine" case, my suspicions shifted to my colleague who was the last to commit and push the DLL's.
So I checked with him to see if he forgot to pull the changes before building.
We quickly ruled this out since this wouldn't be realistic as it had been an entire week since my commit, so my changes definitely were included.
And to solidfy this fact, when he compiled the game with a completely up-to-date git he would still get the issue.
So now what? If I compile it works, if he compiles it doesn't.
This is around the time I remembered an issue we had years ago. An issue where FVector::GetSafeNormal would return faulty data.
This was something that we had only figured out because other people had reported it on the UnrealEngine forums.
So I wondered...
What if?
I started checking my compiler version, "Visual Studio 2019 14.29.30145 toolchain", and then his compiler version, "Visual Studio 2022 14.35.32215 toolchain".
In this situation I was very glad I had not enforced any specific compiler version amongst the team members.
So I asked everyone on the team to report me their compiler version and whether or not the beam vfx were comming from the right origin for them.
With this done I limited the issue to anyone with the 14.35 and 14.34 toolchain versions.
Damage control engaged!
Since our build pipeline solely uses 14.29.30145 as well it was an easy decision for me to just enforce everyone to install Visual Studio 2019.
I had noticed that Unreal Engine will prefer to use the 2019 compiler even when 2022 was used.
Running from desk to desk and contacting remote colleagues we got everyone on 2019 within an hour.
A summary
The report that the issue was still happening got in at 9:40 AM, by 3:16 PM everyone was on the 2019 compiler and the issue was officially resolved.
I am really glad we had experience with a compiler bug before. If not for that I don't think I would've jumped to the possibility this quickly.
And so we would've likely had a much longer resolution time.
Time to analyze!
As the dust settled I finally had time to sit down and analyze what actually was going wrong.
In the midsts of the chaos some theories were flung around.
"The beam is coming from 0,0,0", not a bad first guess but when you actually test it yourself you quickly realize it's not a fixed position.
Playing around with it myself I noticed it felt like it was somehow offset relative to gun barrel.
Turns out I wasn't too far off. (Spoilers)
But let's take a look at the code at hand!
for (int i = 0; i < Beams.Num(); i++)
{
if (Beams[i].IsValid())
{
FVector Start;
if (i == 0)
{
Start = Weapon->GetMuzzleSocketLocation();
}
else
{
Start = Beams[i].GetBeamOrigin();
}
FVector End;
if (Beams[i].HasValidHitResult())
{
End = Beams[i].GetHitResult().ImpactPoint;
}
else
{
End = Beams[i].GetBeamOrigin() + (Beams[i].GetBeamDirection() * Beams[i].GetRange());
}
if (Beams[i].GetParticleSystem() == nullptr)
{
FTransform SpawnTransform;
SpawnTransform.SetLocation(Start);
Beams[i].SetParticleSystem(UFXHelper::SpawnVFXWithTintColor(Weapon, BulletFX.BeamParticleEffect, SpawnTransform, BulletFX.TintColor));
}
EmittersCount = Beams[i].GetParticleSystem()->Template->Emitters.Num();
for (int EmitterIndex = 0; EmitterIndex < EmittersCount; EmitterIndex++)
{
Beams[i].GetParticleSystem()->SetWorldLocation(Start);
Beams[i].GetParticleSystem()->SetBeamSourcePoint(EmitterIndex, Start, 0);
Beams[i].GetParticleSystem()->SetBeamEndPoint(EmitterIndex, End);
}
}
}
So nothing too crazy here, just some code to set the Start and End points of our beam Particle Effects.
Along with some fallbacks in case there are no particle effects we make sure there are some.
Testing setup
Okay so before I could get started with digging further into the issue, I had to obviously replicate the issue.
As I previously mentioned, the compiler version I had was seemingly working just fine.
I did have a faulty compiler installed but Unreal would still use the 2019 version.
Luckily it is pretty easy to force Unreal to use the faulty compiler version instead.
I just had to go to %appdata%/Unreal Engine/UnrealBuildTool/BuildConfiguration.xml.
And add the following xml:
<?xml version="1.0" encoding="utf-8" ?>
<Configuration xmlns="https://www.unrealengine.com/BuildConfiguration">
<WindowsPlatform>
<Compiler>VisualStudio2022</Compiler>
<CompilerVersion>14.35.32215</CompilerVersion>
</WindowsPlatform>
</Configuration>
Stepping through
Now that I have it using the problem compiler I can start debugging the issue further!
When we step through it there is something very perculiar.
After executing "Start = Weapon->GetMuzzleSocketLocation();" the Y value of Start never got set.
That's definitely not supposed to happen. And while it was incredibly unlikely that my weapon was exactly at Y 0 I did double check this and rule it out.
No matter what happened the Y value was always 0 even though the function definitely returned a vector with a Y value.
So it seems it was a bit of a combination of our previous guesses. The X and Z value would correspond to the barrel location.
But the Y value was always at 0.
Taking it a step further
So now we roughly know what is happening. And I could've left it at that.
But I wanted to dig deeper into the issue and look at the assembly generated by the compiler.
Maybe there are some glaring differences?
"Correct" (14.29.30145)
;Start = Weapon->GetMuzzleSocketLocation();
A0007FF86E3942F2 mov r8,qword ptr [r13+0D0h]
A0007FF86E3942F9 mov rax,qword ptr [r8+150h]
A0007FF86E394300 test rax,rax
A0007FF86E394303 je UWeaponFiringBeamComponent::DrawBeam+39Dh (07FF86E39430Dh)
A0007FF86E394305 add rax,1F0h
A0007FF86E39430B jmp UWeaponFiringBeamComponent::DrawBeam+3A4h (07FF86E394314h)
A0007FF86E39430D mov rax,qword ptr [__imp_FTransform::Identity (07FF86E654460h)]
A0007FF86E394314 movups xmm0,xmmword ptr [rax]
A0007FF86E394317 mov rcx,qword ptr [r8+368h]
A0007FF86E39431E movups xmm1,xmmword ptr [rax+10h]
A0007FF86E394322 movaps xmmword ptr [rbp+220h],xmm0
A0007FF86E394329 movups xmm0,xmmword ptr [rax+20h]
A0007FF86E39432D movaps xmmword ptr [rbp+240h],xmm0
A0007FF86E394334 test rcx,rcx
A0007FF86E394337 je UWeaponFiringBeamComponent::DrawBeam+3E1h (07FF86E394351h)
A0007FF86E394339 mov r8,qword ptr [r8+3E0h]
A0007FF86E394340 lea rdx,[rbp+2B0h]
A0007FF86E394347 call qword ptr [__imp_USkeletalMeshSocket::GetSocketTransform (07FF86E6578D0h)]
;Problem Area
A0007FF86E39434D movups xmm1,xmmword ptr [rax+10h] ;XMM1 = 0000000042F32217-44EE58E2C4F55432
A0007FF86E394351 movaps xmm2,xmm1 ;XMM2 = 0000000042F32217-44EE58E2C4F55432
A0007FF86E394354 movaps xmm0,xmm1 ;XMM0 = 0000000042F32217-44EE58E2C4F55432
A0007FF86E394357 shufps xmm0,xmm1,0AAh ;XMM0 = 42F3221742F32217-42F3221742F32217
A0007FF86E39435B shufps xmm2,xmm1,55h ;XMM2 = 44EE58E244EE58E2-44EE58E244EE58E2
A0007FF86E39435F unpcklps xmm1,xmm2 ;XMM1 = 44EE58E244EE58E2-44EE58E2C4F55432
A0007FF86E394362 movss dword ptr [rbp+48h],xmm0 ;0x00000066455797A8 = 42F32217
A0007FF86E394367 mov eax,dword ptr [rbp+48h] ;RAX = 0000000042F32217
A0007FF86E39436A movsd mmword ptr [rbp+40h],xmm1 ;0x0000006645579690 = 00007FF86E6F7688
A0007FF86E39436F movsd mmword ptr [Start],xmm1 ;Sets X and Y
;}
A0007FF86E394375 jmp UWeaponFiringBeamComponent::DrawBeam+44Dh (07FF86E3943BDh)
;else
;{
;Start = Beams[i].GetBeamOrigin();
A0007FF86E394377 cmp edi,dword ptr [rbx+8]
A0007FF86E39437A mov eax,r12d
A0007FF86E39437D mov dword ptr [rsp+78h],edi
A0007FF86E394381 cmovl eax,r14d
A0007FF86E394385 test eax,eax
A0007FF86E394387 jne UWeaponFiringBeamComponent::DrawBeam+43Dh (07FF86E3943ADh)
A0007FF86E394389 lea rax,[rsp+78h]
A0007FF86E39438E mov qword ptr [rbp+0E8h],rbx
A0007FF86E394395 lea rcx,[rbp+0E0h]
A0007FF86E39439C mov qword ptr [rbp+0E0h],rax
A0007FF86E3943A3 call DispatchCheckVerify<void,<lambda_514a2dba2fe57b19bcfe7814a18ccbb3> > (07FF86E61D840h)
A0007FF86E3943A8 nop
A0007FF86E3943A9 int 3
A0007FF86E3943AA mov rcx,qword ptr [rbx]
A0007FF86E3943AD movsd xmm0,mmword ptr [rsi+rcx+4]
A0007FF86E3943B3 mov eax,dword ptr [rsi+rcx+0Ch]
A0007FF86E3943B7 movsd mmword ptr [Start],xmm0
;}
;FVector End;
;if (Beams[i].HasValidHitResult())
A0007FF86E3943BD cmp edi,dword ptr [rbx+8]
A0007FF86E3943C0 mov dword ptr [rsp+38h],eax ;Sets Z
"Wrong" (14.35.32215)
;Start = Weapon->GetMuzzleSocketLocation();
A0007FF86DD7075F mov r8,qword ptr [r13+0D0h]
A0007FF86DD70766 mov rax,qword ptr [r8+150h]
A0007FF86DD7076D test rax,rax
A0007FF86DD70770 je UWeaponFiringBeamComponent::DrawBeam+37Ah (07FF86DD7077Ah)
A0007FF86DD70772 add rax,1F0h
A0007FF86DD70778 jmp UWeaponFiringBeamComponent::DrawBeam+381h (07FF86DD70781h)
A0007FF86DD7077A mov rax,qword ptr [__imp_FTransform::Identity (07FF86E02E468h)]
A0007FF86DD70781 movups xmm0,xmmword ptr [rax]
A0007FF86DD70784 mov rcx,qword ptr [r8+368h]
A0007FF86DD7078B movups xmm6,xmmword ptr [rax+10h]
A0007FF86DD7078F movaps xmmword ptr [rbp+1F0h],xmm0
A0007FF86DD70796 movups xmm0,xmmword ptr [rax+20h]
A0007FF86DD7079A movaps xmmword ptr [rbp+210h],xmm0
A0007FF86DD707A1 test rcx,rcx
A0007FF86DD707A4 je UWeaponFiringBeamComponent::DrawBeam+3BEh (07FF86DD707BEh)
A0007FF86DD707A6 mov r8,qword ptr [r8+3E0h]
A0007FF86DD707AD lea rdx,[rbp+280h]
A0007FF86DD707B4 call qword ptr [__imp_USkeletalMeshSocket::GetSocketTransform (07FF86E0318D0h)]
;Problem Area
A0007FF86DD707BA movups xmm6,xmmword ptr [rax+10h] ;XMM6 = 00000000430214D2-44EE05BBC4F4FD36
A0007FF86DD707BE movaps xmm8,xmm6 ;XMM8 = 00000000430214D2-44EE05BBC4F4FD36
;}
A0007FF86DD707C2 movaps xmm0,xmm6 ;XMM0 = 00000000430214D2-44EE05BBC4F4FD36
A0007FF86DD707C5 shufps xmm8,xmm6,55h ;XMM8 = 44EE05BB44EE05BB-44EE05BB44EE05BB
A0007FF86DD707CA movaps xmm7,xmm6 ;XMM7 = 00000000430214D2-44EE05BBC4F4FD36
A0007FF86DD707CD unpcklps xmm0,xmm8 ;XMM0 = 44EE05BB44EE05BB-44EE05BBC4F4FD36
A0007FF86DD707D1 movss dword ptr [Start],xmm0 ;Sets X
A0007FF86DD707D7 shufps xmm7,xmm6,0AAh ;XMM7 = 430214D2430214D2-430214D2430214D2
A0007FF86DD707DB jmp UWeaponFiringBeamComponent::DrawBeam+435h (07FF86DD70835h)
;else
;{
;Start = Beams[i].GetBeamOrigin();
A0007FF86DD707DD cmp esi,dword ptr [rdi+8]
A0007FF86DD707E0 mov eax,r12d
A0007FF86DD707E3 mov dword ptr [rsp+78h],esi
A0007FF86DD707E7 cmovl eax,r15d
A0007FF86DD707EB test eax,eax
A0007FF86DD707ED jne UWeaponFiringBeamComponent::DrawBeam+410h (07FF86DD70810h)
A0007FF86DD707EF lea rax,[rsp+78h]
A0007FF86DD707F4 mov qword ptr [rbp+0C8h],rdi
A0007FF86DD707FB lea rcx,[rbp+0C0h]
A0007FF86DD70802 mov qword ptr [rbp+0C0h],rax
A0007FF86DD70809 call DispatchCheckVerify<void,<lambda_514a2dba2fe57b19bcfe7814a18ccbb3> > (07FF86DFF7850h)
A0007FF86DD7080E nop
A0007FF86DD7080F int 3
A0007FF86DD70810 mov rax,qword ptr [rdi]
A0007FF86DD70813 movsd xmm1,mmword ptr [rax+r14+4]
A0007FF86DD7081A movd xmm7,dword ptr [rax+r14+0Ch]
A0007FF86DD70821 movaps xmm0,xmm1
A0007FF86DD70824 shufps xmm0,xmm0,55h
A0007FF86DD70828 movaps xmm6,xmm1
A0007FF86DD7082B movaps xmm8,xmm0
A0007FF86DD7082F movsd mmword ptr [Start],xmm1
;}
;FVector End;
;if (Beams[i].HasValidHitResult())
A0007FF86DD70835 cmp esi,dword ptr [rdi+8]
A0007FF86DD70838 mov eax,r12d
A0007FF86DD7083B movss dword ptr [rsp+38h],xmm7 ;Sets Z
Now maybe many of you are like me and don't really understand what's going on here at first glance.
And honestly even after researching it for a while I still don't have a very thorough understanding.
But at least I think I get the gist of the issue.
In the good version it uses a 64bit mov instruction (movsd) and in the bad one it uses a 32bit mov instruction (movss).
So I am assuming that in the 64bit version this would correspond to a 32bit X and a 32bit Y next to eachother in memory and so both gets set.
And with the 32bit version this now only sets the 32bit X and thus the Y never gets touched and always stays 0.
What's next?
I am not sure what's next. I have been trying to recreate the issue in a clean Unreal Engine 4.27 project and so far have been unsucessful.
Which means there are aspects to this issue I haven't figured out yet. Maybe part of the issue lies in the changes we've made to our version of UE 4.27.
Or maybe there is another part in the code somewhere that is somehow affecting it that I haven't noticed yet.
Regardless this doesn't seem to be a very widespread and easy to reproduce issue.
But maybe if someone else runs into this they will now know what is happening and how to resolve it!
I will probably keep investigating this further when I have time and keep you updated if I find anything new!