It is All about Traffic, and Programming
Search
Archives
0 visitors online now
0 guests, 0 members

Debugging

The Move Constructor: A Linking Pitfall of Qt Library with Different Compilers

Move constructor is a C++11 feature from ISO/IEC14882:2011.  It enhanced the old-school C++ copy constructor by moving the resource of an object, instead of member-wise copying.  It is faster than a copy constructor.  Thus a C++11 compiler will generate six special member functions for a C++ object, including  move constructor and move assignment operator:

  • Default Constructor
  • Copy Constructor
  • Copy Assignment Operator
  • Destructor
  • Move Constructor
  • Move Assignment Operator

We are not going to discuss C++11 Move Constructor in details here –  better details and better explanations are there, and there.

The motivation of  this article,  is not to discuss the semantics or usage of Move Constructor,  but an interesting debugging experience, and the lessons learned which involves C++11 move constructor.

Long story short – I have been working on developing an advanced Qt-based C++ plugin DLL for a host software developed in Qt 4.8.7.   I was using Visual C++ 2010 compiler, and the Qt for VS2010 official release for commercial applications.

The plugin DLL is supposed to be dynamically loaded by the host via Qt’s QLibrary API (see my previous article on this subject), which internally will call Microsoft’s Win32 API LoadLibrary.  Somehow, I got the following error, when trying to load the DLL into the host software:

34f8:3518 @ 01570140 – LdrpGetProcedureAddress – INFO: Locating procedure “AcdssPluginFactory” by name
34f8:3518 @ 01570140 – LdrpNameToOrdinal – WARNING: Procedure “AcdssPluginFactory” could not be located in DLL at base 0x0B630000
34f8:3518 @ 01570140 – LdrpReportError – WARNING: Locating export “AcdssPluginFactory” for DLL “Unknown” failed with status: 0xc0000139
34f8:1168 @ 01570265 – LdrpReportError – ERROR: Locating export “??4QString@@QAEAAV0@$$QAV0@@Z” for DLL “C:\Program Files (x86)\…\plugins\acdssplugin.DLL” failed with status: 0xc0000139

The above debug messages were generated by GFlags Utility, which comes with Debugging Tools for Windows. You can find the most advanced debugging techniques including how to use GFlags utility from the excellent book Inside Windows Debugging.  In our case, we just need to set “Show Loader Snaps” flag, as shown in the figure.  This is a typical way of debugging LoadLibrary failures.

2016-10-12_21-41-26

Looking at the debug messages, it is simply pointing out that LoadLibrary fails because an exported function is missing from the image space – its C++ mangled name as  “??4QString@@QAEAAV0@$$QAV0@@Z”,  which can be interpreted as:


    ? = C++ mangled symbol
    ?4 = "operator="
    QString@ = the enclosing class, hence method name: QString::operator=
    @ = end of the scope
    Q = public
    A = no const and volatile qualifier
    E = __thiscall
    AAV0 = const ref to the first named class (QString) - the return value (const QString &)
    @ = end return type
    $$ = move constructor &&
    QAV0 = const pointer to first named class - QString * const
    @ = end parameter type
    @ = end parameters
    Z = function

So clearly, the host software was complaining that the Move Constructor of QString can not be located in any of the image space.  Upon further inspection of the host software, it is discovered it was compiled using VS2005.  QtCore4.dll does not have the Move Constructor generated with VC++ 2005 compiler.  

The solution is simple – just rebuilt the Qt 4.8.7 using VC++ 2005, and rebuilt the plugin DLL.  Above all, the lessons learned are:

Do NOT mix Qt library compiled with different compilers, EVEN with the same Qt Version.  Period.

Dissecting Vissim COM Internal from Inside Out (3)

In order to study the life cycle of CCOMVissim,  we have to track the calling stack of comsvr.dll.  This DLL is the host of CCOMVissim and other helper classes of Vissim COM functionalities. Please note it is not possible and not an option to perform static code analysis using a typical debugger. This is because CodeMeter hardware chip employs sophisticated anti-debugging mechanism, black-listing all known static/dynamic analysis debuggers such as OllyDBG or IDAPro .  We’d have to return to comsrv.DLL  – and recall in this post – the 58 exported functions, something like below:

vissim-com-export-index-1-30

As an effective approach (for interoperability among Vissim’s own various licensed modules),  let’s start creating a DLL project, using the same name “comsrv.DLL”,  as the one in the Exe folder of Vissim installation.

As a good background reading on exporting  C++ classes in a DLL,  here is an detailed post CodeProject: Export C++ classes from a DLL

Following this new Visual Studio C++ DLL project,  create a new class called CCOMVissim  – pay attention to the class definition, and the empty methods, and the declaration “__declspec(dllexport)“.  By default, the calling convention is “thiscall“.  Because we know Vissim is compiled in Visual C++ compiler apriori, thus name mangling wouldn’t be an issue (because of the same C++ compiler).

2015-06-05_14-59-13

Now, compile this project, and a dll called comsrv.dll is generated.  After this new DLL is generated,  de-compile it to verify the exported functions (see the figure below):

2015-06-05_15-02-18

We just created a “proxy” DLL with the same name as Vissim’s original comsrv.dll.  We emulated some of the original exported functions and classes.  The point is,  as long as the same list of  functions  are exported by this new dll, it is “loadable” by Vissim.  Therefore, we are  enableed to append additional code to track, manage, and manipulate the calling stack of those class methods or functions relevant to the life-cycle of IVissimPtr. This will help accomplish the objective of enabling interoperability of Vissim COM functionalities within Vissim’s various DLL-based modules,  including Signal Control API DLL, External Driver DLL and such.

If you could understand up to this point,  there should be no problem for you to move on to a final working solution to the question asked in the beginning.  The efforts are yours, and the omissions are mine.  One last hint:

Rename the original “comsrv.dll” to a different name, e.g., “comsrv_org.dll”.  Redirect calls from inside the proxy comsrv.dll to the original dll.

Disable 8.3 Filename

What is 8.3 file name? See wikipedia at
http://en.wikipedia.org/wiki/8.3_filename

How to disable it?

Windows Server 2003 and Windows XP
To disable the 8.3 name creation on all NTFS partitions, type fsutil.exe behavior set disable8dot3 1 at a command prompt, and then press ENTER.

Windows 2000 and Windows NT

To disable the 8.3 name creation on all NTFS partitions, use the following steps: Important This section, method, or task contains steps that tell you how to modify the registry. However, serious problems might occur if you modify the registry incorrectly. Therefore, make sure that you follow these steps carefully. For added protection, back up the registry before you modify it. Then, you can restore the registry if a problem occurs. For more information about how to back up and restore the registry, click the following article number to view the article in the Microsoft Knowledge Base:
322756 (http://support.microsoft.com/kb/322756/ ) How to back up and restore the registry in Windows

1. Start Regedt32.exe and locate the following registry key:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
2. Select the NtfsDisable8dot3NameCreation entry.

NOTE: By default, the value for this entry is set to 0.
3. On the Edit menu, click DWORD. Type a value of 1 in the Data field.
4. Click OK and then quit Regedt32.
5. Quit Windows NT and turn off your computer.
6. Restart your computer and Windows NT.

NOTE: The change to the NtfsDisable8dot3NameCreation registry entry affects only files, folders, and profiles that are created after the change. Files that already exist are not affected.

OpenMP Runtime Version Conflicts

I was working on a software project that I have been working on in the past two years…

The strangest thing happened when I tried to launch the main GUI of the system – I double clicked it, it flashed in the screen then disappeared immediately.

This doesn’t make sense at all – it was working just a few days ago and I didn’t recall I did anything…

I launched the debugger from the IDE, but the same thing happened while none of the source code break points get hit!

My first reaction is, something must go wrong in the compiler generated start-up code….

So, Plan B.

This time I launched the OllyDBG debuggger from which I started my executable.

….. oh….. modules loaded one by one…. then…… boop!

I got this error message from a console window, and the main program just halted:

OMP abort: Initializing lidguide40.dll, but found libiomp5md.dll already initialized.

This may cause preformance KMP_DUPLICATE_LIB _OK=TRUE to ignore this problem and force the program to continue anyway. Please note that the use of KMP_DUPLICAT_LIB_OK is Unsupported and using it may cause undefined behavioer. For more infomation,please contact Intel(R) Premier Support. ”

So what is going on?

Here is an explanation from Intel website:

Both libiomp5md.dll and libguide40.dll are Intel OpenMP Runtime library. The libiomp5md.dll is new Intel OpenMP* Compatibility library while the libguide40.dll is legacy OpenMP library. The Intel® IPP 6.x and Intel® Compiler 11.x threading libraries are switching from the legacy OpenMP* run-time library (libguide*) to the compatibility library (libiomp*).

The dynamic library, libguide40.dll, is still included in the current releases for backwards compatibility and will be removed in a future release.

The error is caused by multiple OpenMP libraries were linked in same application. For example, if a application link the libiomp5md.dll from IPP 6.x, at the same time, it also link other software, e.g third-party library, which link libguide40.dll, then the error arises because of the duplicate initialization of OpenMP Runtime library

So it turns out that two third party libraries I was using are dependent on TWO different versions of OpenMP libraries.

In all, it wasn’t a challenging problem that requires RE technique to find the cause, however it is an elusive problem as the Console warning window never show up, or it did but in a very very short time to prevent naked human eyes to notice.

Solution? For the time being I just add the environmental variable as suggested but later I am thinking of removing one of the third party libraries.

Inside Aimsun ANGApp.close

Aimsun ANGApp is a class available to python scripting. It is a very interesting class that TSS provides to its general users, and can be very powerful if the user really appreciates how it works.

In the past a few days, I have been cleaning up my previous codes, while working on a project proposal that may require Aimsun micro + meso. And for some reason, one code snippet that has worked previously crashes on ANGApp.close … …

This time I am pretty hesitant to write to TSS for help – though I am sure they are going to help, as always TSS is very kind and prompt in supporting their users – sometimes a little delay if they are busy. And for these couple of days, I know they are being working their tails off on the new release of Aimsun 6.1.0 – couldn’t be more busier.

So I decided not to bother them and find out myself.

All rightee ……the question is, what is really going on inside ANGAPP.close()? And why there is a crash? Let’s take a partial look at the code segment:

OK, save the current state of ESI by pushing it to stack. This is a no brainer.
.text:00B620E0 push esi

Note, ECX always holds the “this” pointer for C++ class. So, after mov ESI, ECX, ESI stores the starting address of the current ANGApp instance
.text:00B620E1 mov esi, ecx

Now, since ESI+0 is the starting address of the subject ANGApp instance, ESI + 8 must be the address for some private member-variable, and by looking at the instruction at 00B620EA , we can immediately get the hint that it must be the address for a GGui* pointer . Now, what about ESI + 4? Note the fact that ANGApp inherits from QObject, thus ESI+4 actually is the address for a protected pointer of type QObjectData. What about ESI+0xC? Hey, are you as curious as I am? That is a good question! That is actually the address for a pointer of type GKModel *.

Doh~!
.text:00B620E3 mov ecx, [esi+8]

IS the GGui* NULL?
.text:00B620E6 test ecx, ecx

.text:00B620E8 jz short loc_B620FC

If the GGui* is not NULL then call GGui.getActiveModel
.text:00B620EA call ds:__imp_?getActiveModel@GGui@@QBEPAVGKModel@@XZ ;

.text:00B620F0 mov ecx, [esi+8]

Now after invoking GGui.getActiveModel, EAX holds the returned pointer for GKModel
.text:00B620F3 push eax

Then GGui.closeDocument is called with the returned GKModel pointer from above instruction
.text:00B620F4 call ds:__imp_?closeDocument@GGui@@QAE_NPAVGKModel@@@Z

Okaydokay. So far so good! Now restore ESI.
.text:00B620FA pop esi
.text:00B620FB retn
.text:00B620FC ; —————————————————————————
.text:00B620FC

.text:00B620FC loc_B620FC:
[ESI + 0xC] is the address that holds the pointer of type GKModel*
.text:00B620FC mov ecx, [esi+0Ch]

Now ECX holds the pointer to GKModel
.text:00B620FF test ecx, ecx

.text:00B62101 jz short loc_B62112

Here is the tricky part. [ECX +0] is the address of the v-table of GKModel class
.text:00B62103 mov eax, [ecx]

Now EDX stores the address of the v-table of GKModel class
.text:00B62105 mov edx, [eax]

Index 1 means the first virtual function in the v-table. Note, GKModel inherits from GKObject. Therefore, the first virtual function in the v-table is the destructor ~GKObject()
.text:00B62107push 1

Now call the destructor
.text:00B62109 call edx

Reset the GKModel* pointer to zero.
.text:00B6210B mov dword ptr [esi+0Ch], 0

This is perfect code logically showing no problem at all! And it turns out that the crash was caused by a corrupted memoy I artificially manipulated which screw up ANGApp’s initernal data!