This project has moved and is read-only. For the latest updates, please go here.

Stackwalker does not return

Jun 30, 2010 at 7:30 AM

We try to use StackWalker creating stack traces on exceptions under Windows (under Linux we use backtrace, did not see the StackTrace project before...).

Some colleagues with newer and so faster hardware (we still use Windows XP as operatiing system for development) have problems when they run our internal regression tests (using cppunit).

They have no problem in "debug" mode, but with all optimizations (including Whole Program Optimization) turned on (we call it "production" mode since we have a third mode between "debug" and "production") the regression tests sometimes get stuck in the stacktrace creating function of Stackwalker.

Since we can not debug it because there is no problem under "debug" we need your help.

Are there any compiler/linker settings which we should not use?

What code in StackWalker could be problematic and may produce that behaviour?

Any help or advice is welcome.

Regards,

Armin

Jun 30, 2010 at 7:35 AM
I do not recommend to use StackWalker for logging exceptions.
You should consider using MiniDumpWriteDump. It is much better than walking the callstack... And it will also work for release builds! Yiu can display on your development machine the complete xallstack of all threads with source line infos!

Its way better than the output of the stackwalking...

Jun 30, 2010 at 7:48 AM

Sorry, we do not log the exceptions with Stackwalker.

We create (new) exceptions containing the stacktrace created via Stackwalker or backtrace. Logging is done via output operator of our exception classes using log4cxx.

We also catch SEH exceptions and here the stacktrace is essential.

Jun 30, 2010 at 7:51 AM
Do you have a small repro project which will show the problem?
Jun 30, 2010 at 7:56 AM

Well, that is the main problem: it is not deterministic and as I mentioned above it does only occur on specific machines.

So I hoped you have some hints, maybe also to reproduce it deterministically...

Jun 30, 2010 at 8:00 AM
It might depend on the exact version of dbghelp.dll!
Can you check the version of dbghelp.dll on the problematic computers? Which version is loaded into the process?
Jun 30, 2010 at 8:11 AM

That is a good hint.

According to different path settings it might be that it is not always the one in C:\WINDOWS\system32.

We will check that, thanks a lot.

Jun 30, 2010 at 8:45 AM

Well, now the problem occured also on a "slow" machine.

dbghelp.dll: 5.1.2600.5512

which is the actual Windows XP library in the system32 folder

Additional info: we use VS 2005 (VC8)

Jun 30, 2010 at 9:30 AM

Now I get the same problem the first time on my machine. I have added some logging to our StackWalkerWin32 class (which is derived from StackWalker) .

We have overloaded StackWalker::OnOutput and filter only messages containing Source Information:

 void StackWalkerWin32::OnOutput(LPCSTR buffer)
{
    LDEBUG(logger, __func__);
    std::string message(buffer);
    LDEBUG(logger, __func__ << " " << message);
    if ( boost::algorithm::contains(message, "):") && ! boost::algorithm::contains(message, "not available"))
    {
        boost::algorithm::trim(message);
        trace->push_back(message);
        LDEBUG(logger, __func__ << " add to trace: " << message);
    }
}

Obviously StackWalker runs in an endless loop. I get these logging messages:

...

2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput
2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput 0282F430 ((module-name not available)): (filename not available): (function-name not available)

2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput
2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput ERROR: SymGetSymFromAddr64, GetLastError: 126 (Address: 0282DE40)

2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput
2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput ERROR: SymGetLineFromAddr64, GetLastError: 126 (Address: 0282DE40)

2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput
2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput ERROR: SymGetModuleInfo64, GetLastError: 1114 (Address: 0282DE40)

2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput
2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput 0282DE40 ((module-name not available)): (filename not available): (function-name not available)

2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput
2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput ERROR: SymGetSymFromAddr64, GetLastError: 126 (Address: 0282F430)

2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput
2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput ERROR: SymGetLineFromAddr64, GetLastError: 126 (Address: 0282F430)

2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput
2010-06-30/10:27:14.525 [0x00001218] DEBUG common.stackwalker StackWalkerWin32::OnOutput - Mordor::StackWalkerWin32::OnOutput ERROR: SymGetModuleInfo64, GetLastError: 1114 (Address: 0282F430)

...

Jun 30, 2010 at 11:10 AM
Edited Jun 30, 2010 at 1:23 PM

I have added a std::set visitedOffsets in ShowCallStack, where I insert the actual s.AddrPC.Offset.

I changed

if (s.AddrPC.Offset == s.AddrReturn.Offset)
to
if (s.AddrPC.Offset == s.AddrReturn.Offset  || visitedOffsets.count(s.AddrPC.Offset))

This work-around works for us.

Maybe you have a better solution...

Marked as answer by jkalmbach on 9/15/2014 at 2:05 AM