.NET Framework Bookmark and Share   
 index > Building Development and Diagnostic Tools for .Net > DoStackSnapshot cuts off early?
 

DoStackSnapshot cuts off early?

I'm seeing an odd bug with asynchronous invocations to DoStackSnapshot as described in the MSDN magazine article. Sometimes, the stackwalk starts by giving me a frame (occasionally two or three but usually just the one) -- but then it cuts out early and gives me functionID = 0. After the 0, it won't call the callback anymore, even though I know that there's a lot of frames above the one it gave me. In fact, there's no reason for an unmanaged run of frames to even BE there, as far as I know. If I call SymFromAddr on the frames it gives me, I get results like CreateApplicationContext, which if I understand correctly is waaay up the callstack, in the guts of CLR.

I'm very confused as to why this happens. It starts out giving me valid frames, and then just cuts out and gives me these bogus results instead. The documentation states quite clearly that I'm free to ignore funcID = 0 if I don't want unmanaged frames, but then I get useless stack fragments. I'm always returning S_OK from the callback and DoStackSnapshot returns S_OK as well, so there's no errors anywhere on the chain...
  • Edited byPromitMVPSunday, August 09, 2009 1:53 AM
  •  

Answers

  • Wednesday, August 12, 2009 9:48 PMShane YuanMSFTUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Hi Promit,

    How did you get your seed for DoStackSnapshot? Did you context point to the top-most managed frame ( System.Windows.Forms.ni.dll!7acf080a())?

    Your seed is only userful if the top-most unmanaged frame is helper code in the CLR.The top-most frame shown by !clrstack, mscorwks.dll!__alldiv(),is a helper code (unmanaged code) in CLR. In this case if DoStackSnapshot is not seeded or is seeded with wrong context, DoStackSnapshot will not work as expected.

    David Broman wrote a comprehensive introduction of how touse DoStackSnapshot at http://blogs.msdn.com/davbr/archive/2005/10/06/profiler-stack-walking-basics-and-beyond.aspxthat every CLR sampling profiler writers need to read and follow. Here's his advice for the profiler that is doing an asynchronous, cross-thread, seeded stack walk while filling in the unmanaged holes:

    1. You suspend the target thread (target thread’s suspend count is now 1)
    2. You get the target thread’s current register context
    3. You determine if the register context points to unmanaged code (e.g., call ICorProfilerInfo2::GetFunctionFromIP(), and see if you get back a 0 FunctionID)
    4. In this case the register context does point to unmanaged code, so you perform an unmanaged stack walk until you find the top-most managed frame (D)
    5. You call DoStackSnapshot with your seed context. CLR suspends target thread again: its suspend count is now 2.


    Thanks,
    Shane
    • Marked As Answer byPromitMVPThursday, August 13, 2009 6:24 PM
    •  

All Replies

  • Monday, August 10, 2009 10:45 PMShane YuanMSFTUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hi Promit,

    Can you compare the result from DoStackSnapshot with the result fromSOS command!clrstack? If asynchronous stack walk is not seeded, DoStackSnapshot is likely to skip the beginingpart of the stack. If DoStackSnapshot skips too many frames, it may appears that DoStackSnapshot cuts off early. Let me know what you find out with !clrstack.


    Thanks,
    Shane
Promit
Hi Shane, I caught it in VS and here's what I've got. This is a seeded stack walk. According to my stack walk, there is a single stack frame:
Microsoft.Xna.Framework.GameClock.CounterToTimeSpan

According to !clrstack, there's nothing decipherable at all:
!clrstack
OS Thread Id: 0x1040 (4160)
ESP EIP
0028f4dc 6df224ad [GCFrame: 0028f4dc]

According to VS, I'm somewhere inside CLR:
> mscorwks.dll!__alldiv() + 0x4d bytes
01f8fe1f()
System.Windows.Forms.ni.dll!7acf080a()
[Frames below may be incorrect and/or missing, no symbols loaded for System.Windows.Forms.ni.dll]
System.Windows.Forms.ni.dll!7aced275()
System.Windows.Forms.ni.dll!7acecc26()
System.Windows.Forms.ni.dll!7ac96776()
mscorwks.dll!_CallDescrWorker@20() + 0x33 bytes
mscorwks.dll!_CallDescrWorkerWithHandler@24() + 0x9f bytes
mscorwks.dll!MethodDesc::CallDescr() + 0x15a bytes
mscorwks.dll!MethodDesc::CallTargetWorker() + 0x1f bytes
mscorwks.dll!MethodDescCallSite::CallWithValueTypes() + 0x1a bytes
mscorwks.dll!ClassLoader::RunMain() - 0x39028 bytes
mscorwks.dll!Assembly::ExecuteMainMethod() + 0xa4 bytes
mscorwks.dll!SystemDomain::ExecuteMainMethod() + 0x416 bytes
mscorwks.dll!ExecuteEXE() + 0x49 bytes
mscorwks.dll!__CorExeMain@0() + 0x98 bytes
mscoree.dll!__CorExeMain@0() + 0x34 bytes
kernel32.dll!@BaseThreadInitThunk@12() + 0x12 bytes
ntdll.dll!___RtlUserThreadStart@8() + 0x27 bytes
ntdll.dll!__RtlUserThreadStart@8() + 0x1b bytes

!clrstack does return a perfectly legitimate stack for the other thread in the application. I've ONLY seen this problem happen with seeded walks. What I do is to attempt an unseeded walk first, and then seed if it fails.
Promit
Hi Promit,

How did you get your seed for DoStackSnapshot? Did you context point to the top-most managed frame ( System.Windows.Forms.ni.dll!7acf080a())?

Your seed is only userful if the top-most unmanaged frame is helper code in the CLR.The top-most frame shown by !clrstack, mscorwks.dll!__alldiv(),is a helper code (unmanaged code) in CLR. In this case if DoStackSnapshot is not seeded or is seeded with wrong context, DoStackSnapshot will not work as expected.

David Broman wrote a comprehensive introduction of how touse DoStackSnapshot at http://blogs.msdn.com/davbr/archive/2005/10/06/profiler-stack-walking-basics-and-beyond.aspxthat every CLR sampling profiler writers need to read and follow. Here's his advice for the profiler that is doing an asynchronous, cross-thread, seeded stack walk while filling in the unmanaged holes:

1. You suspend the target thread (target thread’s suspend count is now 1)
2. You get the target thread’s current register context
3. You determine if the register context points to unmanaged code (e.g., call ICorProfilerInfo2::GetFunctionFromIP(), and see if you get back a 0 FunctionID)
4. In this case the register context does point to unmanaged code, so you perform an unmanaged stack walk until you find the top-most managed frame (D)
5. You call DoStackSnapshot with your seed context. CLR suspends target thread again: its suspend count is now 2.


Thanks,
Shane
  • Marked As Answer byPromitMVPThursday, August 13, 2009 6:24 PM
  •  
  • Wednesday, August 12, 2009 11:28 PMPromitMVPUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I'm following David Broman's algorithm pretty much verbatim, although it's likely that I have a bug in there somewhere. My code is here:
    http://code.google.com/p/slimtune/source/browse/trunk/SlimTuneCLR/Profiler.cpp

    The seeded stack walk begins around line 1027. Note that I am ignoring unmanaged holes for the time being; I just want clean managed stacks right now. (It gets a little hacky towards the end where it's trying to work around the incorrect stacks.) Let me know if anything seems off.

    I'm mainly confused because I do frequently get one or two valid stack frames before it cuts out. It seems like CLR should have a pretty solid idea of the stack if it gets that far...

    Oh, and my seeded stack walks also crash on x64, inside DoStackSnapshot. I'm still debugging that, but if you see anything obvious I'm missing, that would be greatly appreciated as well.
    • Edited byPromitMVPWednesday, August 12, 2009 11:33 PM
    • Edited byPromitMVPWednesday, August 12, 2009 11:32 PM
    •  
  • Thursday, August 13, 2009 12:21 AMShane YuanMSFTUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hi Promit,

    Due to legal concerns (given how many various versions of different licenseagreements out there), it's better for you to post your related code snippet here. Before doing so, can you first verify that your context pointed to the top-most managed frame, System.Windows.Forms.ni.dll!7acf080a() in debugger?


    Thanks,
    Shane
  • Shane Yuan
    Hi Shane, looks like I've solved it. When stack walking, you initialize a STACKFRAME64 from the fields of CONTEXT. My mistake was that, once finished stack walking, I would clear out CONTEXT and copy just the same fields back over from the SF64. This is incorrect, as StackWalk64 updates the CONTEXT to be correct, and clearing it out basically corrupts necessary pieces of it. Once I passed the context directly to DoStackSnapshot without blowing away CONTEXT, the bad stacks stopped.

    I'm guessing that CLR needs the rest of the CONTEXT (the fields other than *SP, *BP, and *IP) only in certain situations, which is why it mostly worked. But it does need the entire context. I'm still looking into my x64 crash, but I'll post separately for that if I need to.
    Promit

    Hi Promit,


    After the targeted thread is suspended, can you set ContextFlags with CONTEXT_FULLand CONTEXT_EXCEPTION_REQUEST before calling GetThreadContext? Let me know if it fixes your x64 crash.

    #define CONTEXT_EXCEPTION_REQUEST 0x40000000 // defined in WinNT.h

    CONTEXT context;
    memset(&context, 0, sizeof(CONTEXT));
    context.ContextFlags = (CONTEXT_FULL | CONTEXT_EXCEPTION_REQUEST);
    GetThreadContext(thread, &context);


    Thanks,
    Shane

    Shane Yuan
    I actually switched to using StackWalk64 the whole way up on x64, as it seemed simpler and less error prone. It works quite well now. How badly did you want me to test this?
    Promit
    Hi Promit,

    It's totally up to you. If you're happy using StackWalk64, I'm going to close this question as answered if you don't mind.


    Thanks,
    Shane
    Shane Yuan

    You can use google to search for other answers

    Custom Search

    More Threads

    • Compiler error cscc.temp
    • Sequence diagram generation from a .Net assembly
    • Identify the code block which consumes maximum time
    • Is an assembly the only possible target for an emitted method?
    • How to evaluate any expression?
    • Dofuscator Community Edition, C# dll and unmanaged C++
    • Is there by any chance an unmanaged version of Light Weight Code Generation API?
    • Creating Performance Counter Logs using C#
    • Multiple Managed Debugging Sessions under one ICorDebugInterface
    • Debugging Custom Attributes