最近碰到一个很奇怪的crash问题,系统是32位进程,运行在64位win 2008 server的机器上,运行一段时间后就会crash, post mortem debugger设的是ADPlus,抓到的dump是个1st chance mini dump,ADPlus的log文件中的记录如下:
--- 1st chance Process_Shut_Down exception ---- --------------------------------------------------------------- This process is shutting down! This can happen for the following reasons 1) Someone killed the process with Task Manager or the kill command 2.) If this process is an MTS or COM+ server package, it could be * exiting because an MTS/COM+ server package idle limit was reached. 3.) If this process is an MTS or COM+ server package, * someone may have shutdown the package via the MTS Explorer or * Component Services MMC snap-in. 4.) If this process is an MTS or COM+ server package, * MTS or COM+ could be shutting down the process because an internal * error was detected in the process (MTS/COM+ fail fast condition). ---------------------------------------------------------------
Occurrence happened at: Debug session time: Wed Jan 4 15:22:46.633 2012 (GMT+8) System Uptime: 8 days 0:59:56.555 Process Uptime: 0 days 0:09:04.503 Kernel time: 0 days 0:01:18.453 User time: 0 days 0:02:29.875
All thread stacks below ---
.101 Id: 1d98.ce0 Suspend: 0 Teb: 7ee89000 Unfrozen # ChildEBP RetAddr Args to Child WARNING: Stack unwind information not available. Following frames may be wrong. 00 1835ff88 75a5339a 0060bb58 1835ffd4 77079ed2 ntdll!ZwWaitForWorkViaWorkerFactory+0x12 01 1835ff94 77079ed2 0060bb58 4640ad42 00000000 kernel32!BaseThreadInitThunk+0x12 02 1835ffd4 77079ea5 77086679 0060bb58 ffffffff ntdll!RtlInitializeExceptionChain+0x63 03 1835ffec 00000000 77086679 0060bb58 00000000 ntdll!RtlInitializeExceptionChain+0x36
Creating c:\\Crash_Mode__Date_01-04-2012__Time_15-15-1717\PID-7576__SERVMYHOST.EXE__1st_chance_Process_Shut_Down__full_214c_2012-01-04_15-22-46-637_1d98.dmp - mini user dump Dump successfully written =========================
用windbg打开dump文件后,发现的确如log显示的只有一个线程,而且都是看上去很正常的系统调用,!analyze -v的输出如下:
FAULTING_IP: +1902faf00a4df5c 00000000 ?? ???
EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff) ExceptionAddress: 00000000 ExceptionCode: 80000003 (Break instruction exception) ExceptionFlags: 00000000 NumberParameters: 0
FAULTING_THREAD: 00000ce0
DEFAULT_BUCKET_ID: STATUS_BREAKPOINT
PROCESS_NAME: servMyHost.exe
ERROR_CODE: (NTSTATUS) 0x80000003 - {EXCEPTION} Breakpoint A breakpoint has been reached.
EXCEPTION_CODE: (HRESULT) 0x80000003 (2147483651) - One or more arguments are invalid
MOD_LIST: <ANALYSIS/>
NTGLOBALFLAG: 400
APPLICATION_VERIFIER_FLAGS: 0
PRIMARY_PROBLEM_CLASS: STATUS_BREAKPOINT
BUGCHECK_STR: APPLICATION_FAULT_STATUS_BREAKPOINT
LAST_CONTROL_TRANSFER: from 7708471e to 77061f36
STACK_TEXT: 1835fe28 7708471e 000001c0 1835fedc 4640ad1e ntdll!ZwWaitForWorkViaWorkerFactory+0x12 1835ff88 75a5339a 0060bb58 1835ffd4 77079ed2 ntdll!TppWorkerThread+0x216 1835ff94 77079ed2 0060bb58 4640ad42 00000000 kernel32!BaseThreadInitThunk+0xe 1835ffd4 77079ea5 77086679 0060bb58 ffffffff ntdll!__RtlUserThreadStart+0x70 1835ffec 00000000 77086679 0060bb58 00000000 ntdll!_RtlUserThreadStart+0x1b
STACK_COMMAND: ~0s; .ecxr ; kb
FOLLOWUP_IP: ntdll!ZwWaitForWorkViaWorkerFactory+12 77061f36 83c404 add esp,4
SYMBOL_STACK_INDEX: 0
SYMBOL_NAME: ntdll!ZwWaitForWorkViaWorkerFactory+12
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: ntdll
IMAGE_NAME: ntdll.dll
DEBUG_FLR_IMAGE_TIMESTAMP: 4ce7ba58
FAILURE_BUCKET_ID: STATUS_BREAKPOINT_80000003_ntdll.dll!ZwWaitForWorkViaWorkerFactory
BUCKET_ID: APPLICATION_FAULT_STATUS_BREAKPOINT_ntdll!ZwWaitForWorkViaWorkerFactory+12
WATSON_STAGEONE_URL: http://watson.microsoft.com/StageOne/servIMSHost_exe/1_51_0_1026/4efda2a8/unknown/0_0_0_0/bbbbbbb4/80000003/00000000.htm?Retriage=1
Followup: MachineOwner ---------
最奇怪的exception居然是Breakpoint类型,据我所知只有assert或者有debugger break in时才会出现breakpoint int 3 exception, 但assert在release build中是不起作用的,也没有用debugger。。。
同样的问题已经出现多次了,每次的dump中都只有一个线程,call stack有所不同,但看上去都很正常,有没有高人给解释下啊
|