System.AccessViolationException

Feb 10 at 11:13 AM
I have a new project running with WSPEventRouter but this time we are using the 3.0 version. Most of the time it's working without any problem but sometimes we are getting the following error on event viewer.
Application: Tracevia.XVIAE.AVDService.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.AccessViolationException
 Stack:
   at Microsoft.WebSolutionsPlatform.Common.NativeMethods+x86.PutBuffer(Byte[], UInt32, UInt32, IntPtr ByRef)
   at Microsoft.WebSolutionsPlatform.Common.SharedQueue.Enqueue(Byte[], UInt32)
   at Microsoft.WebSolutionsPlatform.PubSubManager.PublishManager.PublishNew(Byte[])
    at Microsoft.WebSolutionsPlatform.PubSubManager.PublishManager.Publish(System.Guid, System.Collections.Generic.Dictionary`2<Byte,System.String>, Byte[])
   at Microsoft.WebSolutionsPlatform.PubSubManager.PublishManager.Publish(System.Guid, Byte[])
    at Tracevia.XVIAE.Common.PublishSubscriber.Publisher.PublishMessage[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]](System.Guid, System.__Canon)
   at Tracevia.XVIAE.AVDService.BL.Service.SendDataToConsole()
    at System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
    at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
    at System.Threading.ThreadHelper.ThreadStart() 
Do you have any idea what could be causing this? I can be working weeks without any problem but yesterday we got this error for 6 hours.

thank you for you time ^^
Coordinator
Feb 10 at 12:41 PM
I don't ever recall seeing this kind of issue. If the AV occurs in the PutBuffer function then it would appear the communication buffer which the thread passes between the managed/native layers has been corrupted or the common portion of shared memory used by all the processes and threads using Wsp has been corrupted. Unfortunately, the stack trace you show doesn't provide enough info and the AV is probably a result of the corruption and not root cause.

If the AV is actually coming from a function which PutBuffer calls then the corruption could be the buffer pointer itself.

If you get into this state again, you need to stop the Wsp router process and all apps using Wsp so the shared memory is released. Then you can safely restart the Wsp router and the apps.

Sorry I'm not much help.

Keith
Feb 10 at 1:16 PM
Thank you Keithh for the quick response.

Is there a way to activate some sort of log so I can get more information on this error?

Also WSPEventRouter is running on a 2012 server. Is there any known issue / configuration needed for WSPEventRouter to run on 2012 Server?

Thank you,
Wheels
Coordinator
Feb 10 at 4:27 PM
I ran Wsp for more than a year on WS 2012 and don't ever recall hitting this issue. If you want logging at this level, you'll need to instrument the code yourself.

In the stack above, it shows x86 for the PutBuffer call. Are you running x86?
Feb 11 at 8:58 AM
Yes. Do you think that may cause the problem?
Feb 11 at 10:17 AM
The reason that we are using the x86 version on the x64 machine is that we have some legacy DLL that are compiled in x86 so we need our service to also be x86
Coordinator
Feb 11 at 1:39 PM
I would run the Wsp router service as x64 and the app can use the x86 libraries. It's not a problem for them to coexist. Since the issue isn't consistent, it could be a bug in the .Net library especially since I haven't seen the issue and I only use the x64 version.
Feb 17 at 3:59 PM
Edited Feb 17 at 4:51 PM
nevermind...I forgot to switch some dll's around since I was using a modified version of 3.0.
Coordinator
Feb 19 at 1:46 PM

I haven’t had this issue before and the config settings look OK. Role should be case insensitive but you could change it to “hub” and see if that changes things.

Coordinator
Feb 28 at 11:45 PM
I think I found the bug. It was in GetBuffer in SharedMemoryMgr.c
Coordinator
Mar 3 at 5:38 PM
Did you get my fix from last Friday? It was a bug which was there for many years and would explain why you could run for a month and then get an exception. Since the event buffer would have data corruption, the error could have shown up in many places.