novice user

Dec 26, 2007 at 7:18 AM
Hello folks,
I have installed this system into my machine and am interested in using some features of this code. In my application I want a publisher to publish a message and there are multiple subscribers waiting to listen to this message. I feel this system is apt to my application, but I don't understand the flow of this system? I would be grateful if anyone could help me out to understand the flow of the system and tell me how to run the system and use the feature that I have mentioned above. I went through the code and I suspect InitMemoryMgrmethod of the unmanaged code to create the queue into the memory, please tell me if I'm wrong...
Coordinator
Dec 26, 2007 at 7:52 AM
It's much simpler.

1. Create an event class which inherits from the base event. You can find an example in the program.cs file in the EventPingPong sample.

2. In the publishing application, instantiate your event class, set it's properties, and then you publish it. You can find the lines of code to publish in the program.cs file in the WspEventServiceTest sample. The two relevent lines of code to pulish are:

private PublishManager pubMgr = new PublishManager();
pubMgr.Publish(yourEvent.Serialize());

3. In the subscribing application, you can instantiate an empty instance of your event class so you can subscribe to it. Look at the WspEventListenTest sample. You will see where it instantiates a callback object, instantiates a SubscriptionManager and passes it the callback object. The you would subscribe to your event type. At that point your callback method will be invoked for each event. Since events are coming in on many threads, your code needs to be thread safe.

Let me know if you still have questions.

Keith



stalera wrote:
Hello folks,
I have installed this system into my machine and am interested in using some features of this code. In my application I want a publisher to publish a message and there are multiple subscribers waiting to listen to this message. I feel this system is apt to my application, but I don't understand the flow of this system? I would be grateful if anyone could help me out to understand the flow of the system and tell me how to run the system and use the feature that I have mentioned above. I went through the code and I suspect InitMemoryMgrmethod of the unmanaged code to create the queue into the memory, please tell me if I'm wrong...


Dec 26, 2007 at 10:04 AM
Does the soln have a running project which implements the whole publisher subscriber system. because when I run the project EventPingPong I get an error "An attempt was made to load a program with an incorrect format", or else when I run WspEventServiceTest I get an error "Queue Name does not exist" in the console. Both of these errors are got during a call to the Native method JoinMemoryMgr. So how should I configure the system to get rid of these errors.

Can you elaborate more on the queue part, how is it created, where is it created, which method is written for these operations. I saw the code which uses native methods to do the queuing operations. All that I'm concerned about is, the delivery of the message to the subscribers.


does the message get deleted from the queue after delivery, and if it does, Is the delivery of the message to all the subscribers reliable?

Thanks.
Coordinator
Dec 26, 2007 at 4:42 PM
It sounds like you are trying to run the 64 bit version on a 32 bit machine, or vice versa. This should solve the first error you mention.

You need to install the correct setup for your machine and the WspEventRouter service needs to be running. The service is what creates the shared memory and should solve your second error.

Keith



stalera wrote:
Does the soln have a running project which implements the whole publisher subscriber system. because when I run the project EventPingPong I get an error "An attempt was made to load a program with an incorrect format", or else when I run WspEventServiceTest I get an error "Queue Name does not exist" in the console. Both of these errors are got during a call to the Native method JoinMemoryMgr. So how should I configure the system to get rid of these errors.

Can you elaborate more on the queue part, how is it created, where is it created, which method is written for these operations. I saw the code which uses native methods to do the queuing operations. All that I'm concerned about is, the delivery of the message to the subscribers.


does the message get deleted from the queue after delivery, and if it does, Is the delivery of the message to all the subscribers reliable?

Thanks.

Dec 27, 2007 at 8:28 AM
Edited Dec 27, 2007 at 12:23 PM
Thanks Keith, the errors are vanished and the application is running fine.

But I am yet not clear with the queue part, does the message get deleted from the queue after delivery, and if it does, Is the delivery of the message to all the subscribers reliable?

Thanks.
Coordinator
Dec 27, 2007 at 4:31 PM
The shared memory is a ring buffer and this is the queue. The event system does not provide guaranteed delivery of events. This is one of the differences between it and a messaging system. Guaranteed delivery would require persistence to disk and transactions which is all very heavy weight and would drastically impact performance. The application requirements should define what level of reliability is required for the application.

That said, I've never hit a condition where the event system did not deliver an event to a subscriber on the same machine. So although it's theoretically possible to lose an event to a subscriber, the OS would have to be in such a bad state that nothing on the box would be working. With the event system, the router is the only process on the box which is "guaranteed" to get the event. After the router receives the event, it marks the event as processed which allows it to be overwritten. The router can persist the events to disk and this is the most "guaranteed" way of capturing events.

For all other subscribers, they are independently listening to the shared memory buffer for new events. When a new event arrives, they copy it out to their process and hand it off to the system to deliver to the callback method on a thread. In doing this, I've never been able to make the system lose an event even though I've queued up 200,000+ events. For one of the listening processes to lose an event means the ring buffer wrapped around before it could copy the events to its process.

Hopefully this answers your questions.

Keith



stalera wrote:
Thanks Keith, the errors are vanished and the application is running fine.

But I am yet not clear with the queue part, does the message get deleted from the queue after delivery, and if it does, Is the delivery of the message to all the subscribers reliable?

Thanks.

Dec 28, 2007 at 7:06 AM
Edited Dec 28, 2007 at 7:35 AM
Thanks a ton keith.
Jan 3, 2008 at 6:00 AM
Hi,

Can you tell me how to configure the application in case of publisher and subscriber running on different machines, and the service running on different machine.

I tried adding the ip address of the service machine in the attribute nic of the app.config file of router for the publisher and the subscriber to read, but it doesn't help. I still get the exception queue name does not exist. Here is that node:

<thisRouter nic="192.168.21.46" port="1300" bufferSize="1024000" timeout="30000" />

So, can you help me configure the application such that the publisher and subscriber access the service which is running on some different machine.

Thanks.
Coordinator
Jan 3, 2008 at 7:16 AM
It's probably easiest if you uninstall the service and then reinstall it. The service needs to be running on each machine. Make the Subscriber machine the parent machine. When you install the service on the Subscriber machine, just take all the default entries. When you install the service on the Publisher machine, put the name of the Subscriber machine as its parent. You should only need to specify the IP address instead of the machine name if you're not using a DNS or if the machine has multiple NICs.

Keith


stalera wrote:
Hi,

Can you tell me how to configure the application in case of publisher and subscriber running on different machines, and the service running on different machine.

I tried adding the ip address of the service machine in the attribute nic of the app.config file of router for the publisher and the subscriber to read, but it doesn't help. I still get the exception queue name does not exist. Here is that node:

<thisRouter nic="192.168.21.46" port="1300" bufferSize="1024000" timeout="30000" />

So, can you help me configure the application such that the publisher and subscriber access the service which is running on some different machine.

Thanks.

Jan 3, 2008 at 2:35 PM
This is fine for one way communication, but how do we configure for a bi-directional communication.

Say my subscriber wants to publish a message to the publisher, then how do we install the service to achieve this kind of bi-directional communication.

Thanks.
Coordinator
Jan 3, 2008 at 5:29 PM
The event system service is a router in the same way a Cisco router is on a physical network. With the event system, the machines must be configured in a hierarchical topology. So it you had 100 machines all connected in a hierarchical topology then it doesn't matter what machine the publishers or subscribers run on. For instance, you could have the same publishing application running on 5 different machines and subscribing applications running on 50 other machines. Some of the subscribing applications could just as well be on the same machine as the publisher. The event system will route all the events published to all the subscribers.

So in the previous example where you would have two machines, publishers and subscribers can be on both machines. The event system router is inherently bi-directional just as a Cisco router is.

Make sense?

Keith


stalera wrote:
This is fine for one way communication, but how do we configure for a bi-directional communication.

Say my subscriber wants to publish a message to the publisher, then how do we install the service to achieve this kind of bi-directional communication.

Thanks.

Jan 8, 2008 at 8:25 AM
Edited Jan 8, 2008 at 12:04 PM
how do you identify that a particular machine is a publisher or a subscriber from the network.

I mean, is there any means by which I could tell my subscriber that so and so machine is the publisher and you have to listen to this machine.

Accg. to you I also gave the subscriber's ip address during installation of the publisher and kept default settings during installation of the subscriber. But my subscriber does not recognize the publisher, so can you tell me how do I overcome this problem. And what IP address do I give as the parent router during installation.

Mine is a web application, where the publisher port runs on one machine and the subscriber port runs on the other machine, having a common IIS server, and not on individual machines. Does this add an impact on your eventing system?

Thanks.


keithh wrote:
It's probably easiest if you uninstall the service and then reinstall it. The service needs to be running on each machine. Make the Subscriber machine the parent machine. When you install the service on the Subscriber machine, just take all the default entries. When you install the service on the Publisher machine, put the name of the Subscriber machine as its parent. You should only need to specify the IP address instead of the machine name if you're not using a DNS or if the machine has multiple NICs.

Keith


stalera wrote:
Hi,

Can you tell me how to configure the application in case of publisher and subscriber running on different machines, and the service running on different machine.

I tried adding the ip address of the service machine in the attribute nic of the app.config file of router for the publisher and the subscriber to read, but it doesn't help. I still get the exception queue name does not exist. Here is that node:

<thisRouter nic="192.168.21.46" port="1300" bufferSize="1024000" timeout="30000" />

So, can you help me configure the application such that the publisher and subscriber access the service which is running on some different machine.

Thanks.


Coordinator
Jan 8, 2008 at 6:31 PM
Machines are not "publisher" or "subscriber" machines. You can have applications publishing and subscribing from any machine.

It sounds like your parent machine might have multiple NIC cards. In this case, you need to configure the parent machine to listen on a specific IP address that is visible to the other machine. You will find in the config file the "thisRouter" entry. Set its "nic" value to the IP address which is visible to the other machine. You'll then have to restart the service. Using the netstat command, you should then see it is listening on port 1300 for that IP address.

So from what you said previously, if the parent machine's IP addresses are 192.168.21.46 and 10.10.211.2 and the child machine's IP address is 192.168.21.40 then the parent machine's relevant config file entry would be:

<thisRouter nic="192.168.21.46" port="1300" bufferSize="1024000" timeout="30000" />

and the child machine's relevant config file entry would be:

<parentRouter name="192.168.21.46" port="1300" bufferSize="1024000" timeout="20000"/>

If the scenario is more complex where you will be going through a firewall then you either have to open port 1300 so the machines can communicate or you extend the configuration from above and have the parent machine above which spans between a DMZ and a private network. It would connect to another machine in the private network as its child.

I don't know your environment to be able to provide more specific help on what your topology should look like.

Keith
Jan 9, 2008 at 7:33 AM
no I dont have multiple nic, I'll explain you the scenario...

As you said keep the subscriber parent,
in one machine(say 192.16.2.77) i have an application running which subscribes the message, in this machine I have installed the service with the default settings ie i have kept the listen IP address blank, the parent router machine blank.

And in another machine(say 192.16.2.64), i have an application running which publishes the message, in this machine I have installed the service giving listen IP address, the IP address of the machine where my subscriber runs (192.16.2.77), and the parent router machine with the IP address of the machine where my subscriber runs (192.16.2.77).

Both have tcp port 1300 running and both are listening to 1300 port. I don't know if this is right.

After setting up this configuration, when I run the application which publishes a message from 192.16.2.64, the application where my subscriber runs ie on 192.16.2.77, does not listen to the message published by the publisher.

Currently in my topology, there are 2 machines where the messaging would be bi-directional.
I hope whatever I've written is lucid and understandable. I suspect the problem is with the installation of the service.

Now can you help me out.

Thanks.
Coordinator
Jan 9, 2008 at 8:25 AM
The config file for the parent machine should have the parentRouter line commented out and should contain this line:
<thisRouter nic=" " port="1300" bufferSize="1024000" timeout="30000" />

The config file for the child machine should contain both these lines:
<thisRouter nic=" " port="1300" bufferSize="1024000" timeout="30000" />
<parentRouter name="192.16.2.77" port="1300" bufferSize="1024000" timeout="20000"/>

If I understand what you're saying above, it sounds like the issue is with the subscriber. The subscriber process must be running and you would have to instantiate the SubscriptionMgr object and then execute the Listen method on it to begin receiving events.

Open up the Performance monitor, select all the perf counters for WspEventRouter, and change the UI to text rather than graph mode. If events are flowing, you'll see it in the perf counters.

Keith
Jan 9, 2008 at 8:44 AM
Edited Jan 9, 2008 at 11:36 AM
It works fine when both my applications are running on the same machine in different AppDomains, and the service is installed in the same machine. The problem comes only if they are installed in different machines. There should be some network configuration issue, I think.

the socket is not getting connected

here is the error i get in the event log.

"System.Net.Sockets.SocketException: A request to send or receive data was disallowed because the socket is not connected and (when sending on a datagram socket using a sendto call) no address was supplied
at System.Net.Sockets.Socket.Shutdown(SocketShutdown how)
at Microsoft.WebSolutionsPlatform.Event.Router.ReceiveServer.Start()"

Coordinator
Jan 9, 2008 at 3:37 PM
Are you sure this exception is not from a previous run? If you use the Network monitor, you can look at the packets to see what is happening on the network. It would seem the child machine is not successfully creating a connection to the parent. Does the user account that the service is running as have permission to open network connections? You could probably change the service to run as your user account to see if it's a permission issue.

When things are working correctly, you should see on the parent machine it listening on port 1300 and 2 open connections to the child machine on port 1300. On the child machine you should see it listening on port 1300 and 2 open connections to the parent machine on port 1300. Use netstat to see the ports. Make sure the parent machine is listening on the IP address you configured on the child machine and that there isn't a IPv4/IPv6 issue. You could change the line in the config file of the parent to the following to make sure it's listen correctly:

<thisRouter nic="192.16.2.77" port="1300" bufferSize="1024000" timeout="30000" />


Send me the config files from both machines as well.

Keith
Jan 10, 2008 at 8:03 AM
Edited Jan 10, 2008 at 10:24 AM
socket exception is now no longer in existence.

and i see 2 ports 1300 opened in the parent, with one listening and one established state.

and in child i see 1 port opened in listening state.

but the parent is still not able to listen to what child publishes.

here is the config of the child:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<configSections>
<section name="eventRouterSettings" type="foo"/>
<section name="eventPersistSettings" type="foo2"/>
</configSections>

<eventRouterSettings>
<!-- refreshIncrement should be about 1/3 of what the expirationIncrement is -->
<!-- This setting needs to be consistent across all the machines in the eventing network -->
<subscriptionManagement refreshIncrement="3" expirationIncrement="10"/>

<localPublish eventQueueName="WspEventQueue" eventQueueSize="102400000" averageEventSize="10240"/>

<!-- nic can be an alias which specifies a specific IP address or an IP address -->
<!-- port can be 0 if you don't want to have the router open a listening port to be a parent to other routers -->
<thisRouter nic="" port="1300" bufferSize="1024000" timeout="30000" />

<parentRouter name="172.16.2.64" port="1300" bufferSize="1024000" timeout="30000" />

</eventRouterSettings>

<eventPersistSettings>
<!-- <event type="*" localOnly="false" fieldTerminator="," rowTerminator="\n" tempFileDirectory="c:\temp\AllEvents\" copyToFileDirectory="c:\temp\AllEvents\log\" /> -->
<!-- <event type="78422526-7B21-4559-8B9A-BC551B46AE34" localOnly="false" fieldTerminator="," rowTerminator="\n" tempFileDirectory="c:\temp\WebEvents\" copyToFileDirectory="c:\temp\WebEvents\log\" /> -->
</eventPersistSettings>
</configuration>


and here is the config of the parent:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<configSections>
<section name="eventRouterSettings" type="foo"/>
<section name="eventPersistSettings" type="foo2"/>
</configSections>

<eventRouterSettings>
<!-- refreshIncrement should be about 1/3 of what the expirationIncrement is -->
<!-- This setting needs to be consistent across all the machines in the eventing network -->
<subscriptionManagement refreshIncrement="3" expirationIncrement="10"/>

<localPublish eventQueueName="WspEventQueue" eventQueueSize="102400000" averageEventSize="10240"/>

<!-- nic can be an alias which specifies a specific IP address or an IP address -->
<!-- port can be 0 if you don't want to have the router open a listening port to be a parent to other routers -->
<thisRouter nic="172.16.2.64" port="1300" bufferSize="1024000" timeout="30000" />

<!-- <parentRouter name="ParentMachineName" port="1300" bufferSize="1024000" timeout="30000" /> -->

</eventRouterSettings>

<eventPersistSettings>
<!-- <event type="*" localOnly="false" fieldTerminator="," rowTerminator="\n" tempFileDirectory="c:\temp\AllEvents\" copyToFileDirectory="c:\temp\AllEvents\log\" /> -->
<!-- <event type="78422526-7B21-4559-8B9A-BC551B46AE34" localOnly="false" fieldTerminator="," rowTerminator="\n" tempFileDirectory="c:\temp\WebEvents\" copyToFileDirectory="c:\temp\WebEvents\log\" /> -->
</eventPersistSettings>
</configuration>
Coordinator
Jan 10, 2008 at 1:45 PM
We should see on the parent one port 1300 listening and two port 1300 bound to the child's IP address. On the child we should see one port 1300 listening and two port 1300 bound to the parent IP address. I'm assuming you're using the latest drops of the code. I fixed a problem similar to this a couple of months ago but haven't seen this since. If you capture the network packets with netmon, you might see what network errors are occuring.

I'm surprised you're not getting any events in the event log. The only other thing I can think of is you build a debug version and step through the code. The code is in communicator.cs file in the \Event\Router directory.

Keith
Coordinator
Jan 10, 2008 at 10:32 PM
I was thinking more about your problem and then realized we never had you check your firewall settings. Make sure port 1300 is open through your firewall.
Jan 11, 2008 at 4:34 AM

The firewall formalities are completed, and the ports are also running properly as per your description.

But the listener is not able to listen to the publisher still.



keithh wrote:
I was thinking more about your problem and then realized we never had you check your firewall settings. Make sure port 1300 is open through your firewall.

Coordinator
Jan 11, 2008 at 5:24 AM
When you run:

netstat -a | findstr 1300

You should have 3 lines for each machine. One line is listening and the other two should say connected. These lines should stay constant over time. If the port number values are changing then it means connections are being dropped and recreated.

I did find a bug today which shows up on a box which has 2 NICs where when the primary NIC is disconnected and the other NIC is wireless, it keeps dropping and recreating the connections. I am debugging this but I don't think it is the issue you're having.

In your subscriber for your AddSubscriber method call, did you say true or false for being local. You need to say false. You could run the WspEventListenTest sample on the subscriber box in a cmd window. If you see events flowing then it means the events are flowing between the boxes.
Jun 11, 2008 at 3:36 PM
Is there anything special you have to do when creating an Event? I used the WebPageEvent as an example and added everything that wasn't data to my new class. I was under the impression that when this: public Firefly_SuMMIT_1553_Data(byte[] serializationData) : base(serializationData) (where base is Event), all of my data would be filled in. Is this not the case? I did implement the GetObjectData function.
Thanks,
Owens
Coordinator
Jun 11, 2008 at 5:25 PM

That should be all you have to do. Probably something simple. Send me your C# file with the class definition.

Keith  (keithh@microsoft.com)


cowens85 wrote:
Is there anything special you have to do when creating an Event? I used the WebPageEvent as an example and added everything that wasn't data to my new class. I was under the impression that when this: public Firefly_SuMMIT_1553_Data(byte[] serializationData) : base(serializationData) (where base is Event), all of my data would be filled in. Is this not the case? I did implement the GetObjectData function.
Thanks,
Owens