JVMKiller
In our latest release of SMB, we spent a fair amount of time testing and minimizing windows where data corruption can occur in our associative memory. The testing involved killing various processes during periods of activity. But rather than a hit-or-miss approach of randomly killing SMB processes and seeing if any sort of recovery was needed, we decided to build a JVMKiller that gave us control of when and how the JVM goes down. We are now able to target specific “high risk” (write operations) areas. Like everything else in our system, the JVMKiller is configured as a JavaBean. The bean has the ability to execute any number of possible kill events, as well as, targeting both specific logical and physical components of the system, given a condition. For example, we could target a certain event, like the start of a phase-2 save event:
<bean name="JVMKiller" singleton="true" className="com.saffrontech.saffronone.core.JVMKiller">
<properties>
<property name="properties" normalize="false">
<![CDATA[
trigger.1=p2Start
halt.1=true
time.1=10
id.1=1001
]]>
</property>
</properties>
</bean>
With the definition above, we would halt the JVM after we encounter the first phase-2 start after 10 minutes of ingestion. The component responsible for triggering the kill event will be id 1001; server 1000, instance 1. If we wanted to define a second or third rule, we could define additional properties, e.g. trigger.2, halt.2, time.2, id.2 and trigger.3, halt.3, time.3, id.3 etc. We then setup a simple TransactionStatusListener interface that we can place anywhere in the code we want. We can also imagine adding more options, such as delayed halt (to vary the actual place where the JVM goes down), but in any case, this control has enabled us to flush out a number of data corruption issues that would have been difficult to troubleshoot otherwise, especially in a cluster environment.
Tags: development, smb, testing
This entry was posted on Tuesday, June 22nd, 2010 at 1:58 pm and is filed under SaffronMemoryBase. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
Leave a Reply

