Friday, April 27, 2012

Performance tip for ILOG when using J2EE MDB and XU

Here is something really simple that ended up being absolutely crucial to the performance of our ILOG execution units.  We use the J2EE Message Driven Bean (MDB) for WebSphere 7 that comes packaged with the product.  Once we ramped up the application to a certain level of throughput, it started consuming an enormous amount of CPU (more than 2.5 CPU's on a hefty, 3-CPU AIX pSeries).

We looked into many different areas, including how our rules were organized, the size of our decision tables, and the amount of data we were passing in our input messages.  Until one day, when we finally got some of IBM's top ILOG support specialists on the phone.  After about an hour of deliberating, one of them asked, "what is the size of your XU connection pool?"  To which I responded, "the XU has a connection pool?"

Up until this point, I knew there was this resource adapter called the XU, and it was used by the Rule Execution Server to notify the MDB's of ruleset updates.  What I didn't know was that it featured a connection pool, and that the size of the pool is extremely important.  Here's why:

The XU is not only a connector - it is also the provider of the actual rule execution engine - the muscle that does the work of executing your rules.  Each active thread of your application that executes rules (i.e. the active threads in our MDB's thread pool) needs its own connection from the XU pool.  Then, if you have multiple rulesets, you need to have a unique connection for each possible concurrent execution of each ruleset.  Here's an example:

Let's say we have 3 rulesets - A, B and C and our MDB has 50 threads in its thread pool.  At busy times, the thread pool has about 30 in use.  All 3 rulesets are used with about the same frequency, i.e. of all executions, 1/3 are for A, 1/3 for B, and 1/3 for C.  Let's say we have our XU connection pool set at a max of 10 (which happens to be the default in WebSphere).  When the application is busy and 30 threads are in use, all 10 XU connections are in use and the 30 threads are constantly fighting over them.

So your first reaction would be to increase the XU pool to 50, to match the thread pool, right?  This makes sense, but it's not quite adequate.  This is because each XU connection is also associated with a specific ruleset.  If a thread that is trying to execute ruleset A has an XU connection, but the connection is for ruleset B, the thread must request a connection for ruleset A.  If there are no available connections associated with A, one of the others must be re-associated with A.  At this point, the ruleset is fetched and parsed.  This is where our performance problem was rooted.  Without enough XU connections to handle the total possible concurrent requests for each ruleset, it was constantly resetting connections and parsing rulesets.

Going back to our example, a safe value for the XU connection pool would be 150, since that is the maximum possible value (threads * number of rulesets).  But this may not be totally necessary - for larger applications, a very large connection pool can possibly cause memory issues.  Some parsing is fine - you just don't want it doing it constantly.  In our case, we have about 40 rulesets and typically around 30 threads active when the system is busy.  We've settled in at 650 XU connections and the application is running great - using about 20% of one CPU on the 3 CPU box (compared to about 250% before).

Here is how you change the value of your XU connection pool in the WebSphere console:

If you deployed the XU resource adapter by itself:

  1. Go under Resources / Resource Adapters / J2C connection factories.
  2. If you followed the ILOG documentation, the factory should be called "xu_cf" or something similar.
  3. On the right, click Connection pool properties.
  4. In there, you can set the minimum and maximum connections and click OK.  We found that the change took effect immediately without restarting the application.
If you deployed the XU embedded in your application:
  1. Go under Applications / Application Types / WebSphere enterprise applications and click on your application name.
  2. Click on Manage Modules.
  3. You should have a module called XU.  Click into that.
  4. Click Resource Adapter.
  5. Click J2C connection factories.
  6. There should be one factory with a JNDI name of eis/XUConnectionFactory.  Click into that.
  7. Click Connection pool properties, and in there you can set your minimum and maximum pool sizes.
Hopefully this can help somebody get through the troubleshooting nightmare we went through to get our application performing well.  This really ought to be something ILOG features prominently in their documentation, and it should be one of the first things their consultants mention.  As I've been told however, it really depends on which consultant you get.  Hopefully they will improve this going forward.

2 comments:

  1. I am trying to implement MDB functionality. out rule have been developed such a way that XOM access database via open JPA. i try to figure out how can i make it with this MDB functionality. I tried clubed xom with mdb jar and deploy on webshere but gives error of not finding class reference. please suggest if you have come across anything like that before

    ReplyDelete
    Replies
    1. We do something similar. We created a directory on the server where we keep all the jars related to the XOM, and point the WAS server's classpath to it.

      You can do this by adding a custom property under the Java settings (where you set max heap etc) called "ws.ext.dirs" whose value is the directory where those jars reside.

      This has two benefits: 1) It separates the XOM jars from the core ILOG product so they are easy to update; and 2) using ws.ext.dirs makes it so you don't have to list each individual jar in the classpath - you just specify the directory.

      Here's a link to an IBM article about class loading, which mentions ws.ext.dirs:

      http://pic.dhe.ibm.com/infocenter/wasinfo/v7r0/index.jsp?topic=%2Fcom.ibm.websphere.base.doc%2Finfo%2Faes%2Fae%2Fcrun_classload.html

      Delete