Coffee-Driven Java: 08/01/2006

Thursday, August 31, 2006

Unit Tests Specify Post-Conditions, Not Code Paths!

I realized something about the unit tests I had been writing today: I couldn't change even a line of the code they were testing without changing the tests themselves as well. Now, I understand that unit tests always carry some maintenance overhead, and will need to be udpated from time to time as a system's design evolves. But unit tests are also supposed to enable refactoring by ensuring that there are no unintended consequences to a code modification. If I need to change my test every time I change the code it's testing, then I must be doing something wrong. But I had to think for a minute about what unit testing sin I was committing.

The code being developed did lots of JDBC stuff, so I was using EasyMock to mock the JDBC API, preventing a dependency on a physical database.

Here is a portion from an offending unit test:

  @Override
  protected void setUp() throws Exception {
    ds = createStrictMock(DataSource.class);
    conn = createStrictMock(Connection.class);
    ps = createStrictMock(PreparedStatement.class);
    md = createStrictMock(DatabaseMetaData.class);
    rs = createStrictMock(ResultSet.class);
    xaDsFactory = createMock(XaDataSourceFactory.class);
    defFactory = createMock(DefinitionFactory.class);
    typeMap = createMock(TypeMap.class);
    expect(defFactory.getTypeMap()).andStubReturn(typeMap);
    expect(typeMap.integer()).andStubReturn("INTEGER");
    db = new DatabaseHelperImpl(ds, xaDsFactory, "sqlDialect", defFactory);
    replay(defFactory, typeMap);
  }
  
  public void testExecuteStatement() throws Exception {
    expect(ds.getConnection()).andReturn(conn);
    expect(conn.prepareStatement("SQL Statement")).andReturn(ps);
    expect(ps.execute()).andReturn(false);
    conn.close();
    replay(ds, conn, ps);
    
    db.executeSql("SQL Statement");
    verify(ds, conn, ps);
  }
  
  public void testExecuteSqlThatThrowsException() throws Exception {
    expect(ds.getConnection()).andReturn(conn);
    expect(conn.prepareStatement("Failing SQL Statement")).andReturn(ps);
    SQLException exToThrow = new SQLException();
    expect(ps.execute()).andThrow(exToThrow);
    conn.close();
    replay(ds, conn, ps);
    
    try {
      db.executeSql("Failing SQL Statement");
      fail("Should throw exception");
    } catch (SQLException ex) {
      assertEquals("Wrong exception", exToThrow, ex);
    }
    verify(ds, conn, ps);
  }

And here is the Code Under Test:

  public void executeSql(String sql) throws SQLException {
    Connection conn = null;
    try {
      conn = dataSource.getConnection();
      conn.prepareStatement(sql).execute();
    } finally {
      if (conn != null) conn.close();
    }
  }

Oy. Thats' pretty ugly-looking to me. Notice the large setUp() method. Notice that, worse than being large, the method does not actually finish setting up, since each test requires a very particular series of EasyMock calls. To add new tests, you would have to pay very careful attention to setUp() unless you were already familiar with the code.

Taking a minute to stare at these tests, the problem becomes obvious: my tests are essentially a line-by-line walk-through of the desired execution path through the code being tested. This is conceptually only a couple steps away from executing two copies of the same code and verifying that they did the same thing. Not only does this couple my tests to the tested code as tightly as you can imagine, it makes for a pretty good chance that I'll make the same mistaken assumptions in my test that I would make in my code, defeating the purpose of unit testing altogether.

So let me re-consider what a unit test should be. A unit test and the code it tests are just two ways of saying the same thing; two ways of specifying the same desired behavior. Production code specifies a unit of behavior in terms of an arrangement of more finely-grained behaviors. A unit test specifies a unit of behavior by describing the desired post-conditions (outputs and side effects) for a given set of pre-conditions. The tests above are bad tests because they are specifying a unit of behavior in terms of more finely-grained behaviors, just like the code. I am essentially missing the forest (the desired results) for the trees (the steps which will obtain the desired result).

Coming back to the specific problem at hand, I think the reason I was suckered into my predicament has to do with how heavily the code under test relies on the rather large JDBC API to bring about its results. Almost every non-conditional statement in the code under test interacts with the API, and it could be said that the API holds quite a bit of 'state' (the state of the entire database being accessed) which affects its outputs. For these reasons, mocking it in the traditional fashion just doesn't work well.

On my first attempt at addressing the problem, I tried writing the tests against an actual database and verifying post-conditions with JDBC. I suppose this could be made to work, but JDBC does not lend itself well to clearly readable specifications of that kind. And on top of that, the tests were pretty slow.

After pondering a bit, I got to thinking how it would be nice if I had some magical object which could tell me about the things which had been done through the JDBC API after-the-fact, and also allow me to specify how a particular set of JDBC instances will respond, at a high level. Over the course of a few hours I started fleshing out just such an object, relying upon proxies. It took some experimentation (I've never written dynamic proxying code before), but here's the interface I've established for my magical 'DbFixture' object (the code turned out to be rather longer than I can post in here):


interface DbFixture {
  public DataSource getDataSource();
  public int numOpenConnections();
  public boolean wasExecuted(String sql);
  public boolean wasExecuted(String sql, Object[] args, int[] types);
  public void throwOnNextStatementExecution(Throwable ex);
  public int getTxNumForExecutedStatement(String sql);
}

The behavior of getTxNumForExecutedStatement() requires elaboration: it returns -1 if the statement has not been executed, 0 if it was not executed in a transaction, and a number greater than or equal to one if it was executed in a transaction. Transactions are numbered according to the order in which they are ended.

Armed with DbFixture, I rewrote the above tests:


  protected void setUp() throws Exception {
    fixture = new DbFixture();
    xaFactory = createMock(XaDataSourceFactory.class);
    defFactory = createMock(DefinitionFactory.class);
    db = new DatabaseHelperImpl(fixture.getDataSource(), xaFactory,"SQL DIALECT", defFactory);    
  }
  
  public void testExecutingSql() throws Exception {
    db.executeSql("SOME SQL");
    assertTrue(fixture.wasExecuted("SOME SQL"));
  }
  
  public void testThatConnectionIsClosedOnException() throws Exception {
    fixture.throwOnNextStatementExecution(new SQLException());
    try {
      db.executeSql("THROWS EXCEPTION");
      fail("Should have thrown exception");
    } catch (SQLException ex) {}
    assertEquals(0, fixture.numOpenConnections());
  }

Much better. Moral of the story: Unit tests specify post-conditions, not code paths!

Friday, August 25, 2006

Distributed (XA) Transactions w/ Hibernate, Spring, JTA, JOTM

I recently struggled with getting XA transactions to work across Oracle 9i and SQL Server 2000 databases, using a combination of Hibernate, Spring, JOTM, and the SQL Server 2005 JDBC driver. I ran into some stickiness, and I still think I might be missing something - but XA transactions work. Recovery probably doesn't, but for our organization that's OK right now. In the off-chance that my probably-misguided attempts could prove useful to someone else in my situation, here is what I learned:

XAPool Is Cheating
Most places you look, XAPool is listed as a great companion to JOTM for supplying connection pooling. It is - if you need to make data sources which don't actually support XA transactions pretend to participate in XA transactions. But if you actually want two-phase commits as specified in the XA standard, you'd better not use XAPool, because it just wraps non-XA data sources. Using JOTM with XA gives you 'simulated' XA transactions - on the one hand, you can easily start and commit/roll back transactions on multiple data sources, but when it comes to commit time, if you commit your first data source and the second data source fails its commit, there's no way to roll back the first one. Which is why we have the 2PC concept in XA to begin with. Which makes XAPool a rather confusingly-named library. Definately not what I was looking for.

For real XA with JOTM, ditch XAPool and use the XADataSource implementations supplied by your vendor's JDBC driver. There might be an easier way, but I wrote a simple helper class to manage connection pooling and transaction enlistment using commons-pool and commons-dbcp. More on that in a bit.

Fixing JOTM
Just when I thought I had things set up correctly, I started getting intermitent and difficult-to-reproduce failures from my resource managers. Given my unfamiliarity with the terrain, it took me a while to realize that the source of the problem was JOTM, not my code/configuration. I am using JOTM 2.0.10, which as far as I know is the most recent official release. Unfortunately it appears to have a rather show-stopping bug - it gets its branch transaction ids mixed up.

When you start an XA transaction, your transaction manager (JOTM) makes up an Xid to identify the global transaction. For each resource manager which is enlisted in the transaction, the TM creates a branch id, which is typically the global id with an appended bit to identify the resource manager. That branch id is used to identify the transaction on each resource manager.

I had created a test application using two data sources, and had successfully committed/rolled back some transactions across both. When I configured my production application for XA, I kept getting intermittent errors from one or the other resource manager that a transaction id was invalid. I couldn't reproduce the problem in my test app. What was going on?

A little background for this bug: for a two-phase commit (2PC), the TM first sends a message to all of the resource managers involved in the transaction that it wants to commit, and so they should indicate whether they can do so, and also 'freeze' the transaction in question. Each resource manager then sends back a 'vote.' This vote can basically be 'commit,' 'rollback,' or 'I was enlisted but no changes were made to me, so I don't have to do anything.'

Turns out JOTM 2.0.10 doesn't handle that third vote so well. When a resource manager indicates that it doesn't need to commit, JOTM can its Xids for the various resource managers confused and end up sending the wrong branch Xid to a resource manager that does need to commit. Which is what was happening to me. This is a big deal, and happened to me almost immediately - the only reason I can think of that it wasn't identified and fixed was that XAPool is so often used, and with XAPool none of this 2PC stuff is actually happening, so XAPool probably always votes 'commit.' I haven't checked.

So I built JOTM from source with my fix. The offending class is org.objectweb.jotm.SubCoordinator. I was going to post a little examination of the bug, but honestly it isn't worth the effort. It's really glaring, and there's nothing particularly subtle or interesting about it. Here's the patch:


77a78,79
>     
>     private Vector loggedJavaxXids = new Vector();
83c85
<     public void  addResource( XAResource res, Xid xid )
---
>     public void  addResource( XAResource res, Xid xid, javax.transaction.xa.Xid javaXid )
90a93
>         loggedJavaxXids.addElement(javaXid);
105a109,112
>     
>     public javax.transaction.xa.Xid getLoggedJavaxXid(int index) {
>      return (javax.transaction.xa.Xid) loggedJavaxXids.get(index);
>     }
825c832
<                             log.addResource(res,xid);
---
>                             log.addResource(res,xid,myjavaxxid);
1021c1028,1034
<             javax.transaction.xa.Xid myjavaxxid = (javax.transaction.xa.Xid) javaxxidList.elementAt(i);
---
>             /* whoops - this line is wrong. For example if the resource manager corresponding to
>              * javaxxidList.elementAt(0) votes VOTE_READONLY and the next on the list votes
>              * VOTE_COMMIT, the wrong xid will get matched with this resource and the commit
>              * will fail.
>              */
> //            javax.transaction.xa.Xid myjavaxxid = (javax.transaction.xa.Xid) javaxxidList.elementAt(i);
>             javax.transaction.xa.Xid myjavaxxid = (javax.transaction.xa.Xid) log.getLoggedJavaxXid(i);

Wiring the Pieces (Spring, Hibernate, and some Pooling)
Now that JOTM will work correctly, let's take a look at the Spring configuration necessary to get ourselves off the ground.


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd">

<beans>
<bean id="sqlServerDataSourceTarget" class="com.microsoft.sqlserver.jdbc.SQLServerXADataSource">
 <property name="URL" value="jdbc:sqlserver://server:port;databaseName=jtatest;user=testing_user;password=testing_user;"/>
</bean>

<bean id="oracleDataSourceTarget" class="oracle.jdbc.xa.client.OracleXADataSource">
 <property name="URL" value="jdbc:oracle:thin:user/password@server:port:PPRD"/>
</bean>

<bean id="jotm" class="org.springframework.transaction.jta.JotmFactoryBean"/>

<bean id="txManager" class="org.springframework.transaction.jta.JtaTransactionManager">
 <property name="userTransaction" ref="jotm"/>
</bean>

<bean id="sqlServerDataSource" class="com.mattmcgill.PoolingXADataSource">
 <property name="xaDataSource" ref="sqlServerDataSourceTarget"/>
 <property name="transactionManager" ref="jotm"/>
</bean>

<bean id="oracleDataSource" class="edu.taylor.domain.PoolingXADataSource">
 <property name="xaDataSource" ref="oracleDataSourceTarget"/>
 <property name="transactionManager" ref="jotm"/>
</bean>

<bean id="sqlServerSessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean">
 <property name="dataSource" ref="sqlServerDataSource"/>
 <property name="jtaTransactionManager" ref="jotm"/>
 <property name="useTransactionAwareDataSource" value="true"/>
 <property name="hibernateProperties">
   <props>
     <prop key="hibernate.dialect">org.hibernate.dialect.SQLServerDialect</prop>
   </props>
 </property>
 <property name="mappingResources">
   <list>
     <value>com/mattmcgill/jtatest/some-entity.hbm.xml</value>
   </list>
 </property>
</bean>

<bean id="oracleSessionFactory" class="org.springframework.orm.hibernate3.LocalSessionFactoryBean">
 <property name="dataSource" ref="oracleDataSource"/>
 <property name="jtaTransactionManager" ref="jotm"/>
 <property name="useTransactionAwareDataSource" value="true"/>
 <property name="hibernateProperties">
   <props>
     <prop key="hibernate.dialect">org.hibernate.dialect.Oracle9Dialect</prop>
   </props>
 </property>
 <property name="mappingResources">
   <list>
     <value>edu/taylor/jta/fee.hbm.xml</value>
   </list>
 </property>
</bean>
</beans>

In this example, XADataSources are configured for Oracle and SQL Server. Then a JOTM bean is configured using the helpful org.springframework.transaction.jta.JotmFactoryBean class. Next, we wrap our XADataSources in instances of PoolingXADataSource, a little helper class I wrote which makes sure that when a connection is obtained from the pool, it gets enlisted in any current thread-bound transaction that happens to be taking place. Here is the code for that class:

PoolingXADataSource.java


package com.mattmcgill

import java.io.PrintWriter;
import java.sql.Connection;
import java.sql.SQLException;
import java.util.Date;
import java.util.HashMap;
import java.util.Map;

import javax.sql.DataSource;
import javax.sql.XAConnection;
import javax.sql.XADataSource;
import javax.transaction.Status;
import javax.transaction.Transaction;
import javax.transaction.TransactionManager;

import org.apache.commons.dbcp.PoolableConnection;
import org.apache.commons.dbcp.PoolingDataSource;
import org.apache.commons.pool.PoolableObjectFactory;
import org.apache.commons.pool.impl.GenericObjectPool;
import org.apache.log4j.Logger;
import org.springframework.beans.factory.DisposableBean;
import org.springframework.beans.factory.InitializingBean;

public class PoolingXADataSource implements DataSource, PoolableObjectFactory, InitializingBean, DisposableBean {
 private static final Logger logger = Logger.getLogger(PoolingXADataSource.class);

 private Map<Connection, XAConnection> xaConnectionLookup = new HashMap<Connection, XAConnection>();
 private XADataSource xaDataSource;
 private TransactionManager transactionManager;
 private GenericObjectPool connectionPool;
 private PoolingDataSource poolingDataSource;

 public XADataSource getXaDataSource() {
    return xaDataSource;
 }

 public void setXaDataSource(XADataSource xaDataSource) {
    this.xaDataSource = xaDataSource;
 }

 public TransactionManager getTransactionManager() {
    return transactionManager;
 }

 public void setTransactionManager(TransactionManager transactionManager) {
    this.transactionManager = transactionManager;
 }

 public void afterPropertiesSet() throws Exception {
    if (xaDataSource == null)
       throw new RuntimeException("Must set xaDataSource");
    if (transactionManager == null)
       throw new RuntimeException("Must set transactionManager");
    connectionPool = new GenericObjectPool(this);
    connectionPool.setWhenExhaustedAction(GenericObjectPool.WHEN_EXHAUSTED_GROW);
    connectionPool.setMaxActive(30);
    connectionPool.setMinIdle(5);
    for (int i=0;i<5;i++) connectionPool.addObject();
    poolingDataSource = new PoolingDataSource(connectionPool);
 }

 public void destroy() throws Exception {
    connectionPool.close();
 }

 public void activateObject(Object conn) throws Exception {
    logger.debug("activateObject(" + conn + ") [" + xaDataSource + "]");
    logger.debug("  active: " + connectionPool.getNumActive());
    logger.debug("  idle: " + connectionPool.getNumIdle());
    XAConnection xaConn = xaConnectionLookup.get(conn);
    if (transactionManager.getStatus() != Status.STATUS_NO_TRANSACTION) {
       Date start = new Date();
       logger.info("Found transaction, associating connection");
       Transaction tx = transactionManager.getTransaction();
       tx.enlistResource(xaConn.getXAResource());
       Date end = new Date();
       logger.debug("  enlisting took " + (end.getTime() - start.getTime()) + " millis");
    }
 }

 public void destroyObject(Object conn) throws Exception {
    logger.debug("destroyObject(" + conn + ") [" + xaDataSource + "]");
    logger.debug("  active: " + connectionPool.getNumActive());
    logger.debug("  idle: " + connectionPool.getNumIdle());
    logger.info("Destroying connection [" + xaDataSource + "]");
    xaConnectionLookup.remove(conn);
 }

 public Object makeObject() throws Exception {
    logger.debug("makeObject() [" + xaDataSource + "]");
    logger.info("Opening connection [" + xaDataSource + "]");
    XAConnection xaConn = xaDataSource.getXAConnection();
    Connection conn = new PoolableConnection(xaConn.getConnection(), connectionPool);
    xaConnectionLookup.put(conn, xaConn);
    logger.debug("returned [" + conn + "]");
    return conn;
 }

 public void passivateObject(Object conn) throws Exception {
    logger.debug("passivateObject(" + conn + ") [" + xaDataSource + "]");
    logger.debug("  active: " + connectionPool.getNumActive());
    logger.debug("  idle: " + connectionPool.getNumIdle());
 }

 public boolean validateObject(Object conn) {
    logger.debug("validateObject(" + conn + ") [" + xaDataSource + "]");
    return true;
 }

 public Connection getConnection() throws SQLException {
    logger.debug("getConnection() [" + xaDataSource + "]");
    try {
       return poolingDataSource.getConnection();
    } catch (Exception ex) {
       throw new RuntimeException("Couldn't borrow object from pool", ex);
    }
 }

 public Connection getConnection(String arg0, String arg1)
       throws SQLException {
    throw new UnsupportedOperationException();
 }

 public PrintWriter getLogWriter() throws SQLException {
    return xaDataSource.getLogWriter();
 }

 public void setLogWriter(PrintWriter writer) throws SQLException {
    xaDataSource.setLogWriter(writer);
 }

 public void setLoginTimeout(int timeout) throws SQLException {
    xaDataSource.setLoginTimeout(timeout);
 }

 public int getLoginTimeout() throws SQLException {
    return xaDataSource.getLoginTimeout();
 }
}

Don't Forget These!
After that, all that's left is to configure our SessionFactorys (which I do by way of Spring's LocalSessionFactoryBean, and we're all set. Except that nothing will work unless you remember to set the following properties on LocalSessionFactoryBean:


 <property name="jtaTransactionManager" ref="jotm"/>
 <property name="useTransactionAwareDataSource" value="true"/>

The Spring documentation is very clear on how to use this session factory definition in conjunction with a TransactionProxyFactoryBean to configure declarative transactions. I'll leave that part up to you =)

If anyone can tell me a better way to make this work with connection pooling which removes the need for the PoolingXADataSource class, please let me know! I feel like it should be simple and I must be missing something.

Tuesday, August 15, 2006

Maven 2 site plugin woes

We're using Maven 2 here to manage our build process, and I'd like to make the maven-generated site our main source for developer documentation since it already hosts reports, JavaDocs, etc. Unfortunately, the site plug-in (currently at version 2.0-beta-5 I believe) is not exactly being cooperative. I have some menus defined in site.xml for the top-level project which I do not want inherited by sub-projects. According to maven documentation, the default behavior is that menus are not inherited. And yet... they are. So I included the inherit="none" attribute on the <menu> tag in question, build and deploy the top-level site, build and deploy a child module's site, and the menus are still inherited. And the links are wrong. Grr.

While adding breadcrumbs, I discovered my mistake: in order for the inheritance settings to work correctly, the child modules have to have their own site.xml files. If a child module doesn't have its own site.xml, then it appears as if maven uses the parent site.xml for the child module instead of using default-site.xml for the child and inheriting from the parent site.xml, like I expected.

I've resisted for so long...

Blogs often contain information of no relevance to anyone but the blogger/blogger's immediate circle, and I've held out for fear of adding to the noise. Until now, that is. Java- and developer-related blogs have recently pulled me out of a couple jams, so to return the favour I'll document my difficulties here on the web for the benefit any fellow developers struggling with Java, Hibernate, Spring, JTA, web frameworks, or what-have-you. At least that's the plan...

Coffee-Driven Java