MongoDB Error-handling Notes

Russell Bateman
10 June 2013
last update:

Table of Contents

Write concern
Java driver
Setting up to use MongoDB in Java
Java
Morphia
Preformed static write concerns for Java (and Morphia)
Exceptions and errors to handle
Beginning of error-handling
Full example

Most (if not all) the code examples here really use Morphia, but that's not much more than a decorator pattern atop the MongoDB Java driver. Consequently, exceptions and errors propagate in from the driver no differently than if we were coding directly to the driver.

I started out using MongoDB's 2.11.1 Java driver, but ended with 2.4.x later by the time I outlined the exceptions and errors to handle table.

Write concern

Originally, MongoDB did not encourage much in the way of write concern and the default was to send off the operation and assume it would be successful.

Today, 10gen encourages one of a vector of write concerns that also promise the detection of specific errors:

  1. Errors ignored. Uh, really?
  2. Unacknowledged. What used to be the default write concern—no longer considered a responsible solution in normal operation.
  3. Acknowledged. Confirmation of write operation. Exceptions caught include network, duplicate key and write-concern error.
  4. Journaled. Confirmation only after mongod has written it to the journal. The operation should survive shutdown.
  5. Replica-acknowledged. Confirmation only after mongod has reached one or more replica nodes. An explicit number of nodes or "majority" may be specified as a requirement.

Java driver

It's sometimes challenging to draw lines from general MongoDB documentation to the Java driver (not to mention all the way to Morphia as well).

Setting up to use MongoDB

These are the opening statements for the source of variable mongo in the rest of the code examples. Also, some other definitions.

public Mongo/MongoClient setup( String hostname, int port ) throws UnknownHostException
{
    ServerAddress address = new ServerAddress( hostname, port );

    Mongo mongo = Mongo( address );                  // (old way)

    ...or...

    MongoClient mongo = new MongoClient( address );  // (modern way)
}

Mongo/MongoClient mongo = setup( "localhost", 27017 );

String databaseName   = "accountdb";
String collectionName = "accounts";

Java

Here's a list of Java statements for setting write concern. The write concern can be created and passed to each individual operation as a subsequent, optional argument to the relevant method. Here, we decide we want to hear back from the primary and at least one of the secondaries. We decide to time-out after 250 milliseconds.

public Account create( Account account )
{
    DBCollection collection = mongo.getCollection( databaseName, collectionName );
    WriteConcern concern    = new WriteConcern( 2, 250 );

    collection.insert( account.getBsonFromPojo(), concern );
    return account;
}

But, another option is to use MongoDB connection set-up to establish a default write concern. Use MongoOptions to set up a default write concern of waiting until at least one secondary commits the data to disk plus a time-out of 250 milliseconds. This obviates creating an instance of WriteConcern and explicitly passing it as the second argument to collection.insert() above.

public Mongo/MongoClient setup( String hostname, int port ) throws UnknownHostException
{
    ServerAddress address = new ServerAddress( hostname, port );

    // (old way)
    MongoOptions options = new MongoOptions();
    options.setW( 2 );
    options.setWtimeout( 250 );

    Mongo mongo = new Mongo( address, options );

    ...or...

    // (modern way)
    Builder builder = new MongoClientOptions.Builder();
    builder.writeConcern( new WriteConcern( 2, 250 ) );

    MongoClientOptions options = builder.build();
    MongoClient        mongo   = new MongoClient( address, options );
}

Morphia

(This is crude and no way to write Morphia code, but it suffices as a complete example on which to base real code.)

Morphia is built atop the Java driver. In Morphia, collections are identified by association with the POJO class rather than explicitly as done when working with the Java driver.

Here, we insist on the primary and one secondary acknowledging receipt of our data before assuming we're out of harm's way. Note that using REPLICA_ACKNOWLEDGED here is not identical to the new WriteConcern( 2, 250 ) we used in the pure Java examples as it will block until both the primary and at least one secondary report back in.

public void update( Account account )
{
    Morphia   morphia   = new Morphia();
    Datastore datastore = morphia.createDatastore( mongo, databaseName );
    datastore.merge( account, WriteConcern.REPLICA_ACKNOWLEDGED );*
}

* Please note, however, that if the original MongoDB connection was enhanced with options including a write concern default, the establishment of such a default in Morphia is (should be) unnecessary.

Just as for Java without Morphia, there is a way to default the write concern and avoid having to pass it explicitly. Also, there are some operations (UpdateOperations for example) that do not take a WriteConcern argument anyway. If the write concern was set up as the MongoDB connection was made, this is unnecessary.

private Datastore getDatastore( String databaseName )
{
    Morphia   morphia   = new Morphia();
    Datastore datastore = morphia.createDatastore( mongo, databaseName );
    datastore.setDefaultWriteConcern( WriteConcern.REPLICA_ACKNOWLEDGED );
    return datastore;
}

public void update( Account account )
{
    getDatastore().merge( account );
}

Preformed static write concerns for Java (and Morphia)

These avoid having to compose one's own. There are others, older and on the way to deprecation. These come with defaults like 0 for the time-out. Perusing the MongoDB documentation will reveal equivalent, "by-hand" ways of creating a write concern. At least one is given here, however, for clarity.

These are in order of ascending time-to-complete.

WriteConcern.ERRORS_IGNORED No checking whatsoever. There is probably never a good reason to do this.
WriteConcern.UNACKNOWLEDGED Returns as soon as the message is written to the socket. Network but no server exceptions raised. This is the old default, now considered inadequate.
WriteConcern.ACKNOWLEDGED Waits for primary server to acknowledge receipt of write.
WriteConcern.FSYNCED Waits for server to flush to disk. Network and server exceptions raised.
WriteConcern.JOURNALED Waits for the server to commit the journal file to disk. Network and server exceptions raised.
WriteConcern.REPLICA_ACKNOWLEDGED Waits for at least 2 servers (primary and one secondary). Network and server exceptions raised. This is equivalent to:
    new WriteConcern( 2 );
WriteConcern.MAJORITY Waits for a number of servers servers greater than the number of servers in the replica set divided by 2 to report in. Network and server exceptions raised.

Exceptions and errors to handle

Here are the exceptions, errors, write and command results that MongoDB reveals under various circumstances. It's somewhat challenging to detect and more challenging to figure out what to do with them in any given case.

MongoException.DuplicateKey This is an attempt to insert a new document whose unique key already exists. It's useless to retry.
MongoException.Network Network-related exceptions. Determine a) whether to retry, b) how many times to retry and c) how long to wait before retrying.
MongoException.CursorNotFound Cursor not found (or timed-out). The default is 10 minutes. In general, making the cursor default "no time-out" is very bad. Do not retry; the solution is other.
WriteConcernException Exception representing an error reported because of write failure. Such an error means that CommandResult comes from getLastError() method. There are a number of interfaces that will return a WriteResult containing information such as WriteResult.getN(), the number of documents updated. If the write was supposed to reach a determined number of replica nodes, its failure to do so is probably recorded in here, though I don't yet know how reported.
MongoException General exceptions raised in Mongo and parent class to all above. This is anything not caught already, which shouldn't be much and probably not worth retrying.
IOException Hmmmm...
Exception Hmmmm...

Now let's tackle which exceptions should be handled for which CRUD operation and whether to retry the operation or consider the failure to be hopeless. This is our best stab at what's relevant to what.

Create

  1. MongoException.DuplicateKey
  2. MongoException.Network
  3. WriteConcernException

Read

  1. MongoException.CursorNotFound
  2. MongoException.Network
  3. MongoException

Update

  1. MongoException.Network
  2. WriteConcernException

Delete

  1. MongoException.Network
  2. WriteConcernException

Beginning of error-handling

Now that we've decided what the exceptions and errors are to be detected, and under which situations, let's examine what we can do with them. This is the real challenging part: just having detected an exception, ask...

Here's some skeletal DAO code performing an unspecified operation (one of create, read, update or delete) with ad absurdam try-catch statements. Note the places where CommandResult may be obtained.

public void method( Account account )
{
    Datastore     datastore = getDatastore();
    CommandResult result    = null;

    try
    {
        datastore.{save|find|get|save|update|etc.}( ... );

        result = mongo.getDB( databaseName ).getLastError();
    }
    catch( MongoException.DuplicateKey e )
    {
        /* This shouldn't happen in our case since we verify beforehand that
         * the key doesn't exist, however, if the same key is being created
         * by another process and it beat us, then this would happen.
         */
        result = e.getCommandResult();
    }
    catch( MongoException.Network e )
    {
        /* If we thought it worth our while, and maybe it is, we could
         * increase the time-out here each time we retry.
         */
    }
    catch( MongoException.CursorNotFound e )
    {
        /* (Not yet considered as this document is mostly about handling
         * write concern errors.)
         */
    }
    catch( WriteConcernException e )
    {
        /* Cannot fulfill the write concern, in our case, this would mean
         * that it didn't reach both the primary and one secondary before
         * timing out or failing, but it may have reached the primary.
         */
        result = e.getCommandResult();
    }
    catch( UpdateException e )
    {
        /* This is a Morphia-only exception about update failing. It
         * appears always to mean "this document not found", perhaps
         * it's been deleted, which should mean this exception only
         * occurs in our JMeter and other testing where accounts,
         * addresses, payments, etc. are deleted for clean-up.
         */
    }
    catch( MongoException e )
    {
        /* What MongoDB error cases are left over to be caught here?
         */
    }
    catch( Exception e )
    {
        /* The cave walls are falling in on top of us!
         */
    }
}

Final DAO pseudo code

As already noted, this is still a little bogus in terms of what DAO code would really look like, but again, it's got to corral all the elements that might be used together (if across several classes) for a real solution. To abbreviate the code and reduce (at least for now) maintenance of the set-up and try-catch code, (most) all calls funnel down into and awkwardly call common code.

AccountDao.java:
package com.acme.user.dao;

import java.net.UnknownHostException;
import java.util.Formatter;
import java.lang.StringBuilder;

import org.apache.log4j.Logger;
import org.bson.types.ObjectId;

import com.google.code.morphia.Datastore;
import com.google.code.morphia.Morphia;
import com.google.code.morphia.query.UpdateException;
import com.acme.user.entity.Account;
import com.mongodb.CommandResult;
import com.mongodb.DBCollection;
import com.mongodb.MongoClient;
import com.mongodb.MongoClientOptions;
import com.mongodb.MongoClientOptions.Builder;
import com.mongodb.MongoException;
import com.mongodb.ServerAddress;
import com.mongodb.WriteConcern;
import com.mongodb.WriteConcernException;
import com.mongodb.WriteResult;

import com.acme.mongodb.MongoCommandResult;

public class AccountDao
{
    private static Logger log = Logger.getLogger( AccountDao.class );

    private static final int    MAX_TRIES       = 3;
    private static final String DATABASE_NAME   = "accountdb";
    private static final String COLLECTION_NAME = "accounts";

    private MongoClient  mongo      = null;
    private Morphia      morphia    = new Morphia();
    private DBCollection collection = null;
    private Datastore    datastore  = null;

    private AccountDao( String hostname, int port ) throws UnknownHostException
    {
        // this really belongs in MongoDB start-up code somewhere else...
        ServerAddress address = new ServerAddress( hostname, port );

        Builder builder = new MongoClientOptions.Builder();
        builder.writeConcern( new WriteConcern( 2, 250 ) );

        MongoClientOptions options = builder.build();

        mongo      = new MongoClient( address, options );
        collection = mongo.getDB( DATABASE_NAME ).getCollection( COLLECTION_NAME );
        datastore  = morphia.createDatastore( mongo, DATABASE_NAME );
    }

    // canonicalize our error/log messages if possible...
    private static final String DUPLICATE_KEY = "Key %s already exists--cannot create new one";
    private static final String NETWORK       = "Network error: Failed to %s %s, tries left: %d";
    private static final String WRITECONCERN  = "%s has been written to primary, but to any secondary by end of specified time-out";
    private static final String UNKNOWN_MONGO = "Uncategorized MongoDB %s failure for %s (see stack trace)";
    private static final String UPDATE        = "%s not updated (probably because it's been deleted)";
    private static final String UNKNOWN_ERROR = "Failed utterly to %s %s: unknown error (see stack trace)";

    /**
     * This is ugly, but it lets us keep the try-catch code in one place
     * for the purposes of this example. Then again, maybe we'd want to
     * do it this way or put it into an abstract class, etc. It's an
     * exercise left to the reader.
     *
     * Be sure to see the comments on individual catches in the previous
     * section of this article.
     */
    private void handleOperation( Account account, String operation ) throws Exception
    {
        int    tries    = MAX_TRIES;
        String identity = account.getIdentity();

        while( tries-- > 0 )
        {
            String        message;
            CommandResult result;
            StringBuilder sb  = new StringBuilder();
            Formatter     fmt = new Formatter( sb );

            try
            {
                if( operation.equals( "create" ) )
                {
                    WriteResult writeResult = collection.insert( account.getBsonFromPojo() );

                    CommandResult result = writeResult.getLastError();

                    if( result.ok() == false )
                        log.info( "Added " + identity + "..." );
                }
                else if( operation.equals( "update" ) )
                {
                    datastore.merge( account );
                }
                else if( operation.equals( "delete" ) )
                {
                    datastore.delete( account );
                }

                break;
            }
            catch( MongoException.DuplicateKey e )        // (CREATE only)
            {
                result = e.getCommandResult();

                fmt.format( DUPLICATE_KEY, identity );
                message = sb.toString();
                log.error( message );
                throw new Exception( message );
            }
            catch( MongoException.Network e )
            {
                fmt.format( NETWORK, operation, identity, tries );
                message = sb.toString();
                log.debug( message );
                continue;
            }
            catch( WriteConcernException e )
            {
                result = e.getCommandResult();

                MongoCommandResult r = new MongoCommandResult( result );

                // on localhost, where there is no replica set, just bail...
                if( r.wnote.equals( MongoCommandResult.NO_REPL_ENABLED ) )
                    return;

                fmt.format( WRITECONCERN, identity );
                message = sb.toString();
                log.debug( message );
                break;
            }
            catch( UpdateException e )
            {
                fmt.format( UPDATE, identity );
                message = sb.toString();
                log.error( e.getMessage() + " " + message );
                throw new AppException( message );
            }
            catch( MongoException e )
            {
                fmt.format( UNKNOWN_MONGO, operation, identity );
                message = sb.toString();
                log.error( e.getMessage() + message );
                e.printStackTrace();
                throw new Exception( message );
            }
            catch( Throwable e )
            {
                fmt.format( UNKNOWN_ERROR, operation, identity );
                message = sb.toString();
                log.error( e.getMessage() + message );
                e.printStackTrace();
                throw new Exception( message );
            }
        }
    }

    public Account create( Account account ) throws Exception
    {
        handleOperation( account, "create" );
        return account;
    }

    public Account readByOid( ObjectId accountoid )
    {
        return datastore.find( Account.class, "_id", accountoid ).get();
    }

    public void update( Account account ) throws Exception
    {
        handleOperation( account, "update" );
    }

    public void delete( Account account ) throws Exception
    {
        handleOperation( account, "delete" );
    }
}

Here's something that will help, at least, at first. It sorts out the CommandResult coming back and allows one to make sense in code of what's going on in the case of WriteConcernException.

Utility class for parsing a MongoDB CommandResult for when that can be useful to us. That's probably only going to be in early days as we're sorting out MongoDB exception- and error handling.

There's no need to stand on ceremony here: everything in this class is is wide-open.

Notes

- When running on the local development host where there is no replica set, WriteConcernException is always thrown.

- When running locally or against a single-node MongoDB (no replica set):

    wnote = "no replication has been enabled, so w=2 won't work"

- n is the number of documents updated; strangely, this can be 0 even though a document was actually updated in a single-node set-up.

- Delightfully, serverUsed seems to be just what you'd expect.

package com.acme.mongodb;

import com.mongodb.CommandResult;

public class MongoCommandResult
{
	public String serverUsed;
	public String wnote;
	public String err;
	public int    n;
	public int    connectionId;
	public Double ok;

	public MongoCommandResult( CommandResult result )
	{
		this.serverUsed   = result.getString( "serverUsed" );
		this.wnote        = result.getString( "wnote" );
		this.err          = result.getString( "err" );
		this.n            = result.getInt( "n" );
		this.connectionId = result.getInt( "connectionId" );
		this.ok           = result.getDouble( "ok" );
	}

	public String toString()
	{
		StringBuilder sb = new StringBuilder();

		sb.append( "{" )
		  .append( "\n    serverUsed = " + serverUsed )
		  .append( "\n         wnote = " + wnote )
		  .append( "\n           err = " + err )
		  .append( "\n  connectionId = " + connectionId )
		  .append( "\n            ok = " + ok )
		  .append( "\n}" );

		return sb.toString();
	}

	public static final String NO_REPL_ENABLED = "no replication has been enabled, so w=2 won't work";
}