Over a million developers have joined DZone.

How to Build a Batch Enabled Cloud Connector

Build APIs from SQL and NoSQL or Salesforce data sources in seconds. Read the Creating REST APIs white paper, brought to you in partnership with CA Technologies.

When we announced the December 2013 release, an exciting new feature also saw daylight: The Batch Module. If you haven’t read the post describing the feature’s highlights, you should, but today I’d like to focus on how the <batch:commit>block interacts with Anypoint™ Connectors and more specifically, how you can leverage your own connectors to take advantage of the feature.

<batch:commit> overview

In a nutshell, you can use a Batch Commit block to collect a subset of records for bulk upsert to an external source or service. For example, rather than upserting each individual contact (i.e. record) to Google Contacts, you can configure a Batch Commit to collect, lets say 100 records, and then upsert all of them to Google Contacts in one chunk. Within a batch step – the only place you can apply it – you can use a Batch Commit to wrap an outbound message processor. See the example below:



Mixing in

This is all great but what do connectors have to do with this? Well, the only reason why the example above makes any sense at all is because the Google is capable of doing bulk operations. If the connector only supported updating records one at a time, then there would be no reason for <batch:commit> to exists.

But wait! The batch module was only released two months ago, yet connectors like Google Contacts, Salesforce, Netsuite, etc, have had bulk operations for years! True that. But what we didn’t have until Batch came along was a construct allowing us to do record level error handling.

Suppose that you’re upserting 200 records in Salesforce. In the past, if 100 of them failed and the other 100 were successful, it was up to you to parse the connector response, pull the failed from the successful apart and take appropiate action. If you wanted to do the same with Google Contacts, you again found yourself needing to do everything again, with the extra complexity that you couldn’t reuse your code because Google and Salesforce APIs use completely different representations to notify the operation’s result.

Our goal with the batch module is clear: make this stuff simple. We no longer want you struggling to figure out each API’s representation for a bulk result and handling each failed record independently – from now on, you can rely on <batch:commit> to do that for you automatically.

It’s not magic

“A kind of magic” is one of my favorite songs from Queen, specially the live performance at Wembley Stadium in ’86. Although the magic described in that song doesn’t apply to batch and connector’s mechanisms, there’s one phrase in that song which accurately describes the problem here: “There can only be one”.

If we want the Batch module to understand all types of a bulk operations results, we need to start by defining a canonical way of representing it. We did so in a class called BulkOperationResult which defines the following contact:

/**
 * This class is used to provide item level information about a bulk operation. This
 * master entity represents the bulk operation as a whole, while the detail entity
 * {@link BulkItem} represents the operation status for each individual data piece.
 * The {@link #items} list defines a contract in which the ordering of those items
 * needs to match the ordering of the original objects. For example, if the bulk
 * operation consisted of 10 person objects in which number X corresponded to the
 * person 'John Doe', then the Xth item in the {@link #items} list must reference to
 * the result of procesing the same 'John Doe'
 */
public final class BulkOperationResult<T> implements Serializable
{
    /**
     * The operation id
     */
    public Serializable getId();
    
    /**
     * Whether or not the operation was successful. Should be true if
     * and only if all the child {@link BulkItem} entities were also successful
     */
    public boolean isSuccessful();
    
    /**
     * An ordered list of {@link BulkItem}, one per each item in the original
     * operation, no matter if the record was successful or not
     */
    public List<BulkItem<T>> getItems();
    
    /**
     * A custom property stored under the given key
     * 
     * @param key the key of the custom property
     * @return a {@link Serializable} value
     */
    public Serializable getCustomProperty(String key);
}

Basically, the above class is a Master-Detail relationship in which:

  • BulkOperationResult represents the operation as a whole, playing the role of the master
  • BulkItem represents the result for each individual record, playing the role of the detail
  • Both classes are immutable
  • There’s an ordering relationship between the master and the detail. The first item in the BulkItem list has to correspond to the first record in the original bulk. The second has to correspond to the second one, and so forth.

In case you’re curious, this is how BulkItem’s contact looks like:

/**
 * This class represents an individual data piece in the context of a bulk operation
 */
public final class BulkItem<T> implements Serializable
{
    /**
     * The item id
     */
    public Serializable getId();

    /**
     * Wether or not it was successful. Notice that this should be false
     * if {@link #exception} is not null, however there might not be an
     * exception but the item could still not be successful for other reasons.
     */
    public boolean isSuccessful();

    /**
     * Message to add context on this item. Could be an error description, a warning
     * or simply some info related to the operation
     */
    public String getMessage();

    /**
     * An optional status code
     */
    public String getStatusCode();
    
    /**
     * An exception if the item was failed
     */
    public Exception getException();

    /**
     * The actual data this entity represents
     */
    public T getPayload();

    /**
     * A custom property stored under the given key
     * 
     * @param key the key of the custom property
     * @return a {@link Serializable} value
     */
    public Serializable getCustomProperty(String key);
}

So, that’s it? We just modify all connectors to return a BulkOperationResult object on all bulk operations and we’re done? Not quite. That would be the recommended practice for new connectors moving forward, but for existing connectors we would be breaking backwards compatibility with any existing application written before the release of the Batch module, which are manually handling the output of bulk operations.

What we did in these cases is have those connectors register a Transformer. Since it’s each connector’s responsibility to understand each API’s domain, it also makes sense to ask each connector to translate it’s own bulk operation representation to a BulkOperationResult object.

Let’s see an example. This is the signature for an operation in the Google Contacts connector which performs a bulk operation:

public List<BatchResult> batchContacts(String batchId, List<NestedProcessor> operations) throws Exception;

Let’s forget about the implementation of the method right now. The take away from the above snippet is that the operation will return a List of BatchResult objects. Let’s see how to register a transformer that goes from that to a BulkOperationResult:

@Start
public void init() {
	this.muleContext.getRegistry().registerTransformer(new BatchResultToBulkOperationTransformer());
}

And for the big finale, the code of the transformer itself:

public class BatchResultToBulkOperationTransformer extends AbstractDiscoverableTransformer {

	public BatchResultToBulkOperationTransformer() {
		this.registerSourceType(DataTypeFactory.create(List.class, BatchResult.class, null));
		this.setReturnDataType(DataTypeFactory.create(BulkOperationResult.class));
	}
	
	@Override
	protected Object doTransform(Object src, String enc) throws TransformerException {
		List<BatchResult> results = (List<BatchResult>) src;
		
		BulkOperationResultBuilder<BaseEntry<?>> builder = BulkOperationResult.<BaseEntry<?>>builder();
		
		if (results != null) {
			for (BatchResult result : results) {
				BatchStatus status = result.getStatus();
				int code = status.getCode();
				
				builder.addItem(BulkItem.<BaseEntry<?>>builder()
						.setRecordId(result.getId())
						.setPayload(result.getEntry())
						.setMessage(status.getContent())
						.setStatusCode(String.format("%d - %s", code, status.getReason()))
						.setSuccessful(code == 200 || code == 201 || code == 204)
					);
			}
		}
		
		return builder.build();
	}

}

Important things to notice about the above transformer:

  • It extends AbstractDiscoverableTransformer. This is so that the batch module can dynamically find it in runtime.
  • It defines the source and target data types on its constructor
  • The doTransform() method does “the magic”
  • Notice how BulkOperationResult and BulkItem classes provide convenient Builder objects to decouple their inner representations from your connector’s code

And that’s pretty much it! The last consideration to take is: what happens if I use a bulk operation in a <batch:commit> using a connector that doesn’t support reporting a BulkOperationResult? Well, in that case you have two options:

  • Write the transformer and register it yourself at an application level
  • Just let it be and in case of exception, batch will fail all records alike

Wrapping it up

In this article we discussed why it’s important for connectors to support bulk operations whenever possible (some APIs just can’t do it, that’s not your fault). For new connectors, we advice to always return instances of the canonical BulkOperationResult class. If you want to add batch support to an existing connector without breaking backwards compatibility, we covered how to register discoverable transformers to do the trick.

The Integration Zone is brought to you in partnership with CA Technologies.  Use CA Live API Creator to quickly create complete application backends, with secure APIs and robust application logic, in an easy to use interface.

Topics:

Published at DZone with permission of Mariano Gonzalez, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}