DevOps and CI/CD Resources

The Latest DevOps and CI/CD Topics

Diagnosing SST Errors with Percona XtraDB Cluster for MySQL

[This article was written by Stephane Combaudon] State Snapshot Transfer (SST) is used in Percona XtraDB Cluster (PXC) when a new node joins the cluster or to resync a failed node if Incremental State Transfer (IST) is no longer available. SST is triggered automatically but there is no magic: If it is not configured properly, it will not work and new nodes will never be able to join the cluster. Let’s have a look at a few classic issues. Port for SST is not open The donor and the joiner communicate on port 4444, and if the port is closed on one side, SST will always fail. You will see in the error log of the donor that SST is started: [...] 141223 16:08:48 [Note] WSREP: Node 2 (node1) requested state transfer from '*any*'. Selected 0 (node3)(SYNCED) as donor. 141223 16:08:48 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 6) 141223 16:08:48 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 141223 16:08:48 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.234.101:4444/xtrabackup_sst' --auth 'sstuser:s3cret' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '04c085a1-89ca-11e4-b1b6-6b692803109b:6'' [...] But then nothing happens, and some time later you will see a bunch of errors: [...] 2014/12/23 16:09:52 socat[2965] E connect(3, AF=2 192.168.234.101:4444, 16): Connection timed out WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 0 1 (20141223 16:09:52.057) WSREP_SST: [ERROR] Cleanup after exit with status:32 (20141223 16:09:52.064) WSREP_SST: [INFO] Cleaning up temporary directories (20141223 16:09:52.068) 141223 16:09:52 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.234.101:4444/xtrabackup_sst' --auth 'sstuser:s3cret' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '04c085a1-89ca-11e4-b1b6-6b692803109b:6' [...] On the joiner side, you will see a similar sequence: SST is started, then hangs and is finally aborted: [...] 141223 16:08:48 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 6) 141223 16:08:48 [Note] WSREP: Requesting state transfer: success, donor: 0 141223 16:08:49 [Note] WSREP: (f9560d0d, 'tcp://0.0.0.0:4567') turning message relay requesting off 141223 16:09:52 [Warning] WSREP: 0 (node3): State transfer to 2 (node1) failed: -32 (Broken pipe) 141223 16:09:52 [ERROR] WSREP: gcs/src/gcs_group.cpp:long int gcs_group_handle_join_msg(gcs_group_t*, const gcs_recv_msg_t*)():717: Will never receive state. Need to abort. The solution is of course to make sure that the ports are open on both sides. SST is not correctly configured Sometimes you will see an error like this on the donor: 141223 21:03:15 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.234.102:4444/xtrabackup_sst' --auth 'sstuser:s3cretzzz' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid 'e63f38f2-8ae6-11e4-a383-46557c71f368:0'' [...] WSREP_SST: [ERROR] innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log (20141223 21:03:26.973) And if you look at innobackup.backup.log: 41223 21:03:26 innobackupex: Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_file=/etc/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock' as 'sstuser' (using password: YES). innobackupex: got a fatal error with the following stacktrace: at /usr//bin/innobackupex line 2995 main::mysql_connect('abort_on_error', 1) called at /usr//bin/innobackupex line 1530 innobackupex: Error: Failed to connect to MySQL server: DBI connect(';mysql_read_default_file=/etc/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock','sstuser',...) failed: Access denied for user 'sstuser'@'localhost' (using password: YES) at /usr//bin/innobackupex line 2979 What happened? The default SST method is xtrabackup-v2 and for it to work, you need to specify a username/password in the my.cnf file: [mysqld] wsrep_sst_auth=sstuser:s3cret And you also need to create the corresponding MySQL user: mysql> GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost' IDENTIFIED BY 's3cret'; So you should check that the user has been correctly created in MySQL and that wsrep_sst_auth is correctly set. Galera versions do not match Here is another set of errors you may see in the error log of the donor: 141223 21:14:27 [Warning] WSREP: unserialize error invalid flags 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():101 141223 21:14:30 [Warning] WSREP: unserialize error invalid flags 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():101 141223 21:14:33 [Warning] WSREP: unserialize error invalid flags 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():101 Here the issue is that you try to connect a node using Galera 2.x and a node running Galera 3.x. This can happen if you try to use a PXC 5.5 node and a PXC 5.6 node. The right solution is probably to understand why you ended up with such inconsistent versions and make sure all nodes are using the same Percona XtraDB Cluster version and Galera version. But if you know what you are doing, you can also instruct the node using Galera 3.x that it will communicate with Galera 2.x nodes by specifying in the my.cnf file: [mysqld] wsrep_provider_options="socket.checksum=1" Conclusion SST errors can have multiple reasons for occurring, and the best way to diagnose the issue is to have a look at the error log of the donor and the joiner. Galera is in general quite verbose so you can follow the progress of SST on both nodes and see where it fails. Then it is mostly about being able to interpret the error messages.

April 27, 2015

by Peter Zaitsev

· 11,885 Views

On Neo4j Indexes, Match and Merge

Neo4j uses schema indexes, constraints, merges, matches, and a variety of other indexing features to help with your NoSQL needs.

April 20, 2015

by Michael Hunger

· 11,877 Views

Using Apache Kafka for Integration and Data Processing Pipelines with Spring

written by josh long on the spring blog applications generated more and more data than ever before and a huge part of the challenge - before it can even be analyzed - is accommodating the load in the first place. apache’s kafka meets this challenge. it was originally designed by linkedin and subsequently open-sourced in 2011. the project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. the design is heavily influenced by transaction logs. it is a messaging system, similar to traditional messaging systems like rabbitmq, activemq, mqseries, but it’s ideal for log aggregation, persistent messaging, fast (_hundreds_ of megabytes per second!) reads and writes, and can accommodate numerous clients. naturally, this makes it perfect for cloud-scale architectures! kafka powers many large production systems . linkedin uses it for activity data and operational metrics to power the linkedin news feed, and linkedin today, as well as offline analytics going into hadoop. twitter uses it as part of their stream-processing infrastructure. kafka powers online-to-online and online-to-offline messaging at foursquare. it is used to integrate foursquare monitoring and production systems with hadoop-based offline infrastructures. square uses kafka as a bus to move all system events through square’s various data centers. this includes metrics, logs, custom events, and so on. on the consumer side, it outputs into splunk, graphite, or esper-like real-time alerting. netflix uses it for 300-600bn messages per day. it’s also used by airbnb, mozilla, goldman sachs, tumblr, yahoo, paypal, coursera, urban airship, hotels.com, and a seemingly endless list of other big-web stars. clearly, it’s earning its keep in some powerful systems! installing apache kafka there are many different ways to get apache kafka installed. if you’re on osx, and you’re using homebrew, it can be as simple as brew install kafka . you can also download the latest distribution from apache . i downloaded kafka_2.10-0.8.2.1.tgz , unzipped it, and then within you’ll find there’s a distribution of apache zookeeper as well as kafka, so nothing else is required. i installed apache kafka in my $home directory, under another directory, bin , then i created an environment variable, kafka_home , that points to $home/bin/kafka . start apache zookeeper first, specifying where the configuration properties file it requires is: $kafka_home/bin/zookeeper-server-start.sh $kafka_home/config/zookeeper.properties the apache kafka distribution comes with default configuration files for both zookeeper and kafka, which makes getting started easy. you will in more advanced use cases need to customize these files. then start apache kafka. it too requires a configuration file, like this: $kafka_home/bin/kafka-server-start.sh $kafka_home/config/server.properties the server.properties file contains, among other things, default values for where to connect to apache zookeeper ( zookeeper.connect ), how much data should be sent across sockets, how many partitions there are by default, and the broker id ( broker.id - which must be unique across a cluster). there are other scripts in the same directory that can be used to send and receive dummy data, very handy in establishing that everything’s up and running! now that apache kafka is up and running, let’s look at working with apache kafka from our application. some high level concepts.. a kafka broker cluster consists of one or more servers where each may have one or more broker processes running. apache kafka is designed to be highly available; there are no master nodes. all nodes are interchangeable. data is replicated from one node to another to ensure that it is still available in the event of a failure. in kafka, a topic is a category, similar to a jms destination or both an amqp exchange and queue. topics are partitioned, and the choice of which of a topic’s partition a message should be sent to is made by the message producer. each message in the partition is assigned a unique sequenced id, its offset . more partitions allow greater parallelism for consumption, but this will also result in more files across the brokers. producers send messages to apache kafka broker topics and specify the partition to use for every message they produce. message production may be synchronous or asynchronous. producers also specify what sort of replication guarantees they want. consumers listen for messages on topics and process the feed of published messages. as you’d expect if you’ve used other messaging systems, this is usually (and usefully!) asynchronous. like spring xd and numerous other distributed system, apache kafka uses apache zookeeper to coordinate cluster information. apache zookeeper provides a shared hierarchical namespace (called znodes ) that nodes can share to understand cluster topology and availability (yet another reason that spring cloud has forthcoming support for it..). zookeeper is very present in your interactions with apache kafka. apache kafka has, for example, two different apis for acting as a consumer. the higher level api is simpler to get started with and it handles all the nuances of handling partitioning and so on. it will need a reference to a zookeeper instance to keep the coordination state. let’s turn now turn to using apache kafka with spring. using apache kafka with spring integration the recently released apache kafka 1.1 spring integration adapter is very powerful, and provides inbound adapters for working with both the lower level apache kafka api as well as the higher level api. the adapter, currently, is xml-configuration first, though work is already underway on a spring integration java configuration dsl for the adapter and milestones are available. we’ll look at both here, now. to make all these examples work, i added the libs-milestone-local maven repository and used the following dependencies: org.apache.kafka:kafka_2.10:0.8.1.1 org.springframework.boot:spring-boot-starter-integration:1.2.3.release org.springframework.boot:spring-boot-starter:1.2.3.release org.springframework.integration:spring-integration-kafka:1.1.1.release org.springframework.integration:spring-integration-java-dsl:1.1.0.m1 using the spring integration apache kafka with the spring integration xml dsl first, let’s look at how to use the spring integration outbound adapter to send message instances from a spring integration flow to an external apache kafka instance. the example is fairly straightforward: a spring integration channel named inputtokafka acts as a conduit that forwards message messages to the outbound adapter, kafkaoutboundchanneladapter . the adapter itself can take its configuration from the defaults specified in the kafka:producer-context element or it from the adapter-local configuration overrides. there may be one or many configurations in a given kafka:producer-context element. here’s the java code from a spring boot application to trigger message sends using the outbound adapter by sending messages into the incoming inputtokafka messagechannel . package xml; import org.apache.commons.logging.log; import org.apache.commons.logging.logfactory; import org.springframework.beans.factory.annotation.qualifier; import org.springframework.boot.commandlinerunner; import org.springframework.boot.springapplication; import org.springframework.boot.autoconfigure.springbootapplication; import org.springframework.context.annotation.bean; import org.springframework.context.annotation.dependson; import org.springframework.context.annotation.importresource; import org.springframework.integration.config.enableintegration; import org.springframework.messaging.messagechannel; import org.springframework.messaging.support.genericmessage; @springbootapplication @enableintegration @importresource("/xml/outbound-kafka-integration.xml") public class demoapplication { private log log = logfactory.getlog(getclass()); @bean @dependson("kafkaoutboundchanneladapter") commandlinerunner kickoff(@qualifier("inputtokafka") messagechannel in) { return args -> { for (int i = 0; i < 1000; i++) { in.send(new genericmessage<>("#" + i)); log.info("sending message #" + i); } }; } public static void main(string args[]) { springapplication.run(demoapplication.class, args); } } using the new apache kafka spring integration java configuration dsl shortly after the spring integration 1.1 release, spring integration rockstar artem bilan got to work on adding a spring integration java configuration dsl analog and the result is a thing of beauty! it’s not yet ga (you need to add the libs-milestone repository for now), but i encourage you to try it out and kick the tires. it’s working well for me and the spring integration team are always keen on getting early feedback whenever possible! here’s an example that demonstrates both sending messages and consuming them from two different integrationflow s. the producer is similar to the example xml above. new in this example is the polling consumer. it is batch-centric, and will pull down all the messages it sees at a fixed interval. in our code, the message received will be a map that contains as its keys the topic and as its value another map with the partition id and the batch (in this case, of 10 records), of records read. there is a messagelistenercontainer -based alternative that processes messages as they come. package jc; import org.apache.commons.logging.log; import org.apache.commons.logging.logfactory; import org.springframework.beans.factory.annotation.autowired; import org.springframework.beans.factory.annotation.qualifier; import org.springframework.beans.factory.annotation.value; import org.springframework.boot.commandlinerunner; import org.springframework.boot.springapplication; import org.springframework.boot.autoconfigure.springbootapplication; import org.springframework.context.annotation.bean; import org.springframework.context.annotation.configuration; import org.springframework.context.annotation.dependson; import org.springframework.integration.integrationmessageheaderaccessor; import org.springframework.integration.config.enableintegration; import org.springframework.integration.dsl.integrationflow; import org.springframework.integration.dsl.integrationflows; import org.springframework.integration.dsl.sourcepollingchanneladapterspec; import org.springframework.integration.dsl.kafka.kafka; import org.springframework.integration.dsl.kafka.kafkahighlevelconsumermessagesourcespec; import org.springframework.integration.dsl.kafka.kafkaproducermessagehandlerspec; import org.springframework.integration.dsl.support.consumer; import org.springframework.integration.kafka.support.zookeeperconnect; import org.springframework.messaging.messagechannel; import org.springframework.messaging.support.genericmessage; import org.springframework.stereotype.component; import java.util.list; import java.util.map; /** * demonstrates using the spring integration apache kafka java configuration dsl. * thanks to spring integration ninja artem bilan * for getting the java configuration dsl working so quickly! * * @author josh long */ @enableintegration @springbootapplication public class demoapplication { public static final string test_topic_id = "event-stream"; @component public static class kafkaconfig { @value("${kafka.topic:" + test_topic_id + "}") private string topic; @value("${kafka.address:localhost:9092}") private string brokeraddress; @value("${zookeeper.address:localhost:2181}") private string zookeeperaddress; kafkaconfig() { } public kafkaconfig(string t, string b, string zk) { this.topic = t; this.brokeraddress = b; this.zookeeperaddress = zk; } public string gettopic() { return topic; } public string getbrokeraddress() { return brokeraddress; } public string getzookeeperaddress() { return zookeeperaddress; } } @configuration public static class producerconfiguration { @autowired private kafkaconfig kafkaconfig; private static final string outbound_id = "outbound"; private log log = logfactory.getlog(getclass()); @bean @dependson(outbound_id) commandlinerunner kickoff( @qualifier(outbound_id + ".input") messagechannel in) { return args -> { for (int i = 0; i < 1000; i++) { in.send(new genericmessage<>("#" + i)); log.info("sending message #" + i); } }; } @bean(name = outbound_id) integrationflow producer() { log.info("starting producer flow.."); return flowdefinition -> { consumer spec = (kafkaproducermessagehandlerspec.producermetadataspec metadata)-> metadata.async(true) .batchnummessages(10) .valueclasstype(string.class) .valueencoder(string::getbytes); kafkaproducermessagehandlerspec messagehandlerspec = kafka.outboundchanneladapter( props -> props.put("queue.buffering.max.ms", "15000")) .messagekey(m -> m.getheaders().get(integrationmessageheaderaccessor.sequence_number)) .addproducer(this.kafkaconfig.gettopic(), this.kafkaconfig.getbrokeraddress(), spec); flowdefinition .handle(messagehandlerspec); }; } } @configuration public static class consumerconfiguration { @autowired private kafkaconfig kafkaconfig; private log log = logfactory.getlog(getclass()); @bean integrationflow consumer() { log.info("starting consumer.."); kafkahighlevelconsumermessagesourcespec messagesourcespec = kafka.inboundchanneladapter( new zookeeperconnect(this.kafkaconfig.getzookeeperaddress())) .consumerproperties(props -> props.put("auto.offset.reset", "smallest") .put("auto.commit.interval.ms", "100")) .addconsumer("mygroup", metadata -> metadata.consumertimeout(100) .topicstreammap(m -> m.put(this.kafkaconfig.gettopic(), 1)) .maxmessages(10) .valuedecoder(string::new)); consumer endpointconfigurer = e -> e.poller(p -> p.fixeddelay(100)); return integrationflows .from(messagesourcespec, endpointconfigurer) .>>handle((payload, headers) -> { payload.entryset().foreach(e -> log.info(e.getkey() + '=' + e.getvalue())); return null; }) .get(); } } public static void main(string[] args) { springapplication.run(demoapplication.class, args); } } the example makes heavy use of java 8 lambdas. the producer spends a bit of time establishing how many messages will be sent in a single send operation, how keys and values are encoded (kafka only knows about byte[] arrays, after all) and whether messages should be sent synchronously or asynchronously. in the next line, we configure the outbound adapter itself and then define an integrationflow such that all messages get sent out via the kafka outbound adapter. the consumer spends a bit of time establishing which zookeeper instance to connect to, how many messages to receive (10) in a batch, etc. once the message batches are recieved, they’re handed to the handle method where i’ve passed in a lambda that’ll enumerate the payload’s body and print it out. nothing fancy. using apache kafka with spring xd apache kafka is a message bus and it can be very powerful when used as an integration bus. however, it really comes into its own because it’s fast enough and scalable enough that it can be used to route big-data through processing pipelines. and if you’re doing data processing, you really want spring xd ! spring xd makes it dead simple to use apache kafka (as the support is built on the apache kafka spring integration adapter!) in complex stream-processing pipelines. apache kafka is exposed as a spring xd source - where data comes from - and a sink - where data goes to. spring xd exposes a super convenient dsl for creating bash -like pipes-and-filter flows. spring xd is a centralized runtime that manages, scales, and monitors data processing jobs. it builds on top of spring integration, spring batch, spring data and spring for hadoop to be a one-stop data-processing shop. spring xd jobs read data from sources , run them through processing components that may count, filter, enrich or transform the data, and then write them to sinks. spring integration and spring xd ninja marius bogoevici , who did a lot of the recent work in the spring integration and spring xd implementation of apache kafka, put together a really nice example demonstrating how to get a full working spring xd and kafka flow working . the readme walks you through getting apache kafka, spring xd and the requisite topics all setup. the essence, however, is when you use the spring xd shell and the shell dsl to compose a stream. spring xd components are named components that are pre-configured but have lots of parameters that you can override with --.. arguments via the xd shell and dsl. (that dsl, by the way, is written by the amazing andy clement of spring expression language fame!) here’s an example that configures a stream to read data from an apache kafka source and then write the message a component called log , which is a sink. log , in this case, could be syslogd, splunk, hdfs, etc. xd> stream create kafka-source-test --definition "kafka --zkconnect=localhost:2181 --topic=event-stream | log"--deploy and that’s it! naturally, this is just a tase of spring xd, but hopefully you’ll agree the possibilities are tantalizing. deploying a kafka server with lattice and docker it’s easy to get an example kafka installation all setup using lattice , a distributed runtime that supports, among other container formats, the very popular docker image format. there’s a docker image provided by spotify that sets up a collocated zookeeper and kafka image . you can easily deploy this to a lattice cluster, as follows: ltc create --run-as-root m-kafka spotify/kafka from there, you can easily scale the apache kafka instances and even more easily still consume apache kafka from your cloud-based services. next steps you can find the code for this blog on my github account . we’ve only scratched the surface! if you want to learn more (and why wouldn’t you?), then be sure to check out marius bogoevici and dr. mark pollack’s upcoming webinar on reactive data-pipelines using spring xd and apache kafka where they’ll demonstrate how easy it can be to use rxjava, spring xd and apache kafka!

April 18, 2015

by Pieter Humphrey

· 29,159 Views

A cluster management framework, Apache Helix

What is Helix? It is used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix automates reassignment of resources in the face of node failure and recovery, cluster expansion, and reconfiguration. Modeling a distributed system as a state machine with constraints on states and transitions. Terminologies Node : A single machine Cluster: Set of Nodes Resource : A logical entry (e.g. database, index, task) Partition: Subset of the resource (Each subtask is referred to as a partition) Replica: Copy of a Partition State (e.g Master, Slave). It increase the availability of the system State: Describes the role of a replica (Each node in the cluster has its own Current State) State Machine and Transitions: An action that allows a replica to move from one state to another, thus changing its role. ( e.g Slave --> Master ) spectators: the external clients. Helix provides an External View that is an aggregated view of the current state across all nodes. Current State: represents resource's actual state at a participating node. - INSTANCE_NAME: Unique name representing the process - SESSION_ID: ID that is automatically assigned every time a process joins the cluster Rebalancer: The core component of Helix is the Controller which runs the Rebalance algorithm on every cluster event. Dynamic Ideal State: Helix powerful is that Ideal State can be changed dynamically. It is adjusting the ideal state. Whenever a cluster event occurs, Helix can operate in one of three modes FULL_AUTO SEMI_AUTO CUSTOMIZED Cluster events can be one of the following: Nodes start and/or stop Nodes experience soft and/or hard failures New nodes are added/removed [1] http://helix.apache.org/Concepts.html

April 13, 2015

by Madhuka Udantha

· 7,959 Views

Adopting Microservices at Netflix: Lessons for Team and Process Design

[this article was written by tony mauro .] in a previous blog post , we shared best practices for designing a microservices architecture, based on adrian cockcroft’s presentation at nginx.conf2014 about his experience as director of web engineering and then cloud architect at netflix . in this follow-up post, we’ll review his recommendations for retooling your development team and processes for a smooth transition to microservices. optimize for speed, not efficiency source: [email protected] the top lesson that cockcroft learned at netflix is that speed wins in the marketplace. if you ask any developer whether a slower development process is better, no one ever says yes. nor do management or customers ever complain that your development cycle is too fast for them. the need for speed doesn’t just apply to tech companies, either: as software becomes increasingly ubiquitous on the internet of things – in cars, appliances, and sensors as well as mobile devices – companies that didn’t used to do software development at all now find that their success depends on being good at it. netflix made an early decision to optimize for speed. this refers specifically to tooling your software development process so that you can react quickly to what your customers want, or even better, can create innovative web experiences that attract customers. speed means learning about your customers and giving them what they want at a faster pace than your competitors. by the time competitors are ready to challenge you in a specific way, you’ve moved on to the next set of improvements. this approach turns the usual paradigm of optimizing for efficiency on its head. efficiency generally means trying to control the overall flow of the development process to eliminate duplication of effort and avoid mistakes, with an eye to keeping costs down. the common result is that you end up focusing on savings instead of looking for opportunities that increase revenue. in cockcroft’s experience, if you say “i’m doing this because it’s more efficient,” the unintended result is that you’re slowing someone else down. this is not an encouragement to be wasteful, but you should optimize for speed first. efficiency becomes secondary as you satisfy the constraint that you’re not slowing things down. the way you grow the business to be more efficient is to go faster. make sure your assumptions are still true many large companies that have enjoyed success in their market (we can call them incumbents ) are finding themselves overtaken by nimbler, usually smaller, organizations ( disruptors ) that react much more quickly to changing consumer behavior. their large size isn’t necessarily the root of the problem – netflix is no longer a small company, for example. as cockcroft sees it, the main cause of difficulty for industry incumbents is that they’re operating under business assumptions that are no longer true. or, as will rogers put it, it’s not what we don’t know that hurts. it’s what we know that ain’t so.” of course, you have to make assumptions as you formulate a business model, and then it makes sense to optimize your business practices around them. the danger comes from sticking with assumptions after they’re no longer true, which means you’re optimizing on the wrong thing. that’s when you become vulnerable to industry disruptors who are making the right assumptions and optimizations for the current business climate. as examples, consider the following assumptions that hold sway at many incumbents. we’ll examine them further in the indicated sections and describe the approach netflix adopted. computing power is expensive. this was true when increasing your computing capacity required capital expenditure on computer hardware. see put your infrastructure in the cloud . process prevents problems. at many companies, the standard response to something going wrong is to add a preventative step to the relevant procedure. see create a high freedom, high responsibility culture with less process . here are some ways to avoid holding onto assumptions that have passed their expiration date: as obvious as it might seem, you need to make your assumptions explicit, then periodically review them to make sure they still hold true. keep aware of technological trends. as an example, the cost of solid state storage drive (ssds) storage continues to go down. it’s still more expensive than regular disks, but the cost difference is becoming small enough that many companies are deciding the superior performance is worth paying a bit more for. [ed: in this entertaining video , fastly founder and ceo artur bergman explains why he believes ssds are always the right choice.] talk to people who aren’t your customers. this is especially necessary for incumbents, who need to make sure that potential new customers are interested in their product. otherwise, they don’t hear about the fact that they’re not being used. as an example, some vendors in the storage space are building hyper-converged systems even as more and more companies are storing their data in the cloud and using open source storage management software. netflix, for example, stores data on amazon web services (aws) servers with ssds and manages it with apache cassandra . a single specialist in java distributed systems is managing the entire configuration without any commercial storage tools or help from engineers specializing in storage, san, or backup. don’t base your future strategy on current it spending, but instead on level of adoption by developers. suppose that your company accounts for nearly all spending in the market for proprietary virtualization software, but then a competitor starts offering an open source-based product at only 1% the cost of yours. if people start choosing it instead of your product, than at the point that your share of total spending is still 90%, your market share has declined to only 10%. if you’re only attending to your revenue, it seems like you’re still in good shape, but 10% of market share can collapse really quickly. put your infrastructure in the cloud source: [email protected] in make sure your assumptions are still true , we mentioned that in the past it was valid to base your business plan on the assumption that computing power was expensive, because it was: the only way to increase your computing capacity was to buy computer hardware. you could then make money by using this expensive resource in the right way to solve customer problems. the advent of cloud computing has pretty much completely invalidated this assumption. it is now possible to buy the amount of capacity you need when you need it, and to pay for only the time you actually use it. the new assumption you need to make is that (virtual) machines are ephemeral. you can create and destroy them at the touch of a button or a call to an api, without any need to negotiate with other departments in your company. one way to think of this change is that the self-service cloud makes formerly impossible things instantaneous. all of netflix’s engineers are in california, but they manage a worldwide infrastructure. the cloud enables them to experiment and determine whether (for example) adding servers in particular location improves performance. suppose they notice problems with video delivery in brazil. they can easily set up 100 cloud server instances in são paulo within a couple hours. if after a week they determine that the difference in delivery speed and reliability isn’t large enought to justify the cost of the additional server instances, they can shut them down just as quickly and easily as they created them. this kind of experiment would be so expensive with a traditional infrastructure that you would never attempt it. you would have to hire an agent in são paulo to coordinate the project, find a data center, satisfy brazilian government regulations, ship machines to brazil, and so on. it would be six months before you could even run the test and find out that increased local capacity didn’t improve your delivery speed. create a high freedom, high responsibility culture with less process in make sure your assumptions are still true , we observed that many companies create rules and processes to prevent problems. when someone makes a mistake, they add a rule to the hr manual that says “well, don’t do that again.” if you read some hr manuals from this perspective, you can extract a historical record of everything that went wrong at the company. when something goes wrong in the development process, the corresponding reaction is to add a new step to the procedure. the major problem with creating process to prevent problems is that over time you build up complex “scar tissue” processes that slow you down. netflix doesn’t have an hr manual. there is a single guideline: “act in netflix’s best interest.” the idea is that if an employee can’t figure out how to interpret the guideline in a given situation, he or she doesn’t have enough judgment to work there. if you don’t trust the judgment of the people on your team, you have to ask why you’re employing them. it’s true that you’ll have to fire people occasionally for violating the guideline. overall, the high level of mutual trust among members of a team, and across the company as a whole, becomes a strong binding force. the following books outline new ways of thinking about process if you’re looking to transform your organization: the goal: a process of ongoing improvement by eliyahu m. goldratt and jeff cox. this book has become a standard management text at business schools since its original publication in 1984. written as a novel about a manager who has only 90 days to improve performance at his factory or have it closed down, it embodies goldratt’s theory of constraints in the context of process control and automation. the phoenix project: a novel about it, devops, and helping your business win by gene kim and kevin behr. as the title indicates, it’s also a novel, about an it manager who has 90 days to save a project that’s late and over budget, or his entire department will be outsourced. he discovers devops as the solution to his problem. replace silos with microservice teams most software development groups are separated into silos, with no overlap of personnel between them. the standard process for a software development project starts with the product manager meeting with the user experience and development groups to discuss ideas for new features. after the idea is implemented in code, the code is passed to the quality assurance (qa) and database administration teams and discussed in more meetings. communication with the system, network, and san administrators is often via tickets. the whole process tends to be slow and loaded with overhead. source: adrian cockcroft some companies try to speed up by creating small “start-up”-style teams that handle the development process from end to end, or sometimes such teams are the result of acquisitions where the acquired company continues to run independently as a separate division. but if the small teams are still doing monolithic delivery, there are usually still handoffs between individuals or groups with responsibility for different functions. the process suffers from the same problems as monolithic delivery in larger companies – it’s simply not very efficient or agile. source: adrian cockcroft conway’s law says that the interface structure of a software system will reflect the social structure of the organization that produced it. so if you want to switch to a microservices architecture, you need to organize your staff into product teams and use devops methodology. there are no longer distinct product managers, ux managers, development managers, and so on, managing downward in their silos. there is a manager for each product feature (implemented as a microservice), who supervises a team that handles all aspects of software development for the microservice, from conception through deployment. the platform team provides infrastructure support that the product teams access via apis. at netflix, the platform team was mostly aws in seattle, with some netflix-managed infrastructure layers built on top. but it doesn’t matter whether your cloud platform is in-house or public; the important thing is that it’s api-driven, self-service, and automatable. source: adrian cockcroft adopt continuous delivery, guided by the ooda loop a siloed team organization is usually paired with monolithic delivery model, in which an integrated, multi-function application is released as a unit (often version-numbered) on a regular schedule. most software development teams use this model initially because it is relatively simple and works well enough with a small number of developers (say, 50 or fewer). however, as the team grows it becomes a real issue when you discover a bug in one developer’s code during qa or production testing and the work of 99 other developers is blocked from release until the bug is fixed. in 2009 netflix adopted a continuous delivery model, which meshes perfectly with a microservices architecture. each microservice represents a single product feature that can be updated independently of the other microservices and on its own schedule. discovering a bug in a microservice has no effect on the release schedule of any other microservice. continuous delivery relies on packaging microservices in standard containers. netflix initially used aws machine images (amis) and it was possible to deploy an update into a test or production environment in about 10 minutes. with docker, that time is reduced even further, to mere seconds in some cases. at netflix, the conceptual framework for continuous development and delivery is an observe-orient-decide-act (ooda) loop . source: adrian cockcroft (http://www.slideshare.net/adrianco) observe refers to examining your current status to look for places where you can innovate. you want your company culture to implicitly authorize anyone who notices an opportunity to start a project to exploit it. for example, you might notice what the diagram calls a “customer pain point”: a lot of people abandoning the registration process on your website when they reach a certain step. you can undertake a project to investigate why and fix the problem. orient refers to analyzing metrics to understand the reasons for the phenomena you’ve observed at the observe point. often this involves analyzing large amounts of unstructured data, such as log files; this is often referred to as big data analysis. the answers you’re looking for are not already in your business intelligence database. you’re examining data that no one has previously looked at and asking questions that haven’t been asked before. decide refers to developing and executing a project plan. company culture is a big factor at this point. as previously discussed, in a high-freedom, high-responsibility culture you don’t need to get management approval before starting to make changes. you share your plan, but you don’t have to ask for permission. act refers to testing your solution and putting it into production. you deploy a microservice that includes your incremental feature to a cloud environment, where it’s automatically put into an ab test to compare it to the previous solution, side by side, for as long as it takes to collect the data that shows whether your approach is better. cooperating microservices aren’t disrupted, and customers don’t see your changes unless they’re selected for the test. if your solution is better, you deploy it into production. it doesn’t have to be a big improvement, either. if the number of clients for your microservice is large enough, then even a fraction of a percent improvement (in response time, say) can be shown to be statistically valid, and the cumulative effect over time of many small changes can be significant. now you’re back at the observe point. you don’t always have to perform all the steps or do them in strict order, either. the important characteristic of the process is that it enables you quickly to determine what your customers want and to create it for them. cockcroft says “it’s hard not to win” if you’re basing your moves on enough data points and your competitors are making guesses that take months to be proven or disproven. the state of art is to circle the loop every one to two weeks, but every microservice team can do it independently. with microservices you can go much faster because you’re not trying to get entire company going around the loop in lockstep. how nginx plus can help at nginx we believe it’s crucial to your future success that you adopt a 4-tier application architecture in which applications are developed and deployed as sets of microservices . we hope the information we’ve shared in this post and its predecessor, adopting microservices at netflix: lessons for architectural design , are helpful as you plan your transition to today’s state-of-the-art architecture for application development. when it’s time to deliver your apps, nginx plus offers an application delivery platform that provides the superior performance, reliability, and scalability your users expect. fully adopting a microservices-based architecture is easier and more likely to succeed when you move to a single software tool for web serving, load balancing, and content caching. nginx plus combines those functions and more in one easy to deploy and manage package. our approach empowers developers to define and control the flawless delivery of their microservices, while respecting the standards and best practices put into place by a platform team. click here to learn more about how nginx plus can help your applications succeed. video recordings fast delivery nginx.conf2014, october 2014 migrating to microservices, part 1 silicon valley microservices meetup, august 2014 migrating to microservices, part 2 silicon valley microservices meetup, august 2014

April 13, 2015

by Patrick Nommensen

· 9,896 Views

Patterns of API Virtualization

[This article was written by Matthew Heusser.] When Christopher Alexander wrote A Pattern Language in 1977, he was looking for a more powerful way to describe how towns and buildings were laid out. These patterns would allow architects, builders and planners to work together, to use the same words, mean the same thing, and create systems that were beautiful and worked, instead of more urban sprawl. Twenty years later, Gamma, Helms, Johnson and Vlissdes took the pattern idea and applied it to object-oriented software, which at the time was struggling to figure out how to create windows-based applications. Today the struggle is figuring out how to break software into small components that can be tested independently, and then having those components interact, typically over internet protocols. Raw SQL commands are giving way to service oriented systems that interact through APIs, sometimes all within one company, sometimes outside with Microsoft, Google, Amazon, or other APIs like a manufacturing company or supplier. While I do not claim to be Christopher Alexander or the Gang of Four, I am seeing some patterns emerge – a set of solutions to a defined problem – and would like to share a few of those today. What do you mean API? Alistair Cockburn’s Hexagonal Architecture (below) presents a way to think about APIs. The application we want to develop is in the middle and has a set of adapters to the external world. Those adapters might be an API we expose, like a ‘search’ interface to an online catalog, or the API’s we call, including the database, an email gateway, or the ‘permissions’ service, to see what types of search results we should show to this user. Cockburn’s Hexagonal Architecture gives us two ways to think about APIs: Our own, and the services we call. (Source: http://alistair.cockburn.us/Hexagonal+architecture) That’s a lot of APIs. Let’s explore about some ways to virtualize these services – and why. Automated Build and Continuous Integration Say, for example, you are working on a piece of software to analyze trending terms on social media – such as a customer complaint that is being liked and tweeted. You want companies to find these problems when they start to trend up, then reach out to the customer and solve it, or, perhaps, reach out to say “thank you” and amplify it. Modern build systems, like Jenkins, TFS, and TeamCity can compile, deploy, and even run the system to check for known scenarios. The trouble is those pesky adapters to external systems, like Twitter and Facebook. The software could do its job, but there is no way to know if the application is correct in its guesses about trends and importance. Getting the data from the providers can turn a quick build into a slow process that uses a lot of network traffic. By recording and storing known answers to predictable requests, then simulating the service and playing back known (“canned”) data, API Virtualization allows build systems to do more, with faster, more predictable results. This does not remove the need for end-to-end testing, but it does allow the team to have more confidence with each build. Performance Testing Your Application Like build/deploy systems, performance testing the application (the inside of the hexagon) with live, external services can cause problems. All that extra traffic can cause problems with the actual company network infrastructure; it could cause bandwidth problems at the point of the ISP. Some 3rd Party APIs charge a micro-fee per transaction, or limit bandwidth. Many of them lack a ‘test’ sandbox to develop in, so performance testing could interact with real, production work. Standing up a virtual server to return pre-planned data means you can performance test your application – not the third party – prevent bandwidth throttles, not step on production data, and avoid paying fees intended for real (production) use that is actually being used to test our environment. Avoid Integration Environment Inconsistency A few years ago I worked at a large organization that was wrapping old code in proxy services, so they could be consumed by other teams. Login, add-to-cart, search catalog, create custom catalog, permissions, all of it was possible to access through API calls, most of it as simple as a web URL that returned some text. The problem was the “System Integration Test” environment, or SIT. Every team tested its services in SIT, which meant about a third of the time, something was broken. After finding a bug in the current build, we would track it back to the catalog service, walk over to that team, bring up the issue, and they would say “thanks, we are testing a new build of catalog.” We expected catalog to work in SIT. Anything else meant a waste of someone’s time. Automated tools reporting false errors were even worse. When teams performance tested their services, everything calling the service got slow, if it worked at all. By virtualizing services we could test our application end-to-end against known data, without the troubles of SIT, or having to build additional expensive test-lab-like copies of production. Best of all, creating the virtual services is a snap – just record the live service with a tool and instruct it to play back similar requests. Flip Integration Tests from Virtual To Real for Final Checking All this API virtualization creates a risk that the team will move from test to production and something will be different between the Virtual API and the live one. If the Virtual API server is just returning the same thing product did when we recorded it and we have automated checks in place, we can change our test server to point to the real service and re-run all the automated checks. As long as the source data hasn’t changed and we are reading, not writing, from production, the checks should all pass. If the production API has changed, we will get failures, and they will be easy enough to fix and retest. Simulate Slow or Unresponsive Service In The Middle Of A Long Running Transaction Sometimes you want to test if a server is overloaded or down. Calling Facebook and asking them to turn off their servers is unlikely to work; even just coordinating with the team down the hall could create a lot of overhead. You also might want to test this often – every day or every hour – and manually pulling a plug or coordinating with the Login team every hour might not be realistic. The trick is to bring the service down once and record the exact behavior of the system, then use a virtual server to simulate that behavior, over and over again, every day. That means you’ll get the exact behavior, not a guess, and know exactly how the application under test can deal with it. Early Development of System against an Undeployed API Sometimes the API you are testing against does not exist, even in test. It’s still possible to create a Virt (virtual API) which returns some roughly equivalent data, and makes it possible to move forward on the core application without introducing new risks. Avoid Configuration and Copying Hassles Many companies use a test system that is a copy of production, and then refresh the system periodically. Sometimes, you want test scenarios that do not exist in production, so you have to create them … and lose them during a refresh. The same problem happens with 3rd party APIs, when, for example, a part is discontinued, and you are testing ordering that part, or the sample person you check for insurance coverage leaves the company. If the request for the part of the coverage goes through an API, you can record known good results that don’t change, even after a database refresh – then leave the real, end-to-end testing for an exploratory step that will be lighter, quicker, more accurate, and have more confidence. A Fistful of Techniques Today we discussed a half-dozen common patterns to API virtualization, mostly around testing systems in isolation that consume data through an API, like a 3rd party or an internal service. These ideas are new, and evolving. What are a few of your favorites?

April 9, 2015

by Denis Goodwin

· 4,174 Views

Adopting Microservices at Netflix: Lessons for Architectural Design

[This article was written by Tony Mauro.] In some recent blog posts, we’ve explained why we believe it’s crucial to adopt a four-tier application architecture in which applications are developed and deployed as sets of microservices. It’s becoming increasingly clear that if you keep using development processes and application architectures that worked just fine ten years ago, you simply can’t move fast enough to capture and hold the interest of mobile users who can choose from an ever-growing number of apps. Switching to a microservices architecture creates exciting opportunities in the marketplace for companies. For system architects and developers, it promises an unprecedented level of control and speed as they deliver innovative new web experiences to customers. But at such a breathless pace, it can feel like there’s not a lot of room for error. In the real world, you can’t stop developing and deploying your apps as you retool the processes for doing so. You know that your future success depends on transitioning to a microservices architecture, but how do you actually do it? Fortunately for us, several early adopters of microservices are now generously sharing their expertise in the spirit of open source, not only in the form of published code but in conference presentations and blog posts. Netflix is a leading example. As the Director of Web Engineering and then Cloud Architect, Adrian Cockcroft oversaw the company’s transition from a traditional development model with 100 engineers producing a monolithic DVD-rental application to a microservices architecture with many small teams responsible for the end-to-end development of hundreds of microservices that work together to stream digital entertainment to millions of Netflix customers every day. Now a Technology Fellow at Battery Ventures, Cockcroft is a prominent evangelist for microservices and cloud-native architectures, and serves on the NGINX Technical Advisory Board. In a two-part series of blog posts, we’ll present top takeaways from two talks that Cockcroft delivered last year, at the first annual NGINX conference in October and at a Silicon Valley Microservices Meetup a couple months earlier. (The complete video recordings are also well worth watching.) This post defines microservices architecture and outlines some best practices for designing one. Adopting Microservices at Netflix: Lessons for Team and Process Design discusses why and how to adopt a new mindset for software development and reorganize your teams around it. What is a Microservices Architecture? Cockcroft defines a microservices architecture as a service-oriented architecture composed of loosely coupled elements that have bounded contexts. Loosely coupled means that you can update the services independently; updating one service doesn’t require changing any other services. If you have a bunch of small, specialized services but still have to update them together, they’re not microservices because they’re not loosely coupled. One kind of coupling that people tend to overlook as they transition to a microservices architecture is database coupling, where all services talk to the same database and updating a service means changing the schema. You need to split the database up and denormalize it. The concept of bounded contexts comes from the book Domain Driven Design by Eric Evans. A microservice with correctly bounded context is self-contained for the purposes of software development. You can understand and update the microservice’s code without knowing anything about the internals of its peers, because the microservices and its peers interact strictly through APIs and so don’t share data structures, database schemata, or other internal representations of objects. If you’ve developed applications for the Internet, you’re already familiar with these concepts, in practice if not by name. Most mobile apps talk to quite a few back-end services, to enable its users to do things like share on Facebook, get directions from Google Maps, and find restaurants on Foursquare, all within the context of the app. If your mobile app were tightly coupled with those services, then before you could release an update you would have to talk to all of their development teams to make sure that your changes aren’t going to break anything. When working with a microservices architecture, you think of other internal development teams like those Internet back ends: as external services that your microservice interacts with through APIs. The commonly understood “contract” between microservices is that their APIs are stable and forward compatible. Just as it’s unacceptable for the Google Maps API to change without warning and in such a way that it breaks its users, your API can evolve but must remain compatible with previous versions. Best Practices for Designing a Microservices Architecture Cockcroft describes his role as Cloud Architect at Netflix not in terms of controlling the architecture, but as discovering and formalizing the architecture that emerged as the Netflix engineers built it. The Netflix development team established several best practices for designing and implementing a microservices architecture. Create a Separate Data Store for Each Microservice Do not use the the same back-end data store across microservices. You want the team for each microservice to choose the database that best suits the service. Moreover, with a single data store it’s too easy for microservices written by different teams to share database structures, perhaps in the name of reducing duplication of work. You end up with the situation where if one team updates a database structure, other services that also use that structure have to be changed too. Breaking apart the data can make data management more complicated, because the separate storage systems can more easily get out sync or become inconsistent, and foreign keys can change unexpectedly. You need to add a tool that performs master data management (MDM) by operating in the background to find and fix inconsistencies. For example, it might examine every database that stores subscriber IDs, to verify that the same IDs exist in all of them (there aren’t missing or extra IDs in any one database). You can write your own tool or buy one. Many commercial relational database management systems (RDBMSs) do these kinds of checks, but they usually impose too many requirements for coupling, and so don’t scale. Keep Code at a Similar Level of Maturity Keep all code in a microservice at a similar level of maturity and stability. In other words, if you need to add or rewrite some of the code in a deployed microservice that’s working well, the best approach is usually to create a new microservice for the new or changed code, leaving the existing microservice in place. [Editor’s note: This is sometimes referred to as the immutable infrastructure principle.] This way you can iteratively deploy and test the new code until it is bug free and maximally efficient, without risking failure or performance degradation in the existing microservice. Once the new microservice is as stable as the original, you can merge them back together if they really perform a single function together, or there are other efficiencies from combining them. However, in Cockcroft’s experience it is much more common to realize you should split up a microservice because it’s gotten too big. Do a Separate Build for Each Microservice Do a separate build for each microservice, so that it can pull in component files from the repository at the revision levels appropriate to it. This sometimes leads to the situation where various microservices pull in a similar set of files, but at different revision levels. That can make it more difficult to clean up your codebase by decommissioning old file versions (because you have to verify more carefully that a revision is no longer being used), but that’s an acceptable trade-off for how easy it is to add new files as you build new microservices. The asymmetry is intentional: you want introducing a new microservice, file, or function easy, not dangerous. Deploy in Containers Deploying microservices in containers is important because it means you just need just one tool to deploy everything. As long as the microservice is in a container, the tool knows how to deploy it. It doesn’t matter what the container is. That said, Docker seems very quickly to have become the de facto standard for containers. Treat Servers as Stateless Treat servers, particularly those that run customer-facing code, as interchangeable members of a group. They all perform the same functions, so you don’t need to be concerned about them individually. Your only concern is that there are enough of them to produce the amount of work you need, and you can use auto scaling to adjust the numbers up and down. If one stops working, it’s automatically replaced by another one. Avoid “snowflake” systems in which you depend on individual servers to perform specialized functions. Cockcroft’s analogy is that you want to think of servers like cattle, not pets. If you have a machine in production that performs a specialized function, and you know it by name, and everyone gets sad when it goes down, it’s a pet. Instead you should think of your servers like a herd of cows. What you care about is how many gallons of milk you get. If one day you notice you’re getting less milk than usual, you find out which cows aren’t producing well and replace them. Netflix Delivery Architecture is Built on nginx Netflix is a longtime nginx user and became the first customer of NGINX, Inc. after it incorporated in 2011. Indeed, Netflix chose nginx as the heart of their delivery infrastructure, the Netflix Open Connect Content Delivery Network (CDN), one of the largest CDNs in the world. With the ability to serve thousands, and sometimes millions, of requests per second, nginx is an optimal solution for high-performance HTTP delivery and enables companies like Netflix to offer high-quality digital experiences to millions of customers every day. Video Recordings Fast Delivery nginx.conf2014, October 2014 Migrating to Microservices, Part 1 Silicon Valley Microservices Meetup, August 2014 Migrating to Microservices, Part 2 Silicon Valley Microservices Meetup, August 2014

April 7, 2015

by Patrick Nommensen

· 33,872 Views · 1 Like

Introduction to Apache Cassandra's Architecture

Some key concepts for Apache's popular Cassandra Architecture include partitioning, replication, consistency, bootstrapping, and write paths.

April 6, 2015

by Akhil Mehra

· 118,311 Views · 38 Likes

How to Configure a Simple JBoss Cluster in Domain Mode

Clustering is a very important thing to master for any serious user of an application server. Clustering allows for high availability by making your application available on secondary servers when the primary instance is down or it lets you scale up or out by increasing the server density on the host, or by adding servers on other hosts. It can even help to increase performance with effective load balancing between servers based on their respective hardware. Andy Overton has already covered how to set up a cluster of servers in standalone mode fronted by mod_cluster for load balancing, so in this post I'll cover clustering in domain mode. I won't rehash mod_cluster settings, so this will just cover the set up of a doman controller on one host, and the host controller and server instances on another host. To follow along with this blog, you'll need to download either JBoss EAP 6.x or WildFly. I'll be using WildFly 8.2 on Xubuntu 14.04. I'll be using $WF_HOME to refer to your WildFly home directory. Configuring the Domain Controller The domain controller needs both the domain.xml and host.xml configured. In the $WF_HOME/domain/configuration directory, you'll see that those two files are joined by a host-master.xml and a host-slave.xml. These are preconfigured host.xml files which you can use to give you a head start in making a host.xml for the domain controller (master) and host controller (slave) to use. You can either change the name of the file to be host.xml, so it will get picked up and used by default, or you can specify the host configuration you want to use on the command line by adding the --host-config argument: domain.sh --host-config=host-master.xml Whether you choose to modify the host.xml or the host-master.xml, you need to make sure that the empty element has been added to the section. This is so that when WildFly looks to see which server is the domain controller, it knows to become the domain controller itself. The other change is optional, but recommended. We need to tell the domain controller to bind its management interface to the correct IP address because, by default, it will bind to localhost, so the management communication it needs to do with the remote hosts won't be able to reach the domain controller at all! We can set this address permanently in the host.xml by making sure the inet-address value is set to the right IP, by changing the 127.0.0.1 in the example below to the correct IP: The result of that is that the default bind IP of the management interface is no longer localhost, although you can still override this value by starting JBoss with the variable left of the colon as a -D argument: domain.sh -Djboss.bind.address.management=10.0.0.1 Next, we need to modify the domain.xml file, where we need to define our server groups; essentially just defining the cluster. Each server group is named, so we can reference it later, and references a particular profile which needs to be one of the profiles named and defined in the same XML file. As I mentioned in my previous blog, domain mode has several profiles in the same file (domain.xml) rather than multiple files for each, like standalone mode (standalone.xml, standalone-ha.xml etc.). In the screenshot, there are two server groups defined - "main-server-group" which references the "full" profile, and "other-server-group" which references the "full-ha" profile. These are just the defaults which come with WildFly, so you're free to use them and modify the settings or create your own from scratch. Whichever you choose, it's a good idea to rename your server group to something meaningful, like a description of the workload, or the application name. Configuring the Host Controllers Every host server which you want to be part of the cluster must have the host.xml file configured. We've already configured the host.xml on the domain controller, so now we'll focus on the host controller. Remember, this process can be repeated on any number of hosts, depending on how many servers you want in your server group and their topology. First, we need to make sure that the domain controller and the host controller can communicate, and to do that we need a valid management user. On the domain controller, run the add-user.sh or add-user.bat script. You will need to make sure to: Choose a management user Make sure the user is different than the one you would use to log in to the web console Confirm that the new user will connect one AS process to another AS process Make a note of the secret value (this is very important!) You will find that you get prompts similar to the following: mike@mike-C2B2:~$ /opt/wildfly/wildfly-8.2.0.Final/bin/add-user.sh What type of user do you wish to add? a) Management User (mgmt-users.properties) b) Application User (application-users.properties) (a): a Enter the details of the new user to add. Using realm 'ManagementRealm' as discovered from the existing property files. Username : mgmt Password recommendations are listed below. To modify these restrictions edit the add-user.properties configuration file. - The password should not be one of the following restricted values {root, admin, administrator} - The password should contain at least 8 characters, 1 alphabetic character(s), 1 digit(s), 1 non-alphanumeric symbol(s) - The password should be different from the username Password : Re-enter Password : What groups do you want this user to belong to? (Please enter a comma separated list, or leave blank for none)[ ]: About to add user 'mgmt' for realm 'ManagementRealm' Is this correct yes/no? yes Added user 'mgmt' to file '/opt/wildfly/wildfly-8.2.0.Final/standalone/configuration/mgmt-users.properties' Added user 'mgmt' to file '/opt/wildfly/wildfly-8.2.0.Final/domain/configuration/mgmt-users.properties' Added user 'mgmt' with groups to file '/opt/wildfly/wildfly-8.2.0.Final/standalone/configuration/mgmt-groups.properties' Added user 'mgmt' with groups to file '/opt/wildfly/wildfly-8.2.0.Final/domain/configuration/mgmt-groups.properties' Is this new user going to be used for one AS process to connect to another AS process? e.g. for a slave host controller connecting to the master or for a Remoting connection for server to server EJB calls. yes/no? yes To represent the user add the following to the server-identities definition Once we have the secret value for our management user, we can add it to the host.xml file. I'm choosing to modify the host-slave.xml file, since much of the configuration I need is done for me: Next, we need to tell the host controller where to look for the domain controller. We set this to for the domain controller's host.xml file, but in the host-slave.xml we have an example tag filled out for us. All we need to do is add the domain controller's IP or hostname exactly as we did for the management bind address earlier. So our host-slave.xml should go from this: to this: This way, like with the management interface on the domain controller, the default address will be 10.0.0.1, but it can also be overridden on the command line if needed. Once we've sorted the communication out, we need to tell the host controller to actually start some server instances! At the bottom of the host-slave.xml file, there are two predefined servers to use: These are already configured to become members of the two server groups configured in the domain.xml. Note that the second server has to have a port offset. Despite it being in a different server group, it's still going to run on the same host and will attempt to bind to the same ports as the first server unless we tell it not to! We would also need to do the same thing if we added other server instances. Optionally, we can make things a little easier for ourselves when managing a lot of servers on a lot of hosts. We can give each server instance its own unique name, but we can also name the host by adding a name attribute to the parent tag, changing it from: to So both in the logs and in the admin console, you should see this host controller referred to as "host1". Now, if you wanted to name your server instances the same across hosts, you'll be able to tell which is which! If all you wanted was to configure a single domain controller and a single host controller, then that's all we need to do to get them speaking to each other. You can then carry on and configure mod_cluster and Apache to forward requests on to the right server, or just deploy your applications and connect to them directly.

April 3, 2015

by Mike Croft

· 23,616 Views

Spark and ZooKeeper: Fault-Tolerant Job Manager out of the Box

Apache Spark, Solr, and Zookeeper work together to create a fault-tolerant, distributed ETL system that converts RDBMS data into Solr documents.

March 28, 2015

by Konstantin Smirnov

· 12,854 Views

Using Google Protocol Buffers with Spring MVC-based REST Services

Written by Josh Long on the Spring blog This week I’m in São Paulo, Brazil presenting at QCon SP. I had an interesting discussion with someone who loves Spring’s REST stack, but wondered if there was something more efficient than plain-ol’ JSON. Indeed, there is! I often get asked about Spring’s support for high-speed binary based encoding of messages. Spring’s long supported RPC encoding with the likes of Hessian, Burlap, etc., and Spring Framework 4.1 introduced support for Google Protocol Buffers which can be used with REST services as well. From the Google Protocol Buffer website: Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages… Google uses Protocol Buffers extensively in their own, internal, service-centric architecture. A .proto document describes the types (_messages_) to be encoded and contains a definition language that should be familiar to anyone who’s used C structs. In the document, you define types, fields in those types, and their ordering (memory offsets!) in the type relative to each other. The .proto files aren’t implementations - they’re declarative descriptions of messages that may be conveyed over the wire. They can prescribe and validate constraints - the type of a given field, or the cardinatlity of that field - on the messages that are encoded and decoded. You must use the Protobuf compiler to generate the appropriate client for your language of choice. You can use Google Protocol Buffers anyway you like, but in this post we’ll look at using it as a way to encode REST service payloads. This approach is powerful: you can use content-negotiation to serve high speed Protocol Buffer payloads to the clients (in any number of languages) that accept it, and something more conventional like JSON for those that don’t. Protocol Buffer messages offer a number of improvements over typical JSON-encoded messages, particularly in a polyglot system where microservices are implemented in various technologies but need to be able to reason about communication between services in a consistant, long-term manner. Protocol Buffers are several nice features that promote stable APIs: Protocol Buffers offer backward compatibility for free. Each field is numbered in a Protocol Buffer, so you don’t have to change the behavior of the code going forward to maintain backward compatability with older clients. Clients that don’t know about new fields won’t bother trying to parse them. Protocol Buffers provide a natural place to specify validation using the required,optional, and repeated keywords. Each client enforces these constraints in their own way. Protocol Buffers are polyglot, and work with all manner of technologies. In the example code for this blog alone there is a Ruby, Python and Java client for the Java service demonstrated. It’s just a matter of using one of the numerous supported compilers. You might think that you could just use Java’s inbuilt serialization mechanism in a homogeneous service environment but, as the Protocol Buffers team were quick to point out whent hey first introduced the technology, there are some problems even with that. Java language luminary Josh Bloch’s epic tome, Effective Java, on page 213, provides further details. Let’s first look at our .proto document: package demo; option java_package = "demo"; option java_outer_classname = "CustomerProtos"; message Customer { required int32 id = 1; required string firstName = 2; required string lastName = 3; enum EmailType { PRIVATE = 1; PROFESSIONAL = 2; } message EmailAddress { required string email = 1; optional EmailType type = 2 [default = PROFESSIONAL]; } repeated EmailAddress email = 5; } message Organization { required string name = 1; repeated Customer customer = 2; } You then pass this definition to the protoc compiler and specify the output type, like this: protoc -I=$IN_DIR --java_out=$OUT_DIR $IN_DIR/customer.proto Here’s the little Bash script I put together to code-generate my various clients: #!/usr/bin/env bash SRC_DIR=`pwd` DST_DIR=`pwd`/../src/main/ echo source: $SRC_DIR echo destination root: $DST_DIR function ensure_implementations(){ # Ruby and Go aren't natively supported it seems # Java and Python are gem list | grep ruby-protocol-buffers || sudo gem install ruby-protocol-buffers go get -u github.com/golang/protobuf/{proto,protoc-gen-go} } function gen(){ D=$1 echo $D OUT=$DST_DIR/$D mkdir -p $OUT protoc -I=$SRC_DIR --${D}_out=$OUT $SRC_DIR/customer.proto } ensure_implementations gen java gen python gen ruby This will generate the appropriate client classes in the src/main/{java,ruby,python}folders. Let’s first look at the Spring MVC REST service itself. A Spring MVC REST Service In our example, we’ll register an instance of Spring framework 4.1’s org.springframework.http.converter.protobuf.ProtobufHttpMessageConverter. This type is an HttpMessageConverter. HttpMessageConverters encode and decode the requests and responses in REST service calls. They’re usually activated after some sort of content negotiation has occurred: if the client specifies Accept: application/x-protobuf, for example, then our REST service will send back the Protocol Buffer-encoded response. package demo; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.context.annotation.Bean; import org.springframework.http.converter.protobuf.ProtobufHttpMessageConverter; import org.springframework.web.bind.annotation.PathVariable; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RestController; import java.util.Arrays; import java.util.Collection; import java.util.Map; import java.util.concurrent.ConcurrentHashMap; import java.util.stream.Collectors; @SpringBootApplication public class DemoApplication { public static void main(String[] args) { SpringApplication.run(DemoApplication.class, args); } @Bean ProtobufHttpMessageConverter protobufHttpMessageConverter() { return new ProtobufHttpMessageConverter(); } private CustomerProtos.Customer customer(int id, String f, String l, Collection emails) { Collection emailAddresses = emails.stream().map(e -> CustomerProtos.Customer.EmailAddress.newBuilder() .setType(CustomerProtos.Customer.EmailType.PROFESSIONAL) .setEmail(e).build()) .collect(Collectors.toList()); return CustomerProtos.Customer.newBuilder() .setFirstName(f) .setLastName(l) .setId(id) .addAllEmail(emailAddresses) .build(); } @Bean CustomerRepository customerRepository() { Map customers = new ConcurrentHashMap<>(); // populate with some dummy data Arrays.asList( customer(1, "Chris", "Richardson", Arrays.asList("[email protected]")), customer(2, "Josh", "Long", Arrays.asList("[email protected]")), customer(3, "Matt", "Stine", Arrays.asList("[email protected]")), customer(4, "Russ", "Miles", Arrays.asList("[email protected]")) ).forEach(c -> customers.put(c.getId(), c)); // our lambda just gets forwarded to Map#get(Integer) return customers::get; } } interface CustomerRepository { CustomerProtos.Customer findById(int id); } @RestController class CustomerRestController { @Autowired private CustomerRepository customerRepository; @RequestMapping("/customers/{id}") CustomerProtos.Customer customer(@PathVariable Integer id) { return this.customerRepository.findById(id); } } Most of this code is pretty straightforward. It’s a Spring Boot application. Spring Boot automatically registers HttpMessageConverter beans so we need only define the ProtobufHttpMessageConverter bean and it gets configured appropriately. The @Configuration class seeds some dummy date and a mock CustomerRepository object. I won’t reproduce the Java type for our Protocol Buffer, demo/CustomerProtos.java, here as it is code-generated bit twiddling and parsing code; not all that interesting to read. One convenience is that the Java implementation automatically provides builder methods for quickly creating instances of these types in Java. The code-generated types are dumb struct like objects. They’re suitable for use as DTOs, but should not be used as the basis for your API. Do not extend them using Java inheritance to introduce new functionality; it’ll break the implementation and it’s bad OOP practice, anyway. If you want to keep things cleaner, simply wrapt and adapt them as appropriate, perhaps handling conversion from an ORM entity to the Protocol Buffer client type as appropriate in that wrapper. HttpMessageConverters may also be used with Spring’s REST client, the RestTemplate. Here’s the appropriate Java-language unit test: package demo; import org.junit.Test; import org.junit.runner.RunWith; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.IntegrationTest; import org.springframework.boot.test.SpringApplicationConfiguration; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.http.ResponseEntity; import org.springframework.http.converter.protobuf.ProtobufHttpMessageConverter; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; import org.springframework.test.context.web.WebAppConfiguration; import org.springframework.web.client.RestTemplate; import java.util.Arrays; @RunWith(SpringJUnit4ClassRunner.class) @SpringApplicationConfiguration(classes = DemoApplication.class) @WebAppConfiguration @IntegrationTest public class DemoApplicationTests { @Configuration public static class RestClientConfiguration { @Bean RestTemplate restTemplate(ProtobufHttpMessageConverter hmc) { return new RestTemplate(Arrays.asList(hmc)); } @Bean ProtobufHttpMessageConverter protobufHttpMessageConverter() { return new ProtobufHttpMessageConverter(); } } @Autowired private RestTemplate restTemplate; private int port = 8080; @Test public void contextLoaded() { ResponseEntity customer = restTemplate.getForEntity( "http://127.0.0.1:" + port + "/customers/2", CustomerProtos.Customer.class); System.out.println("customer retrieved: " + customer.toString()); } } Things just work as you’d expect, not only in Java and Spring, but also in Ruby and Python. For completeness, here is a simple client using Ruby (client types omitted): #!/usr/bin/env ruby require './customer.pb' require 'net/http' require 'uri' uri = URI.parse('http://localhost:8080/customers/3') body = Net::HTTP.get(uri) puts Demo::Customer.parse(body) ..and here’s a client in Python (client types omitted): #!/usr/bin/env python import urllib import customer_pb2 if __name__ == '__main__': customer = customer_pb2.Customer() customers_read = urllib.urlopen('http://localhost:8080/customers/1').read() customer.ParseFromString(customers_read) print customer Where to go from Here If you want very high speed message encoding that works with multiple languages, Protocol Buffers are a compelling option. There are other encoding technologies like Avro or Thrift, but none nearly so mature and entrenched as Protocol Buffers. You don’t necessarily need to use Protocol Buffers with REST, either. You could plug it into some sort of RPC service, if that’s your style. There are almost as many client implementations as there are buildpacks for Cloud Foundry - so you could run almost anything on Cloud Foundry and enjoy the same high speed, consistent messaging across all your services! The code for this example is available online, as well, so don’t hesitate to check it out! Also.. Hi gang, in 2015, I’ve been trying to do a random tech-tip style post every week based on things that I see garnering interest in the community, either here or on the Pivotal blog. I use these weekly-_ish_ (OK! OK! - it’s not been easy doing them as regularly as This Week in Spring, but so far I haven’t missed a week! :-) ) posts as a chance to focus not on a specific new release, per se, but on the application of Spring in service to some community use case that might be cross-cutting or just might benefit from having a spotlight shined on it. So far we’ve looked at all manner of things - Vaadin, Activiti, 12-Factor App Style Configuration, Smarter Service to Service Invocations, Couchbase, and much more, etc. - and we’ve got some interesting stuff lined up, too. I wondered what else you want to see talked about, however. If you’ve got some ideas about what you’d like to see covered, or a community post of your own to contribute, reach out to me on Twitter (@starbuxman) or via email (jlong [at] pivotal [dot] io). I remain, as always, at your service.

March 27, 2015

by Pieter Humphrey

· 15,204 Views

How to Write a "Hello, World!" Microservice

What does implementing microservices mean for a software developer? Especially, for the rookies, greenhorns, and newbs out there? I’m not talking about microservice software architecture here; this is about microservices software development. And not just that, the ultimate implementation goal should be “microservices done right”. For this post, I’ll go with Java. Yes, it’s wordy. Yes, it’s resource intensive (especially when used for the sole purpose of returning a single string). However the concept of classes and objects goes well with my intention of explaining how to do microservices correctly. Plus, it makes sense to use microservices in environments that are heavily biased towards Java. Anyway, please feel free to add your own “Hello, World!” microservice in your favorite language in the comments section below. Hello, monolith! As a prerequisite, you should be familiar with the following piece of code, what it does, and why it has to look the way it does (read this tutorial if you don’t): class Starter { public static void main(String[] args) { System.out.println(“Hello, World!”); } } This is a simple console application that yields the string “Hello, World!” This is not written in the microservice way. This is an example of when not to use the microservices approach: if all you need on your console is a single string, this is all you need. Hello, code duplication! In addition to this console application, I want this string to be available on the web by calling http://localhost:80/helloWorld.servlet from a browser. Here is the required code, implemented as plain HTTP servlet (yes, it’s wordy. Get over it.) class HelloWorldServlet extends HttpServlet { public void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.getWriter().println(“Hello, World!”); } } The string “Hello, World!” has to be “implemented” again. Sure, this is no big deal. But this simple string could be so much more. It could be the result of a complex calculation or it could be the result of a time consuming search query. So, just imagine that the string “Hello, World!” is the result of a week’s worth of hard work (If you’re new to programming, it may very well be...). How should you go about making it available to apps and services that you create? Step 1: HelloWorldService.java To save yourself from duplicating a week’s worth of coding, allow me to introduce the HelloWorldService class: class HelloWorldService { public String greet() { return “Hello, World!”; } } You can re-use this fine piece of software craftmanship in all your apps and classes without re-implementing or duplicating code. Here’s our console application again: class Starter { HelloWorldService helloWorldService = new HelloWorldService(); public static void main(String[] args) { String message = helloWorldService.greet(); System.out.println(message); } } The same goes for servlets: class HelloWorldServlet extends HttpServlet { HelloWorldService helloWorldService = new HelloWorldService(); public void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { String message = helloWorldService.greet(); response.getWriter().println(message); } } It also works great for Spring MVC controllers: @Controller class HelloWorldController { HelloWorldService helloWorldService = new HelloWorldService(); @RequestMapping("/helloWorld") public String greet() { String message = helloWorldService.greet(); return message; } } I could go on and show more examples, but I think you get the point (spoiler: it’s the bold lines that matter). Those of you who are familiar with microservices could point out that this may be fine for getting rid of code duplication, but this is no microservice. You’re right, but to get to “microservices done right,” you have to be able to separate you app’s concerns, which is what I did here in the most possible basic way: I separated the app’s frontend concerns from its backend concerns. The frontend is either a console app or a servlet, the backend is HelloWorldService. Serviceward, ho! To go down microservice lane from here, all we have to do is wrap HelloWorldService into some kind of web component that makes it accessible via HTTP, right? Let’s see… First, we could just use our servlet code from above, as it conveniently returns the string as a response to any HTTP request. But we won’t. Why? Because there’s something missing: fault tolerance. What could possibly fail when returning a simple string? That’s not the point. What matters is that the client side (the code that calls HelloWorldService) should be given enough information to effectively react to failures. We face two possible problems: The service as a whole may be unavailable The service may be unable to return a proper response The service is unavailable If a service is unavailable, it’s the client that is responsible for dealing with the situation. Frameworks like unirest.io save you the effort of writing many lines of code when dealing with HTTP requests. Future> future = Unirest.post("HTTP://helloworld.myservices.local/greet") .header("accept", "application/json") .asJsonAsync(new Callback() { public void failed(UnirestException e) { //tell them UI folks that the request went south } public void completed(HttpResponse response) { //extract data from response and fulfill it’s destiny } public void cancelled() { //shot a note to UI dept that the request got cancelled } } ); With this code, the client now knows when the service is not available or has timed out following no response. Wee can easily have an error message displayed in place of the string we expected to receive. Try/catch is probably the right solution here. Invalid responses however pose more of a challenge. The service fails If the service fails, we can just return a string with an appropriate error message. But how can you know if a message is an error message or a correct response? Yes, you can start every error message with [ERROR] or invent another “smart” (read: not-so-smart) workaround, but this won’t be a solution you’ll be proud of. And, there’s always the possibility that even valid responses may begin with ERROR because it’s simply part of the message. I’d go with JSON or XML for wrapping the answer. I prefer JSON because it’s a little less wordy than XML. And I really like using the JSON-HTML tool over at json.bloople.net for visualizing results during development. Of course, you might go for any of the numerous alternatives, like protobuf or a proprietary solution of your own. The main point is that you need to be able to apply structure to responses: { “status”:”ok”, ”message”:”Hello, World!” } By checking the status attribute, you can easily decide whether to handle an error or to display an appropriate message. { “status”:”error”, ”message”:”Invalid input parameter” } The possibilities are endless here. You can add an error code or additional properties. This all boils down to a single important point: apply structure to your responses. Structure, why? Because structure not only helps you keep your code maintainable, it also serves as the foundation of the API of your service. An API definition consists of more than a URL like this: GET HTTP://helloworld.myservices.local/greet API definitions also consist of the response structures that can be expected as a response (you know this already from a few lines back): { “status”:”ok”, ”message”:”Hello, World!” } Most important takeaway Keeping the API specifications of a service’s request and response stable is a key requirement for succeeding with microservices. Conclusion Are you (and your project) ready for microservices? If you read this and kept asking yourself, what good is all the overhead of microservices, then either your project won’t benefit from microservices or you’re just not there yet (for mindset perspective see my previous post about the value of microservices ). If you can’t stop thinking about microservices, then you probably are ready.

March 13, 2015

by Martin Goodwell

· 31,321 Views · 7 Likes

Using Jenkins as a Reverse Proxy for IIS

Jenkins is one of the most popular build servers and it runs on a wide variety of platforms (Windows, Linux, Mac OS X) and can build software for most programming languages (Java, C#, C++, …). And best of all, it is fully open source and free to use. By default Jenkins runs on the port 8080, which can be troublesome as this not the standard port 80 used by most web applications. But running on port 80 is in most cases not possible as the webserver is already using this port. Luckily IIS has a neat feature that allows it to act as a reverse proxy. The reverse proxy mode allows to forward traffic from IIS to another web server (Jenkins in this example) and send the responses back through IIS. This allows us to assign a regular DNS address to Jenkins and use the standard HTTP port 80. In this guide, I will explain you how you can set this up. What is required? You need an installation of IIS 7 or higher and you need to install the additional modules “URL Rewrite and “Application Request Routing”. The easiest way to install these modules is through the Microsoft Web Platform Installer. Configuring IIS Once the two necessary modules are installed, you have to create a new website in IIS. In my example I bind this website to the DNS alias “Jenkins.test.intranet”. You can bind this of course to the DNS of your choice (or to no specific DNS entry). Next you must copy the following web.config to the root of newly created website. This rule forwards all the traffic to http://localhost:8080/, the address on which Jenkins is running. It is also possible to configure this through the GUI with the URL Rewrite dialog boxes. I you are not forwarding to a localhost address, you need to go into the dialogs of Application Requet Routing and check the “Enable proxy” property.

March 9, 2015

by Pieter De Rycke

· 11,123 Views

How to Support Multi-Speed IT with DevOps and Agile

These days a lot of organizations talk about Multi-Speed IT, so I thought I’d share my thoughts on this. I think the concept has been around for a while but now there is a nice label to associate this idea with. Let’s start by looking at why Multi-Speed IT is important. The idea is best illustrated by a picture of two interlocking gears of different sizes and by using a simple example to explain the concept. Different Speeds for Different Needs One easy way to recall what multi-speed IT looks like is to remember that there are multiple speeds for multiple needs. This is to say that there are different IT programs that may be most useful at various speeds. Some departments and applications need to move very rapidly, but others can move at a slower pace that works best for them. Regardless of which department you are focused on at the moment, it is important to know that it will have specialized needs that you need to look after, and that is why so many people are now looking at multi-speed IT as the best way to accomplish what they set out to accomplish. The smaller gear moves much faster than the larger one, but where the two gears interlock they remain aligned to not stop the motion. But what does this mean in reality? Think about a banking app on your mobile. Your bank might update the app on a weekly basis with new functionality like reporting and/or an improved user interface. That is a reasonable fast release cycle. The mainframe system that sits in the background and provides the mobile app with your account balance and transaction details does not have to change at the same speed. In fact, it might only have to provide a new service for the mobile app once every quarter. Nonetheless, the changes between those two systems need to align when new functionality is rolled out. However, it doesn’t mean both systems need to release at the same speed. In general, the customer-facing systems are the fast applications (Systems of Engagement, Digital) and the slower ones are the Systems of Record or backend systems. The release cycles should take this into consideration. So how do you get ready for the Multi-Speed IT Delivery Model? Release Strategy (Agile) – Identify functionality that requires changes in multiple systems and ones that can be done in isolation. If you follow an Agile approach, you can align every n-th release for releasing functionality that is aligned while the releases in between can deliver isolated changes for the fast-moving applications. Application Architecture – Use versioned interface agreements so that you can decouple the gears (read applications) temporarily. This means you can release a new version of a backend system or a front-end system without impacting the current functionality of the other. Once the other system catches up, new functionality becomes available across the system. This allows you to keep to your individual release schedule, which in turn means delivery is a lot less complex and interdependent. In the picture I used above, think of this as the clutch that temporarily disengages the gears. Technical Practices and Tools (DevOps) – If the application architecture decoupling is the clutch, then the technical practices and tools are the grease. This is where DevOps comes into the picture. The whole idea of Multi-Speed IT is to make the delivery of functionality less interdependent. On the flip side, you need to spend more effort on getting the right practices and tools in place to support this. For example, you want to make sure that you can quickly test the different interface versions with automated testing, you need to have good version control to make sure you have in place the right components for each application, and you also want to make sure you can manage your code line very well through abstractions and branching where required. And the basics of configuration management, packaging, and deployment will become even more important as you want to reduce the number of variables you have to deal with in your environments. You better remove those variables introduced through manual steps by having these processes completely automated. Testing strategies – Given that you are now dealing with multiple versions of components being in the environment at the same time, you have to rethink your testing strategies. The rules of combinatorics make it very clear that it only takes a few different variables before it becomes unmanageable to test all permutations. So we need to think about different testing strategies that focus on valid permutations and risk profiles. After all, functionality that is not yet live requires less testing than the ones that will go live next. The above points cover the technical aspects but to get there you will also have to solve some of the organizational challenges. Let me just highlight 3 of them here: Partnership with delivery partners – It will be important to choose your partners wisely. Perhaps it helps to think of your partner ecosystem in three categories: Innovators (the ones who work with you in innovative spaces and with new technologies), Workhorses(the ones who support your core business applications that continue to change) and Commodities (the ones who run legacy applications that don’t require much new functionality and attention). It should be clear that you need to treat them differently in regards to contracts and incentives. I will blog later about the best way to incentivize your workhorses, the area that I see most challenges in. Application Portfolio Management - Of course, to find the right partner you first need to understand what your needs are. Look across your application portfolio and determine where your applications sit across the following dimensions: Importance to business, exposure to customers, frequency of change, and volume of change. Based on this you can find the right partner to optimize the outcome for each application. Governance – Last but not least, governance is very important. In a multi-speed IT world you will need flexible governance. One size fits all will not be good enough. You will need lightweight system-driven governance for your high-speed applications and you can probably afford a more PowerPoint/Excel-driven manual governance for your slower-changing applications. If you can run status reports of live systems (like Jira, RTC, or TFS) for your fast applications you are another step closer to mastering the multi-speed IT world.

March 2, 2015

by Mirco Hering

CORE

· 8,438 Views

How to Speed Up Your Gradle Build From 90 to 8 Minutes

Even though I was supposed to write a series of blog posts about micro-infra-spring here at Too Much Coding blog, today I'll write about how we've managed to decrease our biggest project's build time from 90 to 8 minutes! In one of the companies I've been working for we've faced a big problem related to pull request build times. We have one monolithic application that we are in progress of slicing into microservices but still until this process is finished we have to build that big app for each PR. We needed to change things to have really fast feedback from our build so that pull request builds don't get queued up endlessly in our CI. You can only imagine the frustration of developers who can't have their branches merged to master because of the waiting time. Structure In that project we have over 200 Gradle modules and over a dozen big projects (countries) from which we can build some (really fat) fat-jars. We have also a core module that if we change then we would have to rebuild all the big projects to check if they weren't affected by the modifications. There are a few old countries that are using GWT compilers and we have some JS tasks executed too. Initial stats Before we started to work on optimization of the process the whole application (all the countries) was built in about 1h 30 minutes. Current build time: ~90 minutes. Profile your build First thing that we've done was to run the build with the --profile switch. ./gradlew clean buildAll --profile That way Gradle created awesome stats for our build. If you are doing any sort of optimization then it's crucial to gather measurements and statistics. Check out this Gradle page about profiling your build for more info on that switch and features. Exclude long running tasks in dev mode It turned out that we are spending a lot of time on JS minification and on GWT compilation. That's why we have added a custom property -PdevMode to disable some long running tasks in dev mode build. Those tasks were: excluded JS minification benefit: 13 countries * ~60 secs * at least 2 modules where minification occurred ~ 26 minutes optimized GWT compilation: have permutations done for only 1 browser (by default it's done for multiple browsers) disable optimization of the compilation (-optimize 0) add the -draftCompile switch to to compile quickly with minimal optimizations benefit: about 2 minutes less on GWT compilation * sth like 5 projects with GWT ~ 10 minutes Overall gain: ~ 40 minutes. Current build time: ~50 minutes. Check out your tests Together with the one and only Adam Chudzik we have started to write our own Gradle Test Profiler (it's a super beta version ;) ) that created a single CSV with sorted tests by their execution time. We needed quick and easy gains without endless test refactoring and it turned out that it's really simple. One of our tests took 50 seconds to execute and it was testing a feature that has and will never be turned on on production. Of course there were plenty of other tests that we should take a look into (we'd have to look for test duplication, check out the test setup etc.) but it would involve more time, help of a QA and we needed quick gains. Benefit: By simple disabling this test we gained about 1 minute. Overall gain: ~ 41 minutes. Current build time: ~49 minutes. Turn on the --parallel Gradle flag at least for the compilation Even though at this point our gains were more or less 40 minutes it was still unacceptable for us to wait 40 minutes for the pull request to be built. That's why we decided to go parallel! Let's build the projects (over 200) in parallel and we'll gain a lot of time on that. When you execute the Gradle build with the --parallel flag Gradle calculates how many threads can be used to concurrently build the modules. For more info go to the Gradle's documentation on parallel project execution. It's an incubating feature so wen we started to get BindExceptions on port allocation we initially thought that most likely it's Gradle's fault. Then we had a chat with Szczepan Faberwho worked for Gradleware and it turns out that the feature is actually really mature (thx Szczepan for the help BTW :) ). We needed quick gains so instead of fixing the port binding stuff we decided only to compile everything in parallel and then run tests sequentially. ./gradlew clean buildAll -PdevMode -x test --parallel && ./gradlew buildAll-PdevMode Benefit: By doing this lame looking hack we gained ~4 mintues (on my 8 core laptop). Overall gain: ~ 45 minutes. Current build time: ~45 minutes. Don't be a jerk - just prepare your tests for parallelization This command seemed so lame that we couldn't even look at it. That's why we said - let's not be jerks and just fix the port issues. So we went through the code, randomized all the fixed ports, patched micro-infra-spring so it does the same upon Wiremock and Zookeeper instantiation and just ran the building of the project like this: ./gradlew clean buildAll-PdevMode --parallel We were sure that this is the killer feature that we were lacking and we're going to win the lottery. Much to our surprise the result was really disappointing. Benefit: Concurrent project build decreased the time by ~5 minutes. Overall gain: ~ 50 minutes. Current build time: ~40 minutes. Check out your project structure You can only imagine the number of WTFs that were there in our office. How on earth is that possible? We've opened up htop, iotop and all the possible tools including vmstat to see what the hell was going on. It turned out that context switching is at an acceptable level whereas at some point of the build only part of the cores are used as if sth was executed sequentially! The answer to that mystery was pretty simple. We had a wrong project structure. We had a module that ended up as a test-jar in testCompile dependency of other projects. That means that the vast majority of modules where waiting for this project to be built. Built means compiled and tested. It turned out that this test-jar module had also plenty of slow integration tests in it so only after those tests were executed could other modules be actually built! Simple source moving can drastically increase your speed By simply moving those slow tests to a separate module we've unblocked the build of all modules that were previously waiting. Now we could do further optimization - we've split the slow integration tests into two modules to make all the modules in the whole project be built in more or less equal time (around 3,5 minutes). . Benefit: Fixing the project structure decreased the time by ~10 minutes Overall gain: ~ 60 minutes. Current build time: ~30 minutes. Don't save on machine power We've invested in some big AWS instance with 32 cores and 60 gb of RAM to really profit from the parallel build's possibilities. We're paying about 1.68$ per one hour of such machine's (c3.8xlarge) working time. If someone form the management tells you that that machine costs a lot of money and the company can't afford it you can actually do a fast calculation. You can ask this manager what is more expensive - paying for the machine or paying the developer for 77 minutes * number of builds of waiting? Benefit: Paying for a really good machine on AWS decreased the build time by ~22 minutes Overall gain: ~ 82 minutes. Current build time: ~8 minutes. What else can we do? Is that it? Can we decrease the time further on? Sure we can! Possible solutions are: Go through all of the tests and check why some of them take so long to run Go through the integration tests and check if don't duplicate the logic - we will remove them We're using Liquibase for schema versioning and we haven't merged the changests for some time thus sth like 100 changesets are executed each time we boot up Spring context (it takes more or less 30 seconds) We could limit the Spring context scope for different parts of our applications so that Spring boots up faster Buy a more powerful machine ;) There is also another, better way ;) SPLIT THE MONOLITH INTO MICROSERVICES AND GO TO PRODUCTION IN 5 MINUTES ;) Summary Hopefully I've managed to show you how you can really speed up your build process. The work to be done is difficult, sometimes really frustrating but as you can see very fruitful.

February 18, 2015

by Marcin Grzejszczak

· 60,197 Views · 3 Likes

Microservices: Five Architectural Constraints

Microservices is a new software architecture and delivery paradigm, where applications are composed of several small runtime services. The current mainstream approach for software delivery is to build, integrate, and test entire applications as a monolith. This approach requires any software change, however small, to require a full test cycle of the entire application. With Microservices a software module is delivered as an independent runtime service with a well defined API. The Microservices approach allow faster delivery of smaller incremental changes to an application. There are several tradeoffs to consider with the Microservices architecture. On one hand, the Microservices approach builds on several best practices and patterns for software design, architecture, and DevOps style organization. On the other hand, Microservices requires expertise in distributed programming and can become an operational nightmare without proper tooling in place. There are several good posts that highlight the pros-and-cons of Microservices, and I have added in the references section. In the remainder of this post, I will define five architectural constraints (principles that drive desired properties) for the Microservices architectural style. To be a Microservice, a service must be: Elastic Resilient Composable Minimal, and; Complete Microservice Constraint #1 - Elastic A microservice must be able to scale, up or down, independently of other services in the same application. This constraint implies that based on load, or other factors, you can fine tune your applications performance, availability, and resource usage. This constraint can be realized in different ways, but a popular pattern is to architect the system so that you can run multiple stateless instances of each microservice, and there is a mechanism for Service naming, registration, and discovery along with routing and load-balancing of requests. Microservice Constraint #2 - Resilient A microservice must fail without impacting other services in the same application. A failure of a single service instance should have minimal impact on the application. A failure of all instances of a microservice, should only impact a single application function and users should be able to continue using the rest of the application without impact. Adrian Cockroft describes Microservices as loosely coupled service oriented architecture with bounded contexts [3]. To be resilient a service has to be loosely coupled with other services, and a bounded context limits a service’s failure domain. Microservice Constraint #3 - Composable A microservice must offer an interface that is uniform and is designed to support service composition. Microservice APIs should be designed with a common way of identifying, representing, and manipulating resources, describing the API schema and supported API operations. The ‘Uniform Interfaces constraint of the REST architectural style describes this in detail. Service Composition is a SOA principle that has fairly obvious benefits, but few guidelines on how it can be achieved. A Microservice interface should be designed to support composition patterns like aggregation, linking, and higher-level functions such as caching, proxies and gateways. I previously discussed REST constraints and elements in as two part blog post: REST is not about APIs Microservice Constraint #4 - Minimal A microservice must only contain highly cohesive entities In software, cohesion is a measure of whether things belong together. A module is said to have high cohesion if all objects and functions in it are focused on the same tasks. Higher cohesion leads to more maintainable software. A Microservice should perform a single business function, which implies that all of its components are highly cohesive. This is also an Single Responsibility Principle (SRP) of object-oriented design [5] Microservice Constraint #5 - Complete A microservice must be functionally complete Bjarne Stroustrup, the creator of C++, stated that a good interface must be, “minimal but complete” i.e. as small as possible, and no smaller. Similarly, a Microservice must offer a complete function, with minimal dependencies (loose coupling) to other services in the application. This is important, as otherwise its becomes impossible to version and upgrade individual services. This constraint is designed to oppose the minimal constraint. Put together a microservice must be “minimal but complete.” Conclusions Designing a Microservices application requires application of several principles, patterns, and best practices of modular design and service-oriented architectures. In this post, I've outlined five architectural constraints which can help guide and retain the key benefits of a Microservices-style architecture. For example, Microservices Constraint# 1 - Elastic steers implementations towards separating the data tier from the application tier, and leads to stateless services. At Nirmata we have built our solution, that makes it easy to deploy and operate microservices applications, using these very same principles. We believe that Microservices style applications, running in containers, will power the next generation of software innovation. If you are using, or interested in using microservices, I would love to hear from you. Jim Bugwadia Founder and CEO Nirmata -- For additional content and articles follow us at @NirmataCloud. -- If you are in the San Francisco Bay Area, come join our Microservices meetup group. References [1] Microservices, Martin Fowler and James Lewis, http://martinfowler.com/articles/microservices.html [2] Microservices Are Not a free lunch!, Benjamin Wootton, http://contino.co.uk/microservices-not-a-free-lunch/ [3] State of the Art in Microservices, Adrian Cockroft, http://thenewstack.io/dockercon-europe-adrian-cockcroft-on-the-state-of-microservices/ [4] The Principles of Object-Oriented Design, Robert C. Martin, http://butunclebob.com/ArticleS.UncleBob.PrinciplesOfOod

February 5, 2015

by Jim Bugwadia

· 13,318 Views · 7 Likes

Dropwizard vs Spring Boot—A Comparison Matrix

Of late, I have been looking into Microservice containers that are available out there to help speed up the development. Although, Microservice is a generic term however there is some consensus with respect to what it means. Hence, we may conveniently refer to the definition Microservice as an "architectural design pattern, in which complex applications are composed of small, independent processes communicating with each other using language-agnostic APIs. These services are small, highly decoupled and focus on doing a small task." There are several Microservice containers out there. However, in my experience I have found Dropwizard and Spring-boot to have had received more attention and they appear to be widely used compared to the rest. In my current role, I was asked create a comparison matrix between the two, so it's here below. Dropwizard Spring-Boot What is it? Dropwizard pulls together stable, mature libraries from the Java ecosystem into a simple, light-weight package that lets you focus on getting things done. [more...] Takes an opinionated view of building production-ready Spring applications. Spring Boot favours convention over configuration and is designed to get you up and running as quickly as possible. [more...] Overview? Dropwizard straddles the line between being a library and a framework. Provide performant, reliable implementations of everything a production-ready web application needs. [more...] Spring-boot takes an opinionated view of the Spring platform and third-party libraries so you can get started with minimum fuss. Most Spring Boot applications need very little Spring configuration. [more...] Out of the box features? Dropwizard has out-of-the-box support for sophisticated configuration, application metrics, logging, operational tools, and much more, allowing you and your team to ship a production-quality web service in the shortest time possible. [more...] Spring-boot provides a range of non-functional features that are common to large classes of projects (e.g. embedded servers, security, metrics, health checks, externalized configuration). [more...] Libraries Core: Jetty, Jersey, Jackson and Matrics Others: Guava, Liquibase and Joda Time. Spring, JUnit, Logback, Guava. There are several starter POM files covering various use cases, which can be included in the POM to get started. Dependency Injection? No built in Dependency Injection. Requires a 3rd party dependency injection framework such as Guice, CDI or Dagger. [Ref...] Built in Dependency Injection provided by Spring Dependency Injection container. [Ref...] Types of Services i.e. REST, SOAP Has some support for other types of services but primarily is designed for performant HTTP/REST LAYER. If ever need to integrate SOAP, there is a dropwizard bundle for building SOAP web services using JAX-WS API is provided here but it’s not official drop-wizard sub project. [more...] As well as supporting REST Spring-boot has support for other types of services such as JMS, Advanced Message Queuing Protocol, SOAP based Web Services to name a few. [more...] Deployment? How it creates the Executable Jar? Uses Shading to build executable fat jars, where a shaded jar spackages all classes, from all jars, into a single 'uber jar'. [Ref...] Spring-boot adopts a different approach and avoids shaded jars, as it becomes hard to see which libraries you are actually using in your application. It can also be problematic if the same filename is used in Shaded jars. Instead it uses “Nested Jar” approach where all classes from all jars do not need to be included into a single “uber jar” instead all dependent jars should be in the “lib” folder, spring loader loads them appropriately. [Ref...] Contract First Web Services? No built in support. Would have to refer to 3rd party library (CXF or any other JAX-WS implementation) if needed a solution for the Contract First SOAP based services. Contract First services support is available with the help of spring-boot-starter-ws starter application. [Ref...] Externalised Configuration for properties and YAML Supports both Properties and YAML Supports both Properties and YAML Concluding Remarks If dealing with only REST micro services, drop wizard is an excellent choice. Where Spring-boot shines is the types of services supported i.e. REST, JMS, Messaging, and Contract First Services. Not least a fully built in Dependency Injection container. Disclaimer: The matrix is purely based on my personal views and experiences, having tried both frameworks and is by no means an exhaustive guide. Readers are requested to do their own research before making a strategic decision between the two very formidable frameworks.

February 2, 2015

by Rizwan Ullah

· 74,196 Views · 9 Likes

Git Flow and Immutable Build Artifacts

We love Git Flow. It’s awesome to have a standard release process where everybody is using the same terminology and tooling. It’s also great to have out-of-the-box answers to the same questions that get asked at the start of every project. For example: “How are we going to develop features in isolation?” “How are we going to separate release candidates from ongoing development?” “How are we going to deal with hotfixes?” Now it’s enough to just say ‘We use Git Flow’, and everybody’s on the same page. Well, mostly. Whilst Git Flow is terrific for managing features and separating release candidates from ongoing development, things get a little hazier when it comes time to actually release to production. This is because of a mismatch between the way that Git Flow works and another practice that is common in large development projects: immutable build artifacts. Immutable what? On most enterprise projects I work on these days, it’s considered a pretty good idea to progress exactly the same build artifact through testing, pre-production and into production. It doesn’t matter whether it’s a JAR file, WAR file, tar.gz file, or something more exotic – the key point is that you build it once, then deploy the same thing to each downstream environment. If you find a bug during this journey, you fix the bug, build a whole new artifact, and start the process again. In the absence of any established terminology, let’s just call these things immutable build artifacts. (Note that you’ll still need a separate, external file for those things that have to be different between environments, but by keeping the amount of stuff in that file to an absolute minimum, you’ll minimise your potential exposure to variances between environments.) Without immutable build artifacts, many development managers will start to sweat nervously. This is usually because at some stage in their careers they’ve been up at 2am in the morning prising apart build artifacts and trying to understand what changed between the pre-prod build (which worked just fine) and the current prod build (which is inexplicably failing). A bunch of things can cause this sort of problem. For example, it could be that somebody managed to sneak some change in between the two builds, or that some downstream build dependency changed between the two builds, or even that somebody tweaked the build process between the two builds. ‘That’ll never happen to me’, I hear you say, ‘if everybody follows our development methodology correctly’. And therein lies the problem: it’s difficult to absolutely guarantee that a developer won’t at the last minute do something stupid, like slip something into the wrong branch, upload an incorrectly-versioned downstream dependency, or mess with your buildbox configuration. My point is this: why open yourself up to the risk at all, when having a single artifact will eliminate a whole class of potential defects? Now for the actual problem A while back, I was working on a project using both Git Flow and immutable builds. A junior developer came to me looking confused. He was trying to understand when it would be OK for him to finish the current Git Flow release. The testers had just signed off on the current build. But if he finished the release, Git Flow was going to merge the release branch into master – and strictly speaking, he should create a new build from that. But if he created a new build, the testers were going to want to test it. If they found a problem, the fix was going to have to happen in a new release branch. But then when he closed that release, he was going to have to do a new build from master, which the testers would want to test again, right? And then, if they found a bug in that… I admired his highly conscientious approach to his work, but worried that his mind was going to disintegrate as it spun around this build-management paradox. I also had to acknowledge that he had stumbled head first into something that I had been wilfully ignoring in the hope that it would resolve itself; namely, the fundamental mismatch between Git Flow and the concept of immutable builds. Put simply, Git Flow works on the premise that production builds will only be done from the master branch. In contrast, immutable builds work on the premise that production builds will be done from either a release branch or a hotfix branch. There is no perfect way to work around this mismatch. The pragmatic solution Put bluntly, if you want truly immutable builds, then you should do them from the release or hotfix branch rather than master. However, because this runs contrary to how Git Flow works, there are a couple of important consequences to keep in mind. 1. Concurrent hotfixes and releases require special handling If you start and then finish a hotfix whilst a release branch is in process, then that hotfix’s changes won’t automatically be brought into the release branch by Git Flow. This means that a subsequent build from the release branch which goes into production won’t include the hotfix changes. Here’s a diagram illustrating the problem: Note that the inverse problem would apply if you tried to start a release whilst a hotfix was in progress. Thankfully, Git Flow will not let you have two release branches in progress at the same time, so we don’t have to worry about that particular possibility. (If you’re wondering how Git Flow avoids these scenarios in its regular usage, remember that it always merges back into master when a release or hotfix is finished, and that you’re supposed to always build from master. Consequently, if you’re building from master post-release or post-hotfix, you’ll always be getting the latest changes into the build.) The problem of concurrent hotfixes and releases can be solved by merging the hotfix branch into the release branch just before the hotfix branch gets finished (or vice-versa if you started a release whilst a hotfix was in progress). Here’s what it looks like: However, you will need to remember to do this merge manually because Git Flow won’t do it for you. 2. Version tags will be incorrect by default When you finish a release or hotfix, Git Flow merges the corresponding branch back into master, and then tags the resultant merge commit on master with the version number. However, because your immutable artifact will have been built from the release/hotfix branch, the commit SHA for the version tag in Git will be different from the SHA that the artifact was actually built from. Continuing on from our previous example: You may or may not care about this, for a number of reasons. Firstly, from the perspective of Git Flow, the merge commit on master should never have any changes in it. The only scenario that might lead to it having changes is if you have run concurrent hotfix and release branches (as described in the previous section) and forgotten to merge them prior to finishing. And you’llnever forget to do that now will you? :) Secondly, in the likely event that you are using some sort of continuous integration server to produce your builds, that server can probably associate its own number with each build, and stamp the resultant artifact with that number. The CI server will probably also have recorded the SHA that each build was done from. So from the artifact you could probably work backwards to the SHA that was used to produce it. If you’re nevertheless wary of the confusion that this backtracking process might introduce when trying to debug a production issue at 2am in the morning, you might still insist on having correct version tags. In that case the best thing I can think of is to manually create the tag on the release/hotfix branch yourself before finishing it and then, when finishing the release, run git flow [release/hotfix] finish with the -n option to stop it from trying to create the tag again (which would fail because the tag now already exists). I’m on a roll with my diagrams, so what the heck, let’s do one more: Wrapping Up On balance, I think the benefits of having immutable builds outweigh the costs of deviating from the Git Flow way of doing things. However, you do have to be aware of a couple of problematic scenarios. We’ll be experimenting in coming months with using Git Flow with our own immutable builds, and will let you know if anything else weird pops up. Who knows, perhaps we’ll even end up with our own version of Git Flow. Either way, I’d be interested in hearing from anybody else who has had the same problem.

January 30, 2015

by Ben Teese

· 11,067 Views · 1 Like

Mule ESB in Docker

In this article I will attempt to run the Mule ESB community edition in Docker in order to see whether it is feasible without any greater inconvenience. My goal is to be able to use Docker both when testing as well as in a production environment in order to gain better control over the environment and to separate different types of environments. I imagine that most of the Docker-related information can be applied to other applications – I have used Mule since it is what I usually work with. The conclusion I have made after having completed my experiments is that it is possible to run Mule ESB in Docker without any inconvenience. In addition, Docker will indeed allow me to have better control over the different environments and also allow me to separate them as I find appropriate. Finally, I just want to mention that I have used Docker in an Ubuntu environment. I have not attempted any of the exercises in Docker running on Windows or Mac OS X. Docker Briefly In short, Docker allows for creating of images that serve as blueprints for containers. A Docker container is an instance of a Docker image in the same way a Java object is an instance of a Java class. FROM codingtony/java MAINTAINER tony(dot)bussieres(at)ticksmith(dot)com RUN wget https://repository.mulesoft.org/nexus/content/repositories/releases/org/mule/distributions/mule-standalone/3.5.0/mule-standalone-3.5.0.tar.gz RUN cd /opt && tar xvzf ~/mule-standalone-3.5.0.tar.gz RUN echo "4a94356f7401ac8be30a992a414ca9b9 /mule-standalone-3.5.0.tar.gz" | md5sum -c RUN rm ~/mule-standalone-3.5.0.tar.gz RUN ln -s /opt/mule-standalone-3.5.0 /opt/mule CMD [ "/opt/mule/bin/mule" ] The resource isolation features of Linux are used to create Docker containers, which are more lightweight than virtual machines and are separated from the environment in which Docker runs, the host. Using Docker an image can be created that, every time it is started has a known state. In order to remove any doubts about whether the environment has been altered in any way, the container can be stopped and a new container started. I can even run multiple Docker containers on one and the same computer to simulate a multi-server production environment. Applications can also be run in their own Docker containers, as shown in this figure. Three Docker containers, each containing a specific application, running in one host. A more detailed introduction to Docker is available here. The main entry point to the Docker documentation can be found here. Motivation Some of the motivations I have for using Docker in both testing and production environments are: The environment in which I test my application should be as similar as the final deployment environment as possible, if not identical. Making the deployment environment easy to scale up and down. If it is easy to start a new processing node when need arise and stop it if it is no longer used, I will be able to adapt to changes rather quickly and thus reduce errors caused by, for instance, load peaks. Maintain an increased number of nodes to which applications can be deployed. Instead of running one instance of some kind of application server, Mule ESB in my case, on a computer, I want multiple instances that are partitioned, for instance, according to importance. High-priority applications run on one separate instance, which have higher priority both as far as resources (CPU, memory, disk etc) are concerned but also as far as support is concerned. Applications which are less critical run on another instance. Enable quick replacement of instances in the deployment environment. Reasons for having to replace instances may be hardware failure etc. Better control over the contents of the different environments. The concept of an environment that, at any time, may be disposed (and restarted) discourages hacks in the environment, which are usually poorly documented and sometimes difficult to trace. Using Docker, I need to change the appropriate Docker image if I want to make changes to some application environment. The Docker image file, commonly known as Dockerfile, can be checked into any ordinary revision control system, such as Git, Subversion etc, making changes reversible and traceable. Automate the creation of a testing environment. An example could be a nightly job that runs on my build server which creates a test environment, deploys one or more applications to it and then performs tests, such as load-testing. Prerequisites To get the best possible experience when running Docker, I run it under Ubuntu. According to the current documentation, Docker is supported under the following versions of Ubuntu: 12.04 LTS (64-bit) 13.04 (64-bit) 13.10 (64-bit) 14.04 (64-bit) Against my usual conservative self, I chose Ubuntu 14.10, which at the time of writing this article is the latest version. While I haven’t run into any issues, I cannot promise anything regarding compatibility with Docker as far as this version of Ubuntu is concerned. Installing Docker Before we install anything, those who have the Docker version from the Ubuntu repository should remove this version before installing a newer version of Docker, since the Ubuntu repository does not contain the most recent version and the package does not have the same name as the Docker package we will install: sudo apt-get remove docker.io The simplest way to install Docker is to use an installation script made available at the Docker website: curl -sSL https://get.docker.com/ubuntu/ | sudo sh If you are not running Ubuntu or if you do not want to use the above way of installing Docker, please refer to this page containing instructions on how to install Docker on various platforms. To verify the Docker installation, open a terminal window and enter: sudo docker version Output similar to the following should appear: Client version: 1.4.1 Client API version: 1.16 Go version (client): go1.3.3 Git commit (client): 5bc2ff8 OS/Arch (client): linux/amd64 Server version: 1.4.1 Server API version: 1.16 Go version (server): go1.3.3 Git commit (server): 5bc2ff8 We are now ready to start a Mule instance in Docker. Running Mule in Docker One of the advantages with Docker is that there is a large repository of Docker images that are ready to be used, and even extended if one so wishes. ThisDocker image is the one that I will use in this article. It is well documented, there is a source repository and it contains a recent version of the Mule ESB Community Edition. Some additional details on the Docker image: Ubuntu 14.04. Oracle JavaSE 1.7.0_65. This version will change as the PPA containing the package is updated. Mule ESB CE 3.5.0 Note that the image may change at any time and the specifications above may have changed. If you intend to use Docker in your organization, I would suspect that the best alternative is to create your own Docker images that are totally under your control. The Docker image repository is an excellent source of inspiration and aid even in this case. Starting a Docker Container To start a Docker container using this image, open a terminal window and write: sudo docker run codingtony/mule The first time an image is used it needs to be downloaded and created. This usually takes quite some time, so I suggest a short break here – perhaps for a cup of coffee or tea. If you just want to download an image without starting it, exchange the Docker command “run” with “pull”. Once the container is started, you will see some output to the console. If you are familiar with Mule, you will recognize the log output: MULE_HOME is set to /opt/mule-standalone-3.5.0 Running in console (foreground) mode by default, use Ctrl-C to exit... MULE_HOME is set to /opt/mule-standalone-3.5.0 Running Mule... --> Wrapper Started as Console Launching a JVM... Starting the Mule Container... Wrapper (Version 3.2.3) http://wrapper.tanukisoftware.org Copyright 1999-2006 Tanuki Software, Inc. All Rights Reserved. INFO 2015-01-05 04:41:42,302 [WrapperListener_start_runner] org.mule.module.launcher.MuleContainer: ********************************************************************** * Mule ESB and Integration Platform * * Version: 3.5.0 Build: ff1df1f3 * * MuleSoft, Inc. * * For more information go to http://www.mulesoft.org * * * * Server started: 1/5/15 4:41 AM * * JDK: 1.7.0_65 (mixed mode) * * OS: Linux (3.16.0-28-generic, amd64) * * Host: f95698cfb796 (172.17.0.2) * ********************************************************************** Note that: In the text-box containing information about the Mule ESB and Integration Platform, there is a row which starts with “Host:”. The hexadecimal digit that follows is the Docker container id and the IP-address is the external IP-address of the Docker container in which Mule is running. Before we do anything with the Mule instance running in Docker, let’s take a look at Docker containers. Docker Containers We can verify that there is a Docker container running by opening another terminal window, or a tab in the first terminal window, and running the command: sudo docker ps As a result, you will see output similar to the following (I have edited the output in order for the columns to be aligned with the column titles): CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f95698cfb796 codingtony/mule:latest "/opt/mule/bin/mule" 7 min ago Up 7 min jolly_hopper From this output we can see that: The ID of the container is f95698cfb796. This ID can be used when performing operations on the container, such as stopping it, restarting it etc. The name of the image used to created the container. The command that is currently executing. If we look at the Dockerfile for the image, we can see that the last line in this file is: CMD [ “/opt/mule/bin/mule” ] This is the command that is executed whenever an instance of the Docker image is launched and it matches what we see in the COMMAND column for the Docker container. The CREATED column shows how much time has passed since the container was created. The STATUS column shows the current status of the image. When you have used Docker for a while, you can view all the containers using: sudo docker ps -a This will show you containers that are not running, in addition to the running ones. Containers that are not running can be restarted. The PORTS column shows any port mappings for the container. More about port mappings later. Finally, the NAMES column contain a more human-friendly container name. This container name can be used in the same way as the container id. Docker containers will consume disk-space and if you want to determine how much disk-space each of the containers on your computer use, issue the following command: sudo docker ps -a -s An additional column, SIZE, will be shown and in this column I see that my Mule container consumes 41,76kB. Note that this is in addition to the disk-space consumed by the Docker image. This number will grow if you use the container under a longer period of time, as the container retains any files written to disk. To completely remove a stopped Docker container, find the id or name of the container and use the command: sudo docker rm [container id or name here] Before going further, let’s stop the running container and remove it: sudo docker stop [container id or name here] sudo docker rm [container id or name here] Files and Docker Containers So far we have managed to start a Mule instance running inside a Docker container, but there were no Mule applications deployed to it and the logs that were generated were only visible in the terminal window. I want to be able to deploy my applications to the Mule instance and examine the logs in a convenient way. In this section I will show how to: Share one or more directories in the host file-system with a Docker container. Access the files in a Docker container from the host. As the first step in looking at sharing directories between the host operating system and a Docker container, we are going to look at Mule logs. As part of this exercise we also set up the directories in the host operating system that are going to be shared with the Docker container. In your home directory, create a directory named “mule-root”. In the “mule-root” directory, create three directories named “apps”, “conf” and “logs”. Download the Mule CE 3.5.0 standalone distribution from this link. From the Mule CE 3.5.0 distribution, copy the files in the “apps” directory to the “mule-root/apps” directory you just created. From the Mule CE 3.5.0 distribution, copy the files in the “conf” directory to the “mule-root/conf” directory you created. The resulting file- and directory-structure should look like this (shown using the tree command): ~/mule-root/ ├── apps │ └── default │ └── mule-config.xml ├── conf │ ├── log4j.properties │ ├── tls-default.conf │ ├── tls-fips140-2.conf │ ├── wrapper-additional.conf │ └── wrapper.conf └── logs Edit the log4j.properties file in the “mule-root/conf” directory and set the log-level on the last line in the file to “DEBUG”. This modification has nothing to do with sharing directories, but is in order for us to be able to see some more output from Mule when we run it later. The last two lines should now look like this: # Mule classes log4j.logger.org.mule=DEBUG Binding Volumes We are now ready to launch a new Docker container and when we do, we will tell Docker to map three directories in the Docker container to three directories in the host operating system. Three directories in a Docker container bound to three directories in the host. Launch the Docker container with the command below. The -v option tells Docker that we want to make the contents of a directory in the host available at a certain path in the Docker container file-system. The -d option runs the container in the background and the terminal prompt will be available as soon as the id of the newly launched Docker container has been printed. sudo docker run -d -v ~/mule-root/apps:/opt/mule/apps -v ~/mule-root/conf:/opt/mule/conf -v ~/mule-root/logs:/opt/mule/logs codingtony/mule Examine the “mule-root” directory and its subdirectories in the host, which should now look like below. The files on the highlighted rows have been created by Mule. mule-root/ ├── apps │ ├── default │ │ └── mule-config.xml │ └── default-anchor.txt ├── conf │ ├── log4j.properties │ ├── tls-default.conf │ ├── tls-fips140-2.conf │ ├── wrapper-additional.conf │ └── wrapper.conf └── logs ├── mule-app-default.log ├── mule-domain-default.log └── mule.log Examine the “mule.log” file using the command “tail -f ~/mule-root/logs/mule.log”. There should be periodic output written to the log file similar to the following: DEBUG 2015-01-05 12:05:37,216 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.DeploymentDirectoryWatcher: Checking for changes... DEBUG 2015-01-05 12:05:37,216 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.DeploymentDirectoryWatcher: Current anchors: default-anchor.txt DEBUG 2015-01-05 12:05:37,216 [Mule.app.deployer.monitor.1.thread.1] org.mule.module.launcher.DeploymentDirectoryWatcher: Deleted anchors: Stop and remove the container: sudo docker stop [container id or name here] sudo docker rm [container id or name here] Direct Access to Docker Container Files When running Docker under the Ubuntu OS it is also possible to access the file-system of a Docker container from the host file-system. It may be possible to do this under other operating systems too, but I haven’t had the opportunity to test this. This technique may come in handy during development or testing with Docker containers for which you haven’t bound any volumes. Note! If given the choice to use either volume binding, as seen above, or direct access to container files as we will look at in this section for something more than a temporary file access, I would chose to use volume binding. Direct access to Docker container files relies on implementation details that I suspect may change in future versions of Docker if the developers find it suitable. With all that said, lets get the action started: Start a new Docker container: sudo docker run -d codingtony/mule Find the id of the newly launched Docker container: sudo docker ps Examine low-level information about the newly launched Docker container: sudo docker inspect [container id or name here] Output similar to this will be printed to the console (portions removed to conserve space): [{ "AppArmorProfile": "", "Args": [], "Config": { ... }, "Created": "2015-01-12T07:58:47.913905369Z", "Driver": "aufs", "ExecDriver": "native-0.2", "HostConfig": { ... }, "HostnamePath": "/var/lib/docker/containers/68b40def7ad6a7f819bd654d5627ad1c3a0f40c84e0fb0f875760f1bd6790eef/hostname", "HostsPath": "/var/lib/docker/containers/68b40def7ad6a7f819bd654d5627ad1c3a0f40c84e0fb0f875760f1bd6790eef/hosts", "Id": "68b40def7ad6a7f819bd654d5627ad1c3a0f40c84e0fb0f875760f1bd6790eef", "Image": "bcd0f37d48d4501ad64bae941d95446b157a6f15e31251e26918dbac542d731f", "MountLabel": "", "Name": "/thirsty_darwin", "NetworkSettings": { ... }, "Path": "/opt/mule/bin/mule", "ProcessLabel": "", "ResolvConfPath": "/var/lib/docker/containers/68b40def7ad6a7f819bd654d5627ad1c3a0f40c84e0fb0f875760f1bd6790eef/resolv.conf", "State": { ... }, "Volumes": {}, "VolumesRW": {} }] Locate the “Driver” node (highlighted in the above output) and ensure that its value is “aufs”. If it is not, you may need to modify the directory paths below replacing “aufs” with the value of this node. Personally I have only seen the “aufs” value at this node so anything else is uncharted territory to me. Copy the long hexadecimal value that can be found at the “Id” node (also highlighted in the above output). This is the long id of the Docker container. In a terminal window, issue the following command, inserting the long id of your container where noted: sudo ls -al /var/lib/docker/aufs/mnt/[long container id here] You are now looking at the root of the volume used by the Docker container you just launched. In the same terminal window, issue the following command: sudo ls -al /var/lib/docker/aufs/mnt/[long container id here]/opt The output from this command should look like this: total 12 drwxr-xr-x 4 root root 4096 jan 12 15:58 . drwxr-xr-x 75 root root 4096 jan 12 15:58 .. lrwxrwxrwx 1 root root 26 aug 10 04:19 mule -> /opt/mule-standalone-3.5.0 drwxr-xr-x 17 409 409 4096 jan 12 15:58 mule-standalone-3.5.0 Examine this line in the Dockerfile:RUN ln -s /opt/mule-standalone-3.5.0 /opt/muleWe see that a symbolic link is created and that the directory name and the name of the symbolic link matches the output we saw earlier. This matches the directory output in the previous step. To examine the Mule log file that we looked at when binding volumes earlier, use the following command: sudo cat /var/lib/docker/aufs/mnt/[long container id here]/opt/mule-standalone-3.5.0/logs/mule.log Next we create a new file in the Docker container using vi: sudo vi /var/lib/docker/aufs/mnt/[long container id here]/opt/mule-standalone-3.5.0/test.txt Enter some text into the new file by first pressing i and the type the text. When you are finished entering the text, press the Escape key and write the file to disk by typing the characters “:wq” without quotes. This writes the new contents of the file to disk and quits the editor. Leave the Docker container running after you are finished. In the next section, we are going to look at the file we just created from inside the Docker container. We have seen that we can examine the file system of a Docker container without binding volumes. It is also possible to copy or move files from the host file-system to the container’s file system using the regular commands. Root privileges are required both when examining and writing to the Docker container’s file system. Entering a Docker Container In order to verify that the file we just created in the host was indeed written to the Docker container, we are going to start a bash shell in the running Docker container and examine the location where the new file is expected to be located and the contents of the file. In the process we will see how we can execute commands in a Docker container from the host. Issue the command below in a terminal window. The exec Docker command is used to run a command, bash in this case, in a running Docker container. The -i flags tell Docker to keep the input stream open while the command is being executed. In this example, it allows us to enter commands into the bash shell running inside the Docker container. The -t flag cause Docker to allocate a text terminal to which the output from the command execution is printed. sudo docker exec -i -t [container id or name here] bash Note the prompt, which should change to [user]@[Docker container id]. In my case it looks like this: root@3ea374a280da:/# Go to the Mule installation directory using this command: cd /opt/mule-standalone-3.5.0/ Examine the contents of the directory: ls -al Among the other files, you should see the “test.txt” file: -rw-r--r-- 1 root root 53 Jan 14 03:19 test.txt Examine the contents of the “text.txt” file. The contents of the file should match what you entered earlier. cat text.txt Exit to the host OS: exit Stop and remove the container: sudo docker stop [container id or name here] sudo docker rm [container id or name here] We have seen that we can execute commands in a running Docker container. In this particular example, we used it to execute the bash shell and examine a file. I draw the conclusion that I should be able to set up a Docker image that contains a very controlled environment for some type of test and then create a container from that image and start the test from the host. Deploying a Mule Application In this section we will look at deploying a Mule application to an instance of the Mule ESB running in a Docker container. We will use volume binding, that we looked at in the section on files and Docker containers, to share directories in the host with the Docker container in order to make it easy to deploy applications, modify running applications, examine logs etc. Preparations Before deploying the application, we need to make some preparations: First of all, we restore the original log-level that we changed earlier. In this example, there will be log output when the applications we will deploy is run and we can limit the log generated by Mule. Edit the log4j.properties file in the “mule-root/conf” directory in the host and set the log-level on the last line in the file back to “INFO” and add one line, as in the listing below. The last three lines should now look like this: # Mule classes log4j.logger.org.mule=INFO log4j.logger.org.mule.tck.functional=DEBUG Next, we create the Mule application which we will deploy to the Mule ESB running in Docker: In some directory, create a file named “mule-deploy.properties” with the following contents: redeployment.enabled=true encoding=UTF-8 domain=default config.resources=HelloWorld.xml In the same directory create a file named “HelloWorld.xml”. This file contains the Mule configuration for our example application: Create a zip-archive named “mule-hello.zip” containing the two files created above: zip mule-hello.zip mule-deploy.properties HelloWorld.xml Deploy the Mule Application Before you start the Docker container in which the Mule EBS will run, make sure that you have created and prepared the directories in the host as described in the section Files and Docker Containers above. Start a new Mule Docker container using the command that we used when binding volumes: sudo docker run -d -v ~/mule-root/apps:/opt/mule/apps -v ~/mule-root/conf:/opt/mule/conf -v ~/mule-root/logs:/opt/mule/logs codingtony/mule As before, the -v option tells Docker to bind three directories in the host to three locations in the Docker container’s file system. Find the IP-address of the Docker container: sudo docker inspect [container id or name here] | grep IPAddress In my case, I see the following line which reveals the IP-address of the Docker container: “IPAddress”: “172.0.17.2”, Open a terminal window or tab and examine the Mule log. Leave this window or tab open during the exercise, in order to be able to verify the output from Mule. tail -f ~/mule-root/logs/mule.log Copy the zip-archive “mule-hello.zip” created earlier to the host directory ~/mule-root/apps/. Verify that the application has been deployed without errors in the Mule log: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + Started app 'mule-hello' + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Leave the Docker container running after you are finished. In the next section we will look at how to access endpoints exposed by applications running in Docker containers. By binding directories in the host thus making them available in the Docker container, it becomes very simple to deploy Mule applications to an instance of Mule ESB running in a Docker container. I am considering this setup for a production environment as well, since it will enable me to perform backups of the directories containing Mule applications and configuration without having to access the Docker container’s file system. It is also in accord with the idea that a Docker container should be able to be quickly and easily restarted, which I feel it would not be if I had to deploy a number of Mule applications to it in order to recreate its previous state. Accessing Endpoints We now know that we can run the Mule ESB in a Docker container, we can deploy applications and examine the logs quite easily but one final, very important question remains to be answered; how to access endpoints exposed by applications running in a Docker container. This section assumes that the Mule application we deployed to Mule in the previous section is still running. In the host, open a web-browser and issue a request to the Docker container’s IP-address at port 8181. In my case, the URL is http://172.17.0.2:8181 Alternatively use the curl command in a terminal window. In my case I would write: curl 172.17.0.2:8181 The result should be a greeting in the following format: Hello World! It is now: 2015-01-14T07:39:03.942Z In addition, you should be able to see that a message was received in the Mule log. Now try the URL http://localhost:8181 You will get a message saying that the connection was refused, provided that you do not already have a service listening at that port. If you have another computer available that is connected to the same network as the host computer running Ubuntu, do the following: – Find the IP-address of the Ubuntu host computer using the ifconfigcommand. – In a web-browser on the other computer, try accessing port 8181 at the IP-address of the Ubuntu host computer. Again you will get a message saying that the connection was refused. Stop and remove the container: sudo docker stop [container id or name here] sudo docker rm [container id or name here] Without any particular measures taken, we see that we can access a service exposed in a Docker container from the Docker host but we did not succeed in accessing the service from another computer. To make a service exposed in a Docker container reachable from outside of the host, we need to tell Docker to publish a port from the Docker container to a port in the host using the -p flag: Launch a new Docker container using the following command: sudo docker run -d -p 8181:8181 -v ~/mule-root/apps:/opt/mule/apps -v ~/mule-root/conf:/opt/mule/conf -v ~/mule-root/logs:/opt/mule/logs codingtony/mule The added flag -p 8181:8181 makes the service exposed at port 8181 in the Docker container available at port 8181 in the host. Try accessing the URL http://localhost:8181 from a web-browser on the host computer.The result should be a greeting of the form we have seen earlier. Try accessing port 8181 at the IP-address of the Ubuntu host computer from another computer.This should also result in a greeting message. Stop and remove the container: sudo docker stop [container id or name here] sudo docker rm [container id or name here] Using the -p flag, we have seen that we can expose a service in a Docker container so that it becomes accessible from outside of the host computer. However, we also see that this information need to be supplied at the time of launching the Docker container. The conclusions that I draw from this is that: I can test and develop against a Mule ESB instance running in a Docker container without having to publish any ports, provided that my development computer is the Docker host computer. In a production environment or any other environment that need to expose services running in a Docker container to “the outside world” and where services will be added over time, I would consider deploying an Apache HTTP Server or NGINX on the Docker host computer and use it to proxy the services that are to be exposed. This way I can avoid re-launching the Docker container each time a new service is added and I can even (temporarily) redirect the proxy to some other computer if I need to perform some maintenance. Is There More? Of course! This article should only be considered an introduction and I am just a beginner with Docker. I hope I will have the time and inspiration to write more about Docker as I learn more.

January 20, 2015

by Ivan K

· 27,795 Views · 4 Likes

How to Integrate Jersey in a Spring MVC Application

I have recently started to build a public REST API with Java for Podcastpedia.org and for the JAX-RS implementation I have chosen Jersey, as I find it “natural” and powerful – you can find out more about it by following the Tutorial – REST API design and implementation in Java with Jersey and Spring. Because Podcastpedia.org is a web application powered by Spring MVC, I wanted to integrate both frameworks in podcastpedia-web, to take advantage of the backend service functionality already present in the project. Anyway this short post will present the steps I had to take to make the integration between the two frameworks work. Framework versions Current versions used: 4.1.0.RELEASE 2.14 Project dependencies The Jersey Spring extension must be present in your project’s classpath. If you are using Maven add it to the pom.xml file of your project: org.glassfish.jersey.ext jersey-spring3 ${jersey.version} org.springframework spring-core org.springframework spring-web org.springframework spring-beans org.glassfish.jersey.media jersey-media-json-jackson ${jersey.version} com.fasterxml.jackson.jaxrs jackson-jaxrs-base com.fasterxml.jackson.core jackson-annotations com.fasterxml.jackson.jaxrs jackson-jaxrs-json-provider Note: I have explicitly excluded the Spring core and the Jackson implementation libraries as they have been already imported in the project with preferred versions. Web.xml configuration In the web.xml, in addition to the Spring MVC servlet configuration I added the jersey-servlet configuration, that will map all requests starting with/api/: Spring MVC Dispatcher Servlet org.springframework.web.servlet.DispatcherServlet contextConfigLocation classpath:spring/application-context.xml 1 Spring MVC Dispatcher Servlet / jersey-serlvet org.glassfish.jersey.servlet.ServletContainer javax.ws.rs.Application org.podcastpedia.web.api.JaxRsApplication 2 jersey-serlvet /api/* Well, that’s pretty much it… If you have any questions drop me a line or comment in the discussion below. In the coming post I will present some of the results of this integration, by showing how to call one method of the REST public API with jQuery, to dynamically load recent episodes of a podcast, so stay tuned.

January 8, 2015

by Adrian Matei

· 20,102 Views