DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Integration Topics

article thumbnail
Algorithm of the Week: Aho-Corasick String Matching Algorithm in Haskell
let’s say you have a large piece of text and a dictionary of keywords. how do you quickly locate all the keywords? aho-corasick algorithm diagram well, there are many ways really, you could even iterate through the whole thing and compare words to keywords. but it turns out that’s going to be very slow. at least o(n_keywords * n_words) complexity. essentially you’re making as many passes over the text as your dictionary is big. in 1975 a couple of ibm researchers – alfred aho and margaret corasick – discovered an algorithm that can do this in a single pass. the aho-corasick string matching algorithm . i implemented it in haskell and it takes 0.005s to find 8 different keywords in oscar wilde’s the nightingale and the rose – a 12kb text. a quick naive keyword search implemented in python takes 0.023s . not a big difference practically speaking, but imagine a situation with megabytes of text and thousands of words in the dictionary. the authors mention printing out the result as a major bottleneck in their assessment of the algorithm. yep, printing . the aho-corasick algorithm at the core of this algorithm are three functions: the three functions of aho-corasick algorithm a parser based on a state machine, which maps (state, char) pairs to states and occasionally emits an output. this is called the goto function a failure function, which tells the goto function which state to jump into when the character it just read doesn’t match anything an output function, which maps states to outputs – potentially more than one per state the algorithm works in two stages. it will first construct the goto, failure and output functions. the complexity of this operation hinges solely on the size of our dictionary. then it iterates over the input text to produce all the matches. using state machines for parsing text is a well known trick – the real genius of this algorithm rests in that failure function if you ask me. it makes lateral transitions between states when the algorithm climbs itself into a wall. say you have she and hers in the dictionary. the goto machine eats your input string one character at the time. let’s say it’s already read s h . the next input is an e so it outputs she and reaches a final state. next it reads an r , but the state didn’t expect any more inputs, so the failure function puts us on the path towards hers . this is a bit tricky to explain in text, i suggest you look at the picture from the original article and look at what’s happening. my haskell implementation the first implementation i tried, relied on manully mapping inputs to outputs for the goto, failure and output functions by using pattern recognition. not very pretty, extremely hardcoded, but it worked and was easy to make. building the functions dynamically proved a bit trickier. type goto = map (int, char) int type failure = map int int type output = map int [string] first off, we build the goto function. -- builds the goto function build_goto::goto -> string -> (goto, string) build_goto m s = (add_one 0 m s, s) -- adds one string to goto function add_one::int -> goto -> [char] -> goto add_one _ m [] = m add_one state m (c:rest) | member key m = add_one (frommaybe 0 $ map.lookup key m) m rest | otherwise = add_one max (map.insert key max m) rest where key = (state, c) max = (size m)+1 essentially this builds a flattened prefix tree in a hashmap of (state, char) pairs mapping to the next state. it makes sure to avoid adding new edges to the three as much as possible. the reason it’s not simply a prefix tree are those lateral transitions; doing them in a tree would require backtracking and repeating of steps, so we haven’t achieved anything. once we have the goto function, building the output is trivial. -- builds the output function build_output::(?m::goto) => [string] -> output build_output [] = empty build_output (s:rest) = map.insert (fin 0 s) (list.filter (\x -> elem x dictionary) $ list.tails s) $ build_output rest -- returns the state in which an input string ends without using failures fin::(?m::goto) => int -> [char] -> int fin state [] = state fin state (c:rest) = fin next rest where next = frommaybe 0 $ map.lookup (state, c) ?m we are essentially going over the dictionary, finding the final state for each word and building a hash table mapping final states to their outputs. building the failure function was trickiest, because we need a way to iterate over the depths at which nodes are position in the goto state machine. but we threw that info away by using a hashmap. -- tells us which nodes in the goto state machine are at which traversal depth nodes_at_depths::(?m::goto) => [[int]] nodes_at_depths = list.map (\i -> list.filter (>0) $ list.map (\l -> if i < length l then l!!i else -1) paths) [0..(maximum $ list.map length paths)-1] where paths = list.map (path 0) dictionary we now have a list of lists, that tells us at which depth certain nodes are. -- builds the failure function build_fail::(?m::goto) => [[int]] -> int -> failure build_fail nodes 0 = fst $ mapaccuml (\f state -> (map.insert state 0 f, state)) empty (nodes!!0) build_fail nodes d = fst $ mapaccuml (\f state -> (map.insert state (decide_fail state lower) f, state)) lower (nodes!!d) where lower = build_fail nodes (d-1) -- inner step of building the failure function decide_fail::(?m::goto) => int -> failure -> int decide_fail state lower = findwithdefault 0 (s, c) ?m where (s', c) = key' state $ assocs ?m s = findwithdefault 0 s' lower -- gives us the key associated with a certain state (how to get there) key'::int -> [((int, char), int)] -> (int, char) key' _ [] = (-1, '_') -- this is ugly, being of maybe type would be better key' state ((k, v):rest) | state == v = k | otherwise = key' state rest here we are going over the list of nodes at depths and deciding what the failure should be for each depth based on the failures of depth-1. at depth zero, all failures go to the zeroth state. an important part of this process was inverting the goto hashmap so values point to keys, which is essentially what the key’ function does. finally, we can use the whole algorithm like this: main = do let ?m = fst $ mapaccuml build_goto empty dictionary let ?f = build_fail nodes_at_depths $ (length $ nodes_at_depths)-1 ?out = build_output dictionary print $ ahocorasick text a bit more involved than the usual example of haskell found online, it’s still pretty cool you can see the whole code on github here .
March 19, 2013
by Swizec Teller
· 21,935 Views
article thumbnail
Client For ActiveMQ
This Post explains Topics in Active MQ (Message Broker) with Subscribing and Publishing. For this we will write two java clients. As we did for wso2 Message Broker TopicSubscriber.java to Subcribe for messages TopicPublisher.java to to Publish the messages Let's Start. [1] Get Active MQ from http://activemq.apache.org/download.html [1.1] Start Active MQ from \bin\activemq.bat You can see the started server form http://localhost:8161/admin/ [2] Create Porject "Client" on IDE that you preferred [3] Add activemq-all-5.7.0.jar to lib Dir in the project (activemq-all-5.7.0.jar can be found in root folder) [4] Creat class "TopicSubscriber.java" to Subcribe for messages package simple; import java.util.Properties; import javax.jms.JMSException; import javax.jms.Message; import javax.jms.MessageListener; import javax.jms.Session; import javax.jms.TextMessage; import javax.jms.Topic; import javax.jms.TopicConnection; import javax.jms.TopicConnectionFactory; import javax.jms.TopicSession; import javax.naming.InitialContext; import javax.naming.NamingException; public class TopicSubscriber { private String topicName = "news.sport"; private String initialContextFactory = "" +"org.apache.activemq.jndi.ActiveMQInitialContextFactory"; private String connectionString = "tcp://" +"localhost:61616"; private boolean messageReceived = false; public static void main(String[] args) { TopicSubscriber subscriber = new TopicSubscriber(); subscriber.subscribeWithTopicLookup(); } public void subscribeWithTopicLookup() { Properties properties = new Properties(); TopicConnection topicConnection = null; properties.put("java.naming.factory.initial", initialContextFactory); properties.put("connectionfactory.QueueConnectionFactory", connectionString); properties.put("topic." + topicName, topicName); try { InitialContext ctx = new InitialContext(properties); TopicConnectionFactory topicConnectionFactory = (TopicConnectionFactory) ctx .lookup("QueueConnectionFactory"); topicConnection = topicConnectionFactory.createTopicConnection(); System.out .println("Create Topic Connection for Topic " + topicName); while (!messageReceived) { try { TopicSession topicSession = topicConnection .createTopicSession(false, Session.AUTO_ACKNOWLEDGE); Topic topic = (Topic) ctx.lookup(topicName); // start the connection topicConnection.start(); // create a topic subscriber javax.jms.TopicSubscriber topicSubscriber = topicSession .createSubscriber(topic); TestMessageListener messageListener = new TestMessageListener(); topicSubscriber.setMessageListener(messageListener); Thread.sleep(5000); topicSubscriber.close(); topicSession.close(); } catch (JMSException e) { e.printStackTrace(); } catch (NamingException e) { e.printStackTrace(); } catch (InterruptedException e) { e.printStackTrace(); } } } catch (NamingException e) { throw new RuntimeException("Error in initial context lookup", e); } catch (JMSException e) { throw new RuntimeException("Error in JMS operations", e); } finally { if (topicConnection != null) { try { topicConnection.close(); } catch (JMSException e) { throw new RuntimeException( "Error in closing topic connection", e); } } } } public class TestMessageListener implements MessageListener { public void onMessage(Message message) { try { System.out.println("Got the Message : " + ((TextMessage) message).getText()); messageReceived = true; } catch (JMSException e) { e.printStackTrace(); } } } } [5] Creat class "TopicPublisher.java" to to Publish the messages package simple; import javax.jms.*; import javax.naming.InitialContext; import javax.naming.NamingException; import java.util.Properties; public class TopicPublisher { private String topicName = "news.sport"; private String initialContextFactory = "org.apache.activemq" +".jndi.ActiveMQInitialContextFactory"; private String connectionString = "tcp://localhost:61616"; public static void main(String[] args) { TopicPublisher publisher = new TopicPublisher(); publisher.publishWithTopicLookup(); } public void publishWithTopicLookup() { Properties properties = new Properties(); TopicConnection topicConnection = null; properties.put("java.naming.factory.initial", initialContextFactory); properties.put("connectionfactory.QueueConnectionFactory", connectionString); properties.put("topic." + topicName, topicName); try { // initialize // the required connection factories InitialContext ctx = new InitialContext(properties); TopicConnectionFactory topicConnectionFactory = (TopicConnectionFactory) ctx .lookup("QueueConnectionFactory"); topicConnection = topicConnectionFactory.createTopicConnection(); try { TopicSession topicSession = topicConnection.createTopicSession( false, Session.AUTO_ACKNOWLEDGE); // create or use the topic System.out.println("Use the Topic " + topicName); Topic topic = (Topic) ctx.lookup(topicName); javax.jms.TopicPublisher topicPublisher = topicSession .createPublisher(topic); String msg = "Hi, I am Test Message"; TextMessage textMessage = topicSession.createTextMessage(msg); topicPublisher.publish(textMessage); System.out.println("Publishing message " +textMessage); topicPublisher.close(); topicSession.close(); Thread.sleep(20); } catch (InterruptedException e) { e.printStackTrace(); } } catch (JMSException e) { throw new RuntimeException("Error in JMS operations", e); } catch (NamingException e) { throw new RuntimeException("Error in initial context lookup", e); } } } [6] Firstly Run "TopicSubscriber.java" and then run "TopicPublisher.java" Here is out put from both TopicSubscriber:: Create Topic Connection for Topic news.sport Got the Message : Hi, I am Test Message TopicPublisher:: Use the Topic news.sport Publishing message ActiveMQTextMessage {commandId = 0, responseRequired = false, messageId = ID:Madhuka-THINK-51683-1359787878456-1:1:1:1:1, originalDestination = null, originalTransactionId = null, producerId = null, destination = topic://news.sport, transactionId = null, expiration = 0, timestamp = 1359787878729, arrival = 0, brokerInTime = 0, brokerOutTime = 0, correlationId = null, replyTo = null, persistent = true, type = null, priority = 4, groupID = null, groupSequence = 0, targetConsumerId = null, compressed = false, userID = null, content = null, marshalledProperties = null, dataStructure = null, redeliveryCounter = 0, size = 0, properties = null, readOnlyProperties = false, readOnlyBody = false, droppable = false, text = Hi, I am Test Message} [More] Here is full message that we have send to TopicSubscriber. We can get that any parameter in above. Here is sample to get TimeStamp and ID from JMS message. public class TestMessageListener implements MessageListener { public void onMessage(Message message) { try { System.out.println("Got the Message TimeStamp: " + message.getJMSTimestamp()); System.out.println("Got the Message JMS ID : " + message.getJMSMessageID()); messageReceived = true; } catch (JMSException e) { e.printStackTrace(); } } } Now go to ActiveMQ server at http://localhost:8161/admin/ See that Topic and message count for that topic. Now you time to check more in 'Active MQ'
March 16, 2013
by Madhuka Udantha
· 18,322 Views
article thumbnail
Using Facades to Decouple API Integrations
the most important part of building an integration with an api is actually writing the code that will connect with the web service and invoke its methods. i'll show you why using the façade pattern to decouple calls from your existing code is a good idea and help you identify what kind of problems you might be able to prevent. so, first things first, what is the façade pattern? a façade is an object that provides simple access to complex - or external - functionality. it might be used to group together several methods into a single one, to abstract a very complex method into several simple calls or, more generically, to decouple two pieces of code where there's a strong dependency of one over the other. what happens when you develop api calls inside your code and, suddenly, the api is upgraded and some of its methods or parameters change? you'll have to change your application code to handle those changes. also, by changing your internal application code, you might have to change the way some of your objects behave. it is easy to overlook every instance and can require you to doublecheck multiple lines of code. there's a better way to keep api calls up-to-date. by writing a façade with the single responsibility of interacting with the external web service, you can defend your code from external changes. now, whenever the api changes, all you have to do is update your façade. your internal application code will remain untouched. from a test driven development point-of-view, using a facade offers a big advantage. you're now able to write simple tests against the façade without affecting your internal code test results. by using this strategy, you'll be able to know immediately whenever an api is not working as you expected and make the necessary changes to the façade. this is the approach we follow at cloudwork when building integrations between any third-party apis. api façades act as tight compartments that protect the rest of the application from external changes and simplify the way we interact with different web services. this article is cross-posted at using facades to decouple api integrations .
March 13, 2013
by Bruno Pedro
· 12,215 Views
article thumbnail
Maven's Non-Resolvable Parent POM Problem
Need help dealing with Maven's non-resolvable parent problem? Check out this post to learn how.
March 12, 2013
by Roger Hughes
· 462,921 Views · 8 Likes
article thumbnail
Compare RESTful vs. SOAP Web Services
There are currently two schools of thought in developing Web Services – one being the standards-based traditional approach [ SOAP ] and the other, simpler school of thought [ REST ]. This article quickly compares one with the other - REST SOAP Assumes a point-to-point communication model–not usable for distributed computing environment where message may go through one or more intermediaries Designed to handle distributed computing environments Minimal tooling/middleware is necessary. Only HTTP support is required Requires significant tooling/middleware support URL typically references the resource being accessed/deleted/updated The content of the message typically decides the operation e.g. doc-literal services Not reliable – HTTP DELETE can return OK status even if a resource is not deleted Reliable Formal description standards not in widespread use. WSDL 1.2, WADL are candidates. Well defined mechanism for describing the interface e.g. WSDL+XSD, WS-Policy Better suited for point-to-point or where the intermediary does not play a significant role Well suited for intermediated services No constraints on the payload Payload must comply with the SOAP schema Only the most well established standards apply e.g. HTTP, SSL. No established standards for other aspects. DELETE and PUT methods often disabled by firewalls, leads to security complexity. A large number of supporting standards for security, reliability, transactions. Built-in error handling (faults) No error handling Tied to the HTTP transport model Both SMTP and HTTP are valid application layer protocols used as Transport for SOAP Less verbose More verbose
March 8, 2013
by Jagadeesh Motamarri
· 167,184 Views · 2 Likes
article thumbnail
8 Lessons in Deployment Tooling Lessons Learned
It didn’t take long. A few months after we released an open source continuous integration tool (Anthill) in 2001, we were asked, “It’s great that I have the build setup, now how do I deploy to the test lab?” That email started a clear transformation in our thinking. Lesson 1: Builds generally exist to be tested or released to customers.The corollary is: Continuous integration is not about build, it is about quality and checking quality generally requires a deployment or six. In 2005, we updated our AnthillPro 2.5 with a shiny new deployment capability. It worked ok, but with that generation of tool being so build oriented, it was never a clean fit. Lesson 2: Deployments are a serious challenge, and can not be bolted on to a build tool. This lesson has been reinforced over the years as we’ve watched the results of various tools tacking on deployments. In 2006, we released AnthillPro 3. Oh boy, what a change. Now deployments and other stuff you do to builds days and weeks later were their own thing. Demonstrating a stunning lack of marketing savvy, we called them “non-originating workflows” (processes that don’t make builds) and declared a new type of tool, an “Application Lifecycle Automation Server”. Today, you’d call it a Continuous Delivery tool (thank you Jez and Dave for the better name). To my knowledge, AnthillPro was the first tool to really take a build lifecycle or build pipeline seriously and had the ambition to manage from continuous integration build through to production release quickly and with a solid audit trail. That brings us to Lesson 3: When you come from the development side of the house as either an engineer or a vendor, you have a lot to learn about audit and security. Around the same time we also learned a great deal about the Dev / Ops gap. While some of our customers implemented the tool as intended, most of the time it was used either by a dev organization doing continuous integration and deployment to the early test environments or a production support / release management group that only did the “official” builds and focused on the production release. The failure of the developer initiated efforts to get to production provided a key insight. Lesson 4: When it comes to deployment automation, start with production, and work your way to the lower environments, treating them as a simpler case of the hard problem. We also learned about the limitations of a build pipeline approach in complex applications. Lesson 5: Deployments of complex systems often require a coordinated release of many builds. The need to start at production style deployments (and not care as much about the build) and the emphasis on deployment time dependencies focused our efforts on a deployment centric tool, uDeploy to compliment the pipeline approach in AnthillPro. Production and scaled usage come together to make the deployment tool a critical piece of infrastructure. If you lose a datacenter, can you recover the tool and use it to recover other applications into a backup datacenter?Lesson 6: Your release and deployment systems need to be configured with high availability and recoverability in mind. For the better part of the last decade, we’ve been trying to streamline deployments and kill off the release weekend. Many of customers have eliminated or shrunk those big events down to a more manageable size. Big releases though still can require the coordination of dozens of people across various teams. One time events like configuring the deployment tool with the password for a new database in each environment, or routine manual steps like putting eyeballs on the new production system before putting it out to the public still need to be managed. Lesson 7: Deployment is still just one part of release. With that in mind, we’ve added uRelease to the portfolio. Different people in the application delivery value chain need different tools. Developers need continuous integration and great testing. Change management needs great auditing around production deployments. Testers need their environments to be not broken, updated regularly and close enough to production to make their testing worthwhile. The patterns and needs overlap. Everyone needs deployments that work. Everyone benefits from understanding the lifecycle of a build (or other version) as well as what is in a given environment. Everyone benefits when release planning meetings are shorter. Lesson 8: No single tool is enough. An integrated toolchain that exchanges information is required. So there you go. Twelve years of tooling, eight with deployments. Summed up in six over-simplified lessons. I’m sure we’ll be updating this next year. Lesson 1: Builds generally exist to be tested or released to customers. Lesson 2: Deployments are a serious challenge, and can not be bolted on to a build tool. Lesson 3: When you come from the development side of the house as either an engineer or a vendor, you have a lot to learn about audit and security. Lesson 4: When it comes to deployment automation, start with production, and work your way to the lower environments, treating them as a simpler case of the hard problem. Lesson 5: Deployments of complex systems often require a coordinated release of many builds. Lesson 6: Release and deployment systems need to be configured with high availability and recoverability in mind. Lesson 7: Deployment is still just one part of release Lesson 8: No single tool does “DevOps”. An integrated toolchain that exchanges information is required. What are the biggest lessons you’ve learned about deployments in the last few years? Share in the comments below!
March 4, 2013
by Eric Minick
· 7,277 Views
article thumbnail
SAP Integration with Talend Components / Connectors (BAPI, RFC, IDoc, BW, SOAP)
talend has several connectors to integrate sap systems. however, this guide is no introduction to talend’s sap components. instead, this guide helps to understand different alternatives to integrate sap systems with talend set up a local sap system configure talend studio for using sap components use talend’s sap wizard run a first talend job which connects to sap all further required information and example use cases for talend’s sap components should be available in the talend component guide at www.help.talend.com . if that’s not the case, please create a jira documentation ticket ( https://jira.talendforge.org/browse/doct )! now let’s take a look at different alternatives for integration of sap systems with talend. alternatives for sap integration three protocols exist for communication between sap and external programs: dynamic information and action gateway (diag): e.g. used by sap gui remote function call (rfc): a function call with input and output parameters (like a java interface) hypertext transfer protocol (http): internet standard the following alternatives are available for integrating sap systems using some of these protocols. file sap supports the direct import of files (call-transaction-program, batch-input, direct input). files have to be in a specific format to be imported. transformation and integration can be realized with talend’s various file components such as tfileinputdelimited. rfc remote function call is the proprietary sap ag interface for communication between a sap system and other sap or third-party compatible system over tcp/ip or cpi-c connections. remote function calls may be associated with sap software and abap programming, and provide a way for an external program (written in languages such as php, asp, java, or c, c++) to use data returned from the server. data transactions are not limited to getting data from the server, but can insert data into server records as well. sap can act as the client or server in an rfc call. a remote function call (rfc) is the call or remote execution of a remote function module in an external system. in the sap system, these functions are provided by the rfc interface system. the rfc interface system enables function calls between two sap systems, or between a sap system and an external system. tsapinput and tsapoutput are talend’s components to use rfcs. business application programming interface (bapi) a bapi is an object-oriented view on most data and transactions of a sap system (called “business objects”). object types of the business objects are stored in the business object repository (bor). bapis are always implemented as rfcs and therefore can be called the same way. additionally, they have the following characteristics (compared to rfcs): stable interface no view layer no exceptions, instead export parameter: “return” most business objects offer the following standard bapis: getlist getdetail change creationfromdata tsapinput and tsapoutput are talend’s components to use bapis. application link enabling (ale) application link enabling (ale) is used for asynchronous messaging between different systems via “intermediate documents (idoc)”. idoc is a sap document format for business transaction data transfers. it is used to realize distributed business processes. idoc is similar to xml in purpose, but differs in syntax. both serve the purpose of data exchange and automation in computer systems, but the idoc technology takes a different approach. while xml allows having some metadata about the document itself, an idoc is obligated to have information at its header like its creator, creation time, etc. while xml has a tag-like tree structure containing data and meta-data, idocs use a table with the data and meta-data. idocs also have a session that explains all the processes which the document passed or will pass, allowing one to debug and trace the status of the document. an idoc consists of control record (it contains the type of idoc, port of the partner, release of sap r/3 which produced the idoc, etc.) data records of different types. the number and type of segments is mostly fixed for each idoc type, but there is some flexibility (for example an sd order can have any number of items). status records containing messages such as 'idoc created', 'the recipient exists', 'idoc was successfully passed to the port', 'could not book the invoice because...' different idoc types are available to handle different types of messages. for example, the idoc format orders01 may be used for both purchase orders and order confirmations. tsapidocinput and tsapidocoutput are talend’s components to use ale / idoc. bapis can also be called asynchronously via ale. all new idocs are even based on bapis. soap web services sap supports soap web services. not just sap as java, but also sap as abap! integration can be realized with talend’s esb / web service components such as tesbrequest, tesbresponse, or tesbconsumer. installation of sap server and client installation can take about 6 to 8 hours, but it is an “all in one installation”, i.e. you can install it overnight. steps for installation: get yourself a windows 7 64 bit laptop or vm with 8+ gb ram and 50+gb free disc space get a sap community account (for free, just register): http://scn.sap.com/welcome download sap netweaver (software downloads --> sap netweaver main releases: http://www.sdn.sap.com/irj/scn/nw-downloads download current version of sap netweaver application server abap 64-bit trial install sap server: follow installation guide – a html website included in the download in root of extracted download folder (start.htm --> there click on “installation” link) install sap gui (rich client frontend): start.htm --> there click on “install sap gui” link and follow instructions download the sap jco for the operating system on which your connector is running. the sap jco is available for download from sap's website at http://service.sap.com/connectors . you must have an sapnet account to access the sap jco (if you do not already have one, contact your local sap basis administrator). usage of sap server hint: you have to use a windows user which has a password (as you need to enter windows credentials when stopping sap). if you have a windows user without a password (for instance if you use windows within a vm on your mac), sap cannot process these credentials (i.e. it cannot process an empty password field) --> change your windows password before starting sap start the management console (windows startmenu --> programs --> sap management console) start and stop the sap server (right click on “nsp” --> start / stop) default user: sap* (sap system super user) password: the one which you entered at installation of sap netweaver, e.g. admin123 usage of sap client a sap client should be used to get information about the sap system (functions, data, etc.) similarly to using e.g. mysql workbench to get information from a mysql database. sap gui (view layer) communicates with sap as abap (business logic layer). the application server communicates with the relational database (db layer). different clients are available for sap: sap gui windows sap gui java web browser external rfc-program for local development demos, sap gui windows is probably the best alternative. start sap gui windows by: clicking shortcut “windows start menu --> sap frontend --> sap logon” entering username and password clicking logon sap transactions in sap, you call sap programs via sap transaction codes. important transactions codes are for example: bapi: bapi explorer, view all sap bapi's se16: data browser, view/add table data se38: program editor here is a list of several other important transaction codes: http://www.sapdev.co.uk/tcodes/tcodes.htm installation of demo data the sap installation includes some demo data. as most people do not want to install “real” sap modules such as sap fi, sap crm or sap bi on their local system, this demo data is perfect for demos using talend’s sap connectors. to install the flight demo on a local sap system, you just have to open the abap editor (transaction: se38) and execute the program sapbc_data_generator. this program generates example data within the flight tables and does some further initializations. here is a good tutorial with more information and how to test the flight application: http://help.sap.com/saphelp_erp60_sp/helpdata/de/db/7c623cf568896be10000000a11405a/content.htm configuration of talend studio to use sap components talend’s sap components are already included in the studio. however, two further steps are required to be able to use them: copy sapjco3.dll to the directory c:/windows/system32 sap java connector jar must be added copy sapjco3.jar to the directory “talend/studio/lib/java” (re-) start talend studio check if sap library is added successfully open view “talend modules” (eclipse --> windows --> show view --> talend --> modules) sort by column “context” look for “tsap*” contexts and check if sapjco3.jar has status “installed” usage of sap components with talend studio this section describes how to use talend’s sap components and the sap wizard in general (using one specific example for calling a bapi). detailed descriptions of all sap components (for using bapis, rfcs, idocs, bw, etc.) are available in the documentation talend_components_rg_x.y.z.pdf at www.help.talend.com . connection to a sap system a connection to a sap system can be done “built-in” or via “metadata --> sap connections” (the latter only in enterprise version). using the latter has several advantages: reuse connection configuration quick check if connection to sap works wizards for retrieving functions from sap (instead of handwriting without wizard) quick test with test parameters if function works before finishing development lifecycle for a sap job development lifecycle for sap job: create connection (if not existing yet) right click on metadata --> sap connection create sap connection follow wizard sap jco version: 3 client: “001” userid: “sap*” password: “admin123” --> as you defined it while installation language: “en” hostname: “localhost” system number: “00” retrieve function (bapi / rfc) right click on created connection click on “retrieve sap function” enter search filter (e.g. bapi_fl*) click on “search” select and double click on your function (e.g bapi_flcust_getlist) you see all input, output and table parameters for this sap function click on “test in” --> here you see parameters in more detail: you now have to define which input and output parameters you want to use --> remove all other by selecting them and clicking “remove” button hint: if you do not remove an input parameter, you usually have to enter a value for it! select the output type - can be a single (single record), a table (list of records), or a structure output hint: difference between table and structure in sap: http://www.sapfans.com/forums/viewtopic.php?f=12&t=119794 if you want to do a quick test: enter values for input parameters (if there are any for your function call), then click “launch” button in this example, there is only an optional input parameter max_rows you should see data in the output fields in this example, you see the record with custname “sap ag” and street “neurottstr. 16” click “finish” button under “metadata --> sap connections --> “your connection” --> sap functions: there you can now see your function (in this example: bapi_flcust_getlist) create sap job drag&drop the created function into a job (without the wizard, you also can enter all data by hand) tsapinput component is proposed automatically. click ok to add it to your job go to “initialize input” and add parameter values in this example, there is just the parameter “max_rows” hint: the parameter value can be changed from a hardcoded value to a variable, of course (just click control space on your keyboard to get access to all available variables via code completion in your studio) go to the tsapinput component and add the desired output mapping (i.e. which values you want to process further with other components scroll to the bottom to “outputs” add the correct table / structure name (in this example: "customer_list") click on mapping (which is empty and has to be filled) click on “mapping”, then click on “…” add the wanted output columns of your sap function add the same names at the column “schema xpathqueries” (do not forget the double quotes here!) click “ok” button connect the tsapinput component to a tlogrowcomponent and synchronize the schema hint: always try out if this works before adding further logic to your job! run and test your job (you will see five rows logged (as you have configured max_rows = 5 that's it. now enjoy talend's sap components :-) best regards, kai wähner (twitter: @kaiwaehner) content from my blog: http://www.kai-waehner.de/blog/2013/03/03/sap-integration-with-talend-components-connectors-bapi-rfc-idoc-bw-soap/
March 4, 2013
by Kai Wähner DZone Core CORE
· 32,871 Views · 1 Like
article thumbnail
Using UriTemplate to Change Path Parameters in a URI
I have been recently working on a client generator in the wadl.java.net project and I was solving a bug where I wanted to change path parameter on a URI. This turned out to be quite easy with existing classes that are part of the Jersey project. First of all you need a instance of UriTemplate, depending on the very of Jersey this is in a slightly different package - you IDE's automatic import will do the work for you. String template = "http://example.com/name/{name}/age/{age}"; UriTemplate uriTemplate = new UriTemplate(template); Then you use match(...) to extract them: String uri = "http://example.com/name/Bob/age/47"; Map parameters = new HashMap<>(); // Not this method returns false if the URI doesn't match, ignored // for the purposes of the this blog. uriTemplate.match(uri, parameters); Then you can replace the parameters at will and rebuild the URI: parameters.put("name","Arnold"); UriBuilder builder = UriBuilder.fromPath(template); URI output = builder.build(parameters);
March 1, 2013
by Gerard Davison
· 43,794 Views · 2 Likes
article thumbnail
Nested Iterator with WSO2 ESB
Please find the following sample which demonstrates the Nested iterator. The following is the source of the configuration. 15000 $1 Use the following request with soapui in order to test the above sample. SUN9 SUN10 From the second iterate mediator, messages retrieved from the first iterator are again iterated into smaller messages as below SUN10
February 23, 2013
by Achala Chathuranga Aponso
· 7,411 Views
article thumbnail
Apache Camel Meets Redis
The Lamborghini of Key-Value stores Camel is the best of bread Integration framework and in this post I'm going to show you how to make it even more powerful by leveraging another great project - Redis. Camel 2.11 is on its way to be released soon with lots of new features, bug fixes and components. Couple of these new components are authored by me, redis-component being my favourite one. Redis - a ligth key/value store is an amazing piece of Italian software designed for speed (same as Lamborghini - a two-seater Italian car designed for speed). Written in C and having an in-memory closer to the metal nature, Redis performs extremely well (Lamborgini's motto is "Closer to the Road"). Redis is often referred to as a data structure server since keys can contain strings, hashes, lists and sorted sets. A fast and light data structure server is like a super sportscars for software engineers - it just flies. If you want to find out more about Redis' and Lamborghini's unique performance characteristics google around and you will see for yourself. Getting started with Redis is easy: download, make, and start a redis-server. After these steps, you ready to use it from your Camel application. The component uses internally Spring Data which in turn uses Jedis driver, but with possibility to switch to other Redis drivers. Here are few use cases where the camel-redis component is a good fit: Idempotent Repository The term idempotent is used in mathematics to describe a function that produces the same result if it is applied to itself. In Messaging this concepts translates into the a message that has the same effect whether it is received once or multiple times. In Camel this pattern is implemented using the IdempotentConsumer class which uses an Expression to calculate a unique message ID string for a given message exchange; this ID can then be looked up in the IdempotentRepository to see if it has been seen before; if it has the message is consumed; if its not then the message is processed and the ID is added to the repository. RedisIdempotentRepository is using a set structure to store and check for existing Ids. ${in.body.id} Caching One of the main uses of Redis is as LRU cache. It can store data inmemory as Memcached or can be tuned to be durable flushing data to a log file that can be replayed if the node restarts.The various policies when maxmemory is reached allows creating caches for specific needs: volatile-lru remove a key among the ones with an expire set, trying to remove keys not recently used. volatile-ttl remove a key among the ones with an expire set, trying to remove keys with short remaining time to live. volatile-random remove a random key among the ones with an expire set. allkeys-lru like volatile-lru, but will remove every kind of key, both normal keys or keys with an expire set. allkeys-random like volatile-random, but will remove every kind of keys, both normal keys and keys with an expire set. Once your Redis server is configured with the right policies and running, the operation you need to do are SET and GET: SET keyOne valueOne Interap pub/sub with Redis Camel has various components for interacting between routes: direct: provides direct, synchronous invocation in the same camel context. seda: asynchronous behavior, where messages are exchanged on a BlockingQueue, again in the same camel context. vm: asynchronous behavior like seda, but also supports communication across CamelContext as long as they are in the same JVM. Complex applications usually consist of more than one standalone Camel instances running on separate machines. For this kind of scenarios, Camel provides jms, activemq, combination of AWS SNS with SQS, for messaging between instances. Redis has a simpler solution for the Publish/Subscribe messaging paradigm. Subscribers subscribes to one or more channels, by specifying the channel names or using pattern matching for receiving messages from multiple channels. Then the publisher publishes the messages to a channel, and Redis makes sure it reaches all the matching subscribers. PUBLISH testChannel Test Message Other usages Guaranteed Delivery: Camel supports this EIP using JMS, File, JPA and few other components. Here Redis can be used as lightweight key-value persistent store with its transaction support. The Claim Check from the EIP patterns allows you to replace message content with a claim check (a unique key), which can be used to retrieve the message content at a later time. The message content can be stored temporarily in Redis. Redis is also very popular for implementing counters, leaderboards, tagging systems and many more functionalities. Now, with two swiss army knives under your belt, the integrations to make are limited only by your imagination.
February 20, 2013
by Bilgin Ibryam
· 10,868 Views
article thumbnail
CPU Cache Flushing Fallacy
Even from highly experienced technologists I often hear talk about how certain operations cause a CPU cache to "flush". This seems to be illustrating a very common fallacy about how CPU caches work, and how the cache sub-system interacts with the execution cores. In this article I will attempt to explain the function CPU caches fulfil, and how the cores, which execute our programs of instructions, interact with them. For a concrete example I will dive into one of the latest Intel x86 server CPUs. Other CPUs use similar techniques to achieve the same ends. Most modern systems that execute our programs are shared-memory multi-processor systems in design. A shared-memory system has a single memory resource that is accessed by 2 or more independent CPU cores. Latency to main memory is highly variable from 10s to 100s of nanoseconds. Within 100ns it is possible for a 3.0GHz CPU to process up to 1200 instructions. Each Sandy Bridge core is capable of retiring up to 4 instructions-per-cycle (IPC) in parallel. CPUs employ cache sub-systems to hide this latency and allow them to exercise their huge capacity to process instructions. Some of these caches are small, very fast, and local to each core; others are slower, larger, and shared across cores. Together with registers and main-memory, these caches make up our non-persistent memory hierarchy. Next time you are developing an important algorithm, try pondering that a cache-miss is a lost opportunity to have executed ~500 CPU instructions! This is for a single-socket system, on a multi-socket system you can effectively double the lost opportunity as memory requests cross socket interconnects. Memory Hierarchy Figure 1. For the circa 2012 Sandy Bridge E class servers our memory hierarchy can be decomposed as follows: Registers: Within each core are separate register files containing 160 entries for integers and 144 floating point numbers. These registers are accessible within a single cycle and constitute the fastest memory available to our execution cores. Compilers will allocate our local variables and function arguments to these registers. When hyperthreading is enabled these registers are shared between the co-located hyperthreads. Memory Ordering Buffers (MOB): The MOB is comprised of a 64-entry load and 36-entry store buffer. These buffers are used to track in-flight operations while waiting on the cache sub-system. The store buffer is a fully associative queue that can be searched for existing store operations, which have been queued when waiting on the L1 cache. These buffers enable our fast processors to run asynchronously while data is transferred to and from the cache sub-system. When the processor issues asynchronous reads and writes then the results can come back out-of-order. The MOB is used to disambiguate the ordering for compliance to the published memory model. Level 1 Cache: The L1 is a core-local cache split into separate 32K data and 32K instruction caches. Access time is 3 cycles and can be hidden as instructions are pipelined by the core for data already in the L1 cache. Level 2 Cache: The L2 cache is a core-local cache designed to buffer access between the L1 and the shared L3 cache. The L2 cache is 256K in size and acts as an effective queue of memory accesses between the L1 and L3. L2 contains both data and instructions. L2 access latency is 12 cycles. Level 3 Cache: The L3 cache is shared across all cores within a socket. The L3 is split into 2MB segments each connected to a ring-bus network on the socket. Each core is also connected to this ring-bus. Addresses are hashed to segments for greater throughput. Latency can be up to 38 cycles depending on cache size. Cache size can be up to 20MB depending on the number of segments, with each additional hop around the ring taking an additional cycle. The L3 cache is inclusive of all data in the L1 and L2 for each core on the same socket. This inclusiveness, at the cost of space, allows the L3 cache to intercept requests thus removing the burden from private core-local L1 & L2 caches. Main Memory: DRAM channels are connected to each socket with an average latency of ~65ns for socket local access on a full cache-miss. This is however extremely variable, being much less for subsequent accesses to columns in the same row buffer, through to significantly more when queuing effects and memory refresh cycles conflict. 4 memory channels are aggregated together on each socket for throughput, and to hide latency via pipelining on the independent memory channels. NUMA: In a multi-socket server we have non-uniform memory access. It is non-uniform because the required memory maybe on a remote socket having an additional 40ns hop across the QPI bus. Sandy Bridge is a major step forward for 2-socket systems over Westmere and Nehalem. With Sandy Bridge the QPI limit has been raised from 6.4GT/s to 8.0GT/s, and two lanes can be aggregated thus eliminating the bottleneck of the previous systems. For Nehalem and Westmere the QPI link is only capable of ~40% the bandwidth that could be delivered by the memory controller for an individual socket. This limitation made accessing remote memory a choke point. In addition, the QPI link can now forward pre-fetch requests which previous generations could not. Associativity Levels Caches are effectively hardware based hash tables. The hash function is usually a simple masking of some low-order bits for cache indexing. Hash tables need some means to handle a collision for the same slot. The associativity level is the number of slots, also known as ways or sets, which can be used to hold a hashed version of an address. Having more levels of associativity is a trade off between storing more data vs. power requirements and time to search each of the ways. For Sandy Bridge the L1 and L2 are 8-way and the L3 is 12-way associative. Cache Coherence With some caches being local to cores, we need a means of keeping them coherent so all cores can have a consistent view of memory. The cache sub-system is considered the "source of truth" for mainstream systems. If memory is fetched from the cache it is never stale; the cache is the master copy when data exists in both the cache and main-memory. This style of memory management is known as write-back whereby data in the cache is only written back to main-memory when the cache-line is evicted because a new line is taking its place. An x86 cache works on blocks of data that are 64-bytes in size, known as a cache-line. Other processors can use a different size for the cache-line. A larger cache-line size reduces effective latency at the expense of increased bandwidth requirements. To keep the caches coherent the cache controller tracks the state of each cache-line as being in one of a finite number of states. The protocol Intel employs for this is MESIF, AMD employs a variant know as MOESI. Under the MESIF protocol each cache-line can be in 1 of the 5 following states: Modified: Indicates the cache-line is dirty and must be written back to memory at a later stage. When written back to main-memory the state transitions to Exclusive. Exclusive: Indicates the cache-line is held exclusively and that it matches main-memory. When written to, the state then transitions to Modified. To achieve this state a Request-For-Ownership (RFO) message is sent which involves a read plus an invalidate broadcast to all other copies. Shared: Indicates a clean copy of a cache-line that matches main-memory. Invalid: Indicates an unused cache-line. Forward: Indicates a specialised version of the shared state i.e. this is the designated cache which should respond to other caches in a NUMA system. To transition from one state to another, a series of messages are sent between the caches to effect state changes. Previous to Nehalem for Intel, and Opteron for AMD, this cache coherence traffic between sockets had to share the memory bus which greatly limited scalability. These days the memory controller traffic is on a separate bus. The Intel QPI, and AMD HyperTransport, buses are used for cache coherence between sockets. The cache controller exists as a module within each L3 cache segment that is connected to the on-socket ring-bus network. Each core, L3 cache segment, QPI controller, memory controller, and integrated graphics sub-system are connected to this ring-bus. The ring is made up of 4 independent lanes for: request, snoop, acknowledge, and 32-bytes data per cycle. The L3 cache is inclusive in that any cache-line held in the L1 or L2 caches is also held in the L3. This provides for rapid identification of the core containing a modified line when snooping for changes. The cache controller for the L3 segment keeps track of which core could have a modified version of a cache-line it owns. If a core wants to read some memory, and it does not have it in a Shared, Exclusive, or Modified state; then it must make a read on the ring bus. It will then either be read from main-memory if not in the cache sub-systems, or read from L3 if clean, or snooped from another core if Modified. In any case the read will never return a stale copy from the cache sub-system, it is guaranteed to be coherent. Concurrent Programming If our caches are always coherent then why do we worry about visibility when writing concurrent programs? This is because within our cores, in their quest for ever greater performance, data modifications can appear out-of-order to other threads. There are 2 major reasons for this. Firstly, our compilers can generate programs that store variables in registers for relatively long periods of time for performance reasons, e.g. variables used repeatedly within a loop. If we need these variables to be visible across cores then the updates must not be register allocated. This is achieved in C by qualifying a variable as "volatile". Beware that C/C++ volatile is inadequate for telling the compiler to order other instructions. For this you need fences/barriers. The second major issue with ordering we have to be aware of is a thread could write a variable and then, if it reads it shortly after, could see the value in its store buffer which may be older than the latest value in the cache sub-system. This is never an issue for algorithms following the Single Writer Principle but is an issue for the likes of the Dekker and Peterson lock algorithms. To overcome this issue, and ensure the latest value is observed, the thread must wait for the store buffer to drain on that core. This can be achieved by issuing a fence instruction. The write of a volatile variable in Java, in addition to never being register allocated, is accompanied by a full fence instruction. This fence instruction on x86 has a significant performance impact by preventing progress on the issuing thread until the store buffer is drained. Fences on other processors can have more efficient implementations that simply put a marker in the store buffer for the search boundary, e.g. the Azul Vega does this. If you want to ensure memory ordering across Java threads when following the Single Writer Principle, and avoid the store fence, it is possible by using the j.u.c.Atomic(Int|Long|Reference).lazySet() method, as opposed to setting a volatile variable. The Fallacy Returning to the fallacy of "flushing the cache" as part of a concurrent algorithm. I think we can safely say that we never "flush" the CPU cache within our user space programs. I believe the source of this fallacy is the need to flush, mark or drain to a point, the store buffer for some classes of concurrent algorithms so the latest value can be observed on a subsequent load operation. For this we require a memory ordering fence and not a cache flush. Another possible source of this fallacy is that L1 caches, or the TLB, may need to be flushed based on address indexing policy on a context switch. ARM, previous to ARMv6, did not use address space tags on TLB entries thus requiring the whole L1 cache to be flushed on a context switch. Many processors require the L1 instruction cache to be flushed for similar reasons, in many cases this is simply because instruction caches are not required to be kept coherent. The bottom line is, context switching is expensive and a bit off topic, so in addition to the cache pollution of the L2, a context switch can also cause the TLB and/or L1 caches to require a flush. Intel x86 processors require only a TLB flush on context switch.
February 15, 2013
by Martin Thompson
· 11,460 Views · 3 Likes
article thumbnail
Synchronising Multithreaded Integration Tests
Testing threads is hard, very hard and this makes writing good integration tests for multithreaded systems under test... hard. This is because in JUnit there's no built in synchronisation between the test code, the object under test and any threads. This means that problems usually arise when you have to write a test for a method that creates and runs a thread. One of the most common scenarios in this domain is in making a call to a method under test, which starts a new thread running before returning. At some point in the future when the thread's job is done you need assert that everything went well. Examples of this scenario could include asynchronously reading data from a socket or carrying out a long and complex set of operations on a database. For example, the ThreadWrapper class below contains a single public method: doWork(). Calling doWork() sets the ball rolling and at some point in the future, at the discretion of the JVM, a thread runs adding data to a database. public class ThreadWrapper { /** * Start the thread running so that it does some work. */ public void doWork() { Thread thread = new Thread() { /** * Run method adding data to a fictitious database */ @Override public void run() { System.out.println("Start of the thread"); addDataToDB(); System.out.println("End of the thread method"); } private void addDataToDB() { // Dummy Code... try { Thread.sleep(4000); } catch (InterruptedException e) { e.printStackTrace(); } } }; thread.start(); System.out.println("Off and running..."); } } A straightforward test for this code would be to call the doWork() method and then check the database for the result. The problem is that, owing to the use of a thread, there's no co-ordination between the object under test, the test and the thread. A common way of achieving some co-ordination when writing this kind of test is to put some kind of delay in between the call to the method under test and checking the results in the database as demonstrated below: public class ThreadWrapperTest { @Test public void testDoWork() throws InterruptedException { ThreadWrapper instance = new ThreadWrapper(); instance.doWork(); Thread.sleep(10000); boolean result = getResultFromDatabase(); assertTrue(result); } /** * Dummy database method - just return true */ private boolean getResultFromDatabase() { return true; } } In the code above there is a simple Thread.sleep(10000) between two method calls. This technique has the benefit of being incredabile simple; however it's also very risky. This is because it introduces a race condition between the test and the worker thread as the JVM makes no guarantees about when threads will run. Often it'll work on a developer's machine only to fail consistently on the build machine. Even if it does work on the build machine it atificially lengthens the duration of the test; remember that quick builds are important. The only sure way of getting this right is to synchronise the two different threads and one technique for doing this is to inject a simple CountDownLatch into the instance under test. In the example below I've modified the ThreadWrapper class's doWork() method adding the CountDownLatch as an argument. public class ThreadWrapper { /** * Start the thread running so that it does some work. */ public void doWork(final CountDownLatch latch) { Thread thread = new Thread() { /** * Run method adding data to a fictitious database */ @Override public void run() { System.out.println("Start of the thread"); addDataToDB(); System.out.println("End of the thread method"); countDown(); } private void addDataToDB() { try { Thread.sleep(4000); } catch (InterruptedException e) { e.printStackTrace(); } } private void countDown() { if (isNotNull(latch)) { latch.countDown(); } } private boolean isNotNull(Object obj) { return latch != null; } }; thread.start(); System.out.println("Off and running..."); } } he Javadoc API describes a count down latch as: A synchronization aid that allows one or more threads to wait until a set of operations being performed in other threads completes. A CountDownLatch is initialized with a given count. The await methods block until the current count reaches zero due to invocations of the countDown() method, after which all waiting threads are released and any subsequent invocations of await return immediately. This is a one-shot phenomenon -- the count cannot be reset. If you need a version that resets the count, consider using a CyclicBarrier. A CountDownLatch is a versatile synchronization tool and can be used for a number of purposes. A CountDownLatch initialized with a count of one serves as a simple on/off latch, or gate: all threads invoking await wait at the gate until it is opened by a thread invoking countDown(). A CountDownLatchinitialized to N can be used to make one thread wait until N threads have completed some action, or some action has been completed N times. A useful property of a CountDownLatch is that it doesn't require that threads calling countDown wait for the count to reach zero before proceeding, it simply prevents any thread from proceeding past an await until all threads could pass. The idea here is that the test code will never check the database for the results until the run() method of the worker thread has called latch.countdown(). This is because the test code thread is blocking at the call to latch.await(). latch.countdown() decrements latch's count and once this is zero the blocking call the latch.await() returns and the test code continues executing, safe in the knowledge that any results which should be in the database, are in the database. The test can then retrieve these results and make a valid assertion. Obviously, the code above merely fakes the database connection and operations. The thing is you may not want to, or need to, inject a CountDownLatch directly into your code; after all it's not used in production and it doesn't look particularly clean or elegant. One quick way around this is to simply make the doWork(CountDownLatch latch) method package private and expose it through a public doWork() method. public class ThreadWrapper { /** * Start the thread running so that it does some work. */ public void doWork() { doWork(null); } @VisibleForTesting void doWork(final CountDownLatch latch) { Thread thread = new Thread() { /** * Run method adding data to a fictitious database */ @Override public void run() { System.out.println("Start of the thread"); addDataToDB(); System.out.println("End of the thread method"); countDown(); } private void addDataToDB() { try { Thread.sleep(4000); } catch (InterruptedException e) { e.printStackTrace(); } } private void countDown() { if (isNotNull(latch)) { latch.countDown(); } } private boolean isNotNull(Object obj) { return latch != null; } }; thread.start(); System.out.println("Off and running..."); } } The code above uses Google's Guava @VisibleForTesting annotation to tell us that the doWork(CountDownLatch latch) method visibility has been relaxed slightly for testing purposes. Now I realise that making a method call package private for testing purposes in highly controversial; some people hate the idea, whilst others include it everywhere. I could write a whole blog on this subject (and may do one day), but for me it should be used judiciously, when there's no other choice, for example when you're writing characterisation tests for legacy code. If possible it should be avoided, but never ruled out. After all tested code is better than untested code. With this in mind the next iteration of ThreadWrapper designs out the need for a method marked as @VisibleForTesting together with the need to inject a CountDownLatch into your production code. The idea here is to use the Strategy Pattern and separate the Runnable implementation from the Thread. Hence, we have a very simple ThreadWrapper public class ThreadWrapper { /** * Start the thread running so that it does some work. */ public void doWork(Runnable job) { Thread thread = new Thread(job); thread.start(); System.out.println("Off and running..."); } } and a separate job: public class DatabaseJob implements Runnable { /** * Run method adding data to a fictitious database */ @Override public void run() { System.out.println("Start of the thread"); addDataToDB(); System.out.println("End of the thread method"); } private void addDataToDB() { try { Thread.sleep(4000); } catch (InterruptedException e) { e.printStackTrace(); } } } You'll notice that the DatabaseJob class doesn't use a CountDownLatch. How is it synchronised? The answer lies in the test code below... public class ThreadWrapperTest { @Test public void testDoWork() throws InterruptedException { ThreadWrapper instance = new ThreadWrapper(); CountDownLatch latch = new CountDownLatch(1); DatabaseJobTester tester = new DatabaseJobTester(latch); instance.doWork(tester); latch.await(); boolean result = getResultFromDatabase(); assertTrue(result); } /** * Dummy database method - just return true */ private boolean getResultFromDatabase() { return true; } private class DatabaseJobTester extends DatabaseJob { private final CountDownLatch latch; public DatabaseJobTester(CountDownLatch latch) { super(); this.latch = latch; } @Override public void run() { super.run(); latch.countDown(); } } } The test code above contains an inner class DatabaseJobTester, which extends DatabaseJob. In this class the run() method has been overridden to include a call to latch.countDown() after our fake database has been updated via the call to super.run(). This works because the test passes a DatabaseJobTester instance to the doWork(Runnable job) method adding in the required thread testing capability. The idea of sub-classing objects under test is something I've mentioned before in one of my blogs on testing techniques and is a really powerful technique. So, to conclude: Testing threads is hard. Testing anonymous inner classes is almost impossible. Using Thead.sleep(...) is a risky idea and should be avoided. You can refactor out these problems using the Strategy Pattern. Programming is the Art of Making the Right Decision ...and that relaxing a method's visibility for testing may or may not be a good idea, but more on that later... The code above is available on Github in the captain debug repository (git://github.com/roghughe/captaindebug.git) under the unit-testing-threads project.
February 13, 2013
by Roger Hughes
· 13,912 Views · 12 Likes
article thumbnail
Eclipse Workspace Tips
usually, one of the first things i see if i launch eclipse is this dialog: select a workspace dialog actually, that ‘workspace’ thing is one of the most important things in eclipse to understand. to mess around it can cause a lot of pain. so i have collected some ‘lessons learned’ around workspaces. the workspace .metadata folder the workspace is where eclipse stores that .metadata folder: workspace metadata folder in this folder, eclipse stores all the workspace settings or preferences i configure e.g. using the menu window > preferences . sometimes that .metadata folder is named ‘framework’ too. e.g. if i’m are asked by eclipse to store some settings in the ‘frame work’ then this means it will be stored in the .metadata. eclipse uses this folder as well to store internal files and data structures. and many plugins store their settings in here to. consider the content of this folder as a ‘black box’: so do not change it, do not touch it unless you *really* know what you are doing! do not copy or move that folder. if you want to copy/share your workspace settings, then do *not* copy the .metadata folder, as typically you cannot use this folder on another machine or for another user. if you want to transfer/copy your settings, then see this post . workspace and eclipse versions as eclipse stores information in the .metadata workspace structure, the data/format might be different from version of eclipse to another (e.g. from one version of codewarrior to another). while using the same workspace with different versions of eclipse might work, it is *not* recommended. the eclipse community tries hard to keep things compatible, but using a different workspace for different eclipse versions is what i recommend. i started to name my workspace(s) like ‘wsp_lecture_10.2′ or ‘wsp_lecture_10.3′ to show that i’m using it for a specific version of codewarrior. workspace and projects this leads to the question: “do i have to duplicate then my projects if using with different versions of eclipse in parallel?” the answer is ‘no’. because the workspace folder does *not* have to have the projects in it (as folders). they can, but it is not needed. for example i have different workspaces (“wsp_10.3″, “wsp_10.2″), but my projects are in the “projects” folder somewhere else on my disk. what i do is to import the projects into each workspace, keep the projects in their original folder location. the menu file > import > general > existing projects into workspace can be used, with ‘copy projects into workspace’ * unchecked *: importing projects into workspace an easy trick is to drag&drop the project folders into eclipse: that’s much faster and simpler in my view than using above dialog. tip: showing the current workspace in the title bar using multiple workspaces can be confusing at some time. see this post how you can show the workspace in the application title: workspace shown in title bar processor expert processor expert has a special setting in the workspace pointing to its ‘data base’. that data base is inside the installation folder, in the mcu\processorexpert folder. if a launch eclipse and use a workspace from a different installation, i get a warning dialog: processor expert workspace warning: current worksapce is configured to use data from another installation of processor expert. that path setting of processor expert (pointing to the installation folder) is in my view the biggest argument to *not* share a workspace between different versions of eclipse. pressing the ‘open preferences’ opens the settings, and with ‘restore defaults’ it will (after a restart of eclipse) use the new installation path: processor expert directory that warning might not come up if using multiple installations of codewarrior. it seems that as long there is a valid path to a processor expert data base, the warning might not show up, and it will use that data base. that can lead to weird behaviour, so better check that the path in the workspace setting is pointing to the right folder. workspace dialog on startup remember that dialog at the beginning of this post? if i want to change if (and what) is shown at eclipse startup, then this is configured with the menu window > preferences > general > startup and shutdown : startup and workspaces in this dialog i can remove items from the list (e.g. if a workspace folder does not exist any more). tabula rasa as eclipse stores a lot of information in the workspace .metadata, that folder can grow to a substantial size (several hundreds of mbytes, depending on usage and configuration). one reason is because eclipse stores local history and undo information into that .metadata folder. just have a look at .metadata\.plugins\org.eclipse.core.resources\.history so from time to time (especially if i think that eclipse is slowing down), i export my workspace settings ( file > export > general > preferences , see copy my workspace settings ) and re-import it again into the new workspace. summary eclipse stores all its workspace settings and files in the .metadata folder. never touch/copy/change/move the .metadata folder. use different workspaces for each eclipse version. store the projects outside of the workspace if you want to share them across different eclipse versions. happy workspacing
February 11, 2013
by Erich Styger
· 100,414 Views · 3 Likes
article thumbnail
Top 10 Customization of Eclipse Settings
the great thing with eclipse is that you can configure a lot. in general, i’m happy with most of the defaults in eclipse and codewarrior. here are my top 10 things i change in eclipse to make it even better: add -showlocation to the eclipse startup command line: show workspace location in the title bar disable the heuristik settings for the indexer : fixing the eclipse index disable build (if required) for the debugger: speeding up the debug launch in codewarrior using spaces and not tabs: spaces vs. tabs in eclipse highlight the selected line : color makes the difference! configuring more hovers : hovering and debugging enabling static software analysis : free static code analysis with eclipse processor expert expert settings: enabling the expert level in processor expert custom dictionary settings: eclipse spell checker show line numbers : eclipse and line numbers i don’t have to waste time to change the settings for each of my workspaces: after i have changed the settings, i can simply export and import them again into another workspace: change my preferences using the menu window > preferences use the menu file > export > general > preferences and save the settings in a file: file export switch to the new workspace use the menu file > import > general > preferences to import the settings from the file: file import happy customizing
February 10, 2013
by Erich Styger
· 40,105 Views
article thumbnail
How Do You Organise Maven Sub-Modules?
Being an itinerant programmer one of the things I've noticed over the years is that every project you come across seems to have a slightly different way of organising its Maven modules. There seems to be no conventional way of characterising the contents of a project's sub-modules and not that much discussion on it either. This is strange, as defining the responsibilities of your Maven modules seems to me to be as critical as good class design and coding technique to a project's success. So, in light of this dearth of wisdom, here's my two penneth worth... When you first come across a new project, you'll generally find a layout convention that vaguely matches that defined by the Better Builds With Maven manual. The 'clean' project directory usually contains a POM file, a src folder and several sub-modules, each in their own subdirectory, as shown in the diagram below: If we all agree that this is the standard way of approaching the top level of project layout (and I have seen it done slightly differently) then there seems to be three different approaches taken when organising the responsibilities of each of a project's sub-modules. These are: Totally haphazardly. By class type. By functional area. I'm not going to linger on those projects that are organised seemingly without any structure or order except to say that they probably started off well organised but were not designed well enough to endure the changes forced upon them. In saying that a project's sub-modules are organised 'by class type', I mean that modules are used to group together all classes that comprise, but are not limited to, a layer in the program's architecture. For example a module could contain all classes that make up the program's service or persistence layers or a module could contain all model class (i.e. beans). Conversely, in saying that a project's sub-modules are organised by functional area I'm talking about a situation where each module contains, as close as possible, a vertical slice of the application, including model beans, service layer, controllers etc. If the truth be told then there are any number of ways to organise your project's sub-modules. Most project set-ups are fairly flat in structure, which is what I've demonstrated above; however, if you take a look at Erik Putrycz's 2009 talk Maven – or how to automate java builds, tests and version management with open source tools, he demonstrates that you can have modules within modules within modules. In order to explore this a little further, I'm going to invent my usual preposterously contrived scenario and in this scenario, you've got to write a program for a Welsh dental practice owned by a man called Jones also known locally as 'Jones The Driller'. The requirements would be pretty standard, I suspect, for a dental practice and would include handling: Patients details: name, address, DOB, phone number etc. Medical records, including treatments and outcomes. Appointments. Accounting, e.g. sales, purchase, wages etc. Auditing: as in who did what to whom... As a solution to Jones The Driller's problem, you propose that you write a multi-module web application based upon Spring, MVC and tomcat that, when assembled, has a standard 'n' tier design of a mySQL database, a database layer, service layer, a set of controllers and some JSPs that comprise the view. In creating your project your idea is to organise your sub-modules 'by class type' and you come up with the following module organisation, shown below roughly in build dependency order dentists-model dentists-utils dentist-repository dentists-services dentists-controllers dentists-web ...which on your screen looks something like this: Your dentists-model module contains the project's beans that model object used from the persistence layer right up to the JSPs. dentists-repository, dentists-services and dentists-controllers reflect the various layers of your application, with dentists-web module containing all the JSPs, CSS and other view paraphernalia. As for dentists-utils, well every project has a utils module where all the really useful, but disparate classes end up. Meanwhile, in a different universe, a different version of you decides to organise your project's sub-modules by functional area and you come up with the following breakdown: dentists-utils dentists-audit dentists-user-details dentists-medical-records dentists-appointments dentists-accounts dentists-repository dentists-integration dentists-web In this scenario, the build order is somewhat different; virtually all modules will depend upon dentists-utils and, depending upon your exact audit requirements, most modules will rely upon dentists-audit. You can also see in the following images that the sub-module package structure has been arranged on layer and type boundaries in that each module has its own model, repository (which contains interface definitions only) services and controller packages and that the layout of each module is identical at the top level. Another discussion to have here is the organisation of your project's package structure, where you can ask the same kind of questions: do you organise 'by class type' or 'by functional area' as shown above? You may have noticed that the dentists-repository modules can be fairly near the end of the build cycle as it only contains the implementation of the repository classes and not their interface definitions. You may have also noticed that dentists-web is again a separate module. This is because you're a pretty savvy business guy and in keeping the JSPs etc. in their own module, you hope to re-skin your app and sell it to that other Welsh dentist down the road: Williams The Puller. From a test perspective, each module contains its own unit tests, but there's a separate integration test module that, as it'll take longer to execute can be run when required. There are generally two ways of defining integration tests: firstly by putting them in their own module, which I prefer, and secondly by using a integration test naming convention such as somefileIT.java, and running all *IT.java files separately to all *Test.java files. Your two identical selves have proposed two different solutions to the same problem, so I guess that it's now time to takes a look at the pros and cons of each. Taking the 'by class type' solution first, what can be said about it? On the plus side, it's pretty maintainable in that you always known where to find stuff. Need a service class? Then that's in the dentist-service module. Also, the build order is very straight forward. On the down side, organisation 'by class type' is prone to problems with circular dependencies creeping in and classes with totally different responsibilities are all mixed up together making it difficult to re-use functionality in other projects without unnecessarily dragging in the who shebang. So, what about the pros and cons of the 'by functional area' approach? To my way of thinking, given the package structure of each module, it's just about as easy to locate a class using this technique as it is when using 'by class type'. The real benefit of using this approach is that it's far simpler to re-use a functional area of code in other projects. For example, I've worked on many projects in different companies and have implemented auditing several times. Each time I implement it I usually do it in roughly the same way, so wouldn't it be good just to reuse the first implementation? Not withstanding code ownership issues... The same idea also applies to dentists-user-details; the requirement to manage names and addresses applies equally as well to a shoe sales web site as it does a dental practice. And the downside? One of the benefits of this approach is that the modules are highly decoupled, but from experience no matter how hard you try, you always end up with more coupling that you'd like. You may have already spotted that both of these proposals are not 100% pure; 'by class type' contains a bit of 'by functionality' and conversely 'by functional area' contains a couple of 'by class type' modules. This may be avoidable, but I'm purposely being pragmatic. As I said earlier you always see a utils module in a project. Furthermore creating a separate database module allows you to change your project's database implementation fairly easily, which may make testing easier in some circumstances and likewise, having a separate web module allows you to re-skin your code should you be lucky enough to sell the same product to multiple customers with their own branding. Finally, one of the unwritten truths in software development is that once you've organised your project into its sub-modules you'll rarely get the opportunity to reorganise and improve them: there usually isn't the time or the political will as doing so costs money; however, it should be remembered that, in Agile terms, project module composition is, like code, a form of technical debt, which if done badly also costs you a lot of cash. It therefore seems a really good idea, as a team, to plan out your project thoroughly before starting to code. So be radical, do some design or have a meeting, you know it'll be worth it in the end.
February 8, 2013
by Roger Hughes
· 28,627 Views · 1 Like
article thumbnail
Eclipse Spell Checker
one of the nice things of modern ide’s are: they offer many extras for free. many times it is related to programming and coding. but i love as well the ones which makes things easier and better which is not directly related to the executed code. one thing eclipse offers is an on-the-fly spell-checking, similar to microsoft word: spellchecked sources hovering over the text offers me to correct the flagged error: initialization vs. initialisation but wait: is that example not spelled correctly? and indeed, eclipse offers to customize the spell checking. the option page is in the windows > preferences > general > editors > text editors > spelling page: spelling preferences ‘initialization’ vs. ‘initialisation’: that’s an ‘english us’ vs. english uk’ thing, and is easily changed. and i prefer the us english: changing dictionary with this, everything is ok now: not flagged any more after changing the platform dictionary, it usually takes a few minutes until the sources are checked again. but what if eclipse does not know a word? then it offers to add it to a dictionary: adding to the dictionary if i do not have a user dictionary yet, it will prompt a dialog: missing user dictionary if pressing ‘yes’, it will prompt the settings page from above where i can specify my user dictionary file: user defined dictionary the user dictionary is a normal text file with one word on each line. that makes it easy to edit and to have it in a version control system. i have one common dictionary file for all my workspaces. but of course it is possible to have different dictionaries per workspace, as the settings are per workspace too. summary i feel having reasonable spelled comments in the sources is just something an engineer should care about. and the eclipse spelling engine does not have to be as good as the one in ms word (which is pretty good in my view). but for making sources better something like correctly spelled comments is a plus. but only if the code works like a charm :mrgreen: happy spelling
February 6, 2013
by Erich Styger
· 15,319 Views
article thumbnail
Local WebHooks with Mule Cloud Connect and LocalTunnel v2
When using an external API for WebHooks or Callbacks as discussed in Chapters 3 and 5 of Getting Started with Mule Cloud Connect; The API provider running somewhere out there on the web needs to callback your application that is happily running in isolation on your local machine. For an API provider to callback your application, the application must be accessible over the web. Sure, you could upload and test your application on a public facing server, but you may find it quicker and easier to work on your local development machine and these are typically behind firewalls, NAT, or otherwise not able to provide a public URL. You need a way to make your local application available over the web. There are a few good services and tools out there to help with this. Examples include ProxyLocal, and Forward.io. Alternatively, you can set up your own reverse SSH Tunnel if you already have a remote system to forward your requests, but this is cumbersome to say the least. I find Localtunnel to be an excellent fit for this need and localtunnel have just recently released v2 of its service with a host of new features and enhancements. More information can be found here: http://progrium.com/blog/2012/12/25/localtunnel-v2-available-in-beta/ Installing Localtunnel Those familiar with version 1 of the service will know that the v1 Localtunnel client was written in Ruby and required Rubygems to install it. The v2 client is now written in Python and can instead be installed via easy_install or pip. If instead you're interested in using Localtunnel v1, then I have wrote a previous blog post on the subject here: http://blogs.mulesoft.org/connector-callback-testing-local/ To get started, you will first need to check that you have Python installed. Localtunnel requires Python 2.6 or later. Most systems come with Python installed as standard, but if not you can check via the following command: $ python -version More info on installing Python can be found here: http://wiki.python.org/moin/BeginnersGuide/Download Once complete, you will need easy_install to install the Localtunnel client.If you don't have easy_install after you install Python, you can install it with this bootstrap script: $ curl http://peak.telecommunity.com/dist/ez_setup.py | python Once complete, you can install the Localtunnel client using the following command: $ easy_install localtunnel First run with LocalTunnel Once installed, creating a tunnel is as simple as running the following command: $ localtunnel-beta 8082 The parameter after the command: "8000" is the local port we want Localtunnel to forward to. So whatever port your app is running on should replace this value. Each time you run the command you should get output similar to the following: Port 8082 is now accessible from http://fb0322605126.v2.localtunnel.com ... Note: As v2 is still in beta; the command local-tunnel-beta will eventually be installed as just localtunnel. This lets you keep the v1 just in case anything goes wrong with v2 during the beta. Configuring the Connector Now onto Mule! To demonstrate I will use the Twilio Cloud Connector example from Chapter 5. Twilio has an awesome WebHook implementation with great debugging tools. Twilio uses callbacks to tell you about the status of your requests; When you use Twilio to a place a phone call or send an SMS the Twilio API allows you to send a URL where you'll receive information about the phone call once it ends or the status of the outbound SMS message after it's processed. This example uses the Twilio Cloud Connector to send a simple SMS message. The most important thing to note is that the "status-callback-flow-ref" attribute. All connector operations that support callback's will have an optional attribute ending in "-flow-ref". In this case : "status-callback-flow-ref". As the name suggests, this attribute should reference a flow. This value must be a valid flow id from within your configuration. It is this flow that will be used to listen for the callback. Notice that the flow has no inbound endpoint? This is where the magic happens; when Twilio process the SMS message it will send a callback automatically to that flow without you having to define an inbound endpoint. The connector automatically generates an inbound endpoint and sends the auto generated URL to Twilio for you. Customizing the Callback The URL generated for the callback URL is built using 'localhost' as the host, the 'http.port' environment variable or 'localPort' value as the port and the path of the URL is typically just a random generated string or static value. So if I run this locally it would send Twilio my non public address, something like: http://localhost:80/...vv3v3er342fvvn. Each connector that accepts HTTP callbacks will provide you with an optional http-callback-config child element to override these settings. These settings can be set at the connector's config level as follows: Here we have amended the previous example to add the additonal http-callback-config configuration. The configuration takes three additional arguments: domain, localPort and remotePort. These settings will be used to constuct the URL that is passed to the external system. The URL will be the same as the default generated URL of the HTTP inbound-endpoint except that the host is replaced by the 'domain' setting (or its default value) and the port is replaced by the 'remotePort' setting (or its default value). In this case we have used the domain from the URL that Localtunnel generated for us earlier: fb0322605126.v2.localtunnel.com and set the localPort to 8082 as we run the Localtunnel command using port 8082 and the remotePort to 80 as the localtunnel server just runs on port 80. And that's it! If you run this configuration you should start seeing your callback being printed to the console. The same goes for any OAuth connectors too. If your using any OAuth connectors built using the DevKit OAuth modules, you can configure the OAuth callback in a similar fashion. A full Mule/Twilio WebHook project can be found here: https://github.com/ryandcarter/GettingStarted-MuleCloudConnect-OReilly/tree/master/chapter05/twilio-webhooks
February 5, 2013
by Ryan Carter
· 4,920 Views
article thumbnail
Tutorial: Deploying an API on EC2 from AWS
Curator's Note: This article was co-authored by Andrzej Jarzyna. At 3scale we find Amazon to be a fantastic platform for running APIs due to the complete control you have on the application stack. For people new to AWS the learning curve is quite steep. So we put together our best practices into this short tutorial. Besides Amazon EC2 we will use the Ruby Grape gem to create the API interface and an Nginx proxy to handle access control. Best of all everything in this tutorial is completely FREE! For the purpose of this tutorial you will need a running API based on Ruby and Thin server. If you don’t have one you can simply clone an example repo as described below (in the “Deploying the Application” section). If you are interested in the background of this example (Sentiment API), you can see a couple of previous guides which 3scale has published. Here we use version_1 of the API(‘API up and running in 10 minutes‘) with some extra sentiment analysis functionality (this part is covered in the second tutorial of the Sentiment API tutorial). Now we will start the creation and configuration of the Amazon EC2 instance. If you already have an EC2 instance (micro or not), you can jump to the next step -> Preparing Instance for Deployment. Creating and configuring EC2 Instance Let’s start by signing up for the Amazon Elastic Compute Cloud (Amazon EC2). For our needs the free tier http://aws.amazon.com/free/ is enough, covering all the basic needs. Once the account is created go to the EC2 dashboard under your AWS Management Console and click on the Launch Instance button. That will transfer you to a popup window where you will continue the process: Choose the classic wizard Choose an AMI (Ubuntu Server 12.04.1 LTS 32bit, T1micro instance) leaving all the other settings for Instance Details as default Create a keypair and download it – this will be the key which you will use to make an ssh connection to the server, it’s VERY IMPORTANT! Add inbound rules for the firewall with source always 0.0.0.0/0 (HTTP, HTTPS, ALL ICMP, TCP port 3000 used by the Ruby thin server) Preparing Instance for Deployment Now, as we have the instance created and running, we can directly connect there from our console (Windows users from PuTTY). Right click on your instance, connect and choose Connect with a standalone SSH Client. Follow the steps and change the username to ubuntu (instead of root) in the given example. After executing this step you are connected to your instance. We will have to install new packages. Some of them require root credentials, so you will have to set a new root password: sudo passwd root. Then login as root: su root. Now with root credentials execute: sudo apt-get update and switch back to your normal user with exit command and install all the required packages: install some libraries which will be required by rvm, ruby and git: sudo apt-get install build-essential git zlib1g-dev libssl-dev libreadline-gplv2-dev imagemagick libxml2-dev libxslt1-dev openssl libreadline6 libreadline6-dev zlib1g libyaml-dev libxslt-dev autoconf libc6-dev ncurses-dev automake libtool bison libpq-dev libpq5 libeditline-dev install git (on Linux rather than from Source): http://www.git-scm.com/book/en/Getting-Started-Installing-Git install rvm: https://rvm.io/rvm/install/ install ruby rvm install 1.9.3 rvm use 1.9.3 --default Deploying the Application Our sample Sentiment API is located on Github. Try cloning the repository: git clone [email protected]:jerzyn/api-demo.git you can once again review the code and tutorial on creating and deploying this app here: http://www.3scale.net/2012/06/the-10-minute-api-up-running-3scale-grape-heroku-api-10-minutes/ and here http://www.3scale.net/2012/07/how-to-out-of-the-box-api-analytics/ note the changes (we are using only v1, as authentication will go through the proxy). Now you can deploy the app by issuing: bundle install. Now you can start the thin server: thin start. To access the API directly (i.e. without any security or access control) access: your-public-dns:3000/v1/words/awesome.json (you can find your-public-dns in the AWS EC2 Dashboard->Instances in the details window of your instance) For the Nginx integration you will have to create an elastic IP address. Inside the AWS EC2 dashboard create an elastic IP in the same region as your instance and associate that IP to it (you won’t have to pay anything for the elastic IP as long as it is associated with your instance in the same region). OPTIONAL: If you want to assign a custom domain to your amazon instance you will have to do one thing: add an A record to the DNS record of your domain mapping the domain to the elastic IP address you have previously created. Your domain provider should either give you some way to set the A record (the IPv4 address), or it will give you a way to edit the nameservers of your domain. If they do not allow you to set the A record directly, find a DNS management service, register your domain as a zone there and the service will give you the nameservers to enter in the admin panel of your domain provider. You can then add the A record for the domain. Some possible DNS management services include ZoneEdit (basic, free), Amazon route 53, etc. At this point you API is open to the world. This is good and bad – great that you are sharing, but bad in the sense that without rate limits a few apps could kill the resources of your server, and you have no insight into who is using your API and how it is being used. The solution is to add some management for your API… Enabling API Management with 3scale Rather than reinvent the wheel and implement rate limits, access controls and analytics from scratch we will leverage the handy 3scale API Management service. Get your free 3scale account, activate and log-in to the new instance through the provided links. The first time you log-in you can choose the option for some sample data to be created, so you will have some API keys to use later. Next you would probably like to go through the tour to get a glimpse on the system functionality (optional) and then start with the implementation. To get some instant results we will start with the sandbox proxy which can be used while in development. Then we will also configure an Nginx proxy which can scale up for full production deployments. There is some documentation on the configuration of the API proxy at 3scale: https://support.3scale.net/howtos/api-configuration/nginx-proxy and for more advanced configuration options here: https://support.3scale.net/howtos/api-configuration/nginx-proxy-advanced Once you sign into your 3scale account, Launch your API on the main Dashboard screen or Go to API->Select the service (API)->Integration in the sidebar->Proxy Set the address of of your API backend – this has to be the Elastic IP address unless the custom domain has been set, including http protocol and port 3000. Now you can save and turn on the sandbox proxy to test your API by hitting the sandbox endpoint (after creating some app credentials in 3scale): http://sandbox-endpoint/v1/words/awesome.json?app_id=APP_ID&app_key=APP_KEY where, APP_ID and APP_KEY are id and key of one of the sample applications which you created when you first logged into your 3scale account (if you missed that step just create a developer account and an application within that account). Try it without app credentials, next with incorrect credentials, and then once authenticated within and over any rate limits that you have defined. Only once it is working to your satisfaction do you need to download the config files for Nginx. Note: any time you have errors check whether you can access the API directly: your-public-dns:3000/v1/words/awesome.json. If that is not available, then you need to check if the AWS instance is running and if the Thin Server is running on the instance. Implement an Nginx Proxy for Access Control In order to streamline this step we recommend that you install the fantastic OpenResty web application that is basically a bundle of the standard Nginx core with almost all the necessary 3rd party Nginx modules built-in. Install dependencies: sudo apt-get install libreadline-dev libncurses5-dev libpcre3-dev perl Compile and install Nginx: cd ~ sudo wget http://agentzh.org/misc/nginx/ngx_openresty-1.2.3.8.tar.gz sudo tar -zxvf ngx_openresty-1.2.3.8.tar.gz cd ngx_openresty-1.2.3.8/ ./configure --prefix=/opt/openresty --with-luajit --with-http_iconv_module -j2 make sudo make install In the config file make the following changes: edit the .conf file from nginx download in line 28, which is preceded by info to change your server name put the correct domain (of your Elastic IP or custom domain name) in line 78 change the path to the .lua file, downloaded together with the .conf file. We are almost finished! Our last step is to start the NGINX proxy and put some traffic through it. If it is not running yet (remember, that thin server has to be started first), please go to your EC2 instance terminal (the one you were connecting through ssh before) and start it now: sudo /opt/openresty/nginx/sbin/nginx -p /opt/openresty/nginx/ -c /opt/openresty/nginx/conf/YOUR-CONFIG-FILE.conf The last step will be verifying that the traffic goes through with a proper authorization. To do that, access: http://your-public-dns/v1/words/awesome.json?app_id=APP_ID&app_key=APP_KEY where, APP_ID and APP_KEY are key and id of the application you want to access through the API call. Once everything is confirmed as working correctly, you will want to block public access to the API backend on port 3000, which bypasses any access controls. If encounter some problems with the Nginx configuration or need a more detailed guide, I encourage you to check the 3scale guide on configuring Nginx proxy: https://support.3scale.net/howtos/api-configuration/nginx-proxy. You can go completely wild with customization of your API gateway. If you want to dive more into the 3scale system configuration (like usage and monitoring of your API traffic) feel encouraged to browse our Quickstart guides and HowTo’s.
February 4, 2013
by Steven Willmott
· 17,798 Views
article thumbnail
Sending Keystrokes to Other Apps with Windows API and C#
Recently I had to tackle a task where I needed to send keystrokes to another application, that are initiated from a .NET Windows app.
February 1, 2013
by Denzel D.
· 48,389 Views · 1 Like
article thumbnail
JDBC Realm and Form Based Authentication with GlassFish 3.1.2.2 and Primefaces 3.4
One of the most popular posts on my blog is the short tutorial about the JDBC Security Realm and form based Authentication on GlassFish with Primefaces. After I received some comments about it that it isn't any longer working with latest GlassFish 3.1.2.2 I thought it might be time to revisit it and present an updated version. Here we go: Preparation As in the original tutorial I am going to rely on some stuff. Make sure to have a recent NetBeans 7.3 beta2 (which includes GlassFish 3.1.2.2) and the MySQL Community Server (5.5.x) installed. You should have verified that everything is up an running and that you can start GlassFish and the MySQL Server also is started. Some Basics A GlassFish authentication realm, also called a security policy domain or security domain, is a scope over which the GlassFish Server defines and enforces a common security policy. GlassFish Server is preconfigured with the file, certificate, and administration realms. In addition, you can set up LDAP, JDBC, digest, Oracle Solaris, or custom realms. An application can specify which realm to use in its deployment descriptor. If you want to store the user credentials for your application in a database your first choice is the JDBC realm. Prepare the Database Fire up NetBeans and switch to the Services tab. Right click the "Databases" node and select "Register MySQL Server". Fill in the details of your installation and click "ok". Right click the new MySQL node and select "connect". Now you see all the already available databases. Right click again and select "Create Database". Enter "jdbcrealm" as the new database name. Remark: We're not going to do all that with a separate database user. This is something that is highly recommended but I am using the root user in this examle. If you have a user you can also grant full access to it here. Click "ok". You get automatically connected to the newly created database. Expand the bold node and right click on "Tables". Select "Execute Command" or enter the table details via the wizard. CREATE TABLE USERS ( `USERID` VARCHAR(255) NOT NULL, `PASSWORD` VARCHAR(255) NOT NULL, PRIMARY KEY (`USERID`) ); CREATE TABLE USERS_GROUPS ( `GROUPID` VARCHAR(20) NOT NULL, `USERID` VARCHAR(255) NOT NULL, PRIMARY KEY (`GROUPID`) ); That is all for now with the database. Move on to the next paragraph. Let GlassFish know about MySQL First thing to do is to get the latest and greatest MySQL Connector/J from the MySQL website which is 5.1.22 at the time of writing this. Extract the mysql-connector-java-5.1.22-bin.jar file and drop it into your domain folder (e.g. glassfish\domains\domain1\lib). Done. Now it is finally time to create a project. Basic Project Setup Start a new maven based web application project. Choose "New Project" > "Maven" > Web Application and hit next. Now enter a name (e.g. secureapp) and all the needed maven cordinates and hit next. Choose your configured GlassFish 3+ Server. Select Java EE 6 Web as your EE version and hit "Finish". Now we need to add some more configuration to our GlassFish domain.Right click on the newly created project and select "New > Other > GlassFish > JDBC Connection Pool". Enter a name for the new connection pool (e.g. SecurityConnectionPool) and underneath the checkbox "Extract from Existing Connection:" select your registered MySQL connection. Click next. review the connection pool properties and click finish. The newly created Server Resources folder now shows your sun-resources.xml file. Follow the steps and create a "New > Other > GlassFish > JDBC Resource" pointing the the created SecurityConnectionPool (e.g. jdbc/securityDatasource).You will find the configured things under "Other Sources / setup" in a file called glassfish-resources.xml. It gets deployed to your server together with your application. So you don't have to care about configuring everything with the GlassFish admin console.Additionally we still need Primefaces. Right click on your project, select "Properties" change to "Frameworks" category and add "JavaServer Faces". Switch to the Components tab and select "PrimeFaces". Finish by clicking "OK". You can validate if that worked by opening the pom.xml and checking for the Primefaces dependency. 3.4 should be there. Feel free to change the version to latest 3.4.2. Final GlassFish Configuration Now it is time to fire up GlassFish and do the realm configuration. In NetBeans switch to the "Services" tab again and right click on the "GlassFish 3+" node. Select "Start" and watch the Output window for a successful start. Right click again and select "View Domain Admin Console", which should open your default browser pointing you to http://localhost:4848/. Select "Configurations > server-config > Security > Realms" and click "New..." on top of the table. Enter a name (e.g. JDBCRealm) and select the com.sun.enterprise.security.auth.realm.jdbc.JDBCRealm from the drop down. Fill in the following values into the textfields: JAAS jdbcRealm JNDI jdbc/securityDatasource User Table users User Name Column username Password Column password Group Table groups Group Name Column groupname Leave all the other defaults/blanks and select "OK" in the upper right corner. You are presented with a fancy JavaScript warning window which tells you to _not_ leave the Digest Algorithm Field empty. I field a bug about it. It defaults to SHA-256. Which is different to GlassFish versions prior to 3.1 which used MD5 here. The older version of this tutorial didn't use a digest algorithm at all ("none"). This was meant to make things easier but isn't considered good practice at all. So, let's stick to SHA-256 even for development, please. Secure your application Done with configuring your environment. Now we have to actually secure the application. First part is to think about the resources to protect. Jump to your Web Pages folder and create two more folders. One named "admin" and another called "users". The idea behind this is, to have two separate folders which could be accessed by users belonging to the appropriate groups. Now we have to create some pages. Open the Web Pages/index.xhtml and replace everything between the h:body tags with the following: Select where you want to go: Now add a new index.xhtml to both users and admin folders. Make them do something like this: Hello Admin|User On to the login.xhtml. Create it with the following content in the root of your Web Pages folder. Username: Password: As you can see, whe have the basic Primefaces p:panel component which has a simple html form which points to the predefined action j_security_check. This is, where all the magic is happening. You also have to include two input fields for username and password with the predefined names j_username and j_password. Now we are going to create the loginerror.xhtml which is displayed, if the user did not enter the right credentials. (use the same DOCTYPE and header as seen in the above example). Sorry, you made an Error. Please try again: Login The only magic here is the href link of the Login anchor. We need to get the correct request context and this could be done by accessing the faces context. If a user without the appropriate rights tries to access a folder he is presented a 403 access denied error page. If you like to customize it, you need to add it and add the following lines to your web.xml: 403 /faces/403.xhtml That snippet defines, that all requests that are not authorized should go to the 403 page. If you have the web.xml open already, let's start securing your application. We need to add a security constraint for any protected resource. Security Constraints are least understood by web developers, even though they are critical for the security of Java EE Web applications. Specifying a combination of URL patterns, HTTP methods, roles and transport constraints can be daunting to a programmer or administrator. It is important to realize that any combination that was intended to be secure but was not specified via security constraints, will mean that the web container will allow those requests. Security Constraints consist of Web Resource Collections (URL patterns, HTTP methods), Authorization Constraint (role names) and User Data Constraints (whether the web request needs to be received over a protected transport such as TLS). Admin Pages Protected Admin Area /faces/admin/* GET POST HEAD PUT OPTIONS TRACE DELETE admin NONE All Access None Protected User Area /faces/users/* GET POST HEAD PUT OPTIONS TRACE DELETE NONE If the constraints are in place you have to define, how the container should challenge the user. A web container can authenticate a web client/user using either HTTP BASIC, HTTP DIGEST, HTTPS CLIENT or FORM based authentication schemes. In this case we are using FORM based authentication and define the JDBCRealm FORM JDBCRealm /faces/login.xhtml /faces/loginerror.xhtml The realm name has to be the name that you assigned the security realm before. Close the web.xml and open the sun-web.xml to do a mapping from the application role-names to the actual groups that are in the database. This abstraction feels weird, but it has some reasons. It was introduced to have the option of mapping application roles to different group names in enterprises. I have never seen this used extensively but the feature is there and you have to configure it. Other appservers do make the assumption that if no mapping is present, role names and group names do match. GlassFish doesn't think so. Therefore you have to put the following into the glassfish-web.xml. You can create it via a right click on your project's WEB-INF folder, selecting "New > Other > GlassFish > GlassFish Descriptor" admin admin hat was it _basically_ ... everything you need is in place. The only thing that is missing are the users in the database. It is still empty ...We need to add a test user: Adding a Test-User to the Database And again we start by right clicking on the jdbcrealm database on the "Services" tab in NetBeans. Select "Execute Command" and insert the following: INSERT INTO USERS VALUES ("admin", "8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918"); INSERT INTO USERS_GROUPS VALUES ("admin", "admin"); You can login with user: admin and password: admin and access the secured area. Sample code to generate the hash could look like this: try { MessageDigest md = MessageDigest.getInstance("SHA-256"); String text = "admin"; md.update(text.getBytes("UTF-8")); // Change this to "UTF-16" if needed byte[] digest = md.digest(); BigInteger bigInt = new BigInteger(1, digest); String output = bigInt.toString(16); System.out.println(output); } catch (NoSuchAlgorithmException | UnsupportedEncodingException ex) { Logger.getLogger(PasswordTest.class.getName()).log(Level.SEVERE, null, ex); } Have fun securing your apps and keep the questions coming! In case you need it, the complete source code is on https://github.com/myfear/JDBCRealmExample
January 29, 2013
by Markus Eisele
· 39,375 Views · 1 Like
  • Previous
  • ...
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×