DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Languages Topics

article thumbnail
WSDLToJava Error: Rpc/Encoded WSDLs Are Not Supported with CXF
RPC/encoded is a vestige from before SOAP objects were defined with XML Schema. It’s not widely supported anymore. You will need to generate the stubs using Apache Axis 1.0, which is from the same era. java org.apache.axis.wsdl.WSDL2Java http://someurl?WSDL You will need the following jars or equivalents in the -cp classpath param: axis-1.4.jar commons-logging-1.1.ja commons-discovery-0.2.jar jaxrpc-1.1.jar saaj-1.1.jar wsdl4j-1.4.jar activation-1.1.jar mail-1.4.jar This will generate similar stubs to wsimport. Alternatively, if you are not using the parts of the schema that require rpc/encoded, you can download a copy of the WSDL and comment out those bits. Then run wsimport against the local file. If you look at the WSDL, the following bits are using rpc/encoded: Sources 1. http://bitkickers.blogspot.com/2008/12/rpcencoded-web-services-on-java-16.html 2. http://stackoverflow.com/questions/412772/java-rpc-encoded-wsdls-are-not-supported-in-jaxws-2-0
June 12, 2013
by Singaram Subramanian
· 40,462 Views · 9 Likes
article thumbnail
Using SSH.NET
I’ve recently had the need to automate configuration of Nginx on an Ubuntu server. Of course, in UNIX land we like to use SSH (Secure Shell) to log into our servers and manage them remotely. Wouldn’t it be nice, I thought, if there was a managed SSH library somewhere so that I could automate logging onto my Ubuntu server, run various commands and transfer files. A short Google turned up SSH.NET by the somewhat mysterious Olegkap (at least I couldn’t find out anything else about them) which turned out to be just what I wanted. Here’s the blurb on the CodePlex site: “This project was inspired by Sharp.SSH library which was ported from java and it seems like was not supported for quite some time. This library is complete rewrite using .NET 4.0, without any third party dependencies and to utilize the parallelism as much as possible to allow best performance I can get.” It does exactly what it says on the tin. It’s on NuGet, so you can grab it with: PM> Install-Package SSH.NET Here’s how you run a remote command. First you need to build a ConnectionInfo object: public ConnectionInfo CreateConnectionInfo() { const string privateKeyFilePath = @"C:\some\private\key.pem"; ConnectionInfo connectionInfo; using (var stream = new FileStream(privateKeyFilePath, FileMode.Open, FileAccess.Read)) { var privateKeyFile = new PrivateKeyFile(stream); AuthenticationMethod authenticationMethod = new PrivateKeyAuthenticationMethod("ubuntu", privateKeyFile); connectionInfo = new ConnectionInfo( "my.server.com", "ubuntu", authenticationMethod); } return connectionInfo; } Then you simply create an SshClient instance and run commands: public void Connect() { using (var ssh = new SshClient(CreateConnectionInfo())) { ssh.Connect(); var command = ssh.CreateCommand("uptime"); var result = command.Execute(); Console.Out.WriteLine(result); ssh.Disconnect(); } } Here I’m running the ‘uptime’ command which output this when I ran it just now: 14:37:46 up 22 days, 3:59, 0 users, load average: 0.08, 0.03, 0.05 To transfer a file, just use the ScpClient: public void GetConfigurationFiles() { using (var scp = new ScpClient(CreateNginxServerConnectionInfo())) { scp.Connect(); scp.Download("/etc/nginx/", new DirectoryInfo(@"D:\Temp\ScpDownloadTest")); scp.Disconnect(); } } Which grabs all my Nginx configuration and transfers it to a directory tree on my windows machine. All in all a very nice little library that’s been working well for me so far. Give it a try if you need to interact with a UNIX-like machine from .NET code.
June 9, 2013
by Mike Hadlow
· 30,976 Views
article thumbnail
IndexedDB and Date Example
about an hour ago i gave a presentation on indexeddb. one of the attendees asked about dates and being able to filter based on a date range. i told him that my assumption was that you would need to convert the dates into numbers and use a number-based range. turns out i was wrong. here is an example. i began by creating an objectstore that used an index on the created field. since our intent is to search via a date field, i decided "created" would be a good name. i also named my objectstore as "data". boring, but it works. var openrequest = indexeddb.open("idbpreso_date1",1); openrequest.onupgradeneeded = function(e) { var thisdb = e.target.result; if(!thisdb.objectstorenames.contains("data")) { var os = thisdb.createobjectstore("data", {autoincrement:true}); os.createindex("created", "created", {unique:false}); } } next - i built a simple way to seed data. i based on a button click event to add 10 objects. each object will have one property, created, and the date object will be based on a random date from now till 7 days in the future. function doseed() { var now = new date(); for(var i=0; i<10; i++) { var daydiff = getrandomint(1, 7); var thisdate = new date(); thisdate.setdate(now.getdate() + daydiff); db.transaction(["data"],"readwrite").objectstore("data").add({created:thisdate}); } } //credit: mozilla developer center function getrandomint (min, max) { return math.floor(math.random() * (max - min + 1)) + min; } note that since indexeddb calls are asynchronous, my code should handle updating the user to let them know when the operation is done. since this is just a quick demo though, and since that add operation will complete incredibly fast, i decided to not worry about it. so at this point we'd have an application that lets us add data containing a created property with a valid javascript date. note i didn't change it to milliseconds. i just passed it in as is. for the final portion i added two date fields on my page. in chrome this is rendered nicely: based on these, i can then create an indexeddb range of either bounds, lowerbounds, or upperbounds. i.e., give me crap either after a date, before a date, or inside a date range. function dosearch() { var fromdate = document.queryselector("#fromdate").value; var todate = document.queryselector("#todate").value; var range; if(fromdate == "" && todate == "") return; var transaction = db.transaction(["data"],"readonly"); var store = transaction.objectstore("data"); var index = store.index("created"); if(fromdate != "") fromdate = new date(fromdate); if(todate != "") todate = new date(todate); if(fromdate != "" && todate != "") { range = idbkeyrange.bound(fromdate, todate); } else if(fromdate == "") { range = idbkeyrange.upperbound(todate); } else { range = idbkeyrange.lowerbound(fromdate); } var s = ""; index.opencursor(range).onsuccess = function(e) { var cursor = e.target.result; if(cursor) { s += "key "+cursor.key+""; for(var field in cursor.value) { s+= field+"="+cursor.value[field]+""; } s+=""; cursor.continue(); } document.queryselector("#status").innerhtml = s; } } the only conversion required here was to take the user input and turn it into "real" date objects. once done, everything works great: you can run the full demo below.
June 7, 2013
by Raymond Camden
· 7,309 Views
article thumbnail
Asynchronous logging using Log4j, ActiveMQ and Spring
My team and I are creating a services platform based on a set of RESTful JSON services where each service contributes to the platform by providing distinct feature(s) and/or data. With logs being generated all over the place, we thought it was a good idea to centralize logging and perhaps also provide a rudimentary log viewer that allowed us to view, filter, sort and search our logs. We also wanted our logging to be asynchronous as we didn’t want our services to be held up while trying to write logs say maybe directly to a database. The strategy for achieving this was straight forward. Setup ActiveMQ Create a log4j appender that writes logs to the queue (log4j ships with one such appender but lets write our own. Write a message listener that reads logs from a JMS queue setup on an MQ server and persists them Let’s take a look one by one. Setup ActiveMQ Setting up an external ActiveMQ server is simple enough. A great tutorial is available at http://servicebus.blogspot.com/2011/02/installing-apache-active-mq-on-ubuntu.html to set it up on Ubuntu. You can also choose to embed a message broker within your application. Spring makes this easy. We will see how later. Creating a Lo4j JMS appender First, we create a log4j JMS appender. log4j ships with one such appender (that writes to a JMS topic instead of a queue) import javax.jms.DeliveryMode; import javax.jms.Destination; import javax.jms.MessageProducer; import javax.jms.ObjectMessage; import javax.jms.Session; import org.apache.activemq.ActiveMQConnectionFactory; import org.apache.log4j.Appender; import org.apache.log4j.AppenderSkeleton; import org.apache.log4j.Logger; import org.apache.log4j.PatternLayout; import org.apache.log4j.spi.LoggingEvent; /** * JMSQueue appender is a log4j appender that writes LoggingEvent to a queue. * @author faheem * */ public class JMSQueueAppender extends AppenderSkeleton implements Appender{ private static Logger logger = Logger.getLogger("JMSQueueAppender"); private String brokerUri; private String queueName; @Override public void close() { } @Override public boolean requiresLayout() { return false; } @Override protected synchronized void append(LoggingEvent event) { try { ActiveMQConnectionFactory connectionFactory = new ActiveMQConnectionFactory( this.brokerUri); // Create a Connection javax.jms.Connection connection = connectionFactory.createConnection(); connection.start();np // Create a Session Session session = connection.createSession(false,Session.AUTO_ACKNOWLEDGE); // Create the destination (Topic or Queue) Destination destination = session.createQueue(this.queueName); // Create a MessageProducer from the Session to the Topic or Queue MessageProducer producer = session.createProducer(destination); producer.setDeliveryMode(DeliveryMode.NON_PERSISTENT); ObjectMessage message = session.createObjectMessage(new LoggingEventWrapper(event)); // Tell the producer to send the message producer.send(message); // Clean up session.close(); connection.close(); } catch (Exception e) { e.printStackTrace(); } } public void setBrokerUri(String brokerUri) { this.brokerUri = brokerUri; } public String getBrokerUri() { return brokerUri; } public void setQueueName(String queueName) { this.queueName = queueName; } public String getQueueName() { return queueName; } } Lets see whats happening here. Line 19: We implement the Log4J appender interface that asks us to implement three methods. requiresLayout, close and append. We will keep things simple for the moment and implement the append method which gets called whenever a method call to the logger is made. Line 37: log4j calls the append method and passes a LoggingEvent object as a parameter which represents a call to a logger. A LoggingEvent object encapsulates all information about every log item. Line 41 & 42: Create a new connection factory by providing it with a uri of a JMS, in our case activemq, server Line 45, 46 and 49: We establish a connection and a session to the JMS server. A Session can be opened in several modes. An Auto_Acknowledge session is one in which the acknowledgment of message happens automatically. Other modes include Client_Acknowledge in which a client has to explicitly acknowledge receipt and/or processing of a message and two other modes. For details, refer to the docs at http://download.oracle.com/javaee/1.4/api/javax/jms/Session.html Line 52: Create a queue. Send the queue name to connect to as a parameter. Line 56: We set the delivery mode to Non_Persistent. The other option is Persistent where the message is persisted to a persistent store. Persistent mode slows down but adds reliability to the message transfer. Line 58: We are doing multiple things. First of all I am wrapping the LoggingEvent object into a LoggingEventWrapper. This is because there are some properties within the LoggingEvent object that are not serializeable and also because I want to capture some additional information such as IP address and host name. Next, using the JMS session object, I prepare an object (the wrapper) for transport. Line 61: I send the object to the queue. Below is the code for the wrapper. import java.io.Serializable; import java.net.InetAddress; import java.net.UnknownHostException; import org.apache.log4j.EnhancedPatternLayout; import org.apache.log4j.spi.LoggingEvent; /** * Logging Event Wraps a log4j LoggingEvent object. Wrapping is required by some information is lost * when the LoggingEvent is serialized. The idea is to extract all information required from the LoggingEvent * object, place it in the wrapper and then serialize the LoggingEventWrapper. This way all required data remains * available to us. * @author faheem * */ public class LoggingEventWrapper implements Serializable{ private static final String ENHANCED_PATTERN_LAYOUT = "%throwable"; private static final long serialVersionUID = 3281981073249085474L; private LoggingEvent loggingEvent; private Long timeStamp; private String level; private String logger; private String message; private String detail; private String ipAddress; private String hostName; public LoggingEventWrapper(LoggingEvent loggingEvent){ this.loggingEvent = loggingEvent; //Format event and set detail field EnhancedPatternLayout layout = new EnhancedPatternLayout(); layout.setConversionPattern(ENHANCED_PATTERN_LAYOUT); this.detail = layout.format(this.loggingEvent); } public Long getTimeStamp() { return this.loggingEvent.timeStamp; } public String getLevel() { return this.loggingEvent.getLevel().toString(); } public String getLogger() { return this.loggingEvent.getLoggerName(); } public String getMessage() { return this.loggingEvent.getRenderedMessage(); } public String getDetail() { return this.detail; } public LoggingEvent getLoggingEvent() { return loggingEvent; } public String getIpAddress() { try { return InetAddress.getLocalHost().getHostAddress(); } catch (UnknownHostException e) { return "Could not determine IP"; } } public String getHostName() { try { return InetAddress.getLocalHost().getHostName(); } catch (UnknownHostException e) { return "Could not determine Host Name"; } } } The Message Listener The message listener “listens” to the queue (or topic). Whenever a new message is added to the queue, the onMessage method is called. import javax.jms.JMSException; import javax.jms.Message; import javax.jms.MessageListener; import javax.jms.ObjectMessage; import org.apache.log4j.Logger; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Component; @Component public class LogQueueListener implements MessageListener { public static Logger logger = Logger.getLogger(LogQueueListener.class); @Autowired private ILoggingService loggingService; public void onMessage( final Message message ) { if ( message instanceof ObjectMessage ) { try{ final LoggingEventWrapper loggingEventWrapper = (LoggingEventWrapper)((ObjectMessage) message).getObject(); loggingService.saveLog(loggingEventWrapper); } catch (final JMSException e) { logger.error(e.getMessage(), e); } catch (Exception e) { logger.error(e.getMessage(),e); } } } } Line 23: Checking if the object being picked off the queue is an instance of ObjectMessage Line 26: Extracting LoggingEventWrapper from the Message Line 27: Call a service method to persist the log Wiring up in Spring Lines 5-9: Use the broker tag to setup an embedded message broker. Since I am using an external one, I don’t need it. Line 12: Mention the name of the queue you want to connect to. Line 14: URI of the Broker Server. Line 15-19: Connection Factory setup Line 26-28: Message Listener Setup where we specify the number of concurrent threads that can consume messages off the queue. Of course, the above example will not work out of the box. You still have to include all JMS dependencies and implement the service that persists logs. But I hope it gives you a decent idea.
June 7, 2013
by Faheem Sohail
· 16,697 Views · 2 Likes
article thumbnail
OCEJWCD ( SCWCD 6) Web Component Developer Certification Exam
Oracle offers two certifications for web component developers one for Java EE 5 and another one for Java EE 6.
June 6, 2013
by Kate Wilson
· 72,984 Views · 1 Like
article thumbnail
Serialization and injection
Serialization is a form of persistence: serialized data survives the process and the RAM where it was created and can be reconstituted inside different processes and machines that live in a different time or place. Sometimes serialization is a poor form of persistence in fact, one that confuses the boundary between the different schemas the data can fit in. However, what I found useful in the last years of development is to institute a strict separation: serialize Value Objects, Entities, and everything that represents the state of the application. Meanwhile, use Dependency Injection over services that are part of a larger object graph and never serialize this second kind of objects. In the discussion that follows, I make the assumption that serialization and deserialization occur on the same machine (e.g. like for web-oriented sessions.) The problem with serialization, which work transparently most of the time, is the need to serialize service objects instead of limiting the procedure to data structures. How can you store such objects? Not options Some options to solve this problems are really not options. Serialization by itself will fail because of the staleness of the references contained in these objects. For example, in PHP trying to serialize a database connections composed by a Repository or DAO object will rightly fail with an exception. Whenever an object represents a resource of the current machine, it cannot usually be serialized except in the case when the only resource involved is RAM. If the resource is disk space or other running processes such as a database daemon, the reconstitution of the object in another place and time will fail and it's best to just stop the developer immediately during storage. Quasi-options Some solutions to the problem try to avoid the staleness problem by serializing objects without their resources, and make them regrab a new version of them on deserialization. In PHP for example, this can be done with the __sleep() and __wakeup() magic methods, called automatically during serialization and deserializaton respectively. This deserialization mechanism introduces a dependency from the serialized Entity to external services: such a dependency is already in place when building the object the first time (passing the XService in the constructor) but it is aggravated when deserializing (depending on a XServiceFactory instead of just an XService). An improvement, from the dependencies point of view, is to reattach collaborators to deserialized objects like you would for other persistence-related tasks. For example, EntityRepository can inject the missing pieces of Entity every time its find() method is called. However, there is still another option, which is the most resilient from the modelling point of view and not only that of dependency management: injecting non-serializable collaborators through the stack. Objects can collaborate even without keeping field references to each other, and injecting dependencies as parameters move the dependency starting point from the server to the client object (which may or may not be desirable). What is most important is that Entities are relieved of having to manage external references in any context, not only that of persistence and in particular serialization. The metaphor for the 3rd option Misko Hevery likes to say: have you ever seen a credit card able to charge itself? If a CreditCard is an Entity in your domain, it would be very strange to keeping a wire attached to your wallet wherever you go. With the first option, you have the card spring a wire when it is taken out of the wallet, like in horror movies. This intelligent cable tries as its best to attach to the nearest Point of Sale (a bad case of bluetooth I think). With Repositories in mind, you're not dealing with automated wires anymore, but you're still attaching cables between cards and fixed devices. In reality, cards collaborate with the PoS in a fast process that does not last more than a few seconds. Actually, sometimes they don't touch it at all, as in all Internet-based purchases. Keeping services around to deal with external dependencies does not mean the API of your Domain Model has to be biased towards service objects: pos.charge(creditCard); // can equivalently be: creditCard.chargeOn(pos); This is a form of Double Dispatch since there are two objects collaborating and you can dispatch (send messages) to both, being polimorphic by substituting both objects. The sequence of calls is: client -> creditCard -> pos The client object still looks at CreditCard as a behaviorally complete object, but it is clear which dependency is necessary to run each use case (CreditCard method). You can persist a CreditCard easily and send it over the wire to caches or databases. When it comes the time to charge, it is the client that has to bring forward a service able to connect to a bank.
June 5, 2013
by Giorgio Sironi
· 7,222 Views
article thumbnail
Log Scraping
A quick Java snippet for log scraping: package com.agilemobiledeveloper.logcheck; import java.io.BufferedReader; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import com.jcraft.jsch.Channel; import com.jcraft.jsch.ChannelSftp; import com.jcraft.jsch.JSch; import com.jcraft.jsch.Session; /** * * @author spannt * */ public class LogScraper { /** * @param args */ public static void main(String[] args) { String SFTPHOST = "myunixsite.com"; int SFTPPORT = 22; String SFTPUSER = "myunixid"; String SFTPPASS = "myunixpassword"; String SFTPWORKINGDIR = "/some/unix/directory"; String SERRORFILE = "SystemErr.log"; String SOUTFILE = "SystemOut.log"; Session session = null; Channel channel = null; ChannelSftp channelSftp = null; StringBuilder out = new StringBuilder(); try { JSch jsch = new JSch(); session = jsch.getSession(SFTPUSER, SFTPHOST, SFTPPORT); session.setPassword(SFTPPASS); java.util.Properties config = new java.util.Properties(); config.put("StrictHostKeyChecking", "no"); session.setConfig(config); session.connect(); channel = session.openChannel("sftp"); channel.connect(); channelSftp = (ChannelSftp) channel; channelSftp.cd(SFTPWORKINGDIR); System.out.println("Error File"); out.append("Error File:").append( LogScraper.parseStream(channelSftp.get(SERRORFILE))); System.out.println("Output File"); out.append("Output File:").append( LogScraper.parseStream(channelSftp.get(SOUTFILE))); } catch (Exception ex) { ex.printStackTrace(); out.append(ex.getLocalizedMessage()); } System.out.println("Logs=" + out.toString()); } /** * * line.contains("Exception") || * * @param file * @return String of error data */ public static String parseStream(InputStream inputFileStream) { if ( null == inputFileStream ) { return "Log Empty"; } StringBuilder out = new StringBuilder(); BufferedReader br = new BufferedReader(new InputStreamReader( inputFileStream)); String line = null; try { while ((line = br.readLine()) != null) { if (line.contains("OutOfMemoryError")) { out.append(line).append(System.lineSeparator()); } } } catch (IOException e) { e.printStackTrace(); out.append(e.getLocalizedMessage()); } return out.toString(); } }
June 5, 2013
by Tim Spann DZone Core CORE
· 9,206 Views · 1 Like
article thumbnail
Creating Internal DSLs in Java & Java 8
Adopting Martin Fowler's approach to domain-specific language.
June 5, 2013
by Mohamed Sanaulla
· 34,202 Views · 7 Likes
article thumbnail
Write CSV Data into Hive and Python
Apache Hive is a high level SQL-like interface to Hadoop. It lets you execute mostly unadulterated SQL, like this: CREATE TABLE test_table(key string, stats map); The map column type is the only thing that doesn’t look like vanilla SQL here. Hive can actually use different backends for a given table. Map is used to interface with column oriented backends like HBase. Essentially, because we won’t know ahead of time all the column names that could be in the HBase table, Hive will just return them all as a key/value dictionary. There are then helpers to access individual columns by key, or even pivot the map into one key per logical row. As part of the Hadoop family, Hive is focused on bulk loading and processing. So it’s not a surprise that Hive does not support inserting raw values like the following SQL: INSERT INTO suppliers (supplier_id, supplier_name) VALUES (24553, 'IBM'); However, for unit testing Hive scripts, it would be nice to be able to insert a few records manually. Then you could run your map reduce HQL, and validate the output. Luckily, Hive can load CSV files, so it’s relatively easy to insert a handful or records that way. CREATE TABLE foobar(key string, stats map) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '|' MAP KEYS TERMINATED BY ':' ; LOAD DATA LOCAL INPATH '/tmp/foobar.csv' INTO TABLE foobar; This will load a CSV file with the following data, where c4ca4-0000001-79879483-000000000124 is the key, and comments and likesare columns in a map. c4ca4-0000001-79879483-000000000124,comments:0|likes:0 c4ca4-0000001-79879483-000000000124,comments:0|likes:0 Because I’ve been doing this quite a bit in my unit tests, I wrote a quick Python helper to dump a list of key/map tuples to a temporary CSV file, and then load it into Hive. This uses hiver to talk to Hive over thrift. import hiver from django.core.files.temp import NamedTemporaryFile def _hql(self, hql): client = hiver.connect(settings.HIVE_HOST, settings.HIVE_PORT) try: client.execute(hql) finally: client.shutdown() def insert(self, table_name, rows): ''' cannot insert single rows via hive, need to save to a temp file and bulk load that ''' csv_file = NamedTemporaryFile(delete=True) for row in rows: map_repr = '|'.join('%s:%s' % (key, value) for key, value in row[1].items()) csv_file.write(row[0] + "," + map_repr + "\n") csv_file.flush() try: _hql('DROP TABLE IF EXISTS %s' % table_name) _hql(""" CREATE TABLE %s ( key string, map ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' COLLECTION ITEMS TERMINATED BY '|' MAP KEYS TERMINATED BY ':' """ % (table_name)) _hql(""" LOAD DATA LOCAL INPATH '%s' INTO TABLE %s """ % (csv_file.name, table_name) finally: csv_file.close() You can call it like this: insert('test_table', [ ('c4ca4-0000001-79879483-000000000124', {'comments': 1, 'likes': 2}), ('c4ca4-0000001-79879483-000000000124', {'comments': 1, 'likes': 2}), ('c4ca4-0000001-79879496-000000000124', {'comments': 1, 'likes': 2}), ('b4aed-0000002-79879783-000000000768', {'comments': 1, 'likes': 2}), ('b4aed-0000002-79879783-000000000768', {'comments': 1, 'likes': 2}), ])
June 5, 2013
by Chase Seibert
· 14,722 Views
article thumbnail
Parsing XML in Groovy using XmlSlurper
In my previous post I showed different ways in which we can parse XML document in Java. You must have noticed the code being too much verbose. Other JVM languages like Groovy, Scala provide much better support for parsing XML documents. In this post I give you the code to parse the XML document in Groovy and one can compare the ease with which we can parse XML documents. I make use of XmlSlurper API in Groovy which loads the complete XML into a tree and this tree can be then navigated using Groovy’s version of XPath called GPath. The XML document I am using is the same one used here and also the intent is to parse the XML and create a list of Employee object. class XmlParserDemo { static void main(args){ def empList = new ArrayList(); def emp; def employees = new XmlSlurper().parse(ClassLoader. getSystemResourceAsStream("xml/employee.xml")); employees.employee.each{ node -> emp = new Employee(); emp.firstName = node.firstName emp.lastName = node.lastName emp.id = node.@id emp.location = node.location empList.add(emp) } empList.each{ empT -> println(empT)} } } class Employee{ String firstName String lastName String id String location @Override public String toString(){ return "${firstName} ${lastName}(${id}) in ${location}" } } The output is: Rakesh Mishra(111) in Bangalore John Davis(112) in Chennai Rajesh Sharma(113) in Pune And the XML is present in the “xml” package and is available in the classpath of the application. I am using the ClassLoader to load the XML resource.
May 29, 2013
by Mohamed Sanaulla
· 32,262 Views
article thumbnail
Parsing XML using DOM, SAX and StAX Parser in Java
I happen to read through a chapter on XML parsing and building APIs in Java. And I tried out the different parsers on a sample XML. Then I thought of sharing it on my blog so that I can have a reference to the code as well as a reference for anyone reading this. In this post I parse the same XML in different parsers to perform the same operation of populating the XML content into objects and then adding the objects to a list. The sample XML considered in the examples is: Rakesh Mishra Bangalore John Davis Chennai Rajesh Sharma Pune And the obejct into which the XML content is to be extracted is defined as below: class Employee{ String id; String firstName; String lastName; String location; @Override public String toString() { return firstName+" "+lastName+"("+id+")"+location; } } There are 3 main parsers for which I have given sample code: DOM Parser SAX Parser StAX Parser Using DOM Parser I am making use of the DOM parser implementation that comes with the JDK and in my example I am using JDK 7. The DOM Parser loads the complete XML content into a Tree structure. And we iterate through the Node and NodeList to get the content of the XML. The code for XML parsing using DOM parser is given below. public class DOMParserDemo { public static void main(String[] args) throws Exception { //Get the DOM Builder Factory DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); //Get the DOM Builder DocumentBuilder builder = factory.newDocumentBuilder(); //Load and Parse the XML document //document contains the complete XML as a Tree. Document document = builder.parse( ClassLoader.getSystemResourceAsStream("xml/employee.xml")); List empList = new ArrayList<>(); //Iterating through the nodes and extracting the data. NodeList nodeList = document.getDocumentElement().getChildNodes(); for (int i = 0; i < nodeList.getLength(); i++) { //We have encountered an tag. Node node = nodeList.item(i); if (node instanceof Element) { Employee emp = new Employee(); emp.id = node.getAttributes(). getNamedItem("id").getNodeValue(); NodeList childNodes = node.getChildNodes(); for (int j = 0; j < childNodes.getLength(); j++) { Node cNode = childNodes.item(j); //Identifying the child tag of employee encountered. if (cNode instanceof Element) { String content = cNode.getLastChild(). getTextContent().trim(); switch (cNode.getNodeName()) { case "firstName": emp.firstName = content; break; case "lastName": emp.lastName = content; break; case "location": emp.location = content; break; } } } empList.add(emp); } } //Printing the Employee list populated. for (Employee emp : empList) { System.out.println(emp); } } } class Employee{ String id; String firstName; String lastName; String location; @Override public String toString() { return firstName+" "+lastName+"("+id+")"+location; } } The output for the above will be: Rakesh Mishra(111)Bangalore John Davis(112)Chennai Rajesh Sharma(113)Pune Using SAX Parser SAX Parser is different from the DOM Parser where SAX parser doesn’t load the complete XML into the memory, instead it parses the XML line by line triggering different events as and when it encounters different elements like: opening tag, closing tag, character data, comments and so on. This is the reason why SAX Parser is called an event based parser. Along with the XML source file, we also register a handler which extends the DefaultHandler class. The DefaultHandler class provides different callbacks out of which we would be interested in: startElement() – triggers this event when the start of the tag is encountered. endElement() – triggers this event when the end of the tag is encountered. characters() – triggers this event when it encounters some text data. The code for parsing the XML using SAX Parser is given below: import java.util.ArrayList; import java.util.List; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.Attributes; import org.xml.sax.SAXException; import org.xml.sax.helpers.DefaultHandler; public class SAXParserDemo { public static void main(String[] args) throws Exception { SAXParserFactory parserFactor = SAXParserFactory.newInstance(); SAXParser parser = parserFactor.newSAXParser(); SAXHandler handler = new SAXHandler(); parser.parse(ClassLoader.getSystemResourceAsStream("xml/employee.xml"), handler); //Printing the list of employees obtained from XML for ( Employee emp : handler.empList){ System.out.println(emp); } } } /** * The Handler for SAX Events. */ class SAXHandler extends DefaultHandler { List empList = new ArrayList<>(); Employee emp = null; String content = null; @Override //Triggered when the start of tag is found. public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { switch(qName){ //Create a new Employee object when the start tag is found case "employee": emp = new Employee(); emp.id = attributes.getValue("id"); break; } } @Override public void endElement(String uri, String localName, String qName) throws SAXException { switch(qName){ //Add the employee to list once end tag is found case "employee": empList.add(emp); break; //For all other end tags the employee has to be updated. case "firstName": emp.firstName = content; break; case "lastName": emp.lastName = content; break; case "location": emp.location = content; break; } } @Override public void characters(char[] ch, int start, int length) throws SAXException { content = String.copyValueOf(ch, start, length).trim(); } } class Employee { String id; String firstName; String lastName; String location; @Override public String toString() { return firstName + " " + lastName + "(" + id + ")" + location; } } The output for the above would be: Rakesh Mishra(111)Bangalore John Davis(112)Chennai Rajesh Sharma(113)Pune Using StAX Parser StAX stands for Streaming API for XML and StAX Parser is different from DOM in the same way SAX Parser is. StAX parser is also in a subtle way different from SAX parser. The SAX Parser pushes the data but StAX parser pulls the required data from the XML. The StAX parser maintains a cursor at the current position in the document allows to extract the content available at the cursor whereas SAX parser issues events as and when certain data is encountered. XMLInputFactory and XMLStreamReader are the two class which can be used to load an XML file. And as we read through the XML file using XMLStreamReader, events are generated in the form of integer values and these are then compared with the constants in XMLStreamConstants. The below code shows how to parse XML using StAX parser: import java.util.ArrayList; import java.util.List; import javax.xml.stream.XMLInputFactory; import javax.xml.stream.XMLStreamConstants; import javax.xml.stream.XMLStreamException; import javax.xml.stream.XMLStreamReader; public class StaxParserDemo { public static void main(String[] args) throws XMLStreamException { List empList = null; Employee currEmp = null; String tagContent = null; XMLInputFactory factory = XMLInputFactory.newInstance(); XMLStreamReader reader = factory.createXMLStreamReader( ClassLoader.getSystemResourceAsStream("xml/employee.xml")); while(reader.hasNext()){ int event = reader.next(); switch(event){ case XMLStreamConstants.START_ELEMENT: if ("employee".equals(reader.getLocalName())){ currEmp = new Employee(); currEmp.id = reader.getAttributeValue(0); } if("employees".equals(reader.getLocalName())){ empList = new ArrayList<>(); } break; case XMLStreamConstants.CHARACTERS: tagContent = reader.getText().trim(); break; case XMLStreamConstants.END_ELEMENT: switch(reader.getLocalName()){ case "employee": empList.add(currEmp); break; case "firstName": currEmp.firstName = tagContent; break; case "lastName": currEmp.lastName = tagContent; break; case "location": currEmp.location = tagContent; break; } break; case XMLStreamConstants.START_DOCUMENT: empList = new ArrayList<>(); break; } } //Print the employee list populated from XML for ( Employee emp : empList){ System.out.println(emp); } } } class Employee{ String id; String firstName; String lastName; String location; @Override public String toString(){ return firstName+" "+lastName+"("+id+") "+location; } } The output for the above is: Rakesh Mishra(111) Bangalore John Davis(112) Chennai Rajesh Sharma(113) Pune With this I have covered parsing the same XML document and performing the same task of populating the list of Employee objects using all the three parsers namely: DOM Parser SAX Parser StAX Parser
May 28, 2013
by Mohamed Sanaulla
· 100,917 Views · 2 Likes
article thumbnail
Secure Web Application in Java EE6 using LDAP
In our previous article we have explained on how to protect the data while it is in transit through Transport Layer Security (TLS)/Secured Socket Layer (SSL). Now let us try to understand how to apply security mechanism for a JEE 6 based web application using LDAP server for authentication. Objective: • Configure a LDAP realm in the JEE Application Server • Apply JEE security to a sample web application. Products used: IDE: Netbeans 7.2 Java Development Kit (JDK): Version 6 Glassfish server: 3.1 Authentication Mechanism: Form Based authentication Authentication server: LDAP OpenDS v2.2 Apply JEE security to the sample web application: The JEE web applications can be secured either through Declarative security or Programmatic security. Declarative security can be implemented in JEE applications by using annotations or through deployment descriptor. This type of security mechanism is used when the roles and authentication process is simple, when it can make use of existing security providers (even external like LDAP, Kerberos). Programmatic security provides additional security mechanism when declarative security is not sufficient for the application in context. It is used when we require custom made security and when rich set of roles, authentication is required. Configure Realm in the Glassfish Application Server Before we configure a realm in the Glassfish Application server you will need to install and configure an LDAP server which we will be using for our project. You can get the complete instructions in the following article: “How to install and configure LDAP server”. Once the installation is successful start your Glassfish server and go to the admin console. Create a new LDAP Realm. Create new LDAP Realm Add the configuration settings as per the configurations set up done for the LDAP server. Glassfish Web App LDAP Realm JAAS Context – identifier which will be used in the application module to connect with the LDAP server. (e.g. ldapRealm) Directory – LDAP server URL path (e.g. ldap://localhost:389) Base DN: Distinguished name in the LDAP directory identifying the location of the user data. Applying JEE security to the web application Create a sample web application as per the following structure: SampleWebApp Directory Form based authentication mechanism will be used for authentication of the users. JEE Login and Authentication Let us explain the whole process with help of above diagram and the code. Set up a sample web application in Netbeans IDE. SampleWebApp in Netbeans IDE SampleWebApp Configuration Step 1: As explained in the above diagram a client browser tries to request for a protected resource from the websitehttp://{samplewebsite.com}/{contextroot}/index.jsp. The webserver goes into the web configuration file and figures out that the requested resource is protected. web.xml Code SecurityConstraint Secured resources /* GeneralUser Administrator NONE Step 2: The webserver presents the Login.jsp as a part of the Form based authentication mechanism to the client. These configurations are checked from the web configuration file. web.xml FORM ldapRealm /Login.jsp /LoginError.jsp Step 3: The client submits the login form to the web server. When the servers finds that the form action is “j_security_check” it processes the request to authenticate the client’s credential. The jsp form must contain the login elements j_username and j_password which will allow the web server to invoke the login authentication mechanism. Login.jsp username: password: While processing the request the webserver will send the authentication request to the LDAP server since LDAP realm is used in the login-config. The LDAP server will authenticate the user based on the username and password stored in the LDAP repository. Step 4: If the authentication is successful the secured resource (in this case index.jsp) is returned to the client and the container uses a session id to identify a login session for the client. The container maintains the login session with a cookie containing the session-id. The server sends this cookie back to the client, and as long as the client is able to show this cookie for subsequent requests, then the container easily recognize the client and hence maintains the session for this client. Step 5: Only if the authentication is unsuccessful the user will be redirected to the LoginError.jsp as per the configuration in the web.xml. /LoginError.jsp This shows how to apply form based security authentication to a sample web application. Now let us get a brief look on the secured resource which is used for this project. In this project the secured resource is index.jsp which accepts a username and forwards the request to LoginServlet. Login servlet dispatches the request to Success.jsp which then prints the username to the client. index.jsp Please type your name LoginServlet.java protected void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=UTF-8"); PrintWriter out = response.getWriter(); try { RequestDispatcher requestDispatcher = getServletConfig().getServletContext(). getRequestDispatcher("/Success.jsp"); requestDispatcher.forward(request, response); } finally { out.close(); } } Success.jsp You have been successfully logged in as ${param.username} web.xml LoginServlet com.login.LoginServlet LoginServlet /LoginServlet You can download the complete working code from the below link. SampleWebApp-Code Download Hope our readers have enjoyed this article. Keep watching this space for more articles on JEE security.
May 24, 2013
by Mainak Goswami
· 20,351 Views · 2 Likes
article thumbnail
How Could Scala do a Merge Sort?
Merge sort is a classical "divide and conquer" sorting algorithm. You should have to never write one because you'd be silly to do that when a standard library class already will already do it for you. But, it is useful to demonstrate a few characteristics of programming techniques in Scala. Firstly a quick recap on the merge sort. It is a divide and conquer algorithm. A list of elements is split up into smaller and smaller lists. When a list has one element it is considered sorted. It is then merged with the list beside it. When there are no more lists to merged the original data set is considered sorted. Now let's take a look how to do that using an imperative approach in Java. public void sort(int[] values) { int[] numbers = values; int[] auxillaryNumbers = new int[values.length]; mergesort(numbers, auxillaryNumbers, 0, values.length - 1); } private void mergesort(int [] numbers, int [] auxillaryNumbers, int low, int high) { // Check if low is smaller then high, if not then the array is sorted if (low < high) { // Get the index of the element which is in the middle int middle = low + (high - low) / 2; // Sort the left side of the array mergesort(numbers, auxillaryNumbers, low, middle); // Sort the right side of the array mergesort(numbers, auxillaryNumbers, middle + 1, high); // Combine them both // Alex: the first time we hit this when there is min difference between high and low. merge(numbers, auxillaryNumbers, low, middle, high); } } /** * Merges a[low .. middle] with a[middle..high]. * This method assumes a[low .. middle] and a[middle..high] are sorted. It returns * a[low..high] as an sorted array. */ private void merge(int [] a, int[] aux, int low, int middle, int high) { // Copy both parts into the aux array for (int k = low; k <= high; k++) { aux[k] = a[k]; } int i = low, j = middle + 1; for (int k = low; k <= high; k++) { if (i > middle) a[k] = aux[j++]; else if (j > high) a[k] = aux[i++]; else if (aux[j] < aux[i]) a[k] = aux[j++]; else a[k] = aux[i++]; } } public static void main(String args[]){ ... ms.sort(new int[] {5, 3, 1, 17, 2, 8, 19, 11}); ... } } Discussion... An auxillary array is used to achieve the sort. Elements to be sorted are copied into it and then once sorted copied back. It is important this array is only created once otherwise there can be a performance hit from extensive array created. The merge method does not have to create an auxiliary array however since it changes an object it means the merge method has side effects. Merge sort big(O) performance is N log N. Now let's have a go at a Scala solution. def mergeSort(xs: List[Int]): List[Int] = { val n = xs.length / 2 if (n == 0) xs else { def merge(xs: List[Int], ys: List[Int]): List[Int] = (xs, ys) match { case(Nil, ys) => ys case(xs, Nil) => xs case(x :: xs1, y :: ys1) => if (x < y) x::merge(xs1, ys) else y :: merge(xs, ys1) } val (left, right) = xs splitAt(n) merge(mergeSort(left), mergeSort(right)) } } Key discussion points: It is the same divide and conquer idea. The splitAt function is used to divide up the data up each time into a tuple. For every recursion this will new a new tuple. The local function merge is then used to perform the merging. Local functions are a useful feature as they help promote encapsulation and prevent code bloat. Neiher the mergeSort() or merge() functions have any side effects. They don't change any object. They create (and throw away) objects. Because the data is not been passed across iterations of the merging, there is no need to pass beginning and ending pointers which can get very buggy. This merge recursion uses pattern matching to great effect here. Not only is there matching for data lists but when a match happens the data lists are assigned to variables: x meaning the top element in the left list xs1 the rest of the left list y meaning the top element in the right list ys1 meaning the rest of the data in the right list This makes it very easy to compare the top elements and to pass around the rest of the date to compare. Would the iterative approach be possible in Java? Of course. But it would be much more complex. You don't have any pattern matching and you don't get a nudge to declare objects as immutable as Scala does with making you make something val or var. In Java, it would always be easier to read the code for this problem if it was done in an imperative style where objects are being changed across iterations of a loop. But Scala a functional recursive approach can be quite neat. So here we see an example of how Scala makes it easier to achieve good, clean, concise recursion and a make a functional approach much more possible.
May 23, 2013
by Alex Staveley
· 12,014 Views
article thumbnail
Capturing camera/picture data without PhoneGap
As people know, I'm a huge fan of PhoneGap and what it allows me to do with JavaScript, HTML, and CSS. But I think it is crucial to remember that you don't always need PhoneGap. A great example of that is camera access. Did you know that recent mobile browsers support accessing the camera directly from HTML and JavaScript? Let's look at an example. Over a year ago I wrote a blog post where I created an application called "Color Thief." This application made use of PhoneGap's Camera API and a third party JavaScript library called Color Thief. I loved this example because it demonstrated how you could combine the extra power that PhoneGap provides along with existing JavaScript libraries. This morning I watched an excellent Google IO presentation (https://www.youtube.com/watch?v=EPYnGFEcis4&feature=youtube_gdata_player) on Mobile HTML. It was an overview of some of the exciting stuff you can now do with mobile HTML and JavaScript. To be clear, this was all without using wrappers like PhoneGap. In one of the examples the presenters discussed the new "capture" support for the input/file field type. This is rather simple to implement: If supported (recent Android and latest iOS), the user can then use their camera to select a picture. I decided to rebuild my old demo to skip PhoneGap completely and just make use of this feature. Here's the code: For the most part, this is pretty similar to the last version. I no longer wait for the deviceready event, but instead just listen for the document itself to load. Instead of listening for a button click, I've switched to a input field using type=file. I now listen for the change event, and on that, I see if I have access to a file. If I do, I can then use the URL object to create a pointer to the source and then simply add it to my DOM. After that, Color Thief takes over. The only tricky part I ran into was that in iOS the URL object is still prefixed. You can see how I get around that in the startup code. To be fair, this isn't 100% backwards compatible, I could add a few checks in here to ensure that things will work and gracefully let people on older phones know they can't use this feature. But the end result is nearly the exact same functionality in a web page - no PhoneGap, no native code. <br>
May 21, 2013
by Raymond Camden
· 17,575 Views
article thumbnail
Resize Image Class With PHP
A common feature that you will come across in websites is the ability to resize an image to fit an exact size so that it will be displayed correctly on your design. If you have a very large image and you are going to place it on your website in a space that is only 100px x 100px then you will want to be able to resize this image to fit in the space. One option is to just set the width and height attributes on the image tag in your HTML, this will force the image to be displayed at this size. This will work perfectly fine and the image will fit in the 100px x 100px space but the problem is that when the browser loads the image it will not resize the image but will just display it in the limited size. This means that the image will still need to be downloaded at full size, if the image is very large it can take some time for you to download a large image just to be displayed in a small space. A better solution would be to resize the image to 100px x 100px which will reduce it's size so that the browser doesn't need to download a large image just to display it in a small space. In this tutorial we are going to create a PHP class that will allow you to resize an image to any dimension you want, it will allow you to resize while keeping the aspect ratio of the image. When the class has resized the image you can either save the image on the server or download the image. Class Methods First lets plan out the class we are going to create: We need to pass an existing image to the class, this is the image that we will use to resize. We need to pass in the desired image dimensions so the class can work out what's the new size of the image. Then we need to be able to save the image to a location on the server, and choose to download the image. The Constructor This class relies on an original image being found and set on the class, without this the class will not work correctly. Because of this we should pass the image filename into the constructor of the class. This will then check if the file exists on the server, if it does then we can call the set image method where we can get the image and create a resource of the image and store this in a class variable. /** * Class constructor requires to send through the image filename * * @param string $filename - Filename of the image you want to resize */ public function __construct( $filename ) { if(file_exists($filename)) { $this->setImage( $filename ); } else { throw new Exception('Image ' . $filename . ' can not be found, try another image.'); } } Set The Image The set image method is used to create a image resource based on the image given to the class this uses the PHP functions imagecreatefromjpeg, imagecreatefromgif, imagecreatefrompng to create the image resource from the given image. We can then use this with the functions imagesx and imagesy to return the current width and height of the image. This will allow us to resize the image easier later in the script. /** * Set the image variable by using image create * * @param string $filename - The image filename */ private function setImage( $filename ) { $size = getimagesize($filename); $this->ext = $size['mime']; switch($this->ext) { // Image is a JPG case 'image/jpg': case 'image/jpeg': // create a jpeg extension $this->image = imagecreatefromjpeg($filename); break; // Image is a GIF case 'image/gif': $this->image = @imagecreatefromgif($filename); break; // Image is a PNG case 'image/png': $this->image = @imagecreatefrompng($filename); break; // Mime type not found default: throw new Exception("File is not an image, please use another file type.", 1); } $this->origWidth = imagesx($this->image); $this->origHeight = imagesy($this->image); } Resize The Image The resize function is what we will use to calculate the new values for the image width and height. This takes 3 parameters the new width, new height and the resize option, this will allow us to resize the image to exact dimensions, use the defined width with the height and keep aspect ratio, use the define height keeping the aspect ratio or let the class decide the best way of resizing the image. Once we have the new width and height of the image we can create a new image resource by using the PHP function imagecreatetruecolor(). Now we can create the new image from the old image resizing it to the new dimensions by using the imagecopyresampled() function. /** * Resize the image to these set dimensions * * @param int $width - Max width of the image * @param int $height - Max height of the image * @param string $resizeOption - Scale option for the image * * @return Save new image */ public function resizeTo( $width, $height, $resizeOption = 'default' ) { switch(strtolower($resizeOption)) { case 'exact': $this->resizeWidth = $width; $this->resizeHeight = $height; break; case 'maxwidth': $this->resizeWidth = $width; $this->resizeHeight = $this->resizeHeightByWidth($width); break; case 'maxheight': $this->resizeWidth = $this->resizeWidthByHeight($height); $this->resizeHeight = $height; break; default: if($this->origWidth > $width || $this->origHeight > $height) { if ( $this->origWidth > $this->origHeight ) { $this->resizeHeight = $this->resizeHeightByWidth($width); $this->resizeWidth = $width; } else if( $this->origWidth < $this->origHeight ) { $this->resizeWidth = $this->resizeWidthByHeight($height); $this->resizeHeight = $height; } } else { $this->resizeWidth = $width; $this->resizeHeight = $height; } break; } $this->newImage = imagecreatetruecolor($this->resizeWidth, $this->resizeHeight); imagecopyresampled($this->newImage, $this->image, 0, 0, 0, 0, $this->resizeWidth, $this->resizeHeight, $this->origWidth, $this->origHeight); } /** * Get the resized height from the width keeping the aspect ratio * * @param int $width - Max image width * * @return Height keeping aspect ratio */ private function resizeHeightByWidth($width) { return floor(($this->origHeight / $this->origWidth) * $width); } /** * Get the resized width from the height keeping the aspect ratio * * @param int $height - Max image height * * @return Width keeping aspect ratio */ private function resizeWidthByHeight($height) { return floor(($this->origWidth / $this->origHeight) * $height); } Save The Image With the new image now set in a class variable we can now use this to save the image on the server. This function will take 3 parameters the save path, the image quality and if we want to download the image. For each mime type PHP has a function imagejpeg(), imagegif(), imagepng() that will allow you to save the image by passing in the new image resource and the path the image is going to be saved. Once this image is saved on the server and we decided to download it we can change the headers to allow the browser to download the image on the clients machine. /** * Save the image as the image type the original image was * * @param String[type] $savePath - The path to store the new image * @param string $imageQuality - The qulaity level of image to create * * @return Saves the image */ public function saveImage($savePath, $imageQuality="100", $download = false) { switch($this->ext) { case 'image/jpg': case 'image/jpeg': // Check PHP supports this file type if (imagetypes() & IMG_JPG) { imagejpeg($this->newImage, $savePath, $imageQuality); } break; case 'image/gif': // Check PHP supports this file type if (imagetypes() & IMG_GIF) { imagegif($this->newImage, $savePath); } break; case 'image/png': $invertScaleQuality = 9 - round(($imageQuality/100) * 9); // Check PHP supports this file type if (imagetypes() & IMG_PNG) { imagepng($this->newImage, $savePath, $invertScaleQuality); } break; } if($download) { header('Content-Description: File Transfer'); header("Content-type: application/octet-stream"); header("Content-disposition: attachment; filename= ".$savePath.""); readfile($savePath); } imagedestroy($this->newImage); } Full Resize Image Class setImage( $filename ); } else { throw new Exception('Image ' . $filename . ' can not be found, try another image.'); } } /** * Set the image variable by using image create * * @param string $filename - The image filename */ private function setImage( $filename ) { $size = getimagesize($filename); $this->ext = $size['mime']; switch($this->ext) { // Image is a JPG case 'image/jpg': case 'image/jpeg': // create a jpeg extension $this->image = imagecreatefromjpeg($filename); break; // Image is a GIF case 'image/gif': $this->image = @imagecreatefromgif($filename); break; // Image is a PNG case 'image/png': $this->image = @imagecreatefrompng($filename); break; // Mime type not found default: throw new Exception("File is not an image, please use another file type.", 1); } $this->origWidth = imagesx($this->image); $this->origHeight = imagesy($this->image); } /** * Save the image as the image type the original image was * * @param String[type] $savePath - The path to store the new image * @param string $imageQuality - The qulaity level of image to create * * @return Saves the image */ public function saveImage($savePath, $imageQuality="100", $download = false) { switch($this->ext) { case 'image/jpg': case 'image/jpeg': // Check PHP supports this file type if (imagetypes() & IMG_JPG) { imagejpeg($this->newImage, $savePath, $imageQuality); } break; case 'image/gif': // Check PHP supports this file type if (imagetypes() & IMG_GIF) { imagegif($this->newImage, $savePath); } break; case 'image/png': $invertScaleQuality = 9 - round(($imageQuality/100) * 9); // Check PHP supports this file type if (imagetypes() & IMG_PNG) { imagepng($this->newImage, $savePath, $invertScaleQuality); } break; } if($download) { header('Content-Description: File Transfer'); header("Content-type: application/octet-stream"); header("Content-disposition: attachment; filename= ".$savePath.""); readfile($savePath); } imagedestroy($this->newImage); } /** * Resize the image to these set dimensions * * @param int $width - Max width of the image * @param int $height - Max height of the image * @param string $resizeOption - Scale option for the image * * @return Save new image */ public function resizeTo( $width, $height, $resizeOption = 'default' ) { switch(strtolower($resizeOption)) { case 'exact': $this->resizeWidth = $width; $this->resizeHeight = $height; break; case 'maxwidth': $this->resizeWidth = $width; $this->resizeHeight = $this->resizeHeightByWidth($width); break; case 'maxheight': $this->resizeWidth = $this->resizeWidthByHeight($height); $this->resizeHeight = $height; break; default: if($this->origWidth > $width || $this->origHeight > $height) { if ( $this->origWidth > $this->origHeight ) { $this->resizeHeight = $this->resizeHeightByWidth($width); $this->resizeWidth = $width; } else if( $this->origWidth < $this->origHeight ) { $this->resizeWidth = $this->resizeWidthByHeight($height); $this->resizeHeight = $height; } } else { $this->resizeWidth = $width; $this->resizeHeight = $height; } break; } $this->newImage = imagecreatetruecolor($this->resizeWidth, $this->resizeHeight); imagecopyresampled($this->newImage, $this->image, 0, 0, 0, 0, $this->resizeWidth, $this->resizeHeight, $this->origWidth, $this->origHeight); } /** * Get the resized height from the width keeping the aspect ratio * * @param int $width - Max image width * * @return Height keeping aspect ratio */ private function resizeHeightByWidth($width) { return floor(($this->origHeight/$this->origWidth)*$width); } /** * Get the resized width from the height keeping the aspect ratio * * @param int $height - Max image height * * @return Width keeping aspect ratio */ private function resizeWidthByHeight($height) { return floor(($this->origWidth/$this->origHeight)*$height); } } ?> Using The Resize Image PHP Class Because we have created this to allow you to resize the image in multiple ways it means that there are different ways of using the class. Resize the image to an exact size. Resize the image to a max width size keeping aspect ratio of the image. Resize the image to a max height size keeping aspect ratio of the image. Resize the image to a given width and height and allow the code to work out which way of resizing is best keeping the aspect ratio. You can save the created resize image on the server. You can download the created resize image on the server. Resize Exact Size To resize an image to an exact size you can use the following code. First pass in the image we want to resize in the class constructor, then define the width and height with the scale option of exact. The class will now have the create dimensions to create the new image, now call the function saveImage() and pass in the new file location to the new image. $resize = new ResizeImage('images/Be-Original.jpg'); $resize->resizeTo(100, 100, 'exact'); $resize->saveImage('images/be-original-exact.jpg'); Resize Max Width Size If you choose to set the image to be an exact size then when the image is resized it could lose it's aspect ratio, which means the image could look stretched. But if you know the max width that you want the image to be you can resize the image to a max width, this will keep the aspect ratio of the image. $resize = new ResizeImage('images/Be-Original.jpg'); $resize->resizeTo(100, 100, 'maxWidth'); $resize->saveImage('images/be-original-maxWidth.jpg'); Resize Max Height Size Just as you can select a max width for the image while keeping aspect ratio you can also select a max height while keeping aspect ratio. $resize = new ResizeImage('images/Be-Original.jpg'); $resize->resizeTo(100, 100, 'maxHeight'); $resize->saveImage('images/be-original-maxHeight.jpg'); Resize Auto Size From Given Width And Height You can also allow the code to work out the best way to resize the image, so if the image height is larger than the width then it will resize the image by using the height and keeping aspect ratio. If the image width is larger than the height then the image will be resized using the width and keeping the aspect ratio. $resize = new ResizeImage('images/Be-Original.jpg'); $resize->resizeTo(100, 100); $resize->saveImage('images/be-original-default.jpg'); Download The Resized Image The default behaviour for this class is to save the image on the server, but you can easily change this to download by passing in a true parameter to the saveImage method. $resize = new ResizeImage('images/Be-Original.jpg'); $resize->resizeTo(100, 100, 'exact'); $resize->saveImage('images/be-original-exact.jpg', "100", true);
May 21, 2013
by Paul Underwood
· 80,094 Views
article thumbnail
Postgres Fuzzy Search Using Trigrams (+/- Django)
When building websites, you’ll often want users to be able to search for something by name. On LinerNotes, users can search for bands, albums, genres etc from a search bar that appears on the homepage and in the omnipresent nav bar. And we need a way to match those queries to entities in our Postgres database. At first, this might seem like a simple problem with a simple solution, especially if you’re using the ORM; just jam the user input into an ORM filter and retrieve every matching string. But there’s a problem: if you do Bands.objects.filter(name="beatles") You’ll probably get nothing back, because the name column in your “bands” table probably says “The Beatles” and as far as Postgres is concerned if it’s not exactly the same string, it’s not a match. Users are naturally terrible at spelling, and even if they weren’t they’d be bad at guessing exactly how the name is formatted in your database. Of course you can use the LIKE keyword in SQL (or the equivalent ‘__contains’ suffix in the ORM) to give yourself a little flexibility and make sure that “Beatles” returns “The Beatles”. But 1) the LIKE keyword requires you to evaluate a regex against every row in your table, or hope that you’ve configured your indices to support LIKE (a quick Google doesn’t tell me whether Django does that by default in the ORM) and 2) what if the user types “Beetles”? Well, then you’ve got a bit of a problem. No matter how obvious it is to human you that “beatles” is close to “beetles”[1], to the computer they’re just two non-identical byte sequences. If you want the computer to understand them as similar you’re going to have to give it a metric for similarity and a method to make the comparison. There are a few ways to do that. You can do what I did initially and whip out the power tools, i.e. a dedicated search system like Solr or ElasticSearch. These guys have notions of fuzziness built right in (Solr more automatically than ES). But they’re designed for full-text indexing of documents (e.g. full web pages) and they’re rather complex to set up and administer. ES has been enough of a hassle to keep running smoothly that I took the time to see if I could push the search workload to Postgres, and hence this article. Unless you need to do something real fancy, it’s probably overkill to use them for just matching names. Instead, we’re going to follow Starr Horne’s advice and use a Postgres EXTENSION that lets us build fuzziness into our query in a fast and fairly simple way. Specifically, we’re going to use an extension called pg_trgm (i.e. “Postgres Trigram”) which gives Postgres a “similarity” function that can evaluate how many three-character subsequences (i.e. “trigrams”) two strings share. This is actually a pretty good metric for fuzzy matching short strings like names. To use pg_trgm, you’ll need to install the “Postgres Contrib” package. On ubuntu: sudo apt-get install postgres-contrib **WARNING: THIS WILL TRY TO RESTART YOUR DATABASE** then pop open psql and install pg_trgm (NB: this only works on Postgres 9.1+; Google for the instructions if you’re on a lower version.) psql CREATE EXTENSION pg_trgm; \dx # to check it's installed Now you can do SELECT * FROM types_and_labels_view WHERE label %'Mountain Goats' ORDER BY similarity(label,'Mountain Goats') DESC LIMIT 100; And out will pop the 100 most similar names. This will still take a long time if your table is large, but we can improve that with a special type of index provided by pg_trgm: CREATE INDEX labels_trigram_index ON types_and_labels_table USING gist (label gist_trgm_ops); or CREATE INDEX labels_trigram_index ON types_and_labels_table USING gin (label gin_trgm_ops); (GIN is slower than GIST to build, but answers queries faster. That’ll take a while to build (possibly quite a while), but once it does you should be able to fuzzy search with ease and speed. If you’re using Django, you will have to drop into writing SQL to use this (until someone, maybe you, writes a Django extension to do this in the ORM.) And as a frustrating finishing note, my attempt to implement this on LinerNotes was not ultimately succesful. It seems that that index query performance is at least O(n) and with 50 million entities in my database queries take at least 10 seconds. I’ve read that performance is great up to about 100k records then drops off sharply from there. There are some apparently additional options for improving query performance, but I’ll be sticking with ElasticSearch for now. [1] Sorry, Googlebot! Not sorry, Bingbot.
May 19, 2013
by George London
· 9,600 Views
article thumbnail
How to Create Barcodes in Your PDFs with Python
The Reportlab library is a great way to generate PDFs in Python. Recently, I noticed that it has the ability to do barcodes. I had heard about it being able to generate QR codes, but I hadn’t really dug under the covers to see what else it could do. In this tutorial, we’ll take a look at some of the barcodes that Reportlab can generate. If you don’t already have Reportlab, go to their website and get it before jumping into the article. Reportlab’s Barcode Library Reportlab provides for several different types of bar codes: code39 (i.e. code 3 of 9), code93, code 128, EANBC, QR, and USPS. I saw one called “fourstate” as well, but I couldn’t figure out how to get it to work. Underneath some of these types, there are sub-types such as Standard, Extended or MultiWidth. I didn’t have much luck getting the MultiWidth one to work for the code128 bar code as it kept giving me an attribute error, so we’ll just ignore that one. If you know how to do it, ping me in the comments or via my contact form and let me know. I’ll update the article if anyone can show me how to add that or the fourstate barcode. Anyway, the best way to learn is to just write some code. Here’s a pretty straight-forward example: from reportlab.graphics.barcode import code39, code128, code93 from reportlab.graphics.barcode import eanbc, qr, usps from reportlab.graphics.shapes import Drawing from reportlab.lib.pagesizes import letter from reportlab.lib.units import mm from reportlab.pdfgen import canvas from reportlab.graphics import renderPDF #---------------------------------------------------------------------- def createBarCodes(): """ Create barcode examples and embed in a PDF """ c = canvas.Canvas("barcodes.pdf", pagesize=letter) barcode_value = "1234567890" barcode39 = code39.Extended39(barcode_value) barcode39Std = code39.Standard39(barcode_value, barHeight=20, stop=1) # code93 also has an Extended and MultiWidth version barcode93 = code93.Standard93(barcode_value) barcode128 = code128.Code128(barcode_value) # the multiwidth barcode appears to be broken #barcode128Multi = code128.MultiWidthBarcode(barcode_value) barcode_usps = usps.POSTNET("50158-9999") codes = [barcode39, barcode39Std, barcode93, barcode128, barcode_usps] x = 1 * mm y = 285 * mm x1 = 6.4 * mm for code in codes: code.drawOn(c, x, y) y = y - 15 * mm # draw the eanbc8 code barcode_eanbc8 = eanbc.Ean8BarcodeWidget(barcode_value) bounds = barcode_eanbc8.getBounds() width = bounds[2] - bounds[0] height = bounds[3] - bounds[1] d = Drawing(50, 10) d.add(barcode_eanbc8) renderPDF.draw(d, c, 15, 555) # draw the eanbc13 code barcode_eanbc13 = eanbc.Ean13BarcodeWidget(barcode_value) bounds = barcode_eanbc13.getBounds() width = bounds[2] - bounds[0] height = bounds[3] - bounds[1] d = Drawing(50, 10) d.add(barcode_eanbc13) renderPDF.draw(d, c, 15, 465) # draw a QR code qr_code = qr.QrCodeWidget('www.mousevspython.com') bounds = qr_code.getBounds() width = bounds[2] - bounds[0] height = bounds[3] - bounds[1] d = Drawing(45, 45, transform=[45./width,0,0,45./height,0,0]) d.add(qr_code) renderPDF.draw(d, c, 15, 405) c.save() if __name__ == "__main__": createBarCodes() Let’s break this down a bit. The code39.Extended39 doesn’t really accept much beyond the value itself. On the other hand, code39.Standard39, code93.Standard93 and code128.Code128 all have basically the same API. You can change the barWidth, barHeight, turn on the start/stop symbols and add “quiet” zones. The usps bar code module provides two types of bar code: FIM and POSTNET. FIM or Facing ID Marks only encodes one letter (A-D) which I personally didn’t find it very interesting. So I just show the POSTNET version which should be pretty familiar to people in the United States as it appears on the bottom of most envelopes. POSTNET encodes the zip code! The next three bar codes use a different API to draw them on the PDF that I discovered viaStackOverflow. Basically you create a Drawing object of a certain size and then add the bar code to the drawing. Finally you use the renderPDF module to place the drawing on the PDF. It’s pretty convoluted, but it works pretty well. The EANBC codes are ones you’ll see on some manufactured products, such as tissue boxes. If you’d like to see the result of the code above, you can download the PDF here. Wrapping Up At this point you should be able to go forth and create your own bar codes in your PDFs. Reportlab is pretty handy and I hope you’ll find this additional tool helpful in your endeavors. Additional Reading A step by step Reportlab tutoral Reportlab: Mixing Fixed Content and Flowables Reportlab Tables – Creating Tables in PDFs with Python Creating QR Codes with Python StackOverflow question on Python barcode generation StackOverflow question on reportlab, QR codes and django Get the Source! barcodes.tar
May 17, 2013
by Mike Driscoll
· 28,921 Views · 2 Likes
article thumbnail
Java 8: CompletableFuture in action
After thoroughly exploring CompletableFuture API in Java 8 we are prepared to write a simplistic web crawler. We solved similar problem already using ExecutorCompletionService, Guava ListenableFuture and Scala/Akka. I choose the same problem so that it's easy to compare approaches and implementation techniques. First we shall define a simple, blocking method to download the contents of a single URL private String downloadSite(final String site) { try { log.debug("Downloading {}", site); final String res = IOUtils.toString(new URL("http://" + site), UTF_8); log.debug("Done {}", site); return res; } catch (IOException e) { throw Throwables.propagate(e); } } Nothing fancy. This method will be later invoked for different sites inside thread pool. Another method parses the Stringinto an XML Document (let me leave out the implementation, no one wants to look at it): private Document parse(String xml) //... Finally the core of our algorithm, function computing relevance of each website taking Document as input. Just as above we don't care about the implementation, only the signature is important: private CompletableFuture calculateRelevance(Document doc) //... Let's put all the pieces together. Having a list of websites our crawler shall start downloading the contents of each web site asynchronously and concurrently. Then each downloaded HTML string will be parsed to XML Document and laterrelevance will be computed. As a last step we take all computed relevance metrics and find the biggest one. This sounds pretty straightforward to the moment when you realize that both downloading content and computing relevance is asynchronous (returns CompletableFuture) and we definitely don't want to block or busy wait. Here is the first piece: ExecutorService executor = Executors.newFixedThreadPool(4); List topSites = Arrays.asList( "www.google.com", "www.youtube.com", "www.yahoo.com", "www.msn.com" ); List> relevanceFutures = topSites.stream(). map(site -> CompletableFuture.supplyAsync(() -> downloadSite(site), executor)). map(contentFuture -> contentFuture.thenApply(this::parse)). map(docFuture -> docFuture.thenCompose(this::calculateRelevance)). collect(Collectors.>toList()); There is actually a lot going on here. Defining thread pool and sites to crawl is obvious. But there is this chained expression computing relevanceFutures. The sequence of map() and collect() in the end is quite descriptive. Starting from a list of web sites we transform each site (String) into CompletableFuture by submitting asynchronous task (downloadSite()) into thread pool. So we have a list of CompletableFuture. We continue transforming it, this time applying parse() method on each of them. Remember that thenApply() will invoke supplied lambda when underlying future completes and returnsCompletableFuture immediately. Third and last transformation step composes eachCompletableFuture in the input list with calculateRelevance(). Note that calculateRelevance()returns CompletableFuture instead of Double, thus we use thenCompose() rather than thenApply(). After that many stages we finally collect() a list of CompletableFuture. Now we would like to run some computations on all results. We have a list of futures and we would like to know when all of them (last one) complete. Of course we can register completion callback on each future and use CountDownLatch to block until all callbacks are invoked. I am too lazy for that, let us utilize existing CompletableFuture.allOf(). Unfortunately it has two minor drawbacks - takes vararg instead of Collection and doesn't return a future of aggregated results but Void instead. By aggregated results I mean: if we provide List> such method should return CompletableFuture>, not CompletableFuture! Luckily it's easy to fix with a bit of glue code: private static CompletableFuture> sequence(List> futures) { CompletableFuture allDoneFuture = CompletableFuture.allOf(futures.toArray(new CompletableFuture[futures.size()])); return allDoneFuture.thenApply(v -> futures.stream(). map(future -> future.join()). collect(Collectors.toList()) ); } Watch carefully sequence() argument and return types. The implementation is surprisingly simple, the trick is to use existing allOf() but when allDoneFuture completes (which means all underlying futures are done), simply iterate over all futures and join() (blocking wait) on each. However this call is guaranteed not to block because by now all futures completed! Equipped with such utility method we can finally complete our task: CompletableFuture> allDone = sequence(relevanceFutures); CompletableFuture maxRelevance = allDone.thenApply(relevances -> relevances.stream(). mapToDouble(Double::valueOf). max() ); This one is easy - when allDone completes, apply our function that counts maximal relevance in whole set.maxRelevance is still a future. By the time your JVM reaches this line, probably none of the websites are yet downloaded. But we encapsulated business logic on top of futures, stacking them in an event-driven manner. Code remains readable (version without lambda and with ordinary Futures would be at least twice as long) but avoids blocking main thread. Of course allDone can as well be an intermediate step, we can further transform it, not really having the result yet. Shortcomings CompletableFuture in Java 8 is a huge step forward. From tiny, thin abstraction over asynchronous task to full-blown, functional, feature rich utility. However after few days of playing with it I found few minor disadvantages: CompletableFuture.allOf() returning CompletableFuture discussed earlier. I think it's fair to say that if I pass a collection of futures and want to wait for all of them, I would also like to extract the results when they arrive easily. It's even worse with CompletableFuture.anyOf(). If I am waiting for any of the futures to complete, I can't imagine passing futures of different types, say CompletableFuture andCompletableFuture. If I don't care which one completes first, how am I suppose to handle return type? Typically you will pass a collection of homogeneous futures (e.g. CompletableFuture) and thenanyOf() can simply return future of that type (instead of CompletableFuture again). Mixing settable and listenable abstractions. In Guava there is ListenableFuture and SettableFuture extending it. ListenableFuture allows registering callbacks while SettableFuture adds possibility to set value of the future (resolve it) from arbitrary thread and context. CompletableFuture is equivalent to SettableFuture but there is no limited version equivalent to ListenableFuture. Why is it a problem? If API returns CompletableFuture and then two threads wait for it to complete (nothing wrong with that), one of these threads can resolve this future and wake up other thread, while it's only the API implementation that should do it. But when API tries to resolve the future later, call to complete() is ignored. It can lead to really nasty bugs which are avoided in Guava by separating these two responsibilities. CompletableFuture is ignored in JDK. ExecutorService was not retrofitted to return CompletableFuture. Literally CompletableFuture is not referenced anywhere in JDK. It's a really useful class, backward compatible withFuture, but not really promoted in standard library. Bloated API (?) Fifty methods in total, most in three variants. Splitting settable and listenable (see above) would help. Also some methods like runAfterBoth() or runAfterEither() IMHO do not really belong to anyCompletableFuture. Is there a difference between fast.runAfterBoth(predictable, ...) andpredictable.runAfterBoth(fast, ...)? No, but API favours one or the other. Actually I believerunAfterBoth(fast, predictable, ...) much better expresses my intention. CompletableFuture.getNow(T) should take Supplier instead of raw reference. In the example belowexpensiveAlternative() is always code, irrespective to whether future finished or not: future.getNow(expensiveAlternative()); However we can easily tweak this behaviour (I know, there is a small race condition here, but the original getNow()works this way as well): public static T getNow( CompletableFuture future, Supplier valueIfAbsent) throws ExecutionException, InterruptedException { if (future.isDone()) { return future.get(); } else { return valueIfAbsent.get(); } } With this utility method we can avoid calling expensiveAlternative() when it's not needed: getNow(future, () -> expensiveAlternative()); //or: getNow(future, this::expensiveAlternative); In overall CompletableFuture is a wonderful new tool in our JDK belt. Minor API issues and sometimes too verbose syntax due to limited type inference shouldn't stop you from using it. At least it's a solid foundation for better abstractions and more robust code.
May 17, 2013
by Tomasz Nurkiewicz
· 48,030 Views · 6 Likes
article thumbnail
The Big List of 256 Programming Languages
Check out a list of 256 programming languages, from ABC to Z shell.
May 16, 2013
by Robert Diana
· 250,431 Views · 5 Likes
article thumbnail
Lazy sequences implementation for Java 8
I just published the LazySeq library on GitHub - the result of my Java 8 experiments recently. I hope you will enjoy it. Even if you don't find it very useful, it's still a great lesson of functional programming in Java 8 (and in general). Also it's probably the first community library targeting Java 8! Introduction A Lazy sequence is a data structure that is computed only when its elements are actually needed. All operations on lazy sequences, like map() and filter() are lazy as well, postponing invocation up to the moment when it is really necessary. Lazy sequences are always traversed from the beginning using very cheap first/rest decomposition (head() and tail()). An important property of lazy sequences is that they can represent infinite streams of data, e.g. all natural numbers or temperature measurements over time. Lazy sequence remembers already computed values so if you access the Nth element, all elements from 1 to N-1 are computed as well and cached. Despite that LazySeq (being at the core of many functional languages and algorithms) is immutable and thread-safe. Rationale This library is heavily inspired by scala.collection.immutable.Stream and aims to provide immutable, thread-safe and easy to use lazy sequence implementation, possibly infinite. See Lazy sequences in Scala and Clojure for some use cases. Stream class name is already used in Java 8, therefore LazySeq was chosen, similar to lazy-seq in Clojure. Speaking of Stream, at first it looks like a lazy sequence implementation available out-of-the-box. However, quoting Javadoc: Streams are not data structures and: Once an operation has been performed on a stream, it is considered consumed and no longer usable for other operations. In other words java.util.stream.Stream is just a thin wrapper around existing collection, suitable for one time use. More akin to Iterator than to Stream in Scala. This library attempts to fill this niche. Of course implementing lazy sequence data structure was possible prior to Java 8, but lack of lambdas makes working with such data structure tedious and too verbose. Getting started Building and working with lazy sequences in 10 minutes. Infinite sequence of all natural numbers In order to create a lazy sequence you use LazySeq.cons() factory method that accepts first element (head) and a function that might be later used to compute rest (tail). For example in order to produce lazy sequence of natural numbers with given start element you simply say: private LazySeq naturals(int from) { return LazySeq.cons(from, () -> naturals(from + 1)); } There is really no recursion here. If there was, calling naturals() would quickly result in StackOverflowError as it calls itself without stop condition. However () -> naturals(from + 1) expression defines a function returning LazySeq (Supplier to be precise) that this data structure will invoke, but only if needed. Look at the code below, how many times do you think naturals() function was called (except the first line)? final LazySeq ints = naturals(2); final LazySeq strings = ints. map(n -> n + 10). filter(n -> n % 2 == 0). take(10). flatMap(n -> Arrays.asList(0x10000 + n, n)). distinct(). map(Integer::toHexString); First invocation of naturals(2) returns lazy sequence starting from 2 but rest (3, 4, 5, ...) is not computed yet. Later we map() over this sequence, filter() it, take() first 10 elements, remove duplicates, etc. All these operations do not evaluate the sequence and are as lazy as possible. For example take(10) doesn't evaluate first 10 elements eagerly to return them. Instead new lazy sequence is returned which remembers that it should truncate original sequence at 10th element. Same applies to distinct(). It doesn't evaluate the whole sequence to extract all unique values (otherwise code above would explode quickly, traversing infinite amount of natural numbers). Instead it returns a new sequence with only the first element. If you ever ask for the second unique element, it will lazily evaluate tail, but only as much as possible. Check out toString() output: System.out.println(strings); //[1000c, ?] Question mark (?) says: "there might be something more in that collection, but I don't know it yet". Do you understand where did 1000c came from? Look carefully: Start from an infinite stream of natural numbers starting from 2 Add 10 to each element (so the first element becomes 12 or C in hex) filter() out odd numbers (12 is even so it stays) take() first 10 elements from sequence so far Each element is replaced by two elements: that element plus 0x1000 and the element itself (flatMap()). This does not yield a sequence of pairs, but a sequence of integers that is twice as long We ensure only distinct() elements will be returned In the end we turn integers to hex strings. As you can see none of these operations really require evaluating the whole stream. Only head is being transformed and this is what we see in the end. So when this data structure is actually evaluated? When it absolutely must, e.g. during side-effect traversal: strings.force(); //or strings.forEach(System.out::println); //or final List list = strings.toList(); //or for (String s : strings) { System.out.println(s); } All the statements above alone will force evaluation of whole lazy sequence. Not very smart if our sequence was infinite, but strings was limited to first 10 elements so it will not run infinitely. If you want to force only part of the sequence, simply call strings.take(5).force(). BTW have you noticed that we can iterate over LazySeq strings using standard Java 5 for-each syntax? That's because LazySeq implements List interface, thus plays nicely with Java Collections Framework ecosystem: import java.util.AbstractList; public abstract class LazySeq extends AbstractList Please keep in mind that once lazy sequence is evaluated (computed) it will cache (memoize) them for later use. This makes lazy sequences great for representing infinite or very long streams of data that are expensive to compute. iterate() Building an infinite lazy sequence very often boils down to providing an initial element and a function that produces next item based on the previous one. In other words second element is a function of the first one, third element is a function of the second one, and so on. Convenience LazySeq.iterate() function is provided for such circumstances. ints definition can now look like this: final LazySeq ints = LazySeq.iterate(2, n -> n + 1); We start from 2 and each subsequent element is represented as previous element + 1. More examples: Fibonacci sequence and Collatz conjecture No article about lazy data structure can be left without Fibonacci numbers example: private static LazySeq lastTwoFib(int first, int second) { return LazySeq.cons( first, () -> lastTwoFib(second, first + second) ); } Fibonacci sequence is infinite as well but we are free to transform it in multiple ways: System.out.println( fib. drop(5). take(10). toList() ); //[5, 8, 13, 21, 34, 55, 89, 144, 233, 377] final int firstAbove1000 = fib. filter(n -> (n > 1000)). head(); fib.get(45); See how easy and natural it is to work with infinite stream of numbers? drop(5).take(10) skips first 5 elements and displays next 10. At this point first 15 numbers are already computed and will never by computed again. Finding first Fibonacci number above 1000 (happens to be 1597) is very straightforward. head() is always precomputed by filter() , so no further evaluation is needed. Last but not least we can simply just ask for 45th Fibonacci number (0-based) and get 1134903170. If you ever try to access any Fibonacci number up to this one, they are precomputed and fast to retrieve. Finite sequences (Collatz conjecture) Collatz conjecture is also quite interesting problem. For each positive integer n we compute next integer using following algorithm: n/2 if n is even 3n + 1 if n is odd For example starting from 10 series looks as follows: 10, 5, 16, 8, 4, 2, 1. The series ends when it reaches 1. Mathematicians believe that starting from any integer we will eventually reach 1 but it's not yet proven. Let us create a lazy sequence that generates Collatz series for given n, but only as many as needed. As stated above, this time our sequence will be finite: private LazySeq collatz(long from) { if (from > 1) { final long next = from % 2 == 0 ? from / 2 : from * 3 + 1; return LazySeq.cons(from, () -> collatz(next)); } else { return LazySeq.of(1L); } } This implementation is driven directly by the definition. For each number greater than 1 return that number + lazily evaluated (() -> collatz(next)) rest of the stream. As you can see if 1 is given, we return single element lazy sequence using special of() factory method. Let's test it with aforementioned 10: final LazySeq collatz = collatz(10); collatz.filter(n -> (n > 10)).head(); collatz.size(); filter() allows us to find first number in the sequence that is greater than 10. Remember that lazy sequence will have to traverse the contents (evaluate itself), but only to the point where it finds first matching element. Then it stops, ensuring it computes as little as possible. However size(), in order to calculate total number of elements, must traverse the whole sequence. Of course this can only work with finite lazy sequences, calling size() on an infinite sequence will end up poorly. If you play a bit with this sequence you will quickly realize that sequences for different numbers share the same suffix (always end with the same sequence of numbers). This begs for some caching/structural sharing. See CollatzConjectureTest for details. But can it be used to something, you know... useful? Real life? Infinite sequences of numbers are great, but not very practical in real life. Maybe some more down to earth examples? Imagine you have a collection and you need to pick few items from that collection randomly. Instead of collection I will use a function returning random latin characters: private char randomChar() { return (char) ('A' + (int) (Math.random() * ('Z' - 'A' + 1))); } But there is a twist. You need N (N < 26, number of latin characters) unique values. Simply calling randomChar() few times doesn't guarantee uniqueness. There are few approaches to this problem, with LazySeq it's pretty straightforward: LazySeq charStream = LazySeq.continually(this::randomChar); LazySeq uniqueCharStream = charStream.distinct(); continually() simply invokes given function for each element when needed. Thus charStream will be an infinite stream of random characters. Of course they can't be unique. However uniqueCharStream guarantees that its output is unique. It does so by examining next element of underlying charStream and rejecting items that already appeared. We can now say uniqueCharStream.take(4) and be sure that no duplicates will appear. Once again notice that continually(this::randomChar).distinct().take(4) really calls randomChar() only once! As long as you don't consume this sequence, it remains lazy and postpones evaluation as long as possible. Another example involves loading batches (pages) of data from database. Using ResultSet or Iterator is cumbersome but loading whole data set into memory often not feasible. An alternative involves loading first batch of data eagerly and then providing a function to load next batches. Data is loaded only when it's really needed and we don't suffer performance or scalability issues. First let's define abstract API for loading batches of data from database: public List loadPage(int offset, int max) { //load records from offset to offset + max } I abstract from the technology entirely, but you get the point. Imagine that we now define LazySeq that starts from row 0 and loads next pages only when needed: public static final int PAGE_SIZE = 5; private LazySeq records(int from) { return LazySeq.concat( loadPage(from, PAGE_SIZE), () -> records(from + PAGE_SIZE) ); } When creating new LazySeq instance by calling records(0) first page of 5 elements is loaded. This means that first 5 sequence elements are already computed. If you ever try to access 6th or above, sequence will automatically load all missing record and cache them. In other words you never compute the same element twice. More useful tools when working with sequences are grouped() and sliding() methods. First partitions input sequence into groups of equal size. Take this as an example, also proving that these methods are as always lazy: final LazySeq chars = LazySeq.of('A', 'B', 'C', 'D', 'E', 'F', 'G'); chars.grouped(3); //[[A, B, C], ?] chars.grouped(3).force(); //force evaluation //[[A, B, C], [D, E, F], [G]] and similarly for sliding(): chars.sliding(3); //[[A, B, C], ?] chars.sliding(3).force(); //force evaluation //[[A, B, C], [B, C, D], [C, D, E], [D, E, F], [E, F, G]] These two methods are extremely useful. You can look at your data through sliding window (e.g. to compute moving average) or partition it to equal-length buckets. Last interesting utility method you may find useful is scan() that iterates (lazily, of course) the input stream and constructs every element of output by applying a function on previous and current element of input. Code snippet is worth a thousand words: LazySeq list = LazySeq. numbers(1). scan(0, (a, x) -> a + x); list.take(10).force(); //[0, 1, 3, 6, 10, 15, 21, 28, 36, 45] LazySeq.numbers(1) is a sequence of natural numbers (1, 2, 3...). scan() creates a new sequence that starts from 0 and for each element of input (natural numbers) adds it to last element of itself. So we get: [0, 0+1, 0+1+2, 0+1+2+3, 0+1+2+3+4, 0+1+2+3+4+5...]. If you want a sequence of growing strings, just replace few types: LazySeq.continually("*"). scan("", (s, c) -> s + c). map(s -> "|" + s + "\\"). take(10). forEach(System.out::println); And enjoy this beautiful triangle: |\ |*\ |**\ |***\ |****\ |*****\ |******\ |*******\ |********\ |*********\ Alternatively (same output): lazySeq. stream(). map(n -> n + 1). flatMap(n -> asList(0, n - 1).stream()). filter(n -> n != 0). substream(4, 18). limit(10). sorted(). distinct(). collect(Collectors.toList()); Java collections framework interoperability LazySeq implements java.util.List interface, thus can be used in variety of places. Moreover it also implements Java 8 enhancements to collections, namely streams and collectors: lazySeq. stream(). map(n -> n + 1). flatMap(n -> asList(0, n - 1).stream()). filter(n -> n != 0). substream(4, 18). limit(10). sorted(). distinct(). collect(Collectors.toList()); However streams in Java 8 were created to work around feature that is a foundation of LazySeq - lazy evaluation. Example above postpones all intermediate steps until collect() is called. With LazySeq you can safely skip .stream() and work directly on sequence: lazySeq. map(n -> n + 1). flatMap(n -> asList(0, n - 1)). filter(n -> n != 0). slice(4, 18). limit(10). sorted(). distinct(); Moreover LazySeq provides special purpose collector (see: LazySeq.toLazySeq()) that avoids evaluation even when used with collect() - which normally forces full collection computation. Implementation details Each lazy sequence is built around the idea of eagerly computed head and lazily evaluated tail represented as function. This is very similar to classic single-linked list recursive definition: class List { private final T head; private final List tail; //... } However in case of lazy sequence tail is given as a function, not a value. Invocation of that function is postponed as long as possible: class Cons extends LazySeq { private final E head; private LazySeq tailOrNull; private final Supplier> tailFun; @Override public LazySeq tail() { if (tailOrNull == null) { tailOrNull = tailFun.get(); } return tailOrNull; } For full implementation see Cons.java and FixedCons.java used when tail is known at creation time (for example LazySeq.of(1, 2) as opposed to LazySeq.cons(1, () -> someTailFun()). Pitfalls and common dangers Below common issues and misunderstandings are described. Evaluating too much One of the biggest dangers of working with infinite sequences is trying to evaluate them completely, which obviously leads to infinite computation. The idea behind infinite sequence is not to evaluate it in its entirety but to take as much as we need without introducing artificial limits and accidental complexity (see database loading example). However evaluating whole sequence is way too simple to miss. For example calling LazySeq.size()must evaluate whole sequence and will run infinitely, eventually filling up stack or heap (implementation detail). There are other methods that require full traversal in order to function properly. E.g. allMatch() making sure all elements match given predicate. Some methods are even more dangerous, because whether they will finish or not depends on data in the sequence. For example anyMatch() may return immediately if head matches predicate - or never. Sometimes we can easily avoid costly operations by using more deterministic methods. For example: seq.size() <= 10 //BAD may not work or be extremely slow if seq is infinite. However we can achieve the same with (more) predictable: seq.drop(10).isEmpty() Remember that lazy sequences are immutable (so we don't really mutate seq), drop(n) is typically O(n) while isEmpty() is O(1). When in doubt, consult source code or JavaDoc to make sure your operation won't too eagerly evaluate your sequence. Also be very cautious when using LazySeq where java.util.Collection or java.util.List is expected. Holding unnecessary reference to head Lazy sequences be definition remember already computed elements. You have to be aware of that, otherwise your sequence (especially infinite) will quickly fill up available memory. However, because LazySeq is just a fancy linked list, if you no longer keep a reference to head (but only to some element in the middle), it becomes eligible for garbage collection. For example: //LazySeq first = seq.take(10); seq = seq.drop(10); First ten elements are dropped and we assume nothing holds a reference to what previously was hept in seq. This makes first ten elements eligible for garbage collection. However if we uncomment first line and keep reference to old head in first, JVM will not release any memory. Let's put that into perspective. The following piece of code will eventually throw OutOfMemoryError because infinite reference keeps holding the beginning of the sequence, therefore all the elements created so far: LazySeq infinite = LazySeq.continually(Big::new); for (Big arr : infinite) { // } However by inlining call to continually() or extracting it to a method this code works flawlessly (well, still runs forever, but uses almost no memory): private LazySeq getContinually() { return LazySeq.continually(Big::new); } for (Big arr : getContinually()) { // } What's the difference? For-each loop uses iterators underneath. LazySeqIterator underneath doesn't hold a reference to old head() when it advances, so if nothing else references that head, it will be eligible for garbage collection, see true javac output when for-each is used: for (Iterator cur = getContinually().iterator(); cur.hasNext(); ) { final Big arr = cur.next(); //... } TL;DR Your sequence grows while being traversed. If you keep holding one end while the other grows, it will eventually blow up. Just like your first level cache in Hibernate if you load too much in one transaction. Use only as much as needed. Converting to plain Java collections Converting is simple, but dangerous. This is a consequence of points above. You can convert lazy sequence to java.util.List by calling toList(): LazySeq even = LazySeq.numbers(0, 2); even.take(5).toList(); //[0, 2, 4, 6, 8] or using Collector from Java 8 having richer API: even. stream(). limit(5). collect(Collectors.toSet()) //[4, 6, 0, 2, 8] But remember that Java collections are finite from definition so avoid converting lazy sequences to collections explicitly. Note that LazySeq is already List, thus Iterable and Collection. It also has efficient LazySeq.iterator(). If you can, simply pass LazySeq instance directly and may just work. Performance, time and space complexity head() of every sequence (except empty) is always computed eagerly, thus accessing it is fast O(1). Computing tail() may take everything from O(1) (if it was already computed) to infinite time. As an example take this valid stream: import static com.blogspot.nurkiewicz.lazyseq.LazySeq.cons; import static com.blogspot.nurkiewicz.lazyseq.LazySeq.continually; LazySeq oneAndZeros = cons( 1, () -> continually(0) ). filter(x -> (x > 0)); It represents 1 followed by infinite number of 0s. By filtering all positive numbers (x > 0) we get a sequence with same head, but filtering of tail is delayed (lazy). However if we now carelessly call oneAndZeros.tail(), LazySeq will keep computing more and more of this infinite sequence, but since there is no positive element after initial 1, this operation will run forever, eventually throwing StackOverflowError or OutOfMemoryError (this is an implementation detail). However if you ever reach this state, it's probably a programming bug or misusing of the library. Typically tail() will be close to O(1). On the other hand if you have plenty of operations already "stacked", calling tail() will trigger them rapidly one after another, so tail() run time is heavily dependant on your data structure. Most operations on LazySeq are O(1) since they are lazy. Some operations, like get(n) or drop(n) are O(n) (n represents parameter, not sequence length). In general run time will be similar to normal linked list. Because LazySeq remembers all already computed values in a single linked list, memory consumption is always O(n), where nn is the number of already computed elements. Troubleshooting Error invalid target release: 1.8 during maven build If you see this error message during maven build: [INFO] BUILD FAILURE ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project lazyseq: Fatal error compiling: invalid target release: 1.8 -> [Help 1] it means you are not compiling using Java 8. Download JDK 8 with lambda support and let maven use it: $ export JAVA_HOME=/path/to/jdk8 I get StackOverflowError or program hangs infinitely When working with LazySeq you sometimes get StackOverflowError or OutOfMemoryError: java.lang.StackOverflowError at sun.misc.Unsafe.allocateInstance(Native Method) at java.lang.invoke.DirectMethodHandle.allocateInstance(DirectMethodHandle.java:426) at com.blogspot.nurkiewicz.lazyseq.LazySeq.iterate(LazySeq.java:118) at com.blogspot.nurkiewicz.lazyseq.LazySeq.lambda$0(LazySeq.java:118) at com.blogspot.nurkiewicz.lazyseq.LazySeq$$Lambda$2.get(Unknown Source) at com.blogspot.nurkiewicz.lazyseq.Cons.tail(Cons.java:32) at com.blogspot.nurkiewicz.lazyseq.LazySeq.size(LazySeq.java:325) at com.blogspot.nurkiewicz.lazyseq.LazySeq.size(LazySeq.java:325) at com.blogspot.nurkiewicz.lazyseq.LazySeq.size(LazySeq.java:325) at com.blogspot.nurkiewicz.lazyseq.LazySeq.size(LazySeq.java:325) at com.blogspot.nurkiewicz.lazyseq.LazySeq.size(LazySeq.java:325) at com.blogspot.nurkiewicz.lazyseq.LazySeq.size(LazySeq.java:325) at com.blogspot.nurkiewicz.lazyseq.LazySeq.size(LazySeq.java:325) When working with possibly infinite data structures, care must be taken. Avoid calling operations that must (size(), allMatch(), minBy(), forEach(), reduce(), ...) or can (filter(), distinct(), ...) traverse the whole sequence in order to give correct results. See Pitfalls for more examples and ways to avoid. Maturity Quality This project was started as an exercise and is not battle-proven. But a healthy 300+ unit-test suite (3:1 test code/production code ratio) guards quality and functional correctness. I also make sure LazySeq is as lazy as possible by mocking tail functions and verifying they are called as rarely as one can get. Contributions and bug reports In the event of finding a bug or missing feature, don't hesitate to open a new ticket or start pull request. I would also love to see more interesting usages of LazySeq in wild. Possible improvements Just like FixedCons is used when tail is known up-front, consider IterableCons that wraps existing Iterable in one node rather than building FixedCons hierarchy. This can be used for all concat methods. Parallel processing support (implementing spliterator?) License This project is released under version 2.0 of the Apache License.
May 15, 2013
by Tomasz Nurkiewicz
· 28,962 Views · 1 Like
  • Previous
  • ...
  • 423
  • 424
  • 425
  • 426
  • 427
  • 428
  • 429
  • 430
  • 431
  • 432
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×