DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Languages Topics

article thumbnail
Immutability Through Interfaces
It is often desirable to have immutable objects, objects that cannot be modified once constructed. Typically, an immutable object has fields that are declared as final and are set in the object constructor. There are getters for the fields, but no setters since the values cannot be saved. Once created, the object state doesn’t change and the objects can be shared across different threads since the state is fixed. There are plenty of caveats to that statement, for example if you have a list, while the list field reference may be final, the list itself may be able to change and have values added and removed which would spoil the immutability. Achieving this state can often be difficult because we rely on a number of tools and frameworks that may not support immutability such as any framework that builds an object by creating it and then setting values on it. However, one way around this would be to take a mutable object and make it immutable through interfaces. We do this by creating an interface that represents all the getters for an object, but none of the setters. Given a domain entity of a person with an id, first and last name fields we can define an immutable interface : public interface ImmutablePerson { public int getId(); public String getFirstName(); public String getLastName(); } We can then implement this interface in our person class : public class PersonEntity implements ImmutablePerson { private int id; private String firstName; private String lastName; public int getId() { return id; } public void setId(int id) { this.id = id; } public String getFirstName() { return firstName; } public void setFirstName(String firstName) { this.firstName = firstName; } } Our API deals with the PersonEntity class internally, but exposes the object as the ImmutablePerson only. public ImmutablePerson loadPerson(int id) { PersonEntity result = someLoadingMechanism.getPerson(id); result.setxxxxx(somevalue); //we can change it internally return result; //once it has been returned it is only accessed as an immutable person } To return a mutable instance for modification, you can return the PersonEntity. Alternatively, you can add aMutablePerson interface with just the setters and implement that. Unless you want to keep casting between the Immutable/Mutable types, the MutablePerson interface can extend the ImmutablePerson interface so the writable version is also readable. public interface MutablePerson extends ImmutablePerson { public void setId(int id) { public void setFirstName(String firstName); public void setLastName(String lastName); } Granted, someone can easily cast the object to a PersonEntity or MutablePerson and change the values, but then, you can also change static final variables with reflection if you wanted to. The ability to cast the object to make it writable only could be an internal-use only feature. This mechanism can be used to return immutable objects from persistence layers. There are some caveats with certain ORM tools since they use lazy loading mechanisms that would break the immutability. Underneath the interface it is still a mutable object. With some additional code, you could create immutable collections by wrapping the underlying collections in read only collections and returning them to the user. Some of the JVM benefits of immutability granted upon variables marked as final would not be given to this type of solution since the underlying variables are not final or immutable.
November 15, 2014
by Andy Gibson
· 13,871 Views · 1 Like
article thumbnail
How to Deal with MySQL Deadlocks
Originally Written by Peiran Song A deadlock in MySQL happens when two or more transactions mutually hold and request for locks, creating a cycle of dependencies. In a transaction system, deadlocks are a fact of life and not completely avoidable. InnoDB automatically detects transaction deadlocks, rollbacks a transaction immediately and returns an error. It uses a metric to pick the easiest transaction to rollback. Though an occasional deadlock is not something to worry about, frequent occurrences call for attention. Before MySQL 5.6, only the latest deadlock can be reviewed using SHOW ENGINE INNODB STATUS command. But with Percona Toolkit’s pt-deadlock-logger you can have deadlock information retrieved from SHOW ENGINE INNODB STATUS at a given interval and saved to a file or table for late diagnosis. For more information on using pt-deadlock-logger, see this post. With MySQL 5.6, you can enable a new variable innodb_print_all_deadlocks to have all deadlocks in InnoDB recorded in mysqld error log. Before and above all diagnosis, it is always an important practice to have the applications catch deadlock error (MySQL error no. 1213) and handle it by retrying the transaction. How to diagnose a MySQL deadlock A MySQL deadlock could involve more than two transactions, but the LATEST DETECTED DEADLOCK section only shows the last two transactions. Also it only shows the last statement executed in the two transactions, and locks from the two transactions that created the cycle. What are missed are the earlier statements that might have really acquired the locks. I will show some tips on how to collect the missed statements. Let’s look at two examples to see what information is given. Example 1: 1 141013 6:06:22 2 *** (1) TRANSACTION: 3 TRANSACTION 876726B90, ACTIVE 7 sec setting auto-inc lock 4 mysql tables in use 1, locked 1 5 LOCK WAIT 9 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 4 6 MySQL thread id 155118366, OS thread handle 0x7f59e638a700, query id 87987781416 localhost msandbox update 7 INSERT INTO t1 (col1, col2, col3, col4) values (10, 20, 30, 'hello') 8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: 9 TABLE LOCK table `mydb`.`t1` trx id 876726B90 lock mode AUTO-INC waiting 10 *** (2) TRANSACTION: 11 TRANSACTION 876725B2D, ACTIVE 9 sec inserting 12 mysql tables in use 1, locked 1 13 876 lock struct(s), heap size 80312, 1022 row lock(s), undo log entries 1002 14 MySQL thread id 155097580, OS thread handle 0x7f585be79700, query id 87987761732 localhost msandbox update 15 INSERT INTO t1 (col1, col2, col3, col4) values (7, 86, 62, "a lot of things"), (7, 76, 62, "many more") 16 *** (2) HOLDS THE LOCK(S): 17 TABLE LOCK table `mydb`.`t1` trx id 876725B2D lock mode AUTO-INC 18 *** (2) WAITING FOR THIS LOCK TO BE GRANTED: 19 RECORD LOCKS space id 44917 page no 529635 n bits 112 index `PRIMARY` of table `mydb`.`t2` trx id 876725B2D lock mode S locks rec but not gap waiting 20 *** WE ROLL BACK TRANSACTION (1) Line 1 gives the time when the deadlock happened. If your application code catches and logs deadlock errors,which it should, then you can match this timestamp with the timestamps of deadlock errors in application log. You would have the transaction that got rolled back. From there, retrieve all statements from that transaction. Line 3 & 11, take note of Transaction number and ACTIVE time. If you log SHOW ENGINE INNODB STATUS output periodically(which is a good practice), then you can search previous outputs with Transaction number to hopefully see more statements from the same transaction. The ACTIVE sec gives a hint on whether the transaction is a single statement or multi-statement one. Line 4 & 12, the tables in use and locked are only with respect to the current statement. So having 1 table in use does not necessarily mean that the transaction involves 1 table only. Line 5 & 13, this is worth of attention as it tells how many changes the transaction had made, which is the “undo log entries” and how many row locks it held which is “row lock(s)”. These info hints the complexity of the transaction. Line 6 & 14, take note of thread id, connecting host and connecting user. If you use different MySQL users for different application functions which is another good practice, then you can tell which application area the transaction comes from based on the connecting host and user. Line 9, for the first transaction, it only shows the lock it was waiting for, in this case the AUTO-INC lock on table t1. Other possible values are S for shared lock and X for exclusive with or without gap locks. Line 16 & 17, for the second transaction, it shows the lock(s) it held, in this case the AUTO-INC lock which was what TRANSACTION (1) was waiting for. Line 18 & 19 shows which lock TRANSACTION (2) was waiting for. In this case, it was a shared not gap record lock on another table’s primary key. There are only a few sources for a shared record lock in InnoDB: 1) use of SELECT … LOCK IN SHARE MODE 2) on foreign key referenced record(s) 3) with INSERT INTO… SELECT, shared locks on source table The current statement of trx(2) is a simple insert to table t1, so 1 and 3 are eliminated. By checking SHOW CREATE TABLE t1, you could confirm that the S lock was due to a foreign key constraint to the parent table t2. Example 2: With MySQL community version, each record lock has the record content printed: 1 2014-10-11 10:41:12 7f6f912d7700 2 *** (1) TRANSACTION: 3 TRANSACTION 2164000, ACTIVE 27 sec starting index read 4 mysql tables in use 1, locked 1 5 LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1 6 MySQL thread id 9, OS thread handle 0x7f6f91296700, query id 87 localhost ro ot updating 7 update t1 set name = 'b' where id = 3 8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: 9 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164000 lock_mode X locks rec but not gap waiting 10 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0 11 0: len 4; hex 80000003; asc ;; 12 1: len 6; hex 000000210521; asc ! !;; 13 2: len 7; hex 180000122117cb; asc ! ;; 14 3: len 4; hex 80000008; asc ;; 15 4: len 1; hex 63; asc c;; 16 17 *** (2) TRANSACTION: 18 TRANSACTION 2164001, ACTIVE 18 sec starting index read 19 mysql tables in use 1, locked 1 20 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1 21 MySQL thread id 10, OS thread handle 0x7f6f912d7700, query id 88 localhost r oot updating 22 update t1 set name = 'c' where id = 2 23 *** (2) HOLDS THE LOCK(S): 24 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap 25 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0 26 0: len 4; hex 80000003; asc ;; 27 1: len 6; hex 000000210521; asc ! !;; 28 2: len 7; hex 180000122117cb; asc ! ;; 29 3: len 4; hex 80000008; asc ;; 30 4: len 1; hex 63; asc c;; 31 32 *** (2) WAITING FOR THIS LOCK TO BE GRANTED: 33 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap waiting 34 Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0 35 0: len 4; hex 80000002; asc ;; 36 1: len 6; hex 000000210520; asc ! ;; 37 2: len 7; hex 17000001c510f5; asc ;; 38 3: len 4; hex 80000009; asc ;; 39 4: len 1; hex 62; asc b;; Line 9 & 10: The ‘space id’ is tablespace id, ‘page no’ gives which page the record lock is on inside the tablespace. The ‘n bits’ is not the page offset, instead the number of bits in the lock bitmap. The page offset is the ‘heap no’ on line 10, Line 11~15: It shows the record data in hex numbers. Field 0 is the cluster index(primary key). Ignore the highest bit, the value is 3. Field 1 is the transaction id of the transaction which last modified this record, decimal value is 2164001 which is TRANSACTION (2). Field 2 is the rollback pointer. Starting from field 3 is the rest of the row data. Field 3 is integer column, value 8. Field 4 is string column with character ‘c’. By reading the data, we know exactly which row is locked and what is the current value. What else can we learn from analysis? Since most MySQL deadlocks happen between two transactions, we could start the analysis based on that assumption. In Example 1, trx (2) was waiting on a shared lock, so trx (1) either held a shared or exclusive lock on that primary key record of table t2. Let’s say col2 is the foreign key column, by checking the current statement of trx(1), we know it did not require the same record lock, so it must be some previous statement in trx(1) that required S or X lock(s) on t2’s PK record(s). Trx (1) only made 4 row changes in 7 seconds. Then you learned a few characteristics of trx(1): it does a lot of processing but a few changes; changes involve table t1 and t2, a single record insertion to t2. These information combined with other data could help developers to locate the transaction. Where else can we find previous statements of the transactions? Besides application log and previous SHOW ENGINE INNODB STATUS output, you may also leverage binlog, slow log and/or general query log. With binlog, if binlog_format=statement, each binlog event would have the thread_id. Only committed transactions are logged into binlog, so we could only look for Trx(2) in binlog. In the case of Example 1, we know when the deadlock happened, and we know Trx(2) started 9 seconds ago. We can run mysqlbinlog on the right binlog file and look for statements with thread_id = 155097580. It is always good to then cross refer the statements with the application code to confirm. $ mysqlbinlog -vvv --start-datetime=“2014-10-13 6:06:12” --stop-datatime=“2014-10-13 6:06:22” mysql-bin.000010 > binlog_1013_0606.out With Percona Server 5.5 and above, you can set log_slow_verbosity to include InnoDB transaction id in slow log. Then if you have long_query_time = 0, you would be able to catch all statements including those rolled back into slow log file. With general query log, the thread id is included and could be used to look for related statements. How to avoid a MySQL deadlock There are things we could do to eliminate a deadlock after we understand it. – Make changes to the application. In some cases, you could greatly reduce the frequency of deadlocks by splitting a long transaction into smaller ones, so locks are released sooner. In other cases, the deadlock rises because two transactions touch the same sets of data, either in one or more tables, with different orders. Then change them to access data in the same order, in another word, serialize the access. That way you would have lock wait instead of deadlock when the transactions happen concurrently. – Make changes to the table schema, such as removing foreign key constraint to detach two tables, or adding indexes to minimize the rows scanned and locked. – In case of gap locking, you may change transaction isolation level to read committed for the session or transaction to avoid it. But then the binlog format for the session or transaction would have to be ROW or MIXED.
November 12, 2014
by Peter Zaitsev
· 31,574 Views
article thumbnail
Plotting Data Online via Plotly and Python
I don’t do a lot of plotting in my job, but I recently heard about a website called Plotly that provides a plotting service for anyone’s data. They even have a plotly package for Python (among others)! So in this article we will be learning how to plot with their package. Let’s have some fun making graphs! Getting Started You will need the plotly package to follow along with this article. You can use pip to get the package and install it: pip install plotly Now that you have it installed, you’ll need to go to the Plotly website and create a free account. Once that’s done, you will get an API key. To make things super simple, you can use your username and API key to create a credentials file. Here’s how to do that: import plotly.tools as tls tls.set_credentials_file( username="your_username", api_key="your_api_key") # to get your credentials credentials = tls.get_credentials_file() If you don’t want to save your credentials, then you can also sign in to their service by doing the following: import plotly.plotly as py py.sign_in('your_username','your_api_key') For the purposes of this article, I’m assuming you have created the credentials file. I found that makes interacting with their service a bit easier to use. Creating a Graph Plotly seems to default to a Scatter Plot, so we’ll start with that. I decided to grab some data from a census website. You can download any US state’s population data, along with other pieces of data. In this case, I downloaded a CSV file that contained the population of each county in the state of Iowa. Let’s take a look: import csv import plotly.plotly as py #---------------------------------------------------------------------- def plot_counties(csv_path): """ http://census.ire.org/data/bulkdata.html """ counties = {} county = [] pop = [] counter = 0 with open(csv_path) as csv_handler: reader = csv.reader(csv_handler) for row in reader: if counter == 0: counter += 1 continue county.append(row[8]) pop.append(row[9]) trace = dict(x=county, y=pop) data = [trace] py.plot(data, filename='ia_county_populations') if __name__ == '__main__': csv_path = 'ia_county_pop.csv' plot_counties(csv_path) If you run this code, you should see a graph that looks like this: <br> You can also view the graph here. Anyway, as you can see in the code above, all I did was read the CSV file and extract out the county name and the population. Then I put that data into two different Python lists. Finally I created a dictionary of those lists and then wrapped that dictionary in a list. So you end up with a list that contains a dictionary that contains two lists! To make the Scatter Plot, I passed the data to plotly’s plot method. Converting to a Bar Chart Now let’s see if we can change the ScatterPlot to a Bar Chart. First off, we’ll play around with the plot data. The following was done via the Python interpreter: >>> scatter = py.get_figure('driscollis', '0') >>> print scatter.to_string() Figure( data=Data([ Scatter( x=[u'Adair County', u'Adams County', u'Allamakee County', u'..', ], y=[u'7682', u'4029', u'14330', u'12887', u'6119', u'26076', '..' ] ) ]) ) This shows how we can grab the figure using the username and the plot’s unique number. Then we printed out the data structure. You will note that it doesn’t print out the entire data structure. Now let’s do the actual conversion to a Bar Chart: from plotly.graph_objs import Data, Figure, Layout scatter_data = scatter.get_data() trace_bar = Bar(scatter_data[0]) data = Data([trace_bar]) layout = Layout(title="IA County Populations") fig = Figure(data=data, layout=layout) py.plot(fig, filename='bar_ia_county_pop') This will create a bar chart at the following URL: https://plot.ly/~driscollis/1. Here’s the image of the graph: <br> This code is slightly different than the code we used originally. In this case, we explicitly created a Bar object and passed it the scatter plot’s data. Then we put that data into a Data object. Next we created a Layout object and gave our chart a title. Then we created a Figure object using the data and layout objects. Finally we plotted the bar chart. Saving the Graph to Disk Plotly also allows you to save your graph to your hard drive. You can save it in the following formats: png, svg, jpeg, and pdf. Assuming you still have the Figure object from the previous example handy, you can do the following: py.image.save_as(fig, filename='graph.png') If you want to save using one of the other formats, then just use that format’s extension in the filename. Wrapping Up At this point you should be able to use the plotly package pretty well. There are many other graph types available, so be sure to read Plotly’s documentation thoroughly. They also support streaming graphs. As I understand it, Plotly allows you to create 10 graphs for free. After that you would either have to delete some of your graphs or pay a monthly fee. Additional Reading Plotly Python documentation Plotly User Guide
November 10, 2014
by Mike Driscoll
· 10,804 Views
article thumbnail
Missing Stack Traces for Repeated Exceptions
A long while ago a optimisation was added to the JVM so that if the same exception is thrown again and again and again a single instance of the Exception is created without the stack trace filled in in order to increase performance. This is an excellent idea unless you are trying to diagnose a problem and you have missed the original error. If you forgot about this optimisation you send the afternoon looking at the following log output and weeping slightly. (In my defence I have a little one in the house hence the fuzzy brain and lack of blogging action this year) java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException java.lang.ArrayIndexOutOfBoundsException ... java.lang.Exception: Uncaught exceptions during test at oracle.jdevstudio-testware-tests.level0.testADFcMATS(/.../work/mw9111/jdeveloper/jdev/extensions/oracle.jdevstudio-testware-tests/abbot/common-adf/CreateNewCustomAppAndProjectWithName.xml:25) at oracle.jdevstudio-testware-tests.level0.testADFcMATS(/.../work/mw9111/jdeveloper/jdev/extensions/oracle.jdevstudio-testware-tests/abbot/level0/testADFcMATS.xml:94) at oracle.abbot.JDevScriptFixture.runTest(JDevScriptFixture.java:555) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at junit.textui.TestRunner.doRun(TestRunner.java:116) at junit.textui.TestRunner.doRun(TestRunner.java:109) at oracle.abbot.AbbotRunner.run(AbbotRunner.java:614) at oracle.abbot.AbbotAddin$IdeAbbotRunner.run(AbbotAddin.java:634) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ArrayIndexOutOfBoundsException [Crickets] It turns out that you can turn off this optimisation with a simple flag: java -XX:-OmitStackTraceInFastThrow .... In my particular case the actual exception causing this trouble was a NPE from the GlyphView code in JDK8. (One that is being caused by a glitch in hotspot it seems) But that in turn was causing the AIOOBE in some logging code clouding the issue even more. This in particular is a good flag to add by default when running your automated tests, particularly in combination with the stack trace length override I have talked about before.
November 9, 2014
by Gerard Davison
· 11,518 Views · 1 Like
article thumbnail
Java regex matching hashmap
A hashmap which maintains keys as regular expressions. Any pattern matching the expression will be able to retrieve the same value. Internally it maintains two maps, one containing the regex to value, and another containing matched pattern to regex. Whenever there is a new pattern to 'get', there will be a O(n) search through the compiled regex(s) (which have been 'put' as keys) to find a match. Existing patterns will have constant time lookup through two maps. import java.util.ArrayList; import java.util.Collection; import java.util.HashMap; import java.util.Iterator; import java.util.List; import java.util.Map; import java.util.Set; import java.util.WeakHashMap; import java.util.regex.Pattern; public class RegexHashMap implements Map { private class PatternMatcher { private final String regex; private final Pattern compiled; PatternMatcher(String name) { regex = name; compiled = Pattern.compile(regex); } boolean matched(String string) { if(compiled.matcher(string).matches()) { ref.put(string, regex); return true; } return false; } } /** * Map of input to pattern */ private final Map ref; /** * Map of pattern to value */ private final Map map; /** * Compiled patterns */ private final List matchers; @Override public String toString() { return "RegexHashMap [ref=" + ref + ", map=" + map + "]"; } /** * */ public RegexHashMap() { ref = new WeakHashMap(); map = new HashMap(); matchers = new ArrayList(); } /** * Returns the value to which the specified key pattern is mapped, or null if this map contains no mapping for the key pattern */ @Override public V get(Object weakKey) { if(!ref.containsKey(weakKey)) { for(PatternMatcher matcher : matchers) { if(matcher.matched((String) weakKey)) { break; } } } if(ref.containsKey(weakKey)) { return map.get(ref.get(weakKey)); } return null; } /** * Associates a specified regular expression to a particular value */ @Override public V put(String key, V value) { V v = map.put(key, value); if (v == null) { matchers.add(new PatternMatcher(key)); } return v; } /** * Removes the regular expression key */ @Override public V remove(Object key) { V v = map.remove(key); if(v != null) { for(Iterator iter = matchers.iterator(); iter.hasNext();) { PatternMatcher matcher = iter.next(); if(matcher.regex.equals(key)) { iter.remove(); break; } } for(Iterator> iter = ref.entrySet().iterator(); iter.hasNext();) { Entry entry = iter.next(); if(entry.getValue().equals(key)) { iter.remove(); } } } return v; } /** * Set of view on the regular expression keys */ @Override public Set> entrySet() { return map.entrySet(); } @Override public void putAll(Map m) { for(Entry entry : m.entrySet()) { put(entry.getKey(), entry.getValue()); } } @Override public int size() { return map.size(); } @Override public boolean isEmpty() { return map.isEmpty(); } /** * Returns true if this map contains a mapping for the specified regular expression key. */ @Override public boolean containsKey(Object key) { return map.containsKey(key); } /** * Returns true if this map contains a mapping for the specified regular expression matched pattern. * @param key * @return */ public boolean containsKeyPattern(Object key) { return ref.containsKey(key); } @Override public boolean containsValue(Object value) { return map.containsValue(value); } @Override public void clear() { map.clear(); matchers.clear(); ref.clear(); } /** * Returns a Set view of the regular expression keys contained in this map. */ @Override public Set keySet() { return map.keySet(); } /** * Returns a Set view of the regex matched patterns contained in this map. The set is backed by the map, so changes to the map are reflected in the set, and vice-versa. * @return */ public Set keySetPattern() { return ref.keySet(); } @Override public Collection values() { return map.values(); } /** * Produces a map of patterns to values, based on the regex put in this map * @param patterns * @return */ public Map transform(List patterns) { for(String pattern : patterns) { get(pattern); } Map transformed = new HashMap(); for(Entry entry : ref.entrySet()) { transformed.put(entry.getKey(), map.get(entry.getValue())); } return transformed; } public static void main(String...strings) { RegexHashMap rh = new RegexHashMap(); rh.put("[o|O][s|S][e|E].?[1|2]", "This is a regex match"); rh.put("account", "This is a direct match"); System.out.println(rh); System.out.println("get:ose-1 -> "+rh.get("ose-1")); System.out.println("get:OSE2 -> "+rh.get("OSE2")); System.out.println("get:OSE112 -> "+rh.get("OSE112")); System.out.println("get:ose-2 -> "+rh.get("ose-2")); System.out.println("get:account -> "+rh.get("account")); System.out.println(rh); } }
November 7, 2014
by Sutanu Dalui
· 23,810 Views
article thumbnail
Using REST with the CQRS Pattern to Blend NoSQL & SQL Data
REST Easy with SQL/NoSQL Integration and CQRS Pattern implementation New demands are being put on IT organizations everyday to deliver agile, high-performance, integrated mobile and web applications. In the meantime, the technology landscape is getting complex everyday with the advent of new technologies like REST, NoSQL, Cloud while existing technologies like SOAP and SQL still rule everyday work. Rather than taking religious side of the debate, NoSQL can successfully co-exist with SQL in this ‘polyglot’ of data storage and formats. However, this integration also adds another layer of complexity both in architecture and implementation. This document offers a guide on how some of the relatively newer technologies like REST can help bridge the gap between SQL and NoSQL with an example of a well known pattern called CQRS. This document is organized as follows: Introduction to SQL development process NoSQL Do I have to choose between SQL and NoSQL? CQRS Pattern How to implement CQRS pattern using REST services Introduction to SQL development process Developers have been using SQL Databases for decades to build and deliver enterprise business applications. The process of creating tables, attributes,and relationships is second nature for most developers. Data architects think in terms of tables and columns and navigate relationships for data. The basic concepts of delivery and transformation takes place at the web server level which means the server developer is reading and ‘binding’ to the tables and mapping attributes to a REST response. Application development lifecycle meant changes to the database schema first, followed by the bindings, then internal schema mapping, and finally the SOAP or JSON services, and eventually the client code. This all costs the project time and money. It also means that the ‘code’ (pick your language here) and the business logic would also need to be modified to handle the changes to the model. NoSQL NoSQL is gaining supporters among many SQL shops for various reasons including: Low cost Ability to handle unstructured dataa Scalability Performance The first thing database folks notice is that there is no schema. These document style storage engines can handle huge volumes of structured, semi-structured, and unstructured data. The very nature of schema-less documents allows change to a document structure without having to go through the formal change management process (or data architect). The other major difference is that NoSQL (no-schema) also means no joins or relationships. The document itself contains the embedded information by design. So an order entry would contain the customer with all the orders and line items for each order in a single document. There are many different NoSQL vendors (popular NoSQL databases include MongoDB, Casandra) that are being used for BI and Analytics (read-only) purposes. We are also seeing many customers starting to use NoSQL for auditing, logging, and archival transactions. Do I have to choose between SQL and NoSQL? The purpose of this article is to not get into the religious debate about whether to use SQL or NoSQL. Bottom line is both have their place and are suited for certain type of data – SQL for structured data and NoSQL for unstructured data. So why not have the capability to mix and match this data depending on the application. This can be done by creating a single REST API across both SQL and NoSQL databases. Why a single REST API? The answer is simple – the new agile and mobile world demands this ‘mashup’ of data into a document style JSON response. CQRS (Command Query Responsibility Segmentation) Pattern There are many design patterns for delivery of high performance RESTful services but the one that stands out was described in an article written by Martin Fowler, one of the software industry veterans. He described the pattern called CQRS that is more relevant today in a ‘polyglot’ of servers, data, services, and connections. “We may want to look at the information in a different way to the record store, perhaps collapsing multiple records into one, or forming virtual records by combining information for different places. On the update side we may find validation rules that only allow certain combinations of data to be stored, or may even infer data to be stored that’s different from that we provide.” – Martin Fowler 2011 In this design pattern, the REST API requests (GET) return documents from multiple sources (e.g. mashups). In the update process, the data is subject to business logic derivations, validations, event processing, and database transactions. This data may then be pushed back into the NoSQL using asynchronous events. With the wide-spread adoption of NoSQL databases like MongoDB and schema-less, high capacity data store; most developers are challenged with providing security, business logic, event handling, and integration to other systems. MongoDB; one the popular NoSQL databases and SQL databases share many similar concepts. However the MongoDB programming language itself is very different from the SQL we all know. How to implement CQRS pattern using a RESTFul Architecture A REST server should meet certain requirements to support the CQRS pattern. The server should run on-premise or in the cloud and appears to the mobile and web developer as an HTTP endpoint. The server architecture should implement the following: Connections and Mapping necessary for SQL and NoSQL connectivity and API services needed to create and return GET, PUT, POST, and DELETE REST responses Security Business Logic Connections and Mapping There are two main approaches to creating REST Servers and APIs for SQL and NoSQL databases: Open source frameworks like Apache Tomcat, Spring/Hibernate Commercial framework like Espresso Logic Open source Frameworks Using various open source frameworks like Tomcat, Spring/Hibernate, Node.js, JDBC and MongoDB drivers, a REST server can be created, but we would still be left with the following tasks: Creation and mapping of the necessary SQL objects Create a REST server container and configurations Create Jersey/Jackson classes and annotations Create and define REST API for tables, views, and procedures Hand write validation, event and business logic Handle persistence, optimistic locking, transaction paging Adding identity management and security by roles Now we can start down the same path to connect to MongoDB and write code to connect, select, and return data in JSON and then create the REST calls to merge these two different document styles into a single RESTful endpoint. This is a lot of work for a development team to manage and control and frankly pretty boring and repetitive and is better done by a well designed framework Commercial Frameworks Many commercial frameworks may take care of this complexity without the need to do extensive programming. Here is an example from Espresso Logic and how it handles this complexity with a point and click interface: Running REST server in the cloud or on-premise Connections to external SQL databases Object mapping to tables, views, and procedures Automatic creation of RESTful endpoints from model Reactive business rules and rich event model Integrated role-based security and authentication services. Point-and-click document API creation for SQL and MongoDB endpoints In the example below, the editor shows an SQL (customersTransactions) joined with archived details from MongoDB (archivedTransactions). The MongoDB document for each customer may include transaction details, check images, customer service notes and other relevant account information. This new mashup becomes a single REST call that can be published to mobile and web application developer. Security Security is an important part of building and delivery of RESTful services which can be broken down into two parts; authentication and access control. Authentication Before allowing anyone access to corporate data you want to use the existing corporate identity management (some call this authentication services) to capture and validate the user. This identity management service is based on using existing corporate standards such as LDAP, Windows AD, SQL Database. Role-based Access Control Each user may be assigned one or more corporate roles and these roles are then assigned specific access privileges to each resource (e.g. READ, INSERT, UPDATE, and DELETE). Role-based access should also be able to restrict permissions to specific rows and columns of the API (e.g. only sales reps can see their own orders or a manager can see and change his department salaries but cannot change his own). This restriction should be applied regardless of how or where the API is used or called. Remember, the SQL database already provides some level of security and access which must be considered when designing and delivering new front-end services to internal and external users. Business Logic for REST When data is updated to a REST Server several things need to happen. First, the authentication and access control should determine if this is a valid request and if the user has rights to the endpoint. In addition, the server may need to de-alias REST attributes back to the actual SQL column names. In a full featured business logic server, there should be a series of events and business rules to perform various calculations, validations, and fire other events on dependent tables. Finally, the entire multi-table transaction is written back to the SQL database in a single transaction. Updates are then sent asynchronously to MongoDB as part of the commit event (after the SQL transaction has completed). Conclusion In the real-world of API services, the demand for more complex document style RESTful services is a requirement. That is, the ability to create ‘mashups’ of data from multiple tables, NoSQL collections, and other external systems is a large part of this new design pattern. In addition, the ability to alias attribute names and formats from these source fields has become critical for partners and customers systems. Using REST with the CQRS pattern to blend MongoDB and SQL seamlessly to your existing data will become a major part of your future mobile strategy. To implement these REST services, one can use open source tools and spend a lot of time or select a right commercial framework. This framework should support cloud or on-premise connectivity, security, API integration, as well as business logic. This will make the design and delivery of new application services more rapid and agile in the heterogeneous world of information.
November 4, 2014
by Val Huber DZone Core CORE
· 16,211 Views
article thumbnail
BigList: a Scalable High-Performance List for Java
As memory gets cheaper and cheaper, our applications can keep more data readily available in main memory, or even all as in case of in-memory databases. To make real use of the growing heap memory, appropriate data structures must be used. Interesting enough, there seem to be no specialized implementations for lists - by far the most used collection. This article introduces BigList, a list designed for handling large collections where large means that all data still fit completely in the heap memory. The article will show the special requirements for handling large collections, how BigList is implemented and how it compares to other list implementations. 1. Requirements What are the special requirements we need to handle large collections efficiently? Memory: Sparing use of memory: The list should need little memory for its own implementation so memory can be used for storing application data. Specialized versions for primitives: It must be possible to store common primitives like ints in a memory saving way. Avoid copying large data blocks: If the list grows or shrinks, only a small part of the data must be copied around, as this operation becomes expensive and needs the same amount of memory again. Data sharing: copying collections is a frequent operation which should be efficiently possible even if the collection is large. An efficient implementation requires some sort of data sharing as copying all elements is per se a costly operation. Performance: Good performance for normal operations like reading, storing, adding or removing single elements. Great performance for bulk operations like adding or removing multiple elements. Predictable overhead of operations, so similar operations should need a similar amount of time without excessive worst case scenarios. If an implementation does not offer these features, some operations will not only be slow for really large collections, but will becomse just not feasible because memory or CPU usage will be too exhaustive. Introduction to BigList BigList is a member of the Brownies Collections library which also includes GapList, the fastest list implementation known. GapList is a drop-in replacement for ArrayList, LinkedList, or ArrayDequeue and offers fast access by index and fast insertion/removal at the beginning and at the end at the same time. GapList however has not been designed to cope with large collections, so adding or removing elements can make it necessary to copy a lot of elements around which will lead to performance problems. Also copying a large collection becomes an expensive operation, both in term of time and memory consumption. It will simply not be possible to make a copy of a large collections if not the same amount of memory is available a second time. And this is a common operation as you often want to return a copy of an internal list through your API which has no reference the original list. BigList addresses both problems. The first problem is solved by storing the collection elements in fixed size blocks. Add or remove operations are then implemented to use only data from one block. The copying problem is solved by maintaining a reference count on the fixed size blocks which allows to implement a copy-on-write approach. For efficient access to the fixed size blocks, they are maintained in a specialized tree structure. 2. BigList Details Each BigList instance stores the following information: Elements are stored in in fixed-size blocks A single block is implemented as GapList with a reference count for sharing All blocks are maintained in a tree for fast access Access information for the current block is cached for better performance The following illustration shows these details for two instances of BigList which share one block. 2.1 Use of Blocks Elements are stored in in fixed-size blocks with a default block size of 1000. Where this default may look pretty small, it is most of the time a good choice because it guarantees that write operation only need to move few elements. Read operations will profit from locality of reference by using the currently cached block to be fast. It is however possible to specify the block size for each created BigList instance. All blocks except the first are allocated with this fixed size and will not grow or shrink. The first block will grow to the specified block size to save memory for small lists. If a block has reached its maximum size and more data must be stored within, the block needs to be split up in two blocks before more elements can be stored. If elements are added to the head or tail of the list, the block will only be filled up to a threshold of 95%. This allows inserts into the block without the immediate need for split operations. To save memory, blocks are also merged. This happens automatically if two adjacent blocks are both filled less than 35% after a remove operation. 2.2 Locality of Reference For each operation on BigList, the affected block must be determined first. As predicted by locality of reference, most of the time the affected block will be the same as for the last operation. The implementation of BigList has therefore been designed to profit from locality of reference which makes common operations like iterating over a list very efficient. Instead of always traversing the block tree to determine the block needed for an operation, lower and upper index of the last used block are cached. So if the next operation happens near to the previous one, the same block can be used again without need to traverse the tree. 2.3 Reference Counting To support a copy-on-write approach, BigList stores a reference count for each fixed size blocks indicating whether this block is private or shared. Initially all lists are private having a reference count of 0, so that modification are allowed. If a list is copied, the reference count is incremented which prohibits further modifications. Before a modification then can be made, the block must be copied decrementing the block's reference count and setting the reference count of the copy to 0. The reference count of a block is then decremented by the finalizer of BigList. 3. Benchmarks To prove the excellence of BigList in time and memory consumption, we compare it with some other List implementations. And here are the nominees: Type Library Description BigList brownie-collections List optimized for storing large number of elements. Elements stored in fixed size blocks which are maintained in a tree. GapList brownie-collections Fastest list implementation known. Fast access by index and fast insertion/removal at end and at beginning. ArrayList JDK Maintains elements in a single array. Fast access by index, fast insertion/removal at end, but slow at beginning. LinkedList JDK Elements stored using a linked list. Slow access by index. Memory overhead for each element stored. TreeList commons-collections Elements stored in a tree. All operations are not really fast, but there are no very slow operations. Memory overhead for each element stored. FastTable javolution Elements stored in a "fractal"-like data structure. Good performance and use of memory. However no bulk operations and collection does not shrink. 3.1 Handling Objects In the first part of the benchmark, we compare memory consumption and performance of the different list implementations. Let's first have a look at the memory consumption. The following table shows the bytes used to hold a list with 1'000'000 null elements: BigList GapList ArrayList LinkedList TreeList FastTable 32 bit 4'298'466 4'021'296 4'861'992 8'000'028 18'000'028 4'142'892 64 bit 8'544'254 8'042'552 9'723'964 16'000'044 26'000'044 8'222'988 We can see that BigList, GapList, ArrayList, and FastTable only add small overhead to the stored elements, where as Linkedlist needs twice the memory and TreeList even more. Now to the performance. Here are the results of 9 benchmarks which have been run for each of the 6 candidates with JDK 8 in a 32 bit Windows environment and a list of 1'000'000 elements: The result table can be read as follows: the fastest candidate for each test has a relative performance indicator of 1 the value for the other candidates indicate how many times they have been slower, so a factor of 3 means that this implementation was 3 times slower than the best one The different factor are colored like this: 
factor 1: green (best)
 factor <5: blue (good) 
factor <25: yellow (moderate) 
factor >25: red (poor) If we look the benchmark result, we can see that the performance of BigList is best for all expect two benchmarks. The only moderate result is produces in getting elements in a totally random order. This could be expected as there is no locality of reference which can be exploited, so for each access, the block tree must be traversed to find the correct block. Luckily this is a rare use case in real applications. And the benchmark "Get local" shows that performance is back to good as soon as elements next to each other must be retrieved - as it is the case if we iterate over a range. 3.2 Handling Primitives In the second part of the benchmark, we want see how big the savings are if we use a data structure specialized for storing primitives compared to strong wrapped objects. For this reason, we compare IntBigList and BigList. The following table shows memory needed to store 1'000'000 integer values: BigList IntBigList 32 bit 16'298'454 4'534'840 64 bit 28'544'234 4'570'432 Obviously it is easy to save a lot of memory. In a 32 bit environment, IntBigList just needs 25% percent of memory, in a 64 bit environment only 14%! These figures become plausible if you recall that a simple object needs 8 bytes in a 32 bit, but already 16 bytes in a 64 bit environment, where as a primitive integer value always only needs 4 bytes. The measurable performance gain is not so impressive, it is something below 10% for simple get operations and something above 10% for add and remove operations. These numbers show that the JVM is impressively fast in creating wrapper objects and boxing and unboxing primitive values. We must however also consider that each created object will need to be garbage collected once and therefore adds to the total load of the JVM. 4. Summary BigList is a scalable high-performance list for storing large collections. Its design guarantees that all operations will be predictable and efficient both in term of performance and memory consumption, even copying large collections is tremendous fast. Benchmarks haven proven this and shown that BigList outperform other known list implementations. The library also offers specialized implementations for primitive types like IntBigList which save much memory and provide superior performance. BigList for handling objects and the specializations for handling primitives are part of the Brownies Collections library and can be downloaded from http://www.magicwerk.org/collections.
November 3, 2014
by Thomas Mauch
· 32,953 Views · 10 Likes
article thumbnail
How to Setup Custom SSLSocketFactory's TrustManager per Each URL Connection
We can see from javadoc that javax.net.ssl.HttpsURLConnection provided a static method to override withsetDefaultSSLSocketFory() method. This allow you to supply a custom javax.net.ssl.TrustManagerthat may verify your own CA certs handshake and validation etc. But this will override the default for all "https" URLs per your JVM! So how can we override just a single https URL? Looking at javax.net.ssl.HttpsURLConnection again we see instance method for setSSLSocketFactory(), but we can't instantiate HttpsURLConnection object directly! It took me some digging to realized that the java.net.URL is actually an factory class for its implementation! One can get an instance like this using new URL("https://localhost").openConnection() To complete this article, I will provide a simple working example that demonstrate this. package zemian; import java.io.BufferedReader; import java.io.InputStream; import java.io.InputStreamReader; import java.net.URL; import java.net.URLConnection; import java.security.SecureRandom; import java.security.cert.X509Certificate; import javax.net.ssl.HttpsURLConnection; import javax.net.ssl.SSLContext; import javax.net.ssl.SSLSocketFactory; import javax.net.ssl.TrustManager; import javax.net.ssl.X509TrustManager; public class WGetText { public static void main(String[] args) throws Exception { String urlString = System.getProperty("url", "https://google.com"); URL url = new URL(urlString); URLConnection urlConnection = url.openConnection(); HttpsURLConnection httpsUrlConnection = (HttpsURLConnection) urlConnection; SSLSocketFactory sslSocketFactory = createSslSocketFactory(); httpsUrlConnection.setSSLSocketFactory(sslSocketFactory); try (InputStream inputStream = httpsUrlConnection.getInputStream()) { BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream)); String line = null; while ((line = reader.readLine()) != null) { System.out.println(line); } } } private static SSLSocketFactory createSslSocketFactory() throws Exception { TrustManager[] byPassTrustManagers = new TrustManager[] { new X509TrustManager() { public X509Certificate[] getAcceptedIssuers() { return new X509Certificate[0]; } public void checkClientTrusted(X509Certificate[] chain, String authType) { } public void checkServerTrusted(X509Certificate[] chain, String authType) { } } }; SSLContext sslContext = SSLContext.getInstance("TLS"); sslContext.init(null, byPassTrustManagers, new SecureRandom()); return sslContext.getSocketFactory(); } }
October 31, 2014
by Zemian Deng
· 38,875 Views · 1 Like
article thumbnail
Python: Converting a Date String to Timestamp
I’ve been playing around with Python over the last few days while cleaning up a data set and one thing I wanted to do was translate date strings into a timestamp. I started with a date in this format: date_text = "13SEP2014" So the first step is to translate that into a Python date – the strftime section of the documentation is useful for figuring out which format code is needed: import datetime date_text = "13SEP2014" date = datetime.datetime.strptime(date_text, "%d%b%Y") print(date) $ python dates.py 2014-09-13 00:00:00 The next step was to translate that to a UNIX timestamp. I thought there might be a method or property on the Date object that I could access but I couldn’t find one and so ended up using calendar to do the transformation: import datetime import calendar date_text = "13SEP2014" date = datetime.datetime.strptime(date_text, "%d%b%Y") print(date) print(calendar.timegm(date.utctimetuple())) $ python dates.py 2014-09-13 00:00:00 1410566400 It’s not too tricky so hopefully I shall remember next time.
October 29, 2014
by Mark Needham
· 50,180 Views
article thumbnail
R: Linear models with the lm function, NA values and Collinearity
In my continued playing around with R I’ve sometimes noticed ‘NA’ values in the linear regression models I created but hadn’t really thought about what that meant. On the advice of Peter Huber I recently started working my way through Coursera’s Regression Models which has a whole slide explaining its meaning: So in this case ‘z’ doesn’t help us in predicting Fertility since it doesn’t give us any more information that we can’t already get from ‘Agriculture’ and ‘Education’. Although in this case we know why ‘z’ doesn’t have a coefficient sometimes it may not be clear which other variable the NA one is highly correlated with. Multicollinearity (also collinearity) is a statistical phenomenon in which two or more predictor variables in a multiple regression model are highly correlated, meaning that one can be linearly predicted from the others with a non-trivial degree of accuracy. In that situation we can make use of the alias function to explain the collinearity as suggested in this StackOverflow post: library(datasets); data(swiss); require(stats); require(graphics) z <- swiss$Agriculture + swiss$Education fit = lm(Fertility ~ . + z, data = swiss) > alias(fit) Model : Fertility ~ Agriculture + Examination + Education + Catholic + Infant.Mortality + z Complete : (Intercept) Agriculture Examination Education Catholic Infant.Mortality z 0 1 0 1 0 0 In this case we can see that ‘z’ is highly correlated with both Agriculture and Education which makes sense given its the sum of those two variables. When we notice that there’s an NA coefficient in our model we can choose to exclude that variable and the model will still have the same coefficients as before: > require(dplyr) > summary(lm(Fertility ~ . + z, data = swiss))$coefficients Estimate Std. Error t value Pr(>|t|) (Intercept) 66.9151817 10.70603759 6.250229 1.906051e-07 Agriculture -0.1721140 0.07030392 -2.448142 1.872715e-02 Examination -0.2580082 0.25387820 -1.016268 3.154617e-01 Education -0.8709401 0.18302860 -4.758492 2.430605e-05 Catholic 0.1041153 0.03525785 2.952969 5.190079e-03 Infant.Mortality 1.0770481 0.38171965 2.821568 7.335715e-03 > summary(lm(Fertility ~ ., data = swiss))$coefficients Estimate Std. Error t value Pr(>|t|) (Intercept) 66.9151817 10.70603759 6.250229 1.906051e-07 Agriculture -0.1721140 0.07030392 -2.448142 1.872715e-02 Examination -0.2580082 0.25387820 -1.016268 3.154617e-01 Education -0.8709401 0.18302860 -4.758492 2.430605e-05 Catholic 0.1041153 0.03525785 2.952969 5.190079e-03 Infant.Mortality 1.0770481 0.38171965 2.821568 7.335715e-03 If we call alias now we won’t see any output: > alias(lm(Fertility ~ ., data = swiss)) Model : Fertility ~ Agriculture + Examination + Education + Catholic + Infant.Mortality
October 26, 2014
by Mark Needham
· 16,881 Views
article thumbnail
How to Avoid Hash Collisions When Using MySQL’s CRC32 Function
Originally Written by Arunjith Aravindan Percona Toolkit’s pt-table-checksum performs an online replication consistency check by executing checksum queries on the master, which produces different results on replicas that are inconsistent with the master – and the tool pt-table-sync synchronizes data efficiently between MySQL tables. The tools by default use the CRC32. Other good choices include MD5 and SHA1. If you have installed the FNV_64 user-defined function, pt-table-sync will detect it and prefer to use it, because it is much faster than the built-ins. You can also use MURMUR_HASH if you’ve installed that user-defined function. Both of these are distributed with Maatkit. For details please see the tool’s documentation. Below are test cases similar to what you might have encountered. By using the table checksum we can confirm that the two tables are identical and useful to verify a slave server is in sync with its master. The following test cases with pt-table-checksum and pt-table-sync will help you use the tools more accurately. For example, in a master-slave setup we have a table with a primary key on column “a” and a unique key on column “b”. Here the master and slave tables are not in sync and the tables are having two identical values and two distinct values. The pt-table-checksum tool should be able to identify the difference between master and slave and the pt-table-sync in this case should sync the tables with two REPLACE queries. +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ +-----+-----+ Case 1: Non-cryptographic Hash function (CRC32) and the Hash collision. The tables in the source and target have two different columns and in general way of thinking the tools should identify the difference. But the below scenarios explain how the tools can be wrongly used and how to avoid them – and make things more consistent and reliable when using the tools in your production. The tools by default use the CRC32 checksums and it is prone to hash collisions. In the below case the non-cryptographic function (CRC32) is not able to identify the two distinct values as the function generates the same value even we are having the distinct values in the tables. CREATE TABLE `t1` ( `a` int(11) NOT NULL, `b` int(11) NOT NULL, PRIMARY KEY (`a`), UNIQUE KEY `b` (`b`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ +-----+-----+ Master: [root@localhost mysql]# pt-table-checksum --replicate=percona.checksum --create-replicate-table --databases=db1 --tables=t1 localhost --user=root --password=*** --no-check-binlog-format TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-17T00:59:45 0 0 4 1 0 1.081 db1.t1 Slave: [root@localhost bin]# ./pt-table-sync --print --execute --replicate=percona.checksum --tables db1.t1 --user=root --password=*** --verbose --sync-to-master 192.**.**.** # Syncing via replication h=192.**.**.**,p=...,u=root # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE Narrowed down to BIT_XOR: Master: mysql> SELECT BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) FROM `db1`.`t1`; +------------------------------------------------------------+ | BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) | +------------------------------------------------------------+ | 6581445 | +------------------------------------------------------------+ 1 row in set (0.00 sec) Slave: mysql> SELECT BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) FROM `db1`.`t1`; +------------------------------------------------------------+ | BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) | +------------------------------------------------------------+ | 6581445 | +------------------------------------------------------------+ 1 row in set (0.16 sec) Case 2: As the tools are not able to identify the difference, let us add a new row to the slave and check if the tools are able to identify the distinct values. So I am adding a new row (5,5) to the slave. mysql> insert into db1.t1 values(5,5); Query OK, 1 row affected (0.05 sec) Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ | 5 | 5 | +-----+-----+ [root@localhost mysql]# pt-table-checksum --replicate=percona.checksum --create-replicate-table --databases=db1 --tables=t1 localhost --user=root --password=*** --no-check-binlog-format TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-17T01:01:13 0 1 4 1 0 1.054 db1.t1 [root@localhost bin]# ./pt-table-sync --print --execute --replicate=percona.checksum --tables db1.t1 --user=root --password=*** --verbose --sync-to-master 192.**.**.** # Syncing via replication h=192.**.**.**,p=...,u=root # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE DELETE FROM `db1`.`t1` WHERE `a`='5' LIMIT 1 /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.**.**.**. 10,p=...,u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.**.**.**,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5205 user:root host:localhost.localdomain*/; REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('3', '4') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.**.**.**, p=...,u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.**.**.**,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5205 user:root host:localhost.localdomain*/; REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('4', '3') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.**.**.**, p=...,u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.**.**.**,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5205 user:root host:localhost.localdomain*/; # 1 2 0 0 Chunk 01:01:43 01:01:43 2 db1.t1 Well, apparently the tools are now able to identify the newly added row in the slave and the two other rows having the difference. Case 3: Advantage of Cryptographic Hash functions (Ex: Secure MD5) As such let us make the tables as in the case1 and ask the tools to use the cryptographic (secure MD5) hash functions instead the usual non-cryptographic function. The default CRC32 function provides no security due to their simple mathematical structure and too prone to hash collisions but the MD5 provides better level of integrity. So let us try with the –function=md5 and see the result. Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ +-----+-----+ Narrowed down to BIT_XOR: Master: mysql> SELECT 'test', 't2', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING (@crc, 1, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'), LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := md5(CONCAT_WS('#', `a`, `b`)) , 17, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'))), 0) AS crc FROM `db1`.`t1`; +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | cnt | crc | +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | 4 | 000000000000000063f65b71e539df48 | +------+----+---+------+------+------+-----+----------------------------------+ 1 row in set (0.00 sec) Slave: mysql> SELECT 'test', 't2', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING (@crc, 1, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'), LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := md5(CONCAT_WS('#', `a`, `b`)) , 17, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'))), 0) AS crc FROM `db1`.`t1`; +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | cnt | crc | +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | 4 | 0000000000000000df024e1a4a32c31f | +------+----+---+------+------+------+-----+----------------------------------+ 1 row in set (0.00 sec) [root@localhost mysql]# pt-table-checksum --replicate=percona.checksum --create-replicate-table --function=md5 --databases=db1 --tables=t1 localhost --user=root --password=*** --no-check-binlog-format TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-23T23:57:52 0 1 12 1 0 0.292 db1.t1 [root@localhost bin]# ./pt-table-sync --print --execute --replicate=percona.checksum --tables db1.t1 --user=root --password=amma --verbose --function=md5 --sync-to-master 192.***.***.*** # Syncing via replication h=192.168.56.102,p=...,u=root # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('3', '4') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.168.56.101,p=..., u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.***.***.***,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5608 user:root host:localhost.localdomain*/; REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('4', '3') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.168.56.101,p=..., u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.***.**.***,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5608 user:root host:localhost.localdomain*/; # 0 2 0 0 Chunk 04:46:04 04:46:04 2 db1.t1 Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 4 | 3 | | 3 | 4 | | 3 | 4 | +-----+-----+ +-----+-----+ The MD5 did the trick and solved the problem. See the BIT_XOR result for the MD5 given above and the function is able to identify the distinct values in the tables and resulted with the different crc values. The MD5 (Message-Digest algorithm 5) is a well-known cryptographic hash function with a 128-bit resulting hash value. MD5 is widely used in security-related applications, and is also frequently used to check the integrity but MD5() and SHA1() are very CPU-intensive with slower checksumming if chunk-time is included.
October 24, 2014
by Peter Zaitsev
· 7,090 Views
article thumbnail
Parsing a Tab Delimited File Using OpenCSV
I prefer OpenCSV for CSV parsing in Java. That library also supports parsing of tab delimited files, here's how: Just a quick gist: import au.com.bytecode.opencsv.CSVReader; public class LoadTest { @Test public void testLoad(String row) throws IOException, JobNotFoundException, InterruptedException { CSVReader reader = new CSVReader(new FileReader("/Users/bone/Desktop/foo_data.tab"), '\t'); String[] record; while ((record = reader.readNext()) != null) { for (String value : record) { System.out.println(value); // Debug only } } } }
October 24, 2014
by Brian O' Neill
· 24,271 Views
article thumbnail
Metal Kernel Functions / Compute Shaders in Swift
As part of a project to create a GPU based reaction diffusion simulation, I stated to look at using Metal in Swift this weekend. I've done similar work in the past targeting the Flash Player and using AGAL. Metal is a far higher level language than AGAL: it's based on C++ with a richer syntax and includes compute functions. Whereas in AGAL, to run cellular automata, I'd create a rectangle out of two triangles with a vertex shader and execute the reaction diffusion functions in a separate fragment shader, a compute shader is more direct: I can get and set textures and it can operate of individual pixels of that texture without the need for a vertex shader. The Swift code I discuss in this article is based heavily on two articles at Metal By Example: Introduction to Compute Programming in Metal and Fundamentals of Image Processing in Metal. Both of which include Objective-C source code, so hopefully my Swift implementation will help some. My application has four main steps: initialise Metal, create a Metal texture from a UIImage, apply a kernel function to that texture, convert the newly generated texture back into a UIImage and display it. I'm using a simple example shader that changes the saturation of the input image. so I've also added a slider that changes the saturation value. Let's look at each step one by one: Initialising Metal Initialising Metal is pretty simple: inside my view controller's overridden viewDidLoad(), I create a pointer to the default Metal device: var device: MTLDevice! = nil [...] device = MTLCreateSystemDefaultDevice() I also need to create a library and command queue: defaultLibrary = device.newDefaultLibrary() commandQueue = device.newCommandQueue() Finally, I add a reference to my Metal function to the library and synchronously create and compile a compute pipeline state: et kernelFunction = defaultLibrary.newFunctionWithName("kernelShader") pipelineState = device.newComputePipelineStateWithFunction(kernelFunction!, error: nil) The kernelShader points to the saturation image processing function, written in Metal, that lives in my Shaders.metal file: kernel void kernelShader(texture2d inTexture [[texture(0)]], texture2d outTexture [[texture(1)]], constant AdjustSaturationUniforms &uniforms [[buffer(0)]], uint2 gid [[thread_position_in_grid]]) { float4 inColor = inTexture.read(gid); float value = dot(inColor.rgb, float3(0.299, 0.587, 0.114)); float4 grayColor(value, value, value, 1.0); float4 outColor = mix(grayColor, inColor, uniforms.saturationFactor); outTexture.write(outColor, gid); } Creating a Metal Texture from a UIIMage There are a few steps in converting a UIImage into a MTLTexture instance. I create an array of UInt8 to hold an emptyCGBitmapInfo, then use CGContextDrawImage() to copy the image into a bitmap context let image = UIImage(named: "grand_canyon.jpg") let imageRef = image.CGImage let imageWidth = CGImageGetWidth(imageRef) let imageHeight = CGImageGetHeight(imageRef) let bytesPerRow = bytesPerPixel * imageWidth var rawData = [UInt8](count: Int(imageWidth * imageHeight * 4), repeatedValue: 0) let bitmapInfo = CGBitmapInfo(CGBitmapInfo.ByteOrder32Big.toRaw() | CGImageAlphaInfo.PremultipliedLast.toRaw()) let context = CGBitmapContextCreate(&rawData, imageWidth, imageHeight, bitsPerComponent, bytesPerRow, rgbColorSpace, bitmapInfo) CGContextDrawImage(context, CGRectMake(0, 0, CGFloat(imageWidth), CGFloat(imageHeight)), imageRef) Once all of those steps have executed, I can create a new texture use its replaceRegion() method to write the image into it: let textureDescriptor = MTLTextureDescriptor.texture2DDescriptorWithPixelFormat(MTLPixelFormat.RGBA8Unorm, width: Int(imageWidth), height: Int(imageHeight), mipmapped: true) texture = device.newTextureWithDescriptor(textureDescriptor) let region = MTLRegionMake2D(0, 0, Int(imageWidth), Int(imageHeight)) texture.replaceRegion(region, mipmapLevel: 0, withBytes: &rawData, bytesPerRow: Int(bytesPerRow)) I also create an empty texture which the kernel function will write into: let outTextureDescriptor = MTLTextureDescriptor.texture2DDescriptorWithPixelFormat(texture.pixelFormat, width: texture.width, height: texture.height, mipmapped: false) outTexture = device.newTextureWithDescriptor(outTextureDescriptor) Invoking the Kernel Function The next block of work is to set the textures and another variable on the kerne function and execute the shader. The first step is to instantiate a command buffer and command encoder: let commandBuffer = commandQueue.commandBuffer() let commandEncoder = commandBuffer.computeCommandEncoder() ...then set the pipeline state (we got from device.newComputePipelineStateWithFunction() earlier) and textures on the command encoder: commandEncoder.setComputePipelineState(pipelineState) commandEncoder.setTexture(texture, atIndex: 0) commandEncoder.setTexture(outTexture, atIndex: 1) The filter requires an addition parameter that defines the saturation amount. This is passed into the shader via anMTLBuffer. To populate the buffer, I've created a small struct: struct AdjustSaturationUniforms { var saturationFactor: Float } Then newBufferWithBytes() to pass in my saturationFactor float value: var saturationFactor = AdjustSaturationUniforms(saturationFactor: self.saturationFactor) var buffer: MTLBuffer = device.newBufferWithBytes(&saturationFactor, length: sizeof(AdjustSaturationUniforms), options: nil) commandEncoder.setBuffer(buffer, offset: 0, atIndex: 0) This is now accessible inside the shader as an argument to its kernel function: constant AdjustSaturationUniforms &uniforms [[buffer(0)]] Now I'm ready invoke the function itself. Metal kernel functions use thread groups to break up their workload into chunks. In my example, I create 64 thread groups, then send them off to the GPU: let threadGroupCount = MTLSizeMake(8, 8, 1) let threadGroups = MTLSizeMake(texture.width / threadGroupCount.width, texture.height / threadGroupCount.height, 1) commandQueue = device.newCommandQueue() commandEncoder.dispatchThreadgroups(threadGroups, threadsPerThreadgroup: threadGroupCount) commandEncoder.endEncoding() commandBuffer.commit() commandBuffer.waitUntilCompleted() Converting the Texture to a UIImage Finally, now that the kernel function has executed, we need to do the reverse of above and get the image held inoutTexture into a UIImage so it can be displayed. Again, I use a region to define the size and the texture's getBytes() to populate an array on UInt8: let imageSize = CGSize(width: texture.width, height: texture.height) let imageByteCount = Int(imageSize.width * imageSize.height * 4) let bytesPerRow = bytesPerPixel * UInt(imageSize.width) var imageBytes = [UInt8](count: imageByteCount, repeatedValue: 0) let region = MTLRegionMake2D(0, 0, Int(imageSize.width), Int(imageSize.height)) outTexture.getBytes(&imageBytes, bytesPerRow: Int(bytesPerRow), fromRegion: region, mipmapLevel: 0) Now that imageBytes holds the raw data, it's a few lines to create a CGImage: let providerRef = CGDataProviderCreateWithCFData( NSData(bytes: &imageBytes, length: imageBytes.count * sizeof(UInt8)) ) let bitmapInfo = CGBitmapInfo(CGBitmapInfo.ByteOrder32Big.toRaw() | CGImageAlphaInfo.PremultipliedLast.toRaw()) let renderingIntent = kCGRenderingIntentDefault let imageRef = CGImageCreate(UInt(imageSize.width), UInt(imageSize.height), bitsPerComponent, bitsPerPixel, bytesPerRow, rgbColorSpace, bitmapInfo, providerRef, nil, false, renderingIntent) imageView.image = UIImage(CGImage: imageRef) ...and we're done! Metal requires an A7 or A8 processor and this code has been built and tested under Xcode 6. All the source code is available at my GitHub repository here.
October 23, 2014
by Simon Gladman
· 12,287 Views
article thumbnail
Functional Programming with Java 8 Functions
Learn how to use lambda expressions and anonymous functions in Java 8.
October 20, 2014
by Edwin Dalorzo
· 330,133 Views · 25 Likes
article thumbnail
Sound Synthesis in Swift: A Core Audio Tone Generator
The other day, Morgan at Swift London mentioned it may be interesting to use my Swift node based user interface as the basis for an audio synthesiser. Always ready for a challenge, I've spent a little time looking into Core Audio to see how this could be done and created a little demonstration app: a multi channel tone generator. I've done something not too dissimilar targeting Flash in the past. In that project, I had to manually create 8192 samples to synthesise a tone and ended up using ActionScript Workers to do that work in the background. After some poking around, it seems a similar approach can be taken in Core Audio, but to create pure sine waves there's a far simpler way: using Audio Toolbox. Luckily, I found this excellent article by Gene De Lisa. His SwiftSimpleGraph project has all the boilerplate code to create sine waves, so all I had to do was add a user interface. My project contains four ToneWidget instances, each of which contain numeric dials for frequency and velocity and asine wave renderer. There's also an additional sine wave renderer that displays the composite frequency of all four channels. The sine wave renderer creates a bitmap image of the wave based on some code Joseph Lord tweaked in my Swift reaction diffusion code in the summer. It exposes a setFrequencyVelocityPairs() method which accepts an array ofFrequencyVelocityPair instances. When that's invoked, the drawSineWave() method loops over the array creating and summing vertical values for each column in the bitmap: for i in 1 ..< Int(frame.width) { let scale = M_PI * 5 let curveX = Double(i) var curveY = Double(frame.height / 2) for pair in frequencyVelocityPairs { let frequency = Double(pair.frequency) / 127.0 let velocity = Double(pair.velocity) / 127.0 curveY += ((sin(curveX / scale * frequency * 5)) * (velocity * 10)) } [...] Then, rather than simply plotting the generated position for each column, it draws a line between the newly calculated vertical position for the current column and the previous vertical position for the previous column: for yy in Int(min(previousCurveY, curveY)) ... Int(max(previousCurveY, curveY)) { let pixelIndex : Int = (yy * Int(frame.width) + i); pixelArray[pixelIndex].r = UInt8(255 * colorRef[0]); pixelArray[pixelIndex].g = UInt8(255 * colorRef[1]); pixelArray[pixelIndex].b = UInt8(255 * colorRef[2]); } The result is a continuous wave: Because the sine wave renderer accepts an array of values, I'm able to use the same component for both the individual channels (where the array length is one) and for the composite of all four (where the array length is four). I was ready to use GCD to do the sine wave drawing in a background thread, but both the simulator and my old iPad happily do this in the main thread keeping the user interface totally responsive. Up in the view controller, I have an array containing each widget and another array containing the currently playing notes which is populated inside viewDidLoad(): override func viewDidLoad() { super.viewDidLoad() view.addSubview(sineWaveRenderer) for i in 0 ... 3 { let toneWidget = ToneWidget(index: i, frame: CGRectZero) toneWidget.addTarget(self, action: "toneWidgetChangeHandler:", forControlEvents: UIControlEvents.ValueChanged) toneWidgets.append(toneWidget) view.addSubview(toneWidget) currentNotes.append(toneWidget.getFrequencyVelocityPair()) soundGenerator.playNoteOn(UInt32(toneWidget.getFrequencyVelocityPair().frequency), velocity: UInt32(toneWidget.getFrequencyVelocityPair().velocity), channelNumber: UInt32(toneWidget.getIndex())) } } When the frequency or the velocity is changed by the user in any of those widgets, I use the value in thecurrentNotes array to switch off the currently playing note, update the master sine wave renderer and play the new note: func toneWidgetChangeHandler(toneWidget : ToneWidget) { soundGenerator.playNoteOff(UInt32(currentNotes[toneWidget.getIndex()].frequency), channelNumber: UInt32(toneWidget.getIndex())) updateSineWave() soundGenerator.playNoteOn(UInt32(toneWidget.getFrequencyVelocityPair().frequency), velocity: UInt32(toneWidget.getFrequencyVelocityPair().velocity), channelNumber: UInt32(toneWidget.getIndex())) currentNotes[toneWidget.getIndex()] = toneWidget.getFrequencyVelocityPair() } The updateSineWave() method simply loops over the widgets, gets each value and passes those into the sine wave renderer: func updateSineWave() { var values = [FrequencyVelocityPair]() for widget in toneWidgets { values.append(widget.getFrequencyVelocityPair()) } sineWaveRenderer.setFrequencyVelocityPairs(values) } Once again, big thanks to Gene De Lisa who has really done all the hard work navigating the Core Audio and Audio Toolbox API research on this. All of the source code for this project is available at my GitHub repository here.
October 17, 2014
by Simon Gladman
· 12,268 Views · 1 Like
article thumbnail
Building HTML5 Form Validation Bubble Replacements
Written by TJ VanToll I’ve written and about HTML5 form validation over the last few years, and one of the most common questions I get is about bubbles. By bubbles I mean the UI controls browsers display validation errors in. Below you can see Chrome’s, Firefox’s, and IE’s implementations: For whatever reason, we developers (or more likely our designer colleagues) have a deep-seated desire to style these things. But unfortunately we can’t, as zero browsers provide styling hooks that target these bubbles. Chromeused to provide a series of vendor prefixed pseudo-elements (::-webkit-validation-bubble-*), but theywere removed in Chrome 28. So what’s a developer to do? Well, although browsers don’t allow you to customize their bubbles, the constraint validation spec does allow you to suppress the browser’s bubble UI and build your own. The rest of the article shows you how to do just that. Warning: Don’t go down the path of building a bubble replacement lightly. With the default bubbles you get some pretty complex functionality for free, such as positioning and accessibility. With custom bubbles you have to address those concerns yourself (or use a library that does). Suppressing the Default Bubbles The first step to building a custom UI is suppressing the native bubbles. You do that by listening to each form control’s invalid event and preventing its default behavior. For example, the following form uses no validation bubbles because of the event.preventDefault() call in the invalid event handler. Submit The invalid event does not bubble (no pun intended), so if you want to prevent the validation bubbles on multiple elements you must attach a capture-phase listener. If you’re confused on the difference between the bubbling and capturing phases of DOM events, check out this MDN article for a pretty good explanation. The following code prevents bubbles on both inputs with a single invalid event listener on the parent element: Submit You can use this approach to remove the browser’s UI for form validation, but once you do so, you have build something custom to replace it. Building alternative UIs There are countless ways of displaying form validation errors, and none of them are necessarily right or wrong. (Ok, there are some wrong ways of doing this.) Let’s look at a few common approaches that you may want to take. For each, we’ll use the very simple name and email address form below. We’ll also use some simple CSS to make this form look halfway decent. Name: Email: Submit One important note before we get started: all of these UIs only work in Internet Explorer version 10 and up, as theconstraint validation API is not available in older versions. If you need to support old IE and still want to use HTML5 form validation, check out this slide deck in which I outline some options for doing so. Alternative UI #1: List of messages A common way of showing validation errors is in a box on the top of the screen, and this behavior is relatively easy to build with the HTML5 form validation APIs. Before we get into the code, here’s what it looks like in action. Here’s the code you need to build that UI: function replaceValidationUI( form ) { // Suppress the default bubbles form.addEventListener( "invalid", function( event ) { event.preventDefault(); }, true ); // Support Safari, iOS Safari, and the Android browser—each of which do not prevent // form submissions by default form.addEventListener( "submit", function( event ) { if ( !this.checkValidity() ) { event.preventDefault(); } }); // Add a container to hold error messages form.insertAdjacentHTML( "afterbegin", "" ); var submitButton = form.querySelector( "button:not([type=button]), input[type=submit]" ); submitButton.addEventListener( "click", function( event ) { var invalidFields = form.querySelectorAll( ":invalid" ), listHtml = "", errorMessages = form.querySelector( ".error-messages" ), label; for ( var i = 0; i < invalidFields.length; i++ ) { label = form.querySelector( "label[for=" + invalidFields[ i ].id + "]" ); listHtml += "" + label.innerHTML + " " + invalidFields[ i ].validationMessage + ""; } // Update the list with the new error messages errorMessages.innerHTML = listHtml; // If there are errors, give focus to the first invalid field and show // the error messages container if ( invalidFields.length > 0 ) { invalidFields[ 0 ].focus(); errorMessages.style.display = "block"; } }); } // Replace the validation UI for all forms var forms = document.querySelectorAll( "form" ); for ( var i = 0; i < forms.length; i++ ) { replaceValidationUI( forms[ i ] ); } This example assumes that each form field has a corresponding , where the id attribute of the form field matches the for attribute of the . You may want to tweak the code that builds the messages themselves to match your applications, but other than that this should be something can simply drop in. Alternative UI #2: Messages under fields Sometimes, instead of showing a list of messages on the top of the screen you want to associate a message with its corresponding field. Here’s a UI that does that. To build this UI, most of the code remains the same as the first approach, with a few subtle differences in the submit button’s click event handler. function replaceValidationUI( form ) { // Suppress the default bubbles form.addEventListener( "invalid", function( event ) { event.preventDefault(); }, true ); // Support Safari, iOS Safari, and the Android browser—each of which do not prevent // form submissions by default form.addEventListener( "submit", function( event ) { if ( !this.checkValidity() ) { event.preventDefault(); } }); var submitButton = form.querySelector( "button:not([type=button]), input[type=submit]" ); submitButton.addEventListener( "click", function( event ) { var invalidFields = form.querySelectorAll( ":invalid" ), errorMessages = form.querySelectorAll( ".error-message" ), parent; // Remove any existing messages for ( var i = 0; i < errorMessages.length; i++ ) { errorMessages[ i ].parentNode.removeChild( errorMessages[ i ] ); } for ( var i = 0; i < invalidFields.length; i++ ) { parent = invalidFields[ i ].parentNode; parent.insertAdjacentHTML( "beforeend", "" + invalidFields[ i ].validationMessage + "" ); } // If there are errors, give focus to the first invalid field if ( invalidFields.length > 0 ) { invalidFields[ 0 ].focus(); } }); } // Replace the validation UI for all forms var forms = document.querySelectorAll( "form" ); for ( var i = 0; i < forms.length; i++ ) { replaceValidationUI( forms[ i ] ); } Alternative UI #3: Replacement bubbles The last UI I’ll present is a way of mimicking the browser’s validation bubble with a completely custom (and styleable) bubble built with JavaScript. Here’s the implementation in action. In this example, I’m using a Kendo UI tooltip because I don’t want to worry about handling the positioning logic of the bubbles myself. The code I’m using to build this UI is below. For this implementation I chose to use jQuery to clean up the DOM code (as Kendo UI depends on jQuery). $( "form" ).each(function() { var form = this; // Suppress the default bubbles form.addEventListener( "invalid", function( event ) { event.preventDefault(); }, true ); // Support Safari, iOS Safari, and the Android browser—each of which do not prevent // form submissions by default $( form ).on( "submit", function( event ) { if ( !this.checkValidity() ) { event.preventDefault(); } }); $( "input, select, textarea", form ) // Destroy the tooltip on blur if the field contains valid data .on( "blur", function() { var field = $( this ); if ( field.data( "kendoTooltip" ) ) { if ( this.validity.valid ) { field.kendoTooltip( "destroy" ); } else { field.kendoTooltip( "hide" ); } } }) // Show the tooltip on focus .on( "focus", function() { var field = $( this ); if ( field.data( "kendoTooltip" ) ) { field.kendoTooltip( "show" ); } }); $( "button:not([type=button]), input[type=submit]", form ).on( "click", function( event ) { // Destroy any tooltips from previous runs $( "input, select, textarea", form ).each( function() { var field = $( this ); if ( field.data( "kendoTooltip" ) ) { field.kendoTooltip( "destroy" ); } }); // Add a tooltip to each invalid field var invalidFields = $( ":invalid", form ).each(function() { var field = $( this ).kendoTooltip({ content: function() { return field[ 0 ].validationMessage; } }); }); // If there are errors, give focus to the first invalid field invalidFields.first().trigger( "focus" ).eq( 0 ).focus(); }); }); Although replacing the validation bubbles requires an unfortunate amount of code, this does come fairly close to replicating the browser’s implementation. The difference being that the JavaScript implementation is far more customizable, as you can change it to your heart’s desire. For instance, if you need to add some pink, green, and Comic Sans to your bubbles, you totally can now: The Kendo UI tooltip widget is one of the 25+ widgets available in Kendo UI Core, the free and open source distribution of Kendo UI. So you can use this code today without worrying about licensing restrictions — or paying money. You candownload the Kendo UI core source directly, use our CDN, or grab the library from Bower (bower install kendo-ui-core). Wrapping Up Although you cannot style the browser’s validation bubbles, you can suppress them and build whatever UI you’d like. Feel free to try and alter the approaches shown in this article to meet your needs. If you have any other approaches you’ve used in the past feel free to share them in the comments.
October 15, 2014
by Burke Holland
· 13,044 Views
article thumbnail
MySQL Replication: 'Got fatal error 1236' Causes and Cures
Originally Written by Muhammad Irfan MySQL replication is a core process for maintaining multiple copies of data – and replication is a very important aspect in database administration. In order to synchronize data between master and slaves you need to make sure that data transfers smoothly, and to do so you need to act promptly regarding replication errors to continue data synchronization. Here on the Percona Support team, we often help customers with replication broken-related issues. In this post I’ll highlight the top most critical replication error code 1236 along with the causes and cure. MySQL replication error “Got fatal error 1236” can be triggered by multiple reasons and I will try to cover all of them. Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the first event ‘binlog.000201′ at 5480571 This is a typical error on the slave(s) server. It reflects the problem around max_allowed_packet size. max_allowed_packet refers to single SQL statement sent to the MySQL server as binary log event from master to slave. This error usually occurs when you have a different size of max_allowed_packet on the master and slave (i.e. master max_allowed_packet size is greater then slave server). When the MySQL master server tries to send a bigger packet than defined on the slave server, the slave server then fails to accept it and hence the error. In order to alleviate this issue please make sure to have the same value for max_allowed_packet on both slave and master. You can read more about max_allowed_packet here. This error usually occurs when updating a huge number of rows on the master and it doesn’t fit into the value of slave max_allowed_packet size because slave max_allowed_packet size is lower then the master. This usually happens with queries “LOAD DATA INFILE” or “INSERT .. SELECT” queries. As per my experience, this can also be caused by application logic that can generate a huge INSERT with junk data. Take into account, that one new variable introduced in MySQL 5.6.6 and later slave_max_allowed_packet_size which controls the maximum packet size for the replication threads. It overrides the max_allowed_packet variable on slave and it’s default value is 1 GB. In this post, “max_allowed_packet and binary log corruption in MySQL,”my colleague Miguel Angel Nieto explains this error in detail. Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file’ This error occurs when the slave server required binary log for replication no longer exists on the master database server. In one of the scenarios for this, your slave server is stopped for some reason for a few hours/days and when you resume replication on the slave it fails with above error. When you investigate you will find that the master server is no longer requesting binary logs which the slave server needs to pull in order to synchronize data. Possible reasons for this include the master server expired binary logs via system variable expire_logs_days – or someone manually deleted binary logs from master via PURGE BINARY LOGS command or via ‘rm -f’ command or may be you have some cronjob which archives older binary logs to claim disk space, etc. So, make sure you always have the required binary logs exists on the master server and you can update your procedures to keep binary logs that the slave server requires by monitoring the “Relay_master_log_file” variable from SHOW SLAVE STATUS output. Moreover, if you have set expire_log_days in my.cnf old binlogs expire automatically and are removed. This means when MySQL opens a new binlog file, it checks the older binlogs, and purges any that are older than the value of expire_logs_days (in days). Percona Server added a feature to expire logs based on total number of files used instead of the age of the binlog files. So in that configuration, if you get a spike of traffic, it could cause binlogs to disappear sooner than you expect. For more information check Restricting the number of binlog files. In order to resolve this problem, the only clean solution I can think of is to re-create the slave server from a master server backup or from other slave in replication topology. – Got fatal error 1236 from master when reading data from binary log: ‘binlog truncated in the middle of event; consider out of disk space on master; the first event ‘mysql-bin.000525′ at 175770780, the last event read from ‘/data/mysql/repl/mysql-bin.000525′ at 175770780, the last byte read from ‘/data/mysql/repl/mysql-bin.000525′ at 175771648.’ Usually, this caused by sync_binlog <>1 on the master server which means binary log events may not be synchronized on the disk. There might be a committed SQL statement or row change (depending on your replication format) on the master that did not make it to the slave because the event is truncated. The solution would be to move the slave thread to the next available binary log and initialize slave thread with the first available position on binary log as below: mysql>CHANGE MASTERTOMASTER_LOG_FILE='mysql-bin.000526',MASTER_LOG_POS=4; – [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: ‘Client requested master to start replication from impossible position; the first event ‘mysql-bin.010711′ at 55212580, the last event read from ‘/var/lib/mysql/log/mysql-bin.000711′ at 4, the last byte read from ‘/var/lib/mysql/log/mysql-bin.010711′ at 4.’, Error_code: 1236 I foresee master server crashed or rebooted and hence binary log events not synchronized on disk. This usually happens when sync_binlog != 1 on the master. You can investigate it as inspecting binary log contents as below: $mysqlbinlog--base64-output=decode-rows--verbose--verbose--start-position=55212580mysql-bin.010711 You will find this is the last position of binary log and end of binary log file. This issue can usually be fixed by moving the slave to the next binary log. In this case it would be: mysql>CHANGE MASTER TOMASTER_LOG_FILE='mysql-bin.000712',MASTER_LOG_POS=4; This will resume replication. To avoid corrupted binlogs on the master, enabling sync_binlog=1 on master helps in most cases. sync_binlog=1 will synchronize the binary log to disk after every commit. sync_binlog makes MySQL perform on fsync on the binary log in addition to the fsync by InnoDB. As a reminder, it has some cost impact as it will synchronize the write-to-binary log on disk after every commit. On the other hand, sync_binlog=1 overhead can be very minimal or negligible if the disk subsystem is SSD along with battery-backed cache (BBU). You can read more about this here in the manual. sync_binlog is a dynamic option that you can enable on the fly. Here’s how: mysql-master>SET GLOBAL sync_binlog=1; To make the change persistent across reboot, you can add this parameter in my.cnf. As a side note, along with replication fixes, it is always a better option to make sure your replica is in the master and to validate data between master/slaves. Fortunately, Percona Toolkit has tools for this purpose: pt-table-checksum & pt-table-sync. Before checking for replication consistency, be sure to check the replication environment and then, later, to sync any differences.
October 15, 2014
by Peter Zaitsev
· 40,824 Views
article thumbnail
Creating Executable Uber Jar’s and Native Applications with Java 8 and Maven
creating uber jar’s in java is nothing particular new, even creating executable jar’s was possible with maven long before java 8. with the first release of javafx 2, oracle introduced the javafxpackager tool, which has now been renamed to javapackager (java 8 u20). this enables developers to create native executables for any common platform, even mac app store packages; the drawbacks of the javapackager are that you must create the executable on the target platform and that it will than contain the whole jre to run the application. this means that your 100kb application gets more than 40mb, depending on your target platform. the first step on your way to an executable uber jar is to build your project and to collect all dependencies in a folder or a fat jar. this folder/fat jar will be the input for the javapackager tool, which creates the executable part. solution 1: the maven-dependency-plugin the maven-dependency-plugin contains the goal “unpack-dependencies”. we can use this goal to enhance the default “target/classes” folder with all the dependencies you need to run your application. we assume that the default maven build creates all classes of your project in the “target/classes” folder and the “unpack-dependencies” plugin copies all the project dependencies to this folder. the result is a valid input for the javapackager tool, which creates the executable. the “unpack-dependencies” goal is easy to use; you just need to set the output directory to the “target/classes” folder to ensure everything is together in one folder. a typical configuration looks like this: maven-dependency-plugin 2.6 unpack-dependencies package unpack-dependencies system org.springframework.jmx ${project.build.directory}/classes to create an executable jar from the target/classes directory we use the “exec-maven-plugin” to execute the javapackager commandline tool. org.codehaus.mojo exec-maven-plugin package-jar package exec ${env.java_home}/bin/javapackager -createjar -appclass ${app.main.class} -srcdir ${project.build.directory}/classes -outdir ./target -outfile ${project.artifactid}-app -v now we can create an executable jar, which contains all the project dependencies and that can simply be executed with “java –jar myapp.jar”. in the next step we want to create a native executable or an installer. specific configuration details for the javapackager can be found here: http://docs.oracle.com/javase/8/docs/technotes/tools/unix/javapackager.html , for this tutorial we assume that we want to create a native installer. to do so, we add a second “execution” to your “exec-maven-plugin” like this: package-jar2 package exec ${env.java_home}/bin/javapackager -deploy -native installer -appclass ${app.main.class} -srcfiles ${project.build.directory}/${artifactid}-app.jar -outdir ./target -outfile ${project.artifactid}-app -v once the configuration is done, you can run “mvn clean package” and you will find your executable jar as well as the native installer in your target folder. this solution works fine in most cases, but sometimes you can get in trouble with this plugin. when you have configuration files in your project that also exist in one of your dependencies, the unpack-dependency goal will overwrite your configuration file. for example your project contains a file like “meta-inf/service/com.myconf.file” and any dependency does contain the same file. in this case solution 2 may be a better approach. solution 2: the maven-shade plugin the maven-shade plugin provides the capability to package artifacts into an uber-jar, including its dependencies. it also provides various transformers to merge configuration files or to define the manifest of your jar file. this capability allows us to create an executable uber jar without involving the javapackager, so the packager is only needed to create the native executable. to build an executable uber jar following plugin configuration is needed: org.apache.maven.plugins maven-shade-plugin 2.3 package shade junit:junit jmock:* *:xml-apis … meta-inf/services/conf.file ${app.main.class} ${maven.compile.java.version} ${maven.compile.java.version} this configuration defines a main-class entry in the manifest and merges all “meta-inf/services/conf.files” together. the resulting executable jar file is now a valid input for the javapacker to create a native installer. the configuration to create a native installer with “exec-maven-plugin” and javapacker tool is exactly the same like in solution 1. i provided two example projects on github https://github.com/amoahcp/mvndemos where you can test both configurations. these are two simple projects with a main class starting a jetty webserver on port 8080, so the only dependency is jetty. both solutions can be used for any type of java projects, even with swing or javafx. reference: http://jacpfx.org/2014/10/08/uber-jars.html
October 12, 2014
by Andy Moncsek
· 29,451 Views
article thumbnail
Do it in Java 8: The State Monad
In a previous article (Do it in Java 8: Automatic memoization), I wrote about memoization and said that memoization is about handling state between function calls, although the value returned by the function does not change from one call to another as long as the argument is the same. I showed how this could be done automatically. There are however other cases when handling state between function calls is necessary but cannot be done this simple way. One is handling state between recursive calls. In such a situation, the function is called only once from the user point of view, but it is in fact called several times since it calls itself recursively. Most of these functions will not benefit internally from memoization. For example, the factorial function may be implemented recursively, and for one call to f(n), there will be n calls to the function, but not twice with the same value. So this function would not benefit from internal memoization. On the contrary, the Fibonacci function, if implemented according to its recursive definition, will call itself recursively a huge number of times will the same arguments. Here is the standard definition of the Fibonacci function: f(n) = f(n – 1) + f(n – 2) This definition has a major problem: calculating f(n) implies evaluating n² times the function with different values. The result is that, in practice, it is impossible to use this definition for n > 50 because the time of execution increases exponentially. But since we are calculating more than n values, it is obvious that some values are calculated several times. So this definition would be a good candidate for internal memoization. Warning: Java is not a true recursive language, so using any recursive method will eventually blow the stack, unless we use TCO (Tail Call Optimization) as described in my previous article (Do it in Java 8: recursive and corecursive Fibonacci). However, TCO can't be applied to the original Fibonacci definition because it is not tail recursive. So we will write an example working for n limited to a few thousands. The naive implementation Here is how we might implement the original definition: static BigInteger fib(BigInteger n) { return n.equals(BigInteger.ZERO) || n.equals(BigInteger.ONE) ? n : fib(n.subtract(BigInteger.ONE)).add(fib(n.subtract(BigInteger.ONE.add(BigInteger.ONE)))); } Do not try this implementation for values greater that 50. On my machine, fib(50) takes 44 minutes to return! Basic memoized version To avoid computing several times the same values, we can use a map were computed values are stored. This way, for each value, we first look into the map to see if it has already been computed. If it is present, we just retrieve it. Otherwise, we compute it, store it into the map and return it. For this, we will use a special class called Memo. This is a normal HashMap that mimic the interface of functional map, which means that insertion of a new (key, value) pair returns a map, and looking up a key returns an Optional: public class Memo extends HashMap { public Optional retrieve(BigInteger key) { return Optional.ofNullable(super.get(key)); } public Memo addEntry(BigInteger key, BigInteger value) { super.put(key, value); return this; } } Note that this is not a true functional (immutable and persistent) map, but this is not a problem since it will not be shared. We will also need a Tuple class that we define as: public class Tuple { public final A _1; public final B _2; public Tuple(A a, B b) { _1 = a; _2 = b; } } With this class, we can write the following basic implementation: static BigInteger fibMemo1(BigInteger n) { return fibMemo(n, new Memo().addEntry(BigInteger.ZERO, BigInteger.ZERO) .addEntry(BigInteger.ONE, BigInteger.ONE))._1; } static Tuple fibMemo(BigInteger n, Memo memo) { return memo.retrieve(n).map(x -> new Tuple<>(x, memo)).orElseGet(() -> { BigInteger x = fibMemo(n.subtract(BigInteger.ONE), memo)._1 .add(fibMemo(n.subtract(BigInteger.ONE).subtract(BigInteger.ONE), memo)._1); return new Tuple<>(x, memo.addEntry(n, x)); }); } This implementation works fine, provided values of n are not too big. For f(50), it returns in less that 1 millisecond, which should be compared to the 44 minutes of the naive version. Remark that we do not have to test for terminal values (f(0) and f(1). These values are simply inserted into the map at start. So, is there something we should make better? The main problem is that we have to handle the passing of the Memo map by hand. The signature of the fibMemo method is no longer fibMemo(BigInteger n), but fibMemo(BigInteger n, Memo memo). Could we simplify this? We might think about using automatic memoization as described in my previous article (Do it in Java 8: Automatic memoization ). However, this will not work: static Function fib = new Function() { @Override public BigInteger apply(BigInteger n) { return n.equals(BigInteger.ZERO) || n.equals(BigInteger.ONE) ? n : this.apply(n.subtract(BigInteger.ONE)).add(this.apply(n.subtract(BigInteger.ONE.add(BigInteger.ONE)))); } }; static Function fibm = Memoizer.memoize(fib); Beside the fact that we could not use lambdas because reference to this is not allowed, which may be worked around by using the original anonymous class syntax, the recursive call is made to the non memoized function, so it does not make things better. Using the State monad In this example, each computed value is strictly evaluated by the fibMemo method, and this is what makes the memo parameter necessary. Instead of a method returning a value, what we would need is a method returning a function that could be evaluated latter. This function would take a Memo as parameter, and this Memo instance would be necessary only at evaluation time. This is what the State monad will do. Java 8 does not provide the state monad, so we have to create it, but it is very simple. However, we first need an implementation of a list that is more functional that what Java offers. In a real case, we would use a true immutable and persistent List. The one I have written is about 1 000 lines, so I can't show it here. Instead, we will use a dummy functional list, backed by a java.util.ArrayList. Although this is less elegant, it does the same job: public class List { private java.util.List list = new ArrayList<>(); public static List empty() { return new List(); } @SafeVarargs public static List apply(T... ta) { List result = new List<>(); for (T t : ta) result.list.add(t); return result; } public List cons(T t) { List result = new List<>(); result.list.add(t); result.list.addAll(list); return result; } public U foldRight(U seed, Function> f) { U result = seed; for (int i = list.size() - 1; i >= 0; i--) { result = f.apply(list.get(i)).apply(result); } return result; } public List map(Function f) { List result = new List<>(); for (T t : list) { result.list.add(f.apply(t)); } return result; } public List filter(Function f) { List result = new List<>(); for (T t : list) { if (f.apply(t)) { result.list.add(t); } } return result; } public Optional findFirst() { return list.size() == 0 ? Optional.empty() : Optional.of(list.get(0)); } @Override public String toString() { StringBuilder s = new StringBuilder("["); for (T t : list) { s.append(t).append(", "); } return s.append("NIL]").toString(); } } The implementation is not functional, but the interface is! And although there are lots of missing capabilities, we have all we need. Now, we can write the state monad. It is often called simply State but I prefer to call it StateMonad in order to avoid confusion between the state and the monad: public class StateMonad { public final Function> runState; public StateMonad(Function> runState) { this.runState = runState; } public static StateMonad unit(A a) { return new StateMonad<>(s -> new StateTuple<>(a, s)); } public static StateMonad get() { return new StateMonad<>(s -> new StateTuple<>(s, s)); } public static StateMonad getState(Function f) { return new StateMonad<>(s -> new StateTuple<>(f.apply(s), s)); } public static StateMonad transition(Function f) { return new StateMonad<>(s -> new StateTuple<>(Nothing.instance, f.apply(s))); } public static StateMonad transition(Function f, A value) { return new StateMonad<>(s -> new StateTuple<>(value, f.apply(s))); } public static StateMonad> compose(List> fs) { return fs.foldRight(StateMonad.unit(List.empty()), f -> acc -> f.map2(acc, a -> b -> b.cons(a))); } public StateMonad flatMap(Function> f) { return new StateMonad<>(s -> { StateTuple temp = runState.apply(s); return f.apply(temp.value).runState.apply(temp.state); }); } public StateMonad map(Function f) { return flatMap(a -> StateMonad.unit(f.apply(a))); } public StateMonad map2(StateMonad sb, Function> f) { return flatMap(a -> sb.map(b -> f.apply(a).apply(b))); } public A eval(S s) { return runState.apply(s).value; } } This class is parameterized by two types: the value type A and the state type S. In our case, A will be BigInteger and S will be Memo. This class holds a function from a state to a tuple (value, state). This function is hold in the runState field. This is similar to the value hold in the Optional monad. To make it a monad, this class needs a unit method and a flatMap method. The unit method takes a value as parameter and returns a StateMonad. It could be implemented as a constructor. Here, it is a factory method. The flatMap method takes a function from A (a value) to StateMonad and return a new StateMonad. (In our case, A is the same as B.) The new type contains the new value and the new state that result from the application of the function. All other methods are convenience methods: map allows to bind a function from A to B instead of a function from A to StateMonad. It is implemented in terms of flatMap and unit. eval allows easy retrieval of the value hold by the StateMonad. getState allows creating a StateMonad from a function S -> A. transition takes a function from state to state and a value and returns a new StateMonad holding the value and the state resulting from the application of the function. In other words, it allows changing the state without changing the value. There is also another transition method taking only a function and returning a StateMonad. Nothing is a special class: public final class Nothing { public static final Nothing instance = new Nothing(); private Nothing() {} } This class could be replaced by Void, to mean that we do not care about the type. However, Void is not supposed to be instantiated, and the only reference of type Void is normally null. The problem is that null does not carry its type. We could instantiate a Void instance through introspection: Constructor constructor; constructor = Void.class.getDeclaredConstructor(); constructor.setAccessible(true); Void nothing = constructor.newInstance(); but this is really ugly, so we create a Nothing type with a single instance of it. This does the trick, although to be complete, Nothing should be able to replace any type (like null), which does not seem to be possible in Java. Using the StateMonad class, we can rewrite our program: static BigInteger fibMemo2(BigInteger n) { return fibMemo(n).eval(new Memo().addEntry(BigInteger.ZERO, BigInteger.ZERO).addEntry(BigInteger.ONE, BigInteger.ONE)); } static StateMonad fibMemo(BigInteger n) { return StateMonad.getState((Memo m) -> m.retrieve(n)) .flatMap(u -> u.map(StateMonad:: unit).orElse(fibMemo(n.subtract(BigInteger.ONE)) .flatMap(x -> fibMemo(n.subtract(BigInteger.ONE).subtract(BigInteger.ONE)) .map(x::add) .flatMap(z -> StateMonad.transition((Memo m) -> m.addEntry(n, z), z))))); } Now, the fibMemo method only takes a BigInteger as its parameter and returns a StateMonad, which means that when this method returns, nothing has been evaluated yet. The Memo doesn't even exist! To get the result, we may call the eval method, passing it the Memo instance. If you find this code difficult to understand, here is an exploded commented version using longer identifiers: static StateMonad fibMemo(BigInteger n) { /* * Create a function of type Memo -> Optional with a closure * over the n parameter. */ Function> retrieveValueFromMapIfPresent = (Memo memoizationMap) -> memoizationMap.retrieve(n); /* * Create a state from this function. */ StateMonad> initialState = StateMonad.getState(retrieveValueFromMapIfPresent); /* * Create a function for converting the value (BigInteger) into a State * Monad instance. This function will be bound to the Optional resulting * from the lookup into the map to give the result if the value was found. */ Function> createStateFromValue = StateMonad:: unit; /* * The value computation proper. This can't be easily decomposed because it * make heavy use of closures. It first calls recursively fibMemo(n - 1), * producing a StateMonad. It then flatMaps it to a new * recursive call to fibMemo(n - 2) (actually fibMemo(n - 1 - 1)) and get a * new StateMonad which is mapped to BigInteger addition * with the preceding value (x). Then it flatMaps it again with the function * y -> StateMonad.transition((Memo m) -> m.addEntry(n, z), z) which adds * the two values and returns a new StateMonad with the computed value added * to the map. */ StateMonad computedValue = fibMemo(n.subtract(BigInteger.ONE)) .flatMap(x -> fibMemo(n.subtract(BigInteger.ONE).subtract(BigInteger.ONE)) .map(x::add) .flatMap(z -> StateMonad.transition((Memo m) -> m.addEntry(n, z), z))); /* * Create a function taking an Optional as its parameter and * returning a state. This is the main function that returns the value in * the Optional if it is present and compute it and put it into the map * before returning it otherwise. */ Function, StateMonad> computeFiboValueIfAbsentFromMap = u -> u.map(createStateFromValue).orElse(computedValue); /* * Bind the computeFiboValueIfAbsentFromMap function to the initial State * and return the result. */ return initialState.flatMap(computeFiboValueIfAbsentFromMap); } The most important part is the following: StateMonad computedValue = fibMemo_(n.subtract(BigInteger.ONE)) .flatMap(x -> fibMemo_(n.subtract(BigInteger.ONE).subtract(BigInteger.ONE)) .map(x::add) .flatMap(z -> StateMonad.transition((Memo m) -> m.addEntry(n, z), z))); This kind of code is essential to functional programming, although it is sometimes replaced in other languages with “for comprehensions”. As Java 8 does not have for comprehensions we have to use this form. At this point, we have seen that using the state monad allows abstracting the handling of state. This technique can be used every time you have to handle state. More uses of the state monad The state monad may be used for many other cases were state must be maintained in a functional way. Most programs based upon maintaining state use a concept known as a State Machine. A state machine is defined by an initial state and a series of inputs. Each input submitted to the state machine will produce a new state by applying one of several possible transitions based upon a list of conditions concerning both the input and the actual state. If we take the example of a bank account, the initial state would be the initial balance of the account. Possible transition would be deposit(amount) and withdraw(amount). The conditions would be true for deposit and balance >= amount for withdraw. Given the state monad that we have implemented above, we could write a parameterized state machine: public class StateMachine { Function> function; public StateMachine(List, Transition>> transitions) { function = i -> StateMonad.transition(m -> Optional.of(new StateTuple<>(i, m)).flatMap((StateTuple t) -> transitions.filter((Tuple, Transition> x) -> x._1.test(t)).findFirst().map((Tuple, Transition> y) -> y._2.apply(t))).get()); } public StateMonad process(List inputs) { List> a = inputs.map(function); StateMonad> b = StateMonad.compose(a); return b.flatMap(x -> StateMonad.get()); } } This machine uses a bunch of helper classes. First, the inputs are represented by an interface: public interface Input { boolean isDeposit(); boolean isWithdraw(); int getAmount(); } There are two instances of inputs: public class Deposit implements Input { private final int amount; public Deposit(int amount) { super(); this.amount = amount; } @Override public boolean isDeposit() { return true; } @Override public boolean isWithdraw() { return false; } @Override public int getAmount() { return this.amount; } } public class Withdraw implements Input { private final int amount; public Withdraw(int amount) { super(); this.amount = amount; } @Override public boolean isDeposit() { return false; } @Override public boolean isWithdraw() { return true; } @Override public int getAmount() { return this.amount; } } Then come two functional interfaces for conditions and transitions: public interface Condition extends Predicate> {} public interface Transition extends Function, S> {} These act as type aliases in order to simplify the code. We could have used the predicate and the function directly. In the same manner, we use a StateTuple class instead of a normal tuple: public class StateTuple { public final A value; public final S state; public StateTuple(A a, S s) { value = Objects.requireNonNull(a); state = Objects.requireNonNull(s); } } This is exactly the same as an ordinary tuple with named members instead of numbered ones. Numbered members allows using the same class everywhere, but a specific class like this one make the code easier to read as we will see. The last utility class is Outcome, which represent the result returned by the state machine: public class Outcome { public final Integer account; public final List> operations; public Outcome(Integer account, List> operations) { super(); this.account = account; this.operations = operations; } public String toString() { return "(" + account.toString() + "," + operations.toString() + ")"; } } This again could be replaced with a Tuple>>, but using named parameters make the code easier to read. (In some functional languages, we could use type aliases for this.) Here, we use an Either class, which is another kind of monad that Java does not offer. I will not show the complete class, but only the parts that are useful for this example: public interface Either { boolean isLeft(); boolean isRight(); A getLeft(); B getRight(); static Either right(B value) { return new Right<>(value); } static Either left(A value) { return new Left<>(value); } public class Left implements Either { private final A left; private Left(A left) { super(); this.left = left; } @Override public boolean isLeft() { return true; } @Override public boolean isRight() { return false; } @Override public A getLeft() { return this.left; } @Override public B getRight() { throw new IllegalStateException("getRight() called on Left value"); } @Override public String toString() { return left.toString(); } } public class Right implements Either { private final B right; private Right(B right) { super(); this.right = right; } @Override public boolean isLeft() { return false; } @Override public boolean isRight() { return true; } @Override public A getLeft() { throw new IllegalStateException("getLeft() called on Right value"); } @Override public B getRight() { return this.right; } @Override public String toString() { return right.toString(); } } } This implementation is missing a flatMap method, but we will not need it. The Either class is somewhat like the Optional Java class in that it may be used to represent the result of an evaluation that may return a value or something else like an exception, an error message or whatever. What is important is that it can hold one of two things of different types. We now have all we need to use our state machine: public class Account { public static StateMachine createMachine() { Condition predicate1 = t -> t.value.isDeposit(); Transition transition1 = t -> new Outcome(t.state.account + t.value.getAmount(), t.state.operations.cons(Either.right(t.value.getAmount()))); Condition predicate2 = t -> t.value.isWithdraw() && t.state.account >= t.value.getAmount(); Transition transition2 = t -> new Outcome(t.state.account - t.value.getAmount(), t.state.operations.cons(Either.right(- t.value.getAmount()))); Condition predicate3 = t -> true; Transition transition3 = t -> new Outcome(t.state.account, t.state.operations.cons(Either.left(new IllegalStateException(String.format("Can't withdraw %s because balance is only %s", t.value.getAmount(), t.state.account))))); List, Transition>> transitions = List.apply( new Tuple<>(predicate1, transition1), new Tuple<>(predicate2, transition2), new Tuple<>(predicate3, transition3)); return new StateMachine<>(transitions); } } This could not be simpler. We just define each possible condition and the corresponding transition, and then build a list of tuples (Condition, Transition) that is used to instantiate the state machine. There are however to rules that must be enforced: Conditions must be put in the right order, with the more specific first and the more general last. We must be careful to be sure to match all possible cases. Otherwise, we will get an exception. At this stage, nothing has been evaluated. We did not even use the initial state! To run the state machine, we must create a list of inputs and feed it in the machine, for example: List inputs = List.apply( new Deposit(100), new Withdraw(50), new Withdraw(150), new Deposit(200), new Withdraw(150)); StateMonad = Account.createMachine().process(inputs); Again, nothing has been evaluated yet. To get the result, we just evaluate the result, using an initial state: Outcome outcome = state.eval(new Outcome(0, List.empty())) If we run the program with the list above, and call toString() on the resulting outcome (we can't do more useful things since the Either class is so minimal!) we get the following result: // // (100,[-150, 200, java.lang.IllegalStateException: Can't withdraw 150 because balance is only 50, -50, 100, NIL]) This is a tuple of the resulting balance for the account (100) and the list of operations that have been carried on. We can see that successful operations are represented by a signed integer, and failed operations are represented by an error message. This of course is a very minimal example, and as usual, one may think it would be much easier to do it the imperative way. However, think of a more complex example, like a text parser. All there is to do to adapt the state machine is to define the state representation (the Outcome class), define the possible inputs and create the list of (Condition,Transition). Going the functional way does not make the whole thing simpler. However, it allows abstracting the implementation of the state machine from the requirements. The only thing we have to do to create a new state machine is to write the new requirements!
October 9, 2014
by Pierre-Yves Saumont
· 25,933 Views · 7 Likes
article thumbnail
Spring Integration with JMS and Map Transformers
in this article i explained how spring built-in transformers works for while transforming object message to map message. sometimes the messages need to be transformed before they can be consumed to achieve a business purpose. for example, a producer uses a plain xml as its payload to produce a message, while a consumer is interested in java object or types like plain text ,name-value pairs, or json model. spring integration provides endpoints such as service activators, channel adapters, message bridges, gateways, transformers, filters, and routers. in this example how transformers endpoint transform object message to map message. references: spring integration spring with jms spring with junit mockrunner sts high level view spring-mockrunner.xml in spring-mockrunner.xml file, i defined mockqueue, mockqueueconnectionfactory for inbound queue, and outbound queue for quick testing purpose. inboundqueue is where you will publish object message from objecttomaptransformertest.java class. outboundqueue where this queue expecting mapmessage type object and this queue is listing mapmessagelistener.java class. for more information mockrunner works please check my previous article mockrunner with spring jms . pom.xml 4.0.0 org.springframework.samples spring-int-jms-basic 0.0.1-snapshot 1.6 utf-8 utf-8 3.2.3.release 1.0.13 1.7.5 4.11 org.springframework spring-context ${spring-framework.version} org.springframework spring-tx ${spring-framework.version} org.springframework.integration spring-integration-core 2.2.4.release org.springframework.integration spring-integration-jmx 2.2.4.release org.springframework.integration spring-integration-jms 2.2.4.release org.slf4j slf4j-api ${slf4j.version} compile ch.qos.logback logback-classic ${logback.version} runtime org.springframework spring-test ${spring-framework.version} test junit junit ${junit.version} test com.mockrunner mockrunner-jms 1.0.3 javax.jms jms 1.1 org.codehaus.jackson jackson-mapper-asl 1.9.3 compile spring-int-jms.xml the endpoint is configured to connect to a jms server, fetch the messages,and publish them onto a local channel i.e inputchannel. where as connection-factory, and destination referred mockqueueconnectionfactory, and mockqueue(inboundqueue) beans from spring-mockrunner.xml file. inputchannel and outputchannel defined as queue channel objecttomaptransformer: object-to-map-transformer element that takes the payload from the input channel original here mockrunner-in-queue object message and emits a name-value paired map object onto the output channel i.e outputchannel and outboundjmsadapter bean fetch this message and publish to queue i.e mockrunner-out-queue. inboundjmsadapter : inbound-channel-adapter bean is responsible for receiving messages from a jms server here it is reading from mock queue name mockrunner-in-queue see objecttomaptransformertest.java class. outboundjmsadapter : outbound-channel-adapter bean is responsible to fetch messages from the channel i.e outputchannel and publish them to jms queue or topic. in this outbounjmsadapter reading message outputchannel as mapmessage and publish to outboundqueue(mockrunner-out-queue). mapmessagelistener.java package com.spijb.listener; import javax.jms.jmsexception; import javax.jms.mapmessage; import javax.jms.session; import org.slf4j.logger; import org.slf4j.loggerfactory; import org.springframework.jms.listener.sessionawaremessagelistener; public class mapmessagelistener implements sessionawaremessagelistener { private static final logger log = loggerfactory.getlogger(mapmessagelistener.class); @override public void onmessage(mapmessage message, session session) throws jmsexception { log.info("message received \r\n"+message); } } it is plain mapmessagelistener class to print received message from queue. department.java package com.spijb.domain; import java.io.serializable; public class department implements serializable{ private static final long serialversionuid = 1l; private final integer deptno; private final string name; private final string location; public department() { deptno=10; name="sales"; location="tx"; } public department(integer dno,string name,string loc) { this.deptno=dno; this.name=name; this.location=loc; } public integer getdeptno() { return deptno; } public string getname() { return name; } public string getlocation() { return location; } @override public string tostring() { return this.deptno+"-> "+this.name+"->"+this.location; } } domain object to send as a message, by default constructor assign deptno 10 , name as sales, location as tx also provide parameter constructor. spring junit class objecttomaptransformertest.java package com.spijb.invoker; import javax.jms.jmsexception; import javax.jms.message; import javax.jms.objectmessage; import javax.jms.session; import org.junit.test; import org.junit.runner.runwith; import org.springframework.beans.factory.annotation.autowired; import org.springframework.jms.core.jmstemplate; import org.springframework.jms.core.messagecreator; import org.springframework.test.context.contextconfiguration; import org.springframework.test.context.junit4.springjunit4classrunner; import com.mockrunner.mock.jms.mockqueue; import com.spijb.domain.department; @runwith(springjunit4classrunner.class) @contextconfiguration({"classpath:spring-mockrunner.xml","classpath:spring-int-jms.xml"}) public class objecttomaptransformertest { @autowired private jmstemplate jmstemplate; @autowired private mockqueue inboundqueue; @test public void shouldsendmessage() throws interruptedexception { final department defaultdepartment = new department(); jmstemplate.send(inboundqueue,new messagecreator() { @override public message createmessage(session session) throws jmsexception { objectmessage objectmessage = session.createobjectmessage(); objectmessage.setobject(defaultdepartment); return objectmessage; } }); thread.sleep(5000); } } spring with junit class where you can send message to inputchannel i.e inboundqueue using mockrunner. output : info: started inboundjmsadapter oct 06, 2014 1:24:25 pm org.springframework.integration.endpoint.abstractendpoint start info: started org.springframework.integration.config.consumerendpointfactorybean#1 13:24:26.882 [org.springframework.jms.listener.defaultmessagelistenercontainer#0-1] info c.spijb.listener.mapmessagelistener - message received com.mockrunner.mock.jms.mockmapmessage: {location=tx, name=sales, deptno=10} oct 06, 2014 1:24:30 pm org.springframework.context.support.abstractapplicationcontext doclose info: closing org.springframework.context.support.genericapplicationcontext@5840979b: startup date [mon oct 06 13:24:25 cdt 2014]; root of context hierarchy oct 06, 2014 1:24:30 pm org.springframework.context.support.defaultlifecycleprocessor$lifecyclegroup stop info: stopping beans in phase 2147483647 in the above highlighted one is output as map.
October 9, 2014
by Upender Chinthala
· 23,014 Views
  • Previous
  • ...
  • 406
  • 407
  • 408
  • 409
  • 410
  • 411
  • 412
  • 413
  • 414
  • 415
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×