DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Databases Topics

article thumbnail
What Is a Monolith (Monoliths vs. Microservices)?
there is currently a strong trend for microservice based architectures and frequent discussions comparing them to monoliths. there is much advice about breaking-up monoliths into microservices and also some amusing fights between proponents of the two paradigms - see the great microservices vs monolithic melee . the term 'monolith' is increasingly being used as a generic insult in the same way that 'legacy' is! however, i believe that there is a great deal of misunderstanding about exactly what a 'monolith' is and those discussing it are often talking about completely different things. a monolith can be considered an architectural style or a software development pattern (or anti-pattern if you view it negatively). styles and patterns usually fit into different viewtypes (a viewtype is a set, or category, of views that can be easily reconciled with each other [clements et al., 2010]) and some basic viewtypes we can discuss are: module - the code units and their relation to each other at compile time. allocation - the mapping of the software onto its environment. runtime - the static structure of the software elements and how they interact at runtime. a monolith could refer to any of the basic viewtypes above. module monolith if you have a module monolith then all of the code for a system is in a single codebase that is compiled together and produces a single artifact. the code may still be well structured (classes and packages that are coherent and decoupled at a source level rather than a big-ball-of-mud) but it is not split into separate modules for compilation. conversely a non-monolithic module design may have code split into multiple modules or libraries that can be compiled separately, stored in repositories and referenced when required. there are advantages and disadvantages to both but this tells you very little about how the code is used - it is primarily done for development management. allocation monolith for an allocation monolith, all of the code is shipped/deployed at the same time. in other words once the compiled code is 'ready for release' then a single version is shipped to all nodes. all running components have the same version of the software running at any point in time. this is independent of whether the module structure is a monolith. you may have compiled the entire codebase at once before deployment or you may have created a set of deployment artifacts from multiple sources and versions. either way this version for the system is deployed everywhere at once (often by stopping the entire system, rolling out the software and then restarting). a non-monolithic allocation would involve deploying different versions to individual nodes at different times. this is again independent of the module structure as different versions of a module monolith could be deployed individually. runtime monolith a runtime monolith will have a single application or process performing the work for the system (although the system may have multiple, external dependencies). many systems have traditionally been written like this (especially line-of-business systems such as payroll, accounts payable, cms etc). whether the runtime is a monolith is independent of whether the system code is a module monolith or not. a runtime monolith often implies an allocation monolith if there is only one main node/component to be deployed (although this is not the case if a new version of software is rolled out across regions, with separate users, over a period of time). note that my examples above are slightly forced for the viewtypes and it won't be as hard-and-fast in the real world. conclusion be very carefully when arguing about 'microservices vs monoliths'. a direct comparison is only possible when discussing the runtime viewtype and properties. you should also not assume that moving away from a module or allocation monolith will magically enable a microservice architecture (although it will probably help). if you are moving to a microservice architecture then i'd advise you to consider all these viewtypes and align your boundaries across them i.e. don't just code, build and distribute a monolith that exposes subsets of itself on different nodes.
November 20, 2014
by Robert Annett
· 15,838 Views · 1 Like
article thumbnail
How to Compress Responses in Java REST API with GZip and Jersey
There may be cases when your REST api provides responses that are very long, and we all know how important transfer speed and bandwidth still are on mobile devices/networks. I think this is the first performance optimization point one needs to address, when developing REST apis that support mobile apps. Guess what? Because responses are text, we can compress them. And with today’s power of smartphones and tablets uncompressing them on the client side should not be a big deal… So in this post I will present how you can SELECTIVELY compress your REST API responses, if you’ve built it in Java with Jersey, which is the JAX-RS Reference Implementation (and more)… 1. Jersey filters and interceptors Well, thanks to Jersey’s powerful Filters and Interceptors features, the implementation is fairly easy. Whereas filters are primarily intended to manipulate request and response parameters like HTTP headers, URIs and/or HTTP methods, interceptors are intended to manipulate entities, via manipulating entity input/output streams. You’ve seen the power of filters in my posts How to add CORS support on the server side in Java with Jersey, where I’ve shown how to CORS-enable a REST API and How to log in Spring with SLF4J and Logback, where I’ve shown how to log requests and responses from the REST API , but for compressing will be using a GZip WriterInterceptor. A writer interceptor is used for cases where entity is written to the “wire”, which on the server side as in this case, means when writing out a response entity. 1.1. GZip Writer Interceptor So let’s have a look at our GZip Writer Interceptor: package org.codingpedia.demo.rest.interceptors; import java.io.IOException; import java.io.OutputStream; import java.util.zip.GZIPOutputStream; import javax.ws.rs.WebApplicationException; import javax.ws.rs.core.MultivaluedMap; import javax.ws.rs.ext.WriterInterceptor; import javax.ws.rs.ext.WriterInterceptorContext; @Provider @Compress public class GZIPWriterInterceptor implements WriterInterceptor { @Override public void aroundWriteTo(WriterInterceptorContext context) throws IOException, WebApplicationException { MultivaluedMap headers = context.getHeaders(); headers.add("Content-Encoding", "gzip"); final OutputStream outputStream = context.getOutputStream(); context.setOutputStream(new GZIPOutputStream(outputStream)); context.proceed(); } } Note: it implements the WriterInterceptor, which is an interface for message body writer interceptors that wrap around calls to javax.ws.rs.ext.MessageBodyWriter.writeTo providers implementing WriterInterceptor contract must be either programmatically registered in a JAX-RS runtime or must be annotated with @Provider annotation to be automatically discovered by the JAX-RS runtime during a provider scanning phase. @Compress is the name binding annotation, which we will discuss more detailed in the coming paragraph “The interceptor gets a output stream from the WriterInterceptorContext and sets a new one which is a GZIP wrapper of the original output stream. After all interceptors are executed the output stream lastly set to the WriterInterceptorContext will be used for serialization of the entity. In the example above the entity bytes will be written to the GZIPOutputStream which will compress the stream data and write them to the original output stream. The original stream is always the stream which writes the data to the “wire”. When the interceptor is used on the server, the original output stream is the stream into which writes data to the underlying server container stream that sends the response to the client.” [2] “The overridden method aroundWriteTo() gets WriterInterceptorContext as a parameter. This context contains getters and setters for header parameters, request properties, entity, entity stream and other properties.” [2]; when you compress your response you should set the “Content-Encoding” header to “gzip” 1.2. Compress annotation Filters and interceptors can be name-bound. Name binding is a concept that allows to say to a JAX-RS runtime that a specific filter or interceptor will be executed only for a specific resource method. When a filter or an interceptor is limited only to a specific resource method we say that it is name-bound. Filters and interceptors that do not have such a limitation are called global. In our case we’ve built the @Compress annotation: package org.codingpedia.demo.rest.interceptors; import java.lang.annotation.Retention; import java.lang.annotation.RetentionPolicy; import javax.ws.rs.NameBinding; //@Compress annotation is the name binding annotation @NameBinding @Retention(RetentionPolicy.RUNTIME) public @interface Compress {} and used it to mark methods on resources which should be gzipped (e.g. when GET-ing all the podcasts with the PodcastsResource): @Component @Path("/podcasts") public class PodcastsResource { @Autowired private PodcastService podcastService; ........................... /* * *********************************** READ *********************************** */ /** * Returns all resources (podcasts) from the database * * @return * @throws IOException * @throws JsonMappingException * @throws JsonGenerationException * @throws AppException */ @GET @Compress @Produces({ MediaType.APPLICATION_JSON, MediaType.APPLICATION_XML }) public List getPodcasts( @QueryParam("orderByInsertionDate") String orderByInsertionDate, @QueryParam("numberDaysToLookBack") Integer numberDaysToLookBack) throws IOException, AppException { List podcasts = podcastService.getPodcasts( orderByInsertionDate, numberDaysToLookBack); return podcasts; } ........................... } 2. Testing 2.1. SOAPui Well, if you are testing with SOAPui, you can issue the following request against the PodcastsResource Request: GET http://localhost:8888/demo-rest-jersey-spring/podcasts/?orderByInsertionDate=DESC HTTP/1.1 Accept-Encoding: gzip,deflate Accept: application/json, application/xml Host: localhost:8888 Connection: Keep-Alive User-Agent: Apache-HttpClient/4.1.1 (java 1.5) Response: HTTP/1.1 200 OK Content-Type: application/json Content-Encoding: gzip Content-Length: 409 Server: Jetty(9.0.7.v20131107) [ { "id": 2, "title": "Quarks & Co - zum Mitnehmen", "linkOnPodcastpedia": "http://www.podcastpedia.org/quarks", "feed": "http://podcast.wdr.de/quarks.xml", "description": "Quarks & Co: Das Wissenschaftsmagazin", "insertionDate": "2014-10-29T10:46:13.00+0100" }, { "id": 1, "title": "- The Naked Scientists Podcast - Stripping Down Science", "linkOnPodcastpedia": "http://www.podcastpedia.org/podcasts/792/-The-Naked-Scientists-Podcast-Stripping-Down-Science", "feed": "feed_placeholder", "description": "The Naked Scientists flagship science show brings you a lighthearted look at the latest scientific breakthroughs, interviews with the world top scientists, answers to your science questions and science experiments to try at home.", "insertionDate": "2014-10-29T10:46:02.00+0100" } ] SOAPui recognizes the Content-Type: gzip header, we’ve added in the GZIPWriterInterceptor and automatically uncompresses the response and displays it readable to the human eye. Well, that’s it. You’ve learned how Jersey makes it straightforward to compress the REST api responses. Tip: If you want really learn how to design and implement REST API in Java read the following Tutorial – REST API design and implementation in Java with Jersey and Spring
November 18, 2014
by Adrian Matei
· 62,692 Views · 2 Likes
article thumbnail
Coldfusion Example: Using jQuery UI Accordion with a ColdFusion Query
A reader pinged me yesterday with a simple problem that I thought would be good to share on the blog. He had a query of events that he wanted to use with jQuery UI's Accordion control. The Accordion control simply takes content and splits into various "panes" with one visible at a time. For his data, he wanted to split his content into panes designated by a unique month and year. Here is a quick demo of that in action. I began by creating a query to store my data. I created a query with a date and title property and then random chose to add 0 to 3 "events" over the next twelve months. I specifically wanted to support 0 to ensure my demo handled noticing months without any data. 01. 04. 05.q = queryNew("date,title"); 06.for(i=1; i<12; i++) { 07. //for each month, we add 0-3 events (some months may not have data) 08. toAdd = randRange(0, 3); 09. 10. for(k=0; k To handle creating the accordion, I had to follow the rules jQuery UI set up for the control. Basically - wrap the entire set of data in a div, and separate each "pane" with an h3 and inner div. To handle this, I have to know when a new unique month/year "block" starts. I store this in a variable, lastDateStr, and just check it in every iteration over the query. I also need to ensure that on the last row I close the div. 01. 02. 03. 04. 05. 06. 07. 08. 09. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. #thisDateStr# 30. 31. 32. 33. 34. 35. 36. #title# 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. And the end result: So, not rocket science, but hopefully helpful to someone. Here is the entire template if you want to try it yourself. 01. 04. 05.q = queryNew("date,title"); 06.for(i=1; i<12; i++) { 07. //for each month, we add 0-3 events (some months may not have data) 08. toAdd = randRange(0, 3); 09. 10. for(k=0; k 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. #thisDateStr# 46. 47. 48. 49. 50. 51. 52. #title# 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63.
November 13, 2014
by Raymond Camden
· 4,533 Views
article thumbnail
How to Deal with MySQL Deadlocks
Originally Written by Peiran Song A deadlock in MySQL happens when two or more transactions mutually hold and request for locks, creating a cycle of dependencies. In a transaction system, deadlocks are a fact of life and not completely avoidable. InnoDB automatically detects transaction deadlocks, rollbacks a transaction immediately and returns an error. It uses a metric to pick the easiest transaction to rollback. Though an occasional deadlock is not something to worry about, frequent occurrences call for attention. Before MySQL 5.6, only the latest deadlock can be reviewed using SHOW ENGINE INNODB STATUS command. But with Percona Toolkit’s pt-deadlock-logger you can have deadlock information retrieved from SHOW ENGINE INNODB STATUS at a given interval and saved to a file or table for late diagnosis. For more information on using pt-deadlock-logger, see this post. With MySQL 5.6, you can enable a new variable innodb_print_all_deadlocks to have all deadlocks in InnoDB recorded in mysqld error log. Before and above all diagnosis, it is always an important practice to have the applications catch deadlock error (MySQL error no. 1213) and handle it by retrying the transaction. How to diagnose a MySQL deadlock A MySQL deadlock could involve more than two transactions, but the LATEST DETECTED DEADLOCK section only shows the last two transactions. Also it only shows the last statement executed in the two transactions, and locks from the two transactions that created the cycle. What are missed are the earlier statements that might have really acquired the locks. I will show some tips on how to collect the missed statements. Let’s look at two examples to see what information is given. Example 1: 1 141013 6:06:22 2 *** (1) TRANSACTION: 3 TRANSACTION 876726B90, ACTIVE 7 sec setting auto-inc lock 4 mysql tables in use 1, locked 1 5 LOCK WAIT 9 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 4 6 MySQL thread id 155118366, OS thread handle 0x7f59e638a700, query id 87987781416 localhost msandbox update 7 INSERT INTO t1 (col1, col2, col3, col4) values (10, 20, 30, 'hello') 8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: 9 TABLE LOCK table `mydb`.`t1` trx id 876726B90 lock mode AUTO-INC waiting 10 *** (2) TRANSACTION: 11 TRANSACTION 876725B2D, ACTIVE 9 sec inserting 12 mysql tables in use 1, locked 1 13 876 lock struct(s), heap size 80312, 1022 row lock(s), undo log entries 1002 14 MySQL thread id 155097580, OS thread handle 0x7f585be79700, query id 87987761732 localhost msandbox update 15 INSERT INTO t1 (col1, col2, col3, col4) values (7, 86, 62, "a lot of things"), (7, 76, 62, "many more") 16 *** (2) HOLDS THE LOCK(S): 17 TABLE LOCK table `mydb`.`t1` trx id 876725B2D lock mode AUTO-INC 18 *** (2) WAITING FOR THIS LOCK TO BE GRANTED: 19 RECORD LOCKS space id 44917 page no 529635 n bits 112 index `PRIMARY` of table `mydb`.`t2` trx id 876725B2D lock mode S locks rec but not gap waiting 20 *** WE ROLL BACK TRANSACTION (1) Line 1 gives the time when the deadlock happened. If your application code catches and logs deadlock errors,which it should, then you can match this timestamp with the timestamps of deadlock errors in application log. You would have the transaction that got rolled back. From there, retrieve all statements from that transaction. Line 3 & 11, take note of Transaction number and ACTIVE time. If you log SHOW ENGINE INNODB STATUS output periodically(which is a good practice), then you can search previous outputs with Transaction number to hopefully see more statements from the same transaction. The ACTIVE sec gives a hint on whether the transaction is a single statement or multi-statement one. Line 4 & 12, the tables in use and locked are only with respect to the current statement. So having 1 table in use does not necessarily mean that the transaction involves 1 table only. Line 5 & 13, this is worth of attention as it tells how many changes the transaction had made, which is the “undo log entries” and how many row locks it held which is “row lock(s)”. These info hints the complexity of the transaction. Line 6 & 14, take note of thread id, connecting host and connecting user. If you use different MySQL users for different application functions which is another good practice, then you can tell which application area the transaction comes from based on the connecting host and user. Line 9, for the first transaction, it only shows the lock it was waiting for, in this case the AUTO-INC lock on table t1. Other possible values are S for shared lock and X for exclusive with or without gap locks. Line 16 & 17, for the second transaction, it shows the lock(s) it held, in this case the AUTO-INC lock which was what TRANSACTION (1) was waiting for. Line 18 & 19 shows which lock TRANSACTION (2) was waiting for. In this case, it was a shared not gap record lock on another table’s primary key. There are only a few sources for a shared record lock in InnoDB: 1) use of SELECT … LOCK IN SHARE MODE 2) on foreign key referenced record(s) 3) with INSERT INTO… SELECT, shared locks on source table The current statement of trx(2) is a simple insert to table t1, so 1 and 3 are eliminated. By checking SHOW CREATE TABLE t1, you could confirm that the S lock was due to a foreign key constraint to the parent table t2. Example 2: With MySQL community version, each record lock has the record content printed: 1 2014-10-11 10:41:12 7f6f912d7700 2 *** (1) TRANSACTION: 3 TRANSACTION 2164000, ACTIVE 27 sec starting index read 4 mysql tables in use 1, locked 1 5 LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1 6 MySQL thread id 9, OS thread handle 0x7f6f91296700, query id 87 localhost ro ot updating 7 update t1 set name = 'b' where id = 3 8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: 9 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164000 lock_mode X locks rec but not gap waiting 10 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0 11 0: len 4; hex 80000003; asc ;; 12 1: len 6; hex 000000210521; asc ! !;; 13 2: len 7; hex 180000122117cb; asc ! ;; 14 3: len 4; hex 80000008; asc ;; 15 4: len 1; hex 63; asc c;; 16 17 *** (2) TRANSACTION: 18 TRANSACTION 2164001, ACTIVE 18 sec starting index read 19 mysql tables in use 1, locked 1 20 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1 21 MySQL thread id 10, OS thread handle 0x7f6f912d7700, query id 88 localhost r oot updating 22 update t1 set name = 'c' where id = 2 23 *** (2) HOLDS THE LOCK(S): 24 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap 25 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0 26 0: len 4; hex 80000003; asc ;; 27 1: len 6; hex 000000210521; asc ! !;; 28 2: len 7; hex 180000122117cb; asc ! ;; 29 3: len 4; hex 80000008; asc ;; 30 4: len 1; hex 63; asc c;; 31 32 *** (2) WAITING FOR THIS LOCK TO BE GRANTED: 33 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap waiting 34 Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0 35 0: len 4; hex 80000002; asc ;; 36 1: len 6; hex 000000210520; asc ! ;; 37 2: len 7; hex 17000001c510f5; asc ;; 38 3: len 4; hex 80000009; asc ;; 39 4: len 1; hex 62; asc b;; Line 9 & 10: The ‘space id’ is tablespace id, ‘page no’ gives which page the record lock is on inside the tablespace. The ‘n bits’ is not the page offset, instead the number of bits in the lock bitmap. The page offset is the ‘heap no’ on line 10, Line 11~15: It shows the record data in hex numbers. Field 0 is the cluster index(primary key). Ignore the highest bit, the value is 3. Field 1 is the transaction id of the transaction which last modified this record, decimal value is 2164001 which is TRANSACTION (2). Field 2 is the rollback pointer. Starting from field 3 is the rest of the row data. Field 3 is integer column, value 8. Field 4 is string column with character ‘c’. By reading the data, we know exactly which row is locked and what is the current value. What else can we learn from analysis? Since most MySQL deadlocks happen between two transactions, we could start the analysis based on that assumption. In Example 1, trx (2) was waiting on a shared lock, so trx (1) either held a shared or exclusive lock on that primary key record of table t2. Let’s say col2 is the foreign key column, by checking the current statement of trx(1), we know it did not require the same record lock, so it must be some previous statement in trx(1) that required S or X lock(s) on t2’s PK record(s). Trx (1) only made 4 row changes in 7 seconds. Then you learned a few characteristics of trx(1): it does a lot of processing but a few changes; changes involve table t1 and t2, a single record insertion to t2. These information combined with other data could help developers to locate the transaction. Where else can we find previous statements of the transactions? Besides application log and previous SHOW ENGINE INNODB STATUS output, you may also leverage binlog, slow log and/or general query log. With binlog, if binlog_format=statement, each binlog event would have the thread_id. Only committed transactions are logged into binlog, so we could only look for Trx(2) in binlog. In the case of Example 1, we know when the deadlock happened, and we know Trx(2) started 9 seconds ago. We can run mysqlbinlog on the right binlog file and look for statements with thread_id = 155097580. It is always good to then cross refer the statements with the application code to confirm. $ mysqlbinlog -vvv --start-datetime=“2014-10-13 6:06:12” --stop-datatime=“2014-10-13 6:06:22” mysql-bin.000010 > binlog_1013_0606.out With Percona Server 5.5 and above, you can set log_slow_verbosity to include InnoDB transaction id in slow log. Then if you have long_query_time = 0, you would be able to catch all statements including those rolled back into slow log file. With general query log, the thread id is included and could be used to look for related statements. How to avoid a MySQL deadlock There are things we could do to eliminate a deadlock after we understand it. – Make changes to the application. In some cases, you could greatly reduce the frequency of deadlocks by splitting a long transaction into smaller ones, so locks are released sooner. In other cases, the deadlock rises because two transactions touch the same sets of data, either in one or more tables, with different orders. Then change them to access data in the same order, in another word, serialize the access. That way you would have lock wait instead of deadlock when the transactions happen concurrently. – Make changes to the table schema, such as removing foreign key constraint to detach two tables, or adding indexes to minimize the rows scanned and locked. – In case of gap locking, you may change transaction isolation level to read committed for the session or transaction to avoid it. But then the binlog format for the session or transaction would have to be ROW or MIXED.
November 12, 2014
by Peter Zaitsev
· 31,552 Views
article thumbnail
Building Microservices with Spring Boot and Apache Thrift. Part 1
In the modern world of microservices it's important to provide strict and polyglot clients for your service. It's better if your API is self-documented. One of the best tools for it is Apache Thrift. I want to explain how to use it with my favorite platform for microservices - Spring Boot. All project source code is available on GitHub: https://github.com/bsideup/spring-boot-thrift Project skeleton I will use Gradle to build our application. First, we need our main build.gradle file: buildscript { repositories { jcenter() } dependencies { classpath("org.springframework.boot:spring-boot-gradle-plugin:1.1.8.RELEASE") } } allprojects { repositories { jcenter() } apply plugin:'base' apply plugin: 'idea' } subprojects { apply plugin: 'java' } Nothing special for a Spring Boot project. Then we need a gradle file for thrift protocol modules (we will reuse it in next part): import org.gradle.internal.os.OperatingSystem repositories { ivy { artifactPattern "http://dl.bintray.com/bsideup/thirdparty/[artifact]-[revision](-[classifier]).[ext]" } } buildscript { repositories { jcenter() } dependencies { classpath "ru.trylogic.gradle.plugins:gradle-thrift-plugin:0.1.1" } } apply plugin: ru.trylogic.gradle.thrift.plugins.ThriftPlugin task generateThrift(type : ru.trylogic.gradle.thrift.tasks.ThriftCompileTask) { generator = 'java:beans,hashcode' destinationDir = file("generated-src/main/java") } sourceSets { main { java { srcDir generateThrift.destinationDir } } } clean { delete generateThrift.destinationDir } idea { module { sourceDirs += [file('src/main/thrift'), generateThrift.destinationDir] } } compileJava.dependsOn generateThrift dependencies { def thriftVersion = '0.9.1'; Map platformMapping = [ (OperatingSystem.WINDOWS) : 'win', (OperatingSystem.MAC_OS) : 'osx' ].withDefault { 'nix' } thrift "org.apache.thrift:thrift:$thriftVersion:${platformMapping.get(OperatingSystem.current())}@bin" compile "org.apache.thrift:libthrift:$thriftVersion" compile 'org.slf4j:slf4j-api:1.7.7' } We're using my Thrift plugin for Gradle. Thrift will generate source to the "generated-src/main/java" directory. By default, Thrift uses slf4j v1.5.8, while Spring Boot uses v1.7.7. It will cause an error in runtime when you will run your application, that's why we have to force a slf4j api dependency. Calculator service Let's start with a simple calculator service. It will have 2 modules: protocol and app.We will start with protocol. Your project should look as follows: calculator/ protocol/ src/ main/ thrift/ calculator.thrift build.gradle build.gradle settings.gradle thrift.gradle Where calculator/protocol/build.gradle contains only one line: apply from: rootProject.file('thrift.gradle') Don't forget to put these lines to settings.gradle, otherwise your modules will not be visible to Gradle: include 'calculator:protocol' include 'calculator:app' Calculator protocol Even if you're not familiar with Thrift, its protocol description file (calculator/protocol/src/main/thrift/calculator.thrift) should be very clear to you: namespace cpp com.example.calculator namespace d com.example.calculator namespace java com.example.calculator namespace php com.example.calculator namespace perl com.example.calculator namespace as3 com.example.calculator enum TOperation { ADD = 1, SUBTRACT = 2, MULTIPLY = 3, DIVIDE = 4 } exception TDivisionByZeroException { } service TCalculatorService { i32 calculate(1:i32 num1, 2:i32 num2, 3:TOperation op) throws (1:TDivisionByZeroException divisionByZero); } Here we define TCalculatorService with only one method - calculate. It can throw an exception of type TDivisionByZeroException. Note how many languages we're supporting out of the box (in this example we will use only Java as a target, though) Now run ./gradlew generateThrift, you will get generated Java protocol source in the calculator/protocol/generated-src/main/java/ folder. Calculator application Next, we need to create the service application itself. Just create calculator/app/ folder with the following structure: src/ main/ java/ com/ example/ calculator/ handler/ CalculatorServiceHandler.java service/ CalculatorService.java CalculatorApplication.java build.gradle Our build.gradle file for app module should look like this: apply plugin: 'spring-boot' dependencies { compile project(':calculator:protocol') compile 'org.springframework.boot:spring-boot-starter-web' testCompile 'org.springframework.boot:spring-boot-starter-test' } Here we have a dependency on protocol and typical starters for Spring Boot web app. CalculatorApplication is our main class. In this example I will configure Spring in the same file, but in your apps you should use another config class instead. package com.example.calculator; import com.example.calculator.handler.CalculatorServiceHandler; import org.apache.thrift.protocol.*; import org.apache.thrift.server.TServlet; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.EnableAutoConfiguration; import org.springframework.context.annotation.*; import javax.servlet.Servlet; @Configuration @EnableAutoConfiguration @ComponentScan public class CalculatorApplication { public static void main(String[] args) { SpringApplication.run(CalculatorApplication.class, args); } @Bean public TProtocolFactory tProtocolFactory() { //We will use binary protocol, but it's possible to use JSON and few others as well return new TBinaryProtocol.Factory(); } @Bean public Servlet calculator(TProtocolFactory protocolFactory, CalculatorServiceHandler handler) { return new TServlet(new TCalculatorService.Processor(handler), protocolFactory); } } You may ask why Thrift servlet bean is called "calculator". In Spring Boot, it will register your servlet bean in context of the bean name and our servlet will be available at /calculator/. After that we need a Thrift handler class: package com.example.calculator.handler; import com.example.calculator.*; import com.example.calculator.service.CalculatorService; import org.apache.thrift.TException; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Component; @Component public class CalculatorServiceHandler implements TCalculatorService.Iface { @Autowired CalculatorService calculatorService; @Override public int calculate(int num1, int num2, TOperation op) throws TException { switch(op) { case ADD: return calculatorService.add(num1, num2); case SUBTRACT: return calculatorService.subtract(num1, num2); case MULTIPLY: return calculatorService.multiply(num1, num2); case DIVIDE: try { return calculatorService.divide(num1, num2); } catch(IllegalArgumentException e) { throw new TDivisionByZeroException(); } default: throw new TException("Unknown operation " + op); } } } In this example I want to show you that Thrift handler can be a normal Spring bean and you can inject dependencies in it. Now we need to implement CalculatorService itself: package com.example.calculator.service; import org.springframework.stereotype.Component; @Component public class CalculatorService { public int add(int num1, int num2) { return num1 + num2; } public int subtract(int num1, int num2) { return num1 - num2; } public int multiply(int num1, int num2) { return num1 * num2; } public int divide(int num1, int num2) { if(num2 == 0) { throw new IllegalArgumentException("num2 must not be zero"); } return num1 / num2; } } That's it. Well... almost. We still need to test our service somehow. And it should be an integration test. Usually, even if your application is providing JSON REST API, you still have to implement a client for it. Thrift will do it for you. We don't have to care about it. Also, it will support different protocols. Let's use a generated client in our test: package com.example.calculator; import org.apache.thrift.protocol.*; import org.apache.thrift.transport.THttpClient; import org.apache.thrift.transport.TTransport; import org.junit.*; import org.junit.runner.RunWith; import org.springframework.beans.factory.annotation.*; import org.springframework.boot.test.IntegrationTest; import org.springframework.boot.test.SpringApplicationConfiguration; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; import org.springframework.test.context.web.WebAppConfiguration; import static org.junit.Assert.*; @RunWith(SpringJUnit4ClassRunner.class) @SpringApplicationConfiguration(classes = CalculatorApplication.class) @WebAppConfiguration @IntegrationTest("server.port:0") public class CalculatorApplicationTest { @Autowired protected TProtocolFactory protocolFactory; @Value("${local.server.port}") protected int port; protected TCalculatorService.Client client; @Before public void setUp() throws Exception { TTransport transport = new THttpClient("http://localhost:" + port + "/calculator/"); TProtocol protocol = protocolFactory.getProtocol(transport); client = new TCalculatorService.Client(protocol); } @Test public void testAdd() throws Exception { assertEquals(5, client.calculate(2, 3, TOperation.ADD)); } @Test public void testSubtract() throws Exception { assertEquals(3, client.calculate(5, 2, TOperation.SUBTRACT)); } @Test public void testMultiply() throws Exception { assertEquals(10, client.calculate(5, 2, TOperation.MULTIPLY)); } @Test public void testDivide() throws Exception { assertEquals(2, client.calculate(10, 5, TOperation.DIVIDE)); } @Test(expected = TDivisionByZeroException.class) public void testDivisionByZero() throws Exception { client.calculate(10, 0, TOperation.DIVIDE); } } This test will run your Spring Boot application, bind it to a random port and test it. All client-server communications will be performed in the same way real world clients are. Note how easy to use our service is from the client side. We're just calling methods and catching exceptions.
November 9, 2014
by Sergei Egorov
· 45,235 Views · 3 Likes
article thumbnail
Sketching API Connections
daniel bryant , simon and i recently had a discussion about how to represent system communication with external apis. the requirement for integration with external apis is now extremely common but it's not immediately obvious how to clearly show them in architectural diagrams. how to represent an external system? the first thing we discussed was what symbol to use for a system supplying an api. traditionally, uml has used the actor (stick man) symbol to represent a "user or any other system that interacts with the subject" (uml superstructure specification, v2.1.2). therefore a system providing an api may look like this: i've found that this symbol tends to confuse those who aren't well versed in uml as most people assume that the actor symbol always represents a *person* rather than a system. sometimes this is stereotyped to make it more obvious e.g. however the symbol is very powerful and tends to overpower the stereotype. therefore i prefer to use a stereotyped box for an external system supplying an api. let's compare two context diagrams using boxes vs stick actors. in which diagram is it more obvious what are systems or people? note that archimate has a specific symbol for application service that can be used to represent an api: (application service notation from the open group's archimate 2.1 specification) an api or the system that supplies it? whatever symbol we choose, what we've done is to show the *system* rather than the actual api. the api is a definition of a service provided by the system in question. how should we provide more details about the api? there are a number of ways we could do this but my preference is to give details of the api on the connector (line connecting two elements/boxes). in c4 the guidelines for a container diagram includes listing protocol information on the connector and an api can be viewed as the layer above the protocol. for example: multiple apis per external system many api providers supply multiple services/apis (i'm not referring to different operations within an api but multiple sets of operations in different apis, which may even use different underlying protocols.) for example a financial marketplace may have apis that do the following: allow a bulk, batch download of static data (such as details of companies listed on a stock market) via xml over http. supply real time, low latency updates of market prices via bespoke messages over udp. allow entry of trades via industry standard fpml over a queuing system. supply a bulk, batch download of trades for end-of-day reconciliation via fpml over http. two of the services use the same protocol (xml over http) but have very different content and use. one of the apis is used to constantly supply information after user subscription (market data) and the last service involves the user supplying all the information with no acknowledgment (although it should reconcile at eod). there are multiple ways of showing this. we could: have a single service element, list the apis on it and have all components linking to it. show each service/api as a separate box and connect the components that use the individual service to the relevant box. show a single service element with multiple connections. each connection is labeled and represents an api. use a port and connector style notation to represent each api from the service provider. provide a key for the ports. use a uml style 'cup and ball' notation to define interfaces and their usage. some examples are below: a single service element and simple description in the above diagram the containers are stating what they are using but contain no information about how to use the apis. we don't know if it is a single api (with different operations) or anything about the mechanisms used to transport the data. this isn't very useful for anyone implementing a solution or resolving operational issues. single, service box with descriptive connectors in this diagram there is a single, service box with descriptive connectors. the above diagram shows all the information so is much more useful as a diagnostic or implementation tool. however it does look quite crowded. services/apis shown as separate boxes here the external system has its services/apis shown as separate boxes. this contains all the information but might be mistaken as defining the internal structure of the external system. we want to show the services it provides but we know nothing about the internal structure. using ports to represent apis in the above diagram the services/apis are shown as 'ports' on the external system and the details have been moved into a separate key/table. this is less likely to be mistaken as showing any internal structure of the external service. (note that i could have also shown outgoing rports from the brokerage system.) uml interfaces this final diagram is using a uml style interface provider and requirer. this is a clean diagram but requires the user to be aware of what the cup and ball means (although i could have explained this in the key). conclusion any of these solutions could be appropriate depending on the complexity of the api set you are trying to represent. i'd suggest starting with a simple representation (i.e. fully labeled connections) and moving to a more complex one if needed but remember to use a key to explain any elements you use!
November 7, 2014
by Robert Annett
· 8,123 Views · 1 Like
article thumbnail
Hibernate Collections: Optimistic Locking
Introduction Hibernate provides an optimistic locking mechanism to prevent lost updates even for long-conversations. In conjunction with an entity storage, spanning over multiple user requests (extended persistence context or detached entities) Hibernate can guarantee application-level repeatable-reads. The dirty checking mechanism detects entity state changes and increments the entity version. While basic property changes are always taken into consideration, Hibernate collections are more subtle in this regard. Owned vs. Inverse Collections In relational databases, two records are associated through a foreign key reference. In this relationship, the referenced record is the parent while the referencing row (the foreign key side) is the child. A non-null foreign key may only reference an existing parent record. In the Object-oriented space this association can be represented in both directions. We can have a many-to-one reference from a child to parent and the parent can also have a one-to-many children collection. Because both sides could potentially control the database foreign key state, we must ensure that only one side is the owner of this association. Only the owningside state changes are propagated to the database. The non-owning side has been traditionally referred as the inverse side. Next I’ll describe the most common ways of modelling this association. The Unidirectional Parent-Owning-Side-Child Association Mapping Only the parent side has a @OneToMany non-inverse children collection. The child entity doesn’t reference the parent entity at all. @Entity(name = "post") public class Post { ... @OneToMany(cascade = CascadeType.ALL, orphanRemoval = true) private List comments = new ArrayList (); ... } The Unidirectional Parent-Owning-Side-Child Component Association Mapping Mapping The child side doesn’t always have to be an entity and we might model it as acomponent type instead. An Embeddable object (component type) may contain both basic types and association mappings but it can never contain an @Id. The Embeddable object is persisted/removed along with its owning entity. The parent has an @ElementCollection children association. The child entity may only reference the parent through the non-queryable Hibernate specific @Parentannotation. @Entity(name = "post") public class Post { ... @ElementCollection @JoinTable(name = "post_comments", joinColumns = @JoinColumn(name = "post_id")) @OrderColumn(name = "comment_index") private List comments = new ArrayList (); ... public void addComment(Comment comment) { comment.setPost(this); comments.add(comment); } } @Embeddable public class Comment { ... @Parent private Post post; ... } The Bidirectional Parent-Owning-Side-Child Association Mapping The parent is the owning side so it has a @OneToMany non-inverse (without a mappedBy directive) children collection. The child entity references the parent entity through a @ManyToOne association that’s neither insertable nor updatable: @Entity(name = "post") public class Post { ... @OneToMany(cascade = CascadeType.ALL, orphanRemoval = true) private List comments = new ArrayList (); ... public void addComment(Comment comment) { comment.setPost(this); comments.add(comment); } } @Entity(name = "comment") public class Comment ... @ManyToOne @JoinColumn(name = "post_id", insertable = false, updatable = false) private Post post; ... } The Bidirectional Parent-Owning-Side-Child Association Mapping The child entity references the parent entity through a @ManyToOne association, and the parent has a mappedBy @OneToMany children collection. The parent side is the inverse side so only the @ManyToOne state changes are propagated to the database. Even if there’s only one owning side, it’s always a good practice to keep both sides in sync by using the add/removeChild() methods. @Entity(name = "post") public class Post { ... @OneToMany(cascade = CascadeType.ALL, orphanRemoval = true, mappedBy = "post") private List comments = new ArrayList (); ... public void addComment(Comment comment) { comment.setPost(this); comments.add(comment); } } @Entity(name = "comment") public class Comment { ... @ManyToOne private Post post; ... } The Unidirectional Parent-Owning-Side-Child Association Mapping The child entity references the parent through a @ManyToOne association. The parent doesn’t have a @OneToMany children collection so the child entity becomes the owning side. This association mapping resembles the relational data foreign key linkage. @Entity(name = "comment") public class Comment { ... @ManyToOne private Post post; ... } Collection Versioning The 3.4.2 section of the JPA 2.1 specification defines optimistic locking as: The version attribute is updated by the persistence provider runtime when the object is written to the database. All non-relationship fields and proper ties and all relationships owned by the entity are included in version checks[35]. [35] This includes owned relationships maintained in join tables N.B. Only owning-side children collection can update the parent version. Testing Time Let’s test how the parent-child association type affects the parent versioning. Because we are interested in the children collection dirty checking, theunidirectional child-owning-side-parent association is going to be skipped, as in that case the parent doesn’t contain a children collection. Test Case The following test case is going to be used for all collection type use cases: protected void simulateConcurrentTransactions(final boolean shouldIncrementParentVersion) { final ExecutorService executorService = Executors.newSingleThreadExecutor(); doInTransaction(new TransactionCallable () { @Override public Void execute(Session session) { try { P post = postClass.newInstance(); post.setId(1L); post.setName("Hibernate training"); session.persist(post); return null; } catch (Exception e) { throw new IllegalArgumentException(e); } } }); doInTransaction(new TransactionCallable () { @Override public Void execute(final Session session) { final P post = (P) session.get(postClass, 1L); try { executorService.submit(new Callable () { @Override public Void call() throws Exception { return doInTransaction(new TransactionCallable () { @Override public Void execute(Session _session) { try { P otherThreadPost = (P) _session.get(postClass, 1L); int loadTimeVersion = otherThreadPost.getVersion(); assertNotSame(post, otherThreadPost); assertEquals(0L, otherThreadPost.getVersion()); C comment = commentClass.newInstance(); comment.setReview("Good post!"); otherThreadPost.addComment(comment); _session.flush(); if (shouldIncrementParentVersion) { assertEquals(otherThreadPost.getVersion(), loadTimeVersion + 1); } else { assertEquals(otherThreadPost.getVersion(), loadTimeVersion); } return null; } catch (Exception e) { throw new IllegalArgumentException(e); } } }); } }).get(); } catch (Exception e) { throw new IllegalArgumentException(e); } post.setName("Hibernate Master Class"); session.flush(); return null; } }); } The Unidirectional Parent-Owning-Side-Child Association Testing #create tables Query:{[create table comment (idbigint generated by default as identity (start with 1), review varchar(255), primary key (id))][]} Query:{[create table post (idbigint not null, name varchar(255), version integer not null, primary key (id))][]} Query:{[create table post_comment (post_id bigint not null, comments_id bigint not null, comment_index integer not null, primary key (post_id, comment_index))][]} Query:{[alter table post_comment add constraint FK_se9l149iyyao6va95afioxsrl foreign key (comments_id) references comment][]} Query:{[alter table post_comment add constraint FK_6o1igdm04v78cwqre59or1yj1 foreign key (post_id) references post][]} #insert post in primary transaction Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} #select post in secondary transaction Query:{[selectentityopti0_.idas id1_1_0_, entityopti0_.name as name2_1_0_, entityopti0_.version as version3_1_0_ from post entityopti0_ where entityopti0_.id=?][1]} #insert comment in secondary transaction #optimistic locking post version update in secondary transaction Query:{[insert into comment (id, review) values (default, ?)][Good post!]} Query:{[update post setname=?, version=? where id=? and version=?][Hibernate training,1,1,0]} Query:{[insert into post_comment (post_id, comment_index, comments_id) values (?, ?, ?)][1,0,1]} #optimistic locking exception in primary transaction Query:{[update post setname=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]} org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.vladmihalcea.hibernate.masterclass.laboratory.concurrency.EntityOptimisticLockingOnUnidirectionalCollectionTest$Post#1] The Unidirectional Parent-Owning-Side-Child Component Association Testing #create tables Query:{[create table post (idbigint not null, name varchar(255), version integer not null, primary key (id))][]} Query:{[create table post_comments (post_id bigint not null, review varchar(255), comment_index integer not null, primary key (post_id, comment_index))][]} Query:{[alter table post_comments add constraint FK_gh9apqeduab8cs0ohcq1dgukp foreign key (post_id) references post][]} #insert post in primary transaction Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} #select post in secondary transaction Query:{[selectentityopti0_.idas id1_0_0_, entityopti0_.name as name2_0_0_, entityopti0_.version as version3_0_0_ from post entityopti0_ where entityopti0_.id=?][1]} Query:{[selectcomments0_.post_id as post_id1_0_0_, comments0_.review as review2_1_0_, comments0_.comment_index as comment_3_0_ from post_comments comments0_ where comments0_.post_id=?][1]} #insert comment in secondary transaction #optimistic locking post version update in secondary transaction Query:{[update post setname=?, version=? where id=? and version=?][Hibernate training,1,1,0]} Query:{[insert into post_comments (post_id, comment_index, review) values (?, ?, ?)][1,0,Good post!]} #optimistic locking exception in primary transaction Query:{[update post setname=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]} org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.vladmihalcea.hibernate.masterclass.laboratory.concurrency.EntityOptimisticLockingOnComponentCollectionTest$Post#1] The Bidirectional Parent-Owning-Side-Child Association Testing #create tables Query:{[create table comment (idbigint generated by default as identity (start with 1), review varchar(255), post_id bigint, primary key (id))][]} Query:{[create table post (idbigint not null, name varchar(255), version integer not null, primary key (id))][]} Query:{[create table post_comment (post_id bigint not null, comments_id bigint not null)][]} Query:{[alter table post_comment add constraint UK_se9l149iyyao6va95afioxsrl unique (comments_id)][]} Query:{[alter table comment add constraint FK_f1sl0xkd2lucs7bve3ktt3tu5 foreign key (post_id) references post][]} Query:{[alter table post_comment add constraint FK_se9l149iyyao6va95afioxsrl foreign key (comments_id) references comment][]} Query:{[alter table post_comment add constraint FK_6o1igdm04v78cwqre59or1yj1 foreign key (post_id) references post][]} #insert post in primary transaction Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} #select post in secondary transaction Query:{[selectentityopti0_.idas id1_1_0_, entityopti0_.name as name2_1_0_, entityopti0_.version as version3_1_0_ from post entityopti0_ where entityopti0_.id=?][1]} Query:{[selectcomments0_.post_id as post_id1_1_0_, comments0_.comments_id as comments2_2_0_, entityopti1_.idas id1_0_1_, entityopti1_.post_id as post_id3_0_1_, entityopti1_.review as review2_0_1_, entityopti2_.idas id1_1_2_, entityopti2_.name as name2_1_2_, entityopti2_.version as version3_1_2_ from post_comment comments0_ inner joincomment entityopti1_ on comments0_.comments_id=entityopti1_.idleft outer joinpost entityopti2_ on entityopti1_.post_id=entityopti2_.idwhere comments0_.post_id=?][1]} #insert comment in secondary transaction #optimistic locking post version update in secondary transaction Query:{[insert into comment (id, review) values (default, ?)][Good post!]} Query:{[update post setname=?, version=? where id=? and version=?][Hibernate training,1,1,0]} Query:{[insert into post_comment (post_id, comments_id) values (?, ?)][1,1]} #optimistic locking exception in primary transaction Query:{[update post setname=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]} org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [com.vladmihalcea.hibernate.masterclass.laboratory.concurrency.EntityOptimisticLockingOnBidirectionalParentOwningCollectionTest$Post#1] The Bidirectional Parent-Owning-Side-Child Association Testing #create tables Query:{[create table comment (idbigint generated by default as identity (start with 1), review varchar(255), post_id bigint, primary key (id))][]} Query:{[create table post (idbigint not null, name varchar(255), version integer not null, primary key (id))][]} Query:{[alter table comment add constraint FK_f1sl0xkd2lucs7bve3ktt3tu5 foreign key (post_id) references post][]} #insert post in primary transaction Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} #select post in secondary transaction Query:{[selectentityopti0_.idas id1_1_0_, entityopti0_.name as name2_1_0_, entityopti0_.version as version3_1_0_ from post entityopti0_ where entityopti0_.id=?][1]} #insert comment in secondary transaction #post version is not incremented in secondary transaction Query:{[insert into comment (id, post_id, review) values (default, ?, ?)][1,Good post!]} Query:{[selectcount(id) from comment where post_id =?][1]} #update works in primary transaction Query:{[update post setname=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]} If you enjoy reading this article, you might want to subscribe to my newsletter and get a discount for my book as well. Overruling Default Collection Versioning If the default owning-side collection versioning is not suitable for your use case, you can always overrule it with Hibernate [a href="http://docs.jboss.org/hibernate/annotations/3.5/reference/en/html_single/#d0e2903" style="font-family: inherit; font-size: 14px; font-style: inherit; font-weight: inherit; text-decoration: none; color: rgb(1, 160, 219); -webkit-tap-highlight-color: rgb(240, 29, 79); background: transparent;"]@OptimisticLock annotation. Let’s overrule the default parent version update mechanism for bidirectional parent-owning-side-child association: @Entity(name = "post") public class Post { ... @OneToMany(cascade = CascadeType.ALL, orphanRemoval = true) @OptimisticLock(excluded = true) private List comments = new ArrayList (); ... public void addComment(Comment comment) { comment.setPost(this); comments.add(comment); } } @Entity(name = "comment") public class Comment { ... @ManyToOne @JoinColumn(name = "post_id", insertable = false, updatable = false) private Post post; ... } This time, the children collection changes won’t trigger a parent version update: #create tables Query:{[create table comment (idbigint generated by default as identity (start with 1), review varchar(255), post_id bigint, primary key (id))][]} Query:{[create table post (idbigint not null, name varchar(255), version integer not null, primary key (id))][]} Query:{[create table post_comment (post_id bigint not null, comments_id bigint not null)][]} Query:{[]} Query:{[alter table comment add constraint FK_f1sl0xkd2lucs7bve3ktt3tu5 foreign key (post_id) references post][]} Query:{[alter table post_comment add constraint FK_se9l149iyyao6va95afioxsrl foreign key (comments_id) references comment][]} Query:{[alter table post_comment add constraint FK_6o1igdm04v78cwqre59or1yj1 foreign key (post_id) references post][]} #insert post in primary transaction Query:{[insert into post (name, version, id) values (?, ?, ?)][Hibernate training,0,1]} #select post in secondary transaction Query:{[selectentityopti0_.idas id1_1_0_, entityopti0_.name as name2_1_0_, entityopti0_.version as version3_1_0_ from post entityopti0_ where entityopti0_.id=?][1]} Query:{[selectcomments0_.post_id as post_id1_1_0_, comments0_.comments_id as comments2_2_0_, entityopti1_.idas id1_0_1_, entityopti1_.post_id as post_id3_0_1_, entityopti1_.review as review2_0_1_, entityopti2_.idas id1_1_2_, entityopti2_.name as name2_1_2_, entityopti2_.version as version3_1_2_ from post_comment comments0_ inner joincomment entityopti1_ on comments0_.comments_id=entityopti1_.idleft outer joinpost entityopti2_ on entityopti1_.post_id=entityopti2_.idwhere comments0_.post_id=?][1]} #insert comment in secondary transaction Query:{[insert into comment (id, review) values (default, ?)][Good post!]} Query:{[insert into post_comment (post_id, comments_id) values (?, ?)][1,1]} #update works in primary transaction Query:{[update post setname=?, version=? where id=? and version=?][Hibernate Master Class,1,1,0]} If you enjoyed this article, I bet you are going to love my book as well. Conclusion It’s very important to understand how various modeling structures impact concurrency patterns. The owning side collections changes are taken into consideration when incrementing the parent version number, and you can always bypass it using the @OptimisticLock annotation. Code available on GitHub. If you have enjoyed reading my article and you’re looking forward to getting instant email notifications of my latest posts, you just need to follow my blog.
November 4, 2014
by Vlad Mihalcea
· 61,720 Views · 1 Like
article thumbnail
Using REST with the CQRS Pattern to Blend NoSQL & SQL Data
REST Easy with SQL/NoSQL Integration and CQRS Pattern implementation New demands are being put on IT organizations everyday to deliver agile, high-performance, integrated mobile and web applications. In the meantime, the technology landscape is getting complex everyday with the advent of new technologies like REST, NoSQL, Cloud while existing technologies like SOAP and SQL still rule everyday work. Rather than taking religious side of the debate, NoSQL can successfully co-exist with SQL in this ‘polyglot’ of data storage and formats. However, this integration also adds another layer of complexity both in architecture and implementation. This document offers a guide on how some of the relatively newer technologies like REST can help bridge the gap between SQL and NoSQL with an example of a well known pattern called CQRS. This document is organized as follows: Introduction to SQL development process NoSQL Do I have to choose between SQL and NoSQL? CQRS Pattern How to implement CQRS pattern using REST services Introduction to SQL development process Developers have been using SQL Databases for decades to build and deliver enterprise business applications. The process of creating tables, attributes,and relationships is second nature for most developers. Data architects think in terms of tables and columns and navigate relationships for data. The basic concepts of delivery and transformation takes place at the web server level which means the server developer is reading and ‘binding’ to the tables and mapping attributes to a REST response. Application development lifecycle meant changes to the database schema first, followed by the bindings, then internal schema mapping, and finally the SOAP or JSON services, and eventually the client code. This all costs the project time and money. It also means that the ‘code’ (pick your language here) and the business logic would also need to be modified to handle the changes to the model. NoSQL NoSQL is gaining supporters among many SQL shops for various reasons including: Low cost Ability to handle unstructured dataa Scalability Performance The first thing database folks notice is that there is no schema. These document style storage engines can handle huge volumes of structured, semi-structured, and unstructured data. The very nature of schema-less documents allows change to a document structure without having to go through the formal change management process (or data architect). The other major difference is that NoSQL (no-schema) also means no joins or relationships. The document itself contains the embedded information by design. So an order entry would contain the customer with all the orders and line items for each order in a single document. There are many different NoSQL vendors (popular NoSQL databases include MongoDB, Casandra) that are being used for BI and Analytics (read-only) purposes. We are also seeing many customers starting to use NoSQL for auditing, logging, and archival transactions. Do I have to choose between SQL and NoSQL? The purpose of this article is to not get into the religious debate about whether to use SQL or NoSQL. Bottom line is both have their place and are suited for certain type of data – SQL for structured data and NoSQL for unstructured data. So why not have the capability to mix and match this data depending on the application. This can be done by creating a single REST API across both SQL and NoSQL databases. Why a single REST API? The answer is simple – the new agile and mobile world demands this ‘mashup’ of data into a document style JSON response. CQRS (Command Query Responsibility Segmentation) Pattern There are many design patterns for delivery of high performance RESTful services but the one that stands out was described in an article written by Martin Fowler, one of the software industry veterans. He described the pattern called CQRS that is more relevant today in a ‘polyglot’ of servers, data, services, and connections. “We may want to look at the information in a different way to the record store, perhaps collapsing multiple records into one, or forming virtual records by combining information for different places. On the update side we may find validation rules that only allow certain combinations of data to be stored, or may even infer data to be stored that’s different from that we provide.” – Martin Fowler 2011 In this design pattern, the REST API requests (GET) return documents from multiple sources (e.g. mashups). In the update process, the data is subject to business logic derivations, validations, event processing, and database transactions. This data may then be pushed back into the NoSQL using asynchronous events. With the wide-spread adoption of NoSQL databases like MongoDB and schema-less, high capacity data store; most developers are challenged with providing security, business logic, event handling, and integration to other systems. MongoDB; one the popular NoSQL databases and SQL databases share many similar concepts. However the MongoDB programming language itself is very different from the SQL we all know. How to implement CQRS pattern using a RESTFul Architecture A REST server should meet certain requirements to support the CQRS pattern. The server should run on-premise or in the cloud and appears to the mobile and web developer as an HTTP endpoint. The server architecture should implement the following: Connections and Mapping necessary for SQL and NoSQL connectivity and API services needed to create and return GET, PUT, POST, and DELETE REST responses Security Business Logic Connections and Mapping There are two main approaches to creating REST Servers and APIs for SQL and NoSQL databases: Open source frameworks like Apache Tomcat, Spring/Hibernate Commercial framework like Espresso Logic Open source Frameworks Using various open source frameworks like Tomcat, Spring/Hibernate, Node.js, JDBC and MongoDB drivers, a REST server can be created, but we would still be left with the following tasks: Creation and mapping of the necessary SQL objects Create a REST server container and configurations Create Jersey/Jackson classes and annotations Create and define REST API for tables, views, and procedures Hand write validation, event and business logic Handle persistence, optimistic locking, transaction paging Adding identity management and security by roles Now we can start down the same path to connect to MongoDB and write code to connect, select, and return data in JSON and then create the REST calls to merge these two different document styles into a single RESTful endpoint. This is a lot of work for a development team to manage and control and frankly pretty boring and repetitive and is better done by a well designed framework Commercial Frameworks Many commercial frameworks may take care of this complexity without the need to do extensive programming. Here is an example from Espresso Logic and how it handles this complexity with a point and click interface: Running REST server in the cloud or on-premise Connections to external SQL databases Object mapping to tables, views, and procedures Automatic creation of RESTful endpoints from model Reactive business rules and rich event model Integrated role-based security and authentication services. Point-and-click document API creation for SQL and MongoDB endpoints In the example below, the editor shows an SQL (customersTransactions) joined with archived details from MongoDB (archivedTransactions). The MongoDB document for each customer may include transaction details, check images, customer service notes and other relevant account information. This new mashup becomes a single REST call that can be published to mobile and web application developer. Security Security is an important part of building and delivery of RESTful services which can be broken down into two parts; authentication and access control. Authentication Before allowing anyone access to corporate data you want to use the existing corporate identity management (some call this authentication services) to capture and validate the user. This identity management service is based on using existing corporate standards such as LDAP, Windows AD, SQL Database. Role-based Access Control Each user may be assigned one or more corporate roles and these roles are then assigned specific access privileges to each resource (e.g. READ, INSERT, UPDATE, and DELETE). Role-based access should also be able to restrict permissions to specific rows and columns of the API (e.g. only sales reps can see their own orders or a manager can see and change his department salaries but cannot change his own). This restriction should be applied regardless of how or where the API is used or called. Remember, the SQL database already provides some level of security and access which must be considered when designing and delivering new front-end services to internal and external users. Business Logic for REST When data is updated to a REST Server several things need to happen. First, the authentication and access control should determine if this is a valid request and if the user has rights to the endpoint. In addition, the server may need to de-alias REST attributes back to the actual SQL column names. In a full featured business logic server, there should be a series of events and business rules to perform various calculations, validations, and fire other events on dependent tables. Finally, the entire multi-table transaction is written back to the SQL database in a single transaction. Updates are then sent asynchronously to MongoDB as part of the commit event (after the SQL transaction has completed). Conclusion In the real-world of API services, the demand for more complex document style RESTful services is a requirement. That is, the ability to create ‘mashups’ of data from multiple tables, NoSQL collections, and other external systems is a large part of this new design pattern. In addition, the ability to alias attribute names and formats from these source fields has become critical for partners and customers systems. Using REST with the CQRS pattern to blend MongoDB and SQL seamlessly to your existing data will become a major part of your future mobile strategy. To implement these REST services, one can use open source tools and spend a lot of time or select a right commercial framework. This framework should support cloud or on-premise connectivity, security, API integration, as well as business logic. This will make the design and delivery of new application services more rapid and agile in the heterogeneous world of information.
November 4, 2014
by Val Huber DZone Core CORE
· 16,180 Views
article thumbnail
Configuring an OpenStack VM with Multiple Network Cards
[This article was written by Barak Merimovich.] We have discussed OpenStack networking extensively in previous posts. In this post, I’d like to dive into a more advanced OpenStack networking scenario. Many cloud images are not configured to automatically bring up all network cards that are available. They will usually only have a single network card configured. To correctly set up a host in the cloud with multiple network cards, log on to the machine and bring up the additional interfaces. echo $'auto eth1\niface eth1 inet dhcp' | sudo tee /etc/network/interfaces.d/eth1.cfg > /dev/null sudo ifup eth1 Networks in the cloud A complex network architecture is a mainstay of modern IaaS clouds. Understanding how to configure your cloud-based networks, and hosts, is critical to getting your application working in the cloud. This is especially true with Cloudify, the open source cloud orchestration platform I work on. The cloud, like the world, used to be flat It was not that long a time ago that most IaaS providers only supported flat networks – all of your hosts were in one large network. Separation between services running in the cloud was enforced in software or with firewalls/security-groups. But technically, all of the hosts were connected to the same network and visible to each other. The flat network model is simple, and therefore easy to reason and understand. It was a good choice for the early days of the IaaS cloud and no doubt helped with getting applications into the cloud in the first place. It was one of the things that made EC2 so easy to use for anyone just starting out with the ‘cloud’. This model is in fact still available on Amazon Web Services under the title ‘EC2-Classic’. And for many applications, a flat network is good enough. But as cloud adoption increases, more complex applications are moving into the clouds, and issues like network separation, security, SLA and broadcast domains make more complex networks models a must. Software Defined Networks (SDN) fill that gap. They are now a staple of most major IaaS clouds. AWS has AWS-VPC, OpenStack has the Neutron project and there are many other implementations. Working with SDN requires knowing a bit more about how information moves around between your cloud resources. In this post I am going to discuss how to set up a host in the cloud so it will play nice with complex networks. I’ll be using OpenStack, but the concepts are similar for other cloud infrastructures. Openstack configuration I am going to start with an empty tenant, only the public network is available. First, lets set up out networks and router: neutron router-create demo-router neutron net-create demo-network-1 neutron net-create demo-network-2 neutron subnet-create --name demo-subnet-1 demo-network-1 10.0.0.0/24 neutron subnet-create --name demo-subnet-2 demo-network-2 10.0.1.0/24 neutron router-interface-add demo-router demo-subnet-1 neutron router-interface-add demo-router demo-subnet-2 neutron router-gateway-set demo-router public Note the network IDs: neutron net-list | id | name | subnets | | 2c33efe2-6204-4125-9716-3bc525630016 | demo-network-1 | 928dafa0-83ef-459c-b20d-71d8ea596fa2 10.0.0.0/24 | | aa30627e-c181-4a4b-89bf-5dd7c26c244e | demo-network-2 | 26d573f7-7953-4a54-825b-ed7bbc0661c7 10.0.1.0/24 | | e502de8d-929a-4ee0-bd18-efa297875cf6 | public | d40dab51-a729-452c-9ee6-b9ad08d10808 | We’ll start with a standard Ubuntu cloud image: glance image-create --name "Ubuntu 12.04 Standard" --location "http://uec-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64-disk1.img" --disk-format qcow2 --container-format bare Create the keypair and security group: nova keypair-add demo-keypair > demo-keypair.pem chmod 400 demo-keypair.pem nova secgroup-create demo-security-group "Security group for demo" nova secgroup-add-rule demo-security-group tcp 22 22 0.0.0.0/0 Let’s spin up an instance connected to both our networks: nova boot -flavor m1.small --image "Ubuntu 12.04 Standard" --nic net-id=2c33efe2-6204-4125-9716-3bc525630016 --nic net-id=aa30627e-c181-4a4b-89bf-5dd7c26c244e --security-groups demo-security-group --key-name demo-keypair demo-vm And set up floating IPs for the first network: nova list | ID | Name | Status | Task State | Power State | Networks | 2b17588b-8980-4489-9a04-6539a159dc3c | demo-vm | ACTIVE | None | Running | demo-network-1=10.0.0.2; demo-network-2=10.0.1.2 | neutron floatingip-create public neutron floatingip-list | id | fixed_ip_address | floating_ip_address | port_id | | 49c8b05e-bb8f-4b07-80ed-3155ab6ffc09 | | 192.168.15.42 | | neutron port-list | id | name | mac_address | fixed_ips | | 1ccfd334-7328-4b22-b93e-24a0888276ab | | fa:16:3e:14:39:39 | {"subnet_id": "94598487-c1fc-4f55-ac1f-ef2545d5cfeb", "ip_address": "10.0.1.3"} | | a482c4f6-fa74-476e-b1ce-cd8dd0c70815 | | fa:16:3e:18:92:79 | {"subnet_id": "94598487-c1fc-4f55-ac1f-ef2545d5cfeb", "ip_address": "10.0.1.2"} | | b23d7836-30c5-4bff-b873-15c87ba051f6 | | fa:16:3e:3a:28:40 | {"subnet_id": "dec6ec74-cfa9-4a08-8792-54900631b98e", "ip_address": "10.0.0.3"} | | d421b447-2adf-406f-876b-142238683344 | | fa:16:3e:9d:fc:7f | {"subnet_id": "dec6ec74-cfa9-4a08-8792-54900631b98e", "ip_address": "10.0.0.2"} | | dcf8696b-cc80-4b48-b09c-61c0f8ab02ac | | fa:16:3e:5b:39:fb | {"subnet_id": "94598487-c1fc-4f55-ac1f-ef2545d5cfeb", "ip_address": "10.0.1.1"} | | f6a1666e-495a-4d3f-afa3-754b3cb3cfc0 | | fa:16:3e:8a:1b:fb | {"subnet_id": "dec6ec74-cfa9-4a08-8792-54900631b98e", "ip_address": "10.0.0.1"} | neutron floatingip-associate 49c8b05e-bb8f-4b07-80ed-3155ab6ffc09 d421b447-2adf-406f-876b-142238683344 Note how we matched the VM’s IP to its port, and associated the floating IP to the port. I wish there was an easier way to do this from the CLI… If everything worked correctly, you should have the following setup: Let’s make sure ssh works correctly: ssh -i demo-keypair.pem [email protected] hostname demo-vm Cool, ssh works. Now, we should have two network cards, right? ssh -i demo-keypair.pem [email protected] hostname demo-vm Cool, ssh works. Now, we should have two network cards, right? ssh -i demo-keypair.pem [email protected] ifconfig eth0 Link encap:Ethernet HWaddr fa:16:3e:5f:a2:5f inet addr:10.0.0.4 Bcast:10.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::f816:3eff:fe5f:a25f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:230 errors:0 dropped:0 overruns:0 frame:0 TX packets:224 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:46297 (46.2 KB) TX bytes:31130 (31.1 KB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Huh?! The VM only has one working network interface! Where is my second NIC? Was there a configuration problem with the OpenStack network setup? The answer is here: ssh -i demo-keypair.pem [email protected] ifconfig -a eth0 Link encap:Ethernet HWaddr fa:16:3e:5f:a2:5f inet addr:10.0.0.4 Bcast:10.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::f816:3eff:fe5f:a25f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:324 errors:0 dropped:0 overruns:0 frame:0 TX packets:332 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:69973 (69.9 KB) TX bytes:47218 (47.2 KB) eth1 Link encap:Ethernet HWaddr fa:16:3e:29:6d:22 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) The second NIC exists, but is not running. The issue is not with the OpenStack network configuration – it’s with the image. The image itself should be configured to work correctly with multiple NICs. All we have to do is bring up the NIC. So we ssh into the instance: ssh -i demo-keypair.pem [email protected] And run the following commands: echo $'auto eth1\niface eth1 inet dhcp' | sudo tee /etc/network/interfaces.d/eth1.cfg > /dev/null sudo ifup eth1 The second NIC should now be running: ifconfig eth1 eth1 Link encap:Ethernet HWaddr fa:16:3e:18:92:79 inet addr:10.0.1.2 Bcast:10.0.1.255 Mask:255.255.255.0 inet6 addr: fe80::f816:3eff:fe18:9279/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:81 errors:0 dropped:0 overruns:0 frame:0 TX packets:45 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:15376 (15.3 KB) TX bytes:3960 (3.9 KB) And there you go – your VM can access both networks. This issue can make life complicated when setting up a complex, or even a not very complex, application. When will this issue hurt you? Well, imagine a scenario where you have a web server and a database server. The web server is connected to both Network1 and Network2, and the database server is only connected to Network2. Network1 is connected to the external world over a router, and Network 2 is completely internal, adding another layer of security to the critical database server. So what happens if the web server only has one network card? If only the NIC for Network1 is up, the web server can’t access the database. If only the NIC for Network2 is up, the web server can’t be reached from the external world. Even worse, if this web server is accessed via a floating IP, this IP will also not work, so you won’t be able to access the web server and fix the issue. Tricky. In conclusion The above commands will bring up your additional network card. You will of-course need to repeat this process for each additional network card, and for each VM. You can use a start-up script (a.k.a. user-data script) or system service to run these commands, but there are better ways. I’ll discuss how to automate the network setup in a follow-up post. This was originally posted at Barak's blog Head in the Clouds, find it here.
November 4, 2014
by Sharone Zitzman
· 14,688 Views
article thumbnail
BigList: a Scalable High-Performance List for Java
As memory gets cheaper and cheaper, our applications can keep more data readily available in main memory, or even all as in case of in-memory databases. To make real use of the growing heap memory, appropriate data structures must be used. Interesting enough, there seem to be no specialized implementations for lists - by far the most used collection. This article introduces BigList, a list designed for handling large collections where large means that all data still fit completely in the heap memory. The article will show the special requirements for handling large collections, how BigList is implemented and how it compares to other list implementations. 1. Requirements What are the special requirements we need to handle large collections efficiently? Memory: Sparing use of memory: The list should need little memory for its own implementation so memory can be used for storing application data. Specialized versions for primitives: It must be possible to store common primitives like ints in a memory saving way. Avoid copying large data blocks: If the list grows or shrinks, only a small part of the data must be copied around, as this operation becomes expensive and needs the same amount of memory again. Data sharing: copying collections is a frequent operation which should be efficiently possible even if the collection is large. An efficient implementation requires some sort of data sharing as copying all elements is per se a costly operation. Performance: Good performance for normal operations like reading, storing, adding or removing single elements. Great performance for bulk operations like adding or removing multiple elements. Predictable overhead of operations, so similar operations should need a similar amount of time without excessive worst case scenarios. If an implementation does not offer these features, some operations will not only be slow for really large collections, but will becomse just not feasible because memory or CPU usage will be too exhaustive. Introduction to BigList BigList is a member of the Brownies Collections library which also includes GapList, the fastest list implementation known. GapList is a drop-in replacement for ArrayList, LinkedList, or ArrayDequeue and offers fast access by index and fast insertion/removal at the beginning and at the end at the same time. GapList however has not been designed to cope with large collections, so adding or removing elements can make it necessary to copy a lot of elements around which will lead to performance problems. Also copying a large collection becomes an expensive operation, both in term of time and memory consumption. It will simply not be possible to make a copy of a large collections if not the same amount of memory is available a second time. And this is a common operation as you often want to return a copy of an internal list through your API which has no reference the original list. BigList addresses both problems. The first problem is solved by storing the collection elements in fixed size blocks. Add or remove operations are then implemented to use only data from one block. The copying problem is solved by maintaining a reference count on the fixed size blocks which allows to implement a copy-on-write approach. For efficient access to the fixed size blocks, they are maintained in a specialized tree structure. 2. BigList Details Each BigList instance stores the following information: Elements are stored in in fixed-size blocks A single block is implemented as GapList with a reference count for sharing All blocks are maintained in a tree for fast access Access information for the current block is cached for better performance The following illustration shows these details for two instances of BigList which share one block. 2.1 Use of Blocks Elements are stored in in fixed-size blocks with a default block size of 1000. Where this default may look pretty small, it is most of the time a good choice because it guarantees that write operation only need to move few elements. Read operations will profit from locality of reference by using the currently cached block to be fast. It is however possible to specify the block size for each created BigList instance. All blocks except the first are allocated with this fixed size and will not grow or shrink. The first block will grow to the specified block size to save memory for small lists. If a block has reached its maximum size and more data must be stored within, the block needs to be split up in two blocks before more elements can be stored. If elements are added to the head or tail of the list, the block will only be filled up to a threshold of 95%. This allows inserts into the block without the immediate need for split operations. To save memory, blocks are also merged. This happens automatically if two adjacent blocks are both filled less than 35% after a remove operation. 2.2 Locality of Reference For each operation on BigList, the affected block must be determined first. As predicted by locality of reference, most of the time the affected block will be the same as for the last operation. The implementation of BigList has therefore been designed to profit from locality of reference which makes common operations like iterating over a list very efficient. Instead of always traversing the block tree to determine the block needed for an operation, lower and upper index of the last used block are cached. So if the next operation happens near to the previous one, the same block can be used again without need to traverse the tree. 2.3 Reference Counting To support a copy-on-write approach, BigList stores a reference count for each fixed size blocks indicating whether this block is private or shared. Initially all lists are private having a reference count of 0, so that modification are allowed. If a list is copied, the reference count is incremented which prohibits further modifications. Before a modification then can be made, the block must be copied decrementing the block's reference count and setting the reference count of the copy to 0. The reference count of a block is then decremented by the finalizer of BigList. 3. Benchmarks To prove the excellence of BigList in time and memory consumption, we compare it with some other List implementations. And here are the nominees: Type Library Description BigList brownie-collections List optimized for storing large number of elements. Elements stored in fixed size blocks which are maintained in a tree. GapList brownie-collections Fastest list implementation known. Fast access by index and fast insertion/removal at end and at beginning. ArrayList JDK Maintains elements in a single array. Fast access by index, fast insertion/removal at end, but slow at beginning. LinkedList JDK Elements stored using a linked list. Slow access by index. Memory overhead for each element stored. TreeList commons-collections Elements stored in a tree. All operations are not really fast, but there are no very slow operations. Memory overhead for each element stored. FastTable javolution Elements stored in a "fractal"-like data structure. Good performance and use of memory. However no bulk operations and collection does not shrink. 3.1 Handling Objects In the first part of the benchmark, we compare memory consumption and performance of the different list implementations. Let's first have a look at the memory consumption. The following table shows the bytes used to hold a list with 1'000'000 null elements: BigList GapList ArrayList LinkedList TreeList FastTable 32 bit 4'298'466 4'021'296 4'861'992 8'000'028 18'000'028 4'142'892 64 bit 8'544'254 8'042'552 9'723'964 16'000'044 26'000'044 8'222'988 We can see that BigList, GapList, ArrayList, and FastTable only add small overhead to the stored elements, where as Linkedlist needs twice the memory and TreeList even more. Now to the performance. Here are the results of 9 benchmarks which have been run for each of the 6 candidates with JDK 8 in a 32 bit Windows environment and a list of 1'000'000 elements: The result table can be read as follows: the fastest candidate for each test has a relative performance indicator of 1 the value for the other candidates indicate how many times they have been slower, so a factor of 3 means that this implementation was 3 times slower than the best one The different factor are colored like this: 
factor 1: green (best)
 factor <5: blue (good) 
factor <25: yellow (moderate) 
factor >25: red (poor) If we look the benchmark result, we can see that the performance of BigList is best for all expect two benchmarks. The only moderate result is produces in getting elements in a totally random order. This could be expected as there is no locality of reference which can be exploited, so for each access, the block tree must be traversed to find the correct block. Luckily this is a rare use case in real applications. And the benchmark "Get local" shows that performance is back to good as soon as elements next to each other must be retrieved - as it is the case if we iterate over a range. 3.2 Handling Primitives In the second part of the benchmark, we want see how big the savings are if we use a data structure specialized for storing primitives compared to strong wrapped objects. For this reason, we compare IntBigList and BigList. The following table shows memory needed to store 1'000'000 integer values: BigList IntBigList 32 bit 16'298'454 4'534'840 64 bit 28'544'234 4'570'432 Obviously it is easy to save a lot of memory. In a 32 bit environment, IntBigList just needs 25% percent of memory, in a 64 bit environment only 14%! These figures become plausible if you recall that a simple object needs 8 bytes in a 32 bit, but already 16 bytes in a 64 bit environment, where as a primitive integer value always only needs 4 bytes. The measurable performance gain is not so impressive, it is something below 10% for simple get operations and something above 10% for add and remove operations. These numbers show that the JVM is impressively fast in creating wrapper objects and boxing and unboxing primitive values. We must however also consider that each created object will need to be garbage collected once and therefore adds to the total load of the JVM. 4. Summary BigList is a scalable high-performance list for storing large collections. Its design guarantees that all operations will be predictable and efficient both in term of performance and memory consumption, even copying large collections is tremendous fast. Benchmarks haven proven this and shown that BigList outperform other known list implementations. The library also offers specialized implementations for primitive types like IntBigList which save much memory and provide superior performance. BigList for handling objects and the specializations for handling primitives are part of the Brownies Collections library and can be downloaded from http://www.magicwerk.org/collections.
November 3, 2014
by Thomas Mauch
· 32,913 Views · 10 Likes
article thumbnail
Building a REST API with JAXB, Spring Boot and Spring Data
if someone asked you to develop a rest api on the jvm, which frameworks would you use? i was recently tasked with such a project. my client asked me to implement a rest api to ingest requests from a 3rd party. the project entailed consuming xml requests, storing the data in a database, then exposing the data to internal application with a json endpoint. finally, it would allow taking in a json request and turning it into an xml request back to the 3rd party. with the recent release of apache camel 2.14 and my success using it , i started by copying my apache camel / cxf / spring boot project and trimming it down to the bare essentials. i whipped together a simple hello world service using camel and spring mvc. i also integrated swagger into both. both implementations were pretty easy to create ( sample code ), but i decided to use spring mvc. my reasons were simple: its rest support was more mature, i knew it well, and spring mvc test makes it easy to test apis. camel's swagger support without web.xml as part of the aforementioned spike, i learned out how to configure camel's rest and swagger support using spring's javaconfig and no web.xml. i made this into a sample project and put it on github as camel-rest-swagger . this article shows how i built a rest api with java 8, spring boot/mvc, jaxb and spring data (jpa and rest components). i stumbled a few times while developing this project, but figured out how to get over all the hurdles. i hope this helps the team that's now maintaining this project (my last day was friday) and those that are trying to do something similar. xml to java with jaxb the data we needed to ingest from a 3rd party was based on the ncpdp standards. as a member, we were able to download a number of xsd files, put them in our project and generate java classes to handle the incoming/outgoing requests. i used the maven-jaxb2-plugin to generate the java classes. org.jvnet.jaxb2.maven2 maven-jaxb2-plugin 0.8.3 generate -xtostring -xequals -xhashcode -xcopyable org.jvnet.jaxb2_commons jaxb2-basics 0.6.4 src/main/resources/schemas/ncpdp the first error i ran into was about a property already being defined. [info] --- maven-jaxb2-plugin:0.8.3:generate (default) @ spring-app --- [error] error while parsing schema(s).location [ file:/users/mraible/dev/spring-app/src/main/resources/schemas/ncpdp/structures.xsd{1811,48}]. com.sun.istack.saxparseexception2; systemid: file:/users/mraible/dev/spring-app/src/main/resources/schemas/ncpdp/structures.xsd; linenumber: 1811; columnnumber: 48; property "multipletimingmodifierandtimingandduration" is already defined. use to resolve this conflict. at com.sun.tools.xjc.errorreceiver.error(errorreceiver.java:86) i was able to workaround this by upgrading to maven-jaxb2-plugin version 0.9.1. i created a controller and stubbed out a response with hard-coded data. i confirmed the incoming xml-to-java marshalling worked by testing with a sample request provided by our 3rd party customer. i started with a curl command, because it was easy to use and could be run by anyone with the file and curl installed. curl -x post -h 'accept: application/xml' -h 'content-type: application/xml' \ --data-binary @sample-request.xml http://localhost:8080/api/message -v this is when i ran into another stumbling block: the response wasn't getting marshalled back to xml correctly. after some research, i found out this was caused by the lack of @xmlrootelement annotations on my generated classes. i posted a question to stack overflow titled returning jaxb-generated elements from spring boot controller . after banging my head against the wall for a couple days, i figured out the solution . i created a bindings.xjb file in the same directory as my schemas. this causes jaxb to generate @xmlrootelement on classes. to add namespaces prefixes to the returned xml, i had to modify the maven-jaxb2-plugin to add a couple arguments. -extension -xnamespace-prefix and add a dependency: org.jvnet.jaxb2_commons jaxb2-namespace-prefix 1.1 then i modified bindings.xjb to include the package and prefix settings. i also moved into a global setting. i eventually had to add prefixes for all schemas and their packages. i learned how to add prefixes from the namespace-prefix plugins page . finally, i customized the code-generation process to generate joda-time's datetime instead of the default xmlgregoriancalendar . this involved a couple custom xmladapters and a couple additional lines in bindings.xjb . you can see the adapters and bindings.xjb with all necessary prefixes in this gist . nicolas fränkel's customize your jaxb bindings was a great resource for making all this work. i wrote a test to prove that the ingest api worked as desired. @runwith(springjunit4classrunner.class) @springapplicationconfiguration(classes = application.class) @webappconfiguration @dirtiescontext(classmode = dirtiescontext.classmode.after_class) public class initiaterequestcontrollertest { @inject private initiaterequestcontroller controller; private mockmvc mockmvc; @before public void setup() { mockitoannotations.initmocks(this); this.mockmvc = mockmvcbuilders.standalonesetup(controller).build(); } @test public void testgetnotallowedonmessagesapi() throws exception { mockmvc.perform(get("/api/initiate") .accept(mediatype.application_xml)) .andexpect(status().ismethodnotallowed()); } @test public void testpostpainitiationrequest() throws exception { string request = new scanner(new classpathresource("sample-request.xml").getfile()).usedelimiter("\\z").next(); mockmvc.perform(post("/api/initiate") .accept(mediatype.application_xml) .contenttype(mediatype.application_xml) .content(request)) .andexpect(status().isok()) .andexpect(content().contenttype(mediatype.application_xml)) .andexpect(xpath("/message/header/to").string("3rdparty")) .andexpect(xpath("/message/header/sendersoftware/sendersoftwaredeveloper").string("hid")) .andexpect(xpath("/message/body/status/code").string("010")); } } spring data for jpa and rest with jaxb out of the way, i turned to creating an internal api that could be used by another application. spring data was fresh in my mind after reading about it last summer. i created classes for entities i wanted to persist, using lombok's @data to reduce boilerplate. i read the accessing data with jpa guide, created a couple repositories and wrote some tests to prove they worked. i ran into an issue trying to persist joda's datetime and found jadira provided a solution. i added its usertype.core as a dependency to my pom.xml: org.jadira.usertype usertype.core 3.2.0.ga ... and annotated datetime variables accordingly. @column(name = "last_modified", nullable = false) @type(type="org.jadira.usertype.dateandtime.joda.persistentdatetime") private datetime lastmodified; with jpa working, i turned to exposing rest endpoints. i used accessing jpa data with rest as a guide and was looking at json in my browser in a matter of minutes. i was surprised to see a "profile" service listed next to mine, and posted a question to the spring boot team. oliver gierke provided an excellent answer . swagger spring mvc's integration for swagger has greatly improved since i last wrote about it . now you can enable it with a @enableswagger annotation. below is the swaggerconfig class i used to configure swagger and read properties from application.yml . @configuration @enableswagger public class swaggerconfig implements environmentaware { public static final string default_include_pattern = "/api/.*"; private relaxedpropertyresolver propertyresolver; @override public void setenvironment(environment environment) { this.propertyresolver = new relaxedpropertyresolver(environment, "swagger."); } /** * swagger spring mvc configuration */ @bean public swaggerspringmvcplugin swaggerspringmvcplugin(springswaggerconfig springswaggerconfig) { return new swaggerspringmvcplugin(springswaggerconfig) .apiinfo(apiinfo()) .genericmodelsubstitutes(responseentity.class) .includepatterns(default_include_pattern); } /** * api info as it appears on the swagger-ui page */ private apiinfo apiinfo() { return new apiinfo( propertyresolver.getproperty("title"), propertyresolver.getproperty("description"), propertyresolver.getproperty("termsofserviceurl"), propertyresolver.getproperty("contact"), propertyresolver.getproperty("license"), propertyresolver.getproperty("licenseurl")); } } after getting swagger to work, i discovered that endpoints published with @repositoryrestresource aren't picked up by swagger. there is an open issue for spring data support in the swagger-springmvc project. liquibase integration i configured this project to use h2 in development and postgresql in production. i used spring profiles to do this and copied xml/yaml (for maven and application*.yml files) from a previously created jhipster project. next, i needed to create a database. i decided to use liquibase to create tables, rather than hibernate's schema-export. i chose liquibase over flyway based of discussions in the jhipster project . to use liquibase with spring boot is dead simple: add the following dependency to pom.xml, then place changelog files in src/main/resources/db/changelog . org.liquibase liquibase-core i started by using hibernate's schema-export and changing hibernate.ddl-auto to "create-drop" in application-dev.yml . i also commented out the liquibase-core dependency. then i setup a postgresql database and started the app with "mvn spring-boot:run -pprod". i generated the liquibase changelog from an existing schema using the following command (after downloading and installing liquibase). liquibase --driver=org.postgresql.driver --classpath="/users/mraible/.m2/repository/org/postgresql/postgresql/9.3-1102-jdbc41/postgresql-9.3-1102-jdbc41.jar:/users/mraible/snakeyaml-1.11.jar" --changelogfile=/users/mraible/dev/spring-app/src/main/resources/db/changelog/db.changelog-02.yaml --url="jdbc:postgresql://localhost:5432/mydb" --username=user --password=pass generatechangelog i did find one bug - the generatechangelog command generates too many constraints in version 3.2.2 . i was able to fix this by manually editing the generated yaml file. tip: if you want to drop all tables in your database to verify liquibase creation is working in postgesql, run the following commands: psql -d mydb drop schema public cascade; create schema public; after writing minimal code for spring data and configuring liquibase to create tables/relationships, i relaxed a bit, documented how everything worked and added a loggingfilter . the loggingfilter was handy for viewing api requests and responses. @bean public filterregistrationbean loggingfilter() { loggingfilter filter = new loggingfilter(); filterregistrationbean registrationbean = new filterregistrationbean(); registrationbean.setfilter(filter); registrationbean.seturlpatterns(arrays.aslist("/api/*")); return registrationbean; } accessing api with resttemplate the final step i needed to do was figure out how to access my new and fancy api with resttemplate . at first, i thought it would be easy. then i realized that spring data produces a hal -compliant api, so its content is embedded inside an "_embedded" json key. after much trial and error, i discovered i needed to create a resttemplate with hal and joda-time awareness. @bean public resttemplate resttemplate() { objectmapper mapper = new objectmapper(); mapper.configure(deserializationfeature.fail_on_unknown_properties, false); mapper.registermodule(new jackson2halmodule()); mapper.registermodule(new jodamodule()); mappingjackson2httpmessageconverter converter = new mappingjackson2httpmessageconverter(); converter.setsupportedmediatypes(mediatype.parsemediatypes("application/hal+json")); converter.setobjectmapper(mapper); stringhttpmessageconverter stringconverter = new stringhttpmessageconverter(); stringconverter.setsupportedmediatypes(mediatype.parsemediatypes("application/xml")); list> converters = new arraylist<>(); converters.add(converter); converters.add(stringconverter); return new resttemplate(converters); } the jodamodule was provided by the following dependency: com.fasterxml.jackson.datatype jackson-datatype-joda with the configuration complete, i was able to write a messagesapiitest integration test that posts a request and retrieves it using the api. the api was secured using basic authentication, so it took me a bit to figure out how to make that work with resttemplate. willie wheeler's basic authentication with spring resttemplate was a big help. @runwith(springjunit4classrunner.class) @contextconfiguration(classes = integrationtestconfig.class) public class messagesapiitest { private final static log log = logfactory.getlog(messagesapiitest.class); @value("http://${app.host}/api/initiate") private string initiateapi; @value("http://${app.host}/api/messages") private string messagesapi; @value("${app.host}") private string host; @inject private resttemplate resttemplate; @before public void setup() throws exception { string request = new scanner(new classpathresource("sample-request.xml").getfile()).usedelimiter("\\z").next(); responseentity response = resttemplate.exchange(gettesturl(initiateapi), httpmethod.post, getbasicauthheaders(request), org.ncpdp.schema.transport.message.class, collections.emptymap()); assertequals(httpstatus.ok, response.getstatuscode()); } @test public void testgetmessages() { httpentity request = getbasicauthheaders(null); responseentity> result = resttemplate.exchange(gettesturl(messagesapi), httpmethod.get, request, new parameterizedtypereference>() {}); httpstatus status = result.getstatuscode(); collection messages = result.getbody().getcontent(); log.debug("messages found: " + messages.size()); assertequals(httpstatus.ok, status); for (message message : messages) { log.debug("message.id: " + message.getid()); log.debug("message.datecreated: " + message.getdatecreated()); } } private httpentity getbasicauthheaders(string body) { string plaincreds = "user:pass"; byte[] plaincredsbytes = plaincreds.getbytes(); byte[] base64credsbytes = base64.encodebase64(plaincredsbytes); string base64creds = new string(base64credsbytes); httpheaders headers = new httpheaders(); headers.add("authorization", "basic " + base64creds); headers.add("content-type", "application/xml"); if (body == null) { return new httpentity<>(headers); } else { return new httpentity<>(body, headers); } } } to get spring data to populate the message id, i created a custom restconfig class to expose it. i learned how to do this from tommy ziegler . /** * used to expose ids for resources. */ @configuration public class restconfig extends repositoryrestmvcconfiguration { @override protected void configurerepositoryrestconfiguration(repositoryrestconfiguration config) { config.exposeidsfor(message.class); config.setbaseuri("/api"); } } summary this article explains how i built a rest api using jaxb, spring boot, spring data and liquibase. it was relatively easy to build, but required some tricks to access it with spring's resttemplate. figuring out how to customize jaxb's code generation was also essential to make things work. i started developing the project with spring boot 1.1.7, but upgraded to 1.2.0.m2 after i found it supported log4j2 and configuring spring data rest's base uri in application.yml. when i handed the project off to my client last week, it was using 1.2.0.build-snapshot because of a bug when running in tomcat . this was an enjoyable project to work on. i especially liked how easy spring data makes it to expose jpa entities in an api. spring boot made things easy to configure once again and liquibase seems like a nice tool for database migrations. if someone asked me to develop a rest api on the jvm, which frameworks would i use? spring boot, spring data, jackson, joda-time, lombok and liquibase. these frameworks worked really well for me on this particular project.
October 30, 2014
by Matt Raible
· 64,261 Views
article thumbnail
Sharding Pitfalls Part III: Chunk Balancing and Collection Limits
In Parts 1 and 2 we have covered a number of common issues people run into when managing a sharded MongoDB cluster. In this final post of the series we will cover a subtle, but important distinction in terms of balancing a sharded cluster as well as an interesting limitation that can be worked around relatively easily, but is nonetheless surprising when it comes up. 6. Chunk balancing != data balancing != traffic balancing The balancer in a sharded cluster cares about just one thing: Are chunks for a given collection evenly balanced across all shards? If they are not, then it will take steps to rectify that imbalance. This all sounds perfectly logical, and even with extra complexity like tagging involved the logic is pretty straight forward. If we assume that all chunks are equal, then we can rest assured that our data is being evenly balanced across all the shards in our cluster and rest easy at night. Although that is sometimes, perhaps even frequently, the case it is not always true - chunks are not always equal. There can be massive “jumbo” chunks that exceed the maximum chunk size (64MiB), completely empty chunks and everything in between. Let’s use an example from our first pitfall, the monotonically increasing shard key. For our example, we have picked just such a key to shard on (date), and up until this point we have had just one shard and had not sharded the collection. We are about to add a second shard to our cluster and so we enable sharding on the collection and do the necessary admin work to add the new shard into the cluster. Once the collection is enabled for sharding, the first shard contains all the newly minted chunks. Let’s represent them in a simplified table of 10 chunks. This is not representative of a real data set, but it will do for illustrative purposes: Table 1 - Initial Chunk Layout Now we add our second shard. The balancer will kick in and attempt to distribute the chunks evenly. It will do this by moving the lowest range chunks to the new shard until the counts are identical. Once it is finished balancing, our table now looks like this: Table 2 - Balanced Chunk Layout That looks pretty good at the moment, but lets imagine that more recent chunks are more likely to have more activity (updates say) than older chunks. Adding the traffic share estimates for each chunk shows that shard1 is taking far more traffic (72%) than shard2 (28%) despite the chunks seeming balanced overall based on the approximate size. Hence, chunk balancing is not equal to traffic balancing. Using that same example, let’s add another wrinkle - periodic deletion of old data. Every 3 months we run a job to delete any data older than 12 months. Let’s look at the impact of that on our table after we run it for the first time (assuming the first run happens on July 1st 2015). Table 3 - Post-Delete Chunk Layout The distribution of data is now completely skewed toward shard1 - shard2 is in fact empty! However, the balancer is completely unaware of this imbalance - the chunk count has remained the same the entire time, and as far as it is concerned the system is in a steady state. With no data on shard2, our traffic imbalance as seen above will be even worse, and we have essentially negated the benefit of having a second shard for this collection. Possible Mitigation Strategies If data and traffic balance are important, select an appropriate shard key Move chunks manually to address the imbalances - swap “hot” chunks for “cool” chunks, empty chunks for larger chunks 7. Waiting too long to shard a collection (collection too large) This is not very common, but when it falls on your shoulders, it can be quite challenging to solve. There is a maximum data size for a collection when when it is initially split which is a function of the chunk size and data size as noted on the limits page. If your collection contains less than 256GiB of data, then there will be no issue. If the collection size exceeds 256GiB but is less than 400GiB, then MongoDB may be able to do an initial split without any special measures being taken. Otherwise, with larger initial data sizes and the default settings, the initial split will fail. It is worth noting that once split the collection may grow as needed and without any real limitations as long as you can continue to add shards as data size grows. Possible Mitigation Strategies Since the limit is dictated by the chunk size and the data size, and assuming there is not much to be done about the data size, then the remaining variable is the chunk size. This is adjustable (default is 64MiB) and can be raised in order to let a large collection split initially and then reduced once that has been completed. The required chunk size increase will depend on the actual data size. However, this is relatively easy to work out - simply divide your data size by 256GB and then multiply that figure by 64MiB (and round up if it is not a nice even number). As an example, let’s consider a 4TiB collection: 4TiB divided by 256GiB = 16 64MiB x 16 = 1024MiB Hence, set the max chunk size to 1024MiB, then perform the initial sharding of the collection, and then finally reduce the chunk size back to 64MiB using the same procedure. . Thanks for reading through the Sharding Pitfall series! If you want to learn more about managing MongoDB deployments at scale, sign up for my online education course, MongoDB Advanced Deployment and Operations. Planning for scale? No problem: MongoDB is here to help. Get a preview of what it’s like to work with MongoDB’s Technical Services Team. Give us some details on your deployment and we can set you up with an expert who can provide detailed guidance on all aspects of scaling with MongoDB, based on our experience with hundreds of deployments.
October 27, 2014
by Francesca Krihely
· 4,266 Views
article thumbnail
How to Avoid Hash Collisions When Using MySQL’s CRC32 Function
Originally Written by Arunjith Aravindan Percona Toolkit’s pt-table-checksum performs an online replication consistency check by executing checksum queries on the master, which produces different results on replicas that are inconsistent with the master – and the tool pt-table-sync synchronizes data efficiently between MySQL tables. The tools by default use the CRC32. Other good choices include MD5 and SHA1. If you have installed the FNV_64 user-defined function, pt-table-sync will detect it and prefer to use it, because it is much faster than the built-ins. You can also use MURMUR_HASH if you’ve installed that user-defined function. Both of these are distributed with Maatkit. For details please see the tool’s documentation. Below are test cases similar to what you might have encountered. By using the table checksum we can confirm that the two tables are identical and useful to verify a slave server is in sync with its master. The following test cases with pt-table-checksum and pt-table-sync will help you use the tools more accurately. For example, in a master-slave setup we have a table with a primary key on column “a” and a unique key on column “b”. Here the master and slave tables are not in sync and the tables are having two identical values and two distinct values. The pt-table-checksum tool should be able to identify the difference between master and slave and the pt-table-sync in this case should sync the tables with two REPLACE queries. +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ +-----+-----+ Case 1: Non-cryptographic Hash function (CRC32) and the Hash collision. The tables in the source and target have two different columns and in general way of thinking the tools should identify the difference. But the below scenarios explain how the tools can be wrongly used and how to avoid them – and make things more consistent and reliable when using the tools in your production. The tools by default use the CRC32 checksums and it is prone to hash collisions. In the below case the non-cryptographic function (CRC32) is not able to identify the two distinct values as the function generates the same value even we are having the distinct values in the tables. CREATE TABLE `t1` ( `a` int(11) NOT NULL, `b` int(11) NOT NULL, PRIMARY KEY (`a`), UNIQUE KEY `b` (`b`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ +-----+-----+ Master: [root@localhost mysql]# pt-table-checksum --replicate=percona.checksum --create-replicate-table --databases=db1 --tables=t1 localhost --user=root --password=*** --no-check-binlog-format TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-17T00:59:45 0 0 4 1 0 1.081 db1.t1 Slave: [root@localhost bin]# ./pt-table-sync --print --execute --replicate=percona.checksum --tables db1.t1 --user=root --password=*** --verbose --sync-to-master 192.**.**.** # Syncing via replication h=192.**.**.**,p=...,u=root # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE Narrowed down to BIT_XOR: Master: mysql> SELECT BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) FROM `db1`.`t1`; +------------------------------------------------------------+ | BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) | +------------------------------------------------------------+ | 6581445 | +------------------------------------------------------------+ 1 row in set (0.00 sec) Slave: mysql> SELECT BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) FROM `db1`.`t1`; +------------------------------------------------------------+ | BIT_XOR(CAST(CRC32(CONCAT_WS('#', `a`, `b`)) AS UNSIGNED)) | +------------------------------------------------------------+ | 6581445 | +------------------------------------------------------------+ 1 row in set (0.16 sec) Case 2: As the tools are not able to identify the difference, let us add a new row to the slave and check if the tools are able to identify the distinct values. So I am adding a new row (5,5) to the slave. mysql> insert into db1.t1 values(5,5); Query OK, 1 row affected (0.05 sec) Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ | 5 | 5 | +-----+-----+ [root@localhost mysql]# pt-table-checksum --replicate=percona.checksum --create-replicate-table --databases=db1 --tables=t1 localhost --user=root --password=*** --no-check-binlog-format TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-17T01:01:13 0 1 4 1 0 1.054 db1.t1 [root@localhost bin]# ./pt-table-sync --print --execute --replicate=percona.checksum --tables db1.t1 --user=root --password=*** --verbose --sync-to-master 192.**.**.** # Syncing via replication h=192.**.**.**,p=...,u=root # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE DELETE FROM `db1`.`t1` WHERE `a`='5' LIMIT 1 /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.**.**.**. 10,p=...,u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.**.**.**,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5205 user:root host:localhost.localdomain*/; REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('3', '4') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.**.**.**, p=...,u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.**.**.**,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5205 user:root host:localhost.localdomain*/; REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('4', '3') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.**.**.**, p=...,u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.**.**.**,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5205 user:root host:localhost.localdomain*/; # 1 2 0 0 Chunk 01:01:43 01:01:43 2 db1.t1 Well, apparently the tools are now able to identify the newly added row in the slave and the two other rows having the difference. Case 3: Advantage of Cryptographic Hash functions (Ex: Secure MD5) As such let us make the tables as in the case1 and ask the tools to use the cryptographic (secure MD5) hash functions instead the usual non-cryptographic function. The default CRC32 function provides no security due to their simple mathematical structure and too prone to hash collisions but the MD5 provides better level of integrity. So let us try with the –function=md5 and see the result. Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 3 | 3 | | 3 | 4 | | 4 | 4 | +-----+-----+ +-----+-----+ Narrowed down to BIT_XOR: Master: mysql> SELECT 'test', 't2', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING (@crc, 1, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'), LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := md5(CONCAT_WS('#', `a`, `b`)) , 17, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'))), 0) AS crc FROM `db1`.`t1`; +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | cnt | crc | +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | 4 | 000000000000000063f65b71e539df48 | +------+----+---+------+------+------+-----+----------------------------------+ 1 row in set (0.00 sec) Slave: mysql> SELECT 'test', 't2', '1', NULL, NULL, NULL, COUNT(*) AS cnt, COALESCE(LOWER(CONCAT(LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING (@crc, 1, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'), LPAD(CONV(BIT_XOR(CAST(CONV(SUBSTRING(@crc := md5(CONCAT_WS('#', `a`, `b`)) , 17, 16), 16, 10) AS UNSIGNED)), 10, 16), 16, '0'))), 0) AS crc FROM `db1`.`t1`; +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | cnt | crc | +------+----+---+------+------+------+-----+----------------------------------+ | test | t2 | 1 | NULL | NULL | NULL | 4 | 0000000000000000df024e1a4a32c31f | +------+----+---+------+------+------+-----+----------------------------------+ 1 row in set (0.00 sec) [root@localhost mysql]# pt-table-checksum --replicate=percona.checksum --create-replicate-table --function=md5 --databases=db1 --tables=t1 localhost --user=root --password=*** --no-check-binlog-format TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 09-23T23:57:52 0 1 12 1 0 0.292 db1.t1 [root@localhost bin]# ./pt-table-sync --print --execute --replicate=percona.checksum --tables db1.t1 --user=root --password=amma --verbose --function=md5 --sync-to-master 192.***.***.*** # Syncing via replication h=192.168.56.102,p=...,u=root # DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('3', '4') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.168.56.101,p=..., u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.***.***.***,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5608 user:root host:localhost.localdomain*/; REPLACE INTO `db1`.`t1`(`a`, `b`) VALUES ('4', '3') /*percona-toolkit src_db:db1 src_tbl:t1 src_dsn:P=3306,h=192.168.56.101,p=..., u=root dst_db:db1 dst_tbl:t1 dst_dsn:h=192.***.**.***,p=...,u=root lock:1 transaction:1 changing_src:percona.checksum replicate:percona.checksum bidirectional:0 pid:5608 user:root host:localhost.localdomain*/; # 0 2 0 0 Chunk 04:46:04 04:46:04 2 db1.t1 Master Slave +-----+-----+ +-----+-----+ | a | b | | a | b | +-----+-----+ +-----+-----+ | 2 | 1 | | 2 | 1 | | 1 | 2 | | 1 | 2 | | 4 | 3 | | 4 | 3 | | 3 | 4 | | 3 | 4 | +-----+-----+ +-----+-----+ The MD5 did the trick and solved the problem. See the BIT_XOR result for the MD5 given above and the function is able to identify the distinct values in the tables and resulted with the different crc values. The MD5 (Message-Digest algorithm 5) is a well-known cryptographic hash function with a 128-bit resulting hash value. MD5 is widely used in security-related applications, and is also frequently used to check the integrity but MD5() and SHA1() are very CPU-intensive with slower checksumming if chunk-time is included.
October 24, 2014
by Peter Zaitsev
· 7,075 Views
article thumbnail
Understanding Information Retrieval by Using Apache Lucene and Tika - Part 1
introduction in this tutorial, the apache lucene and apache tika frameworks will be explained through their core concepts (e.g. parsing, mime detection, content analysis, indexing, scoring, boosting) via illustrative examples that should be applicable to not only seasoned software developers but to beginners to content analysis and programming as well. we assume you have a working knowledge of the java™ programming language and plenty of content to analyze. throughout this tutorial, you will learn: how to use apache tika's api and its most relevant functions how to develop code with apache lucene api and its most important modules how to integrate apache lucene and apache tika in order to build your own piece of software that stores and retrieves information efficiently. (project code is available for download) what are lucene and tika? according to apache lucene's site, apache lucene represents an open source java library for indexing and searching from within large collections of documents. the index size represents roughly 20-30% the size of text indexed and the search algorithms provide features like: ranked searching - best results returned first many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more. in this tutorial we will demonstrate only phrase queries. fielded search (e.g. title, author, contents) sorting by any field flexible faceting, highlighting, joins and result grouping pluggable ranking models, including the vector space model and okapi bm25 but lucene's main purpose is to deal directly with text and we want to manipulate documents, who have various formats and encoding. for parsing document content and their properties the apache tika library it is necessary. apache tika is a library that provides a flexible and robust set of interfaces that can be used in any context where metadata analyzis and structured text extraction is needed. the key component of apache tika is the parser (org.apache.tika.parser.parser ) interface because it hides the complexity of different file formats while providing a simple and powerful mechanism to extract structured text content and metadata from all sorts of documents. criterias for tika parsing design streamed parsing the interface should require neither the client application nor the parser implementation to keep the full document content in memory or spooled to disk. this allows even huge documents to be parsed without excessive resource requirements. structured content a parser implementation should be able to include structural information (headings, links, etc.) in the extracted content. a client application can use this information for example to better judge the relevance of different parts of the parsed document. input metadata a client application should be able to include metadata like the file name or declared content type with the document to be parsed. the parser implementation can use this information to better guide the parsing process. output metadata a parser implementation should be able to return document metadata in addition to document content. many document formats contain metadata like the name of the author that may be useful to client applications. context sensitivity while the default settings and behaviour of tika parsers should work well for most use cases, there are still situations where more fine-grained control over the parsing process is desirable. it should be easy to inject such context-specific information to the parsing process without breaking the layers of abstraction. requirements maven 2.0 or higher java 1.6 se or higher lesson 1: automate metadata extraction from any file type our premisses are the following: we have a collection of documents stored on disk/database and we would like to index them; these documents can be word documents, pdfs, htmls, plain text files etc. as we are developers, we would like to write reusable code that extracts file properties regarding format (metadata) and file content. apache tika has a mimetype repository and a set of schemes (any combination of mime magic, url patterns, xml root characters, or file extensions) to determine if a particular file, url, or piece of content matches one of its known types. if the content does match, tika has detected its mimetype and can proceed to select the appropriate parser. in the sample code, the file type detection and its parsing is being covered inside the class com.retriever.lucene.index.indexcreator , method indexfile. listing 1.1 analyzing a file with tika public static documentwithabstract indexfile(analyzer analyzer, file file) throws ioexception { metadata metadata = new metadata(); contenthandler handler = new bodycontenthandler(10 * 1024 * 1024); parsecontext context = new parsecontext(); parser parser = new autodetectparser(); inputstream stream = new fileinputstream(file); //open stream try { parser.parse(stream, handler, metadata, context); //parse the stream } catch (tikaexception e) { e.printstacktrace(); } catch (saxexception e) { e.printstacktrace(); } finally { stream.close(); //close the stream } //more code here } the above code displays how a file it is being parsed using org.apache.tika.parser.autodetectparser; this kind of implementation was chosen because we would like to achieve parsing documents disregarding their format. also, for handling the content the org.apache.tika.sax.bodycontenthandler wasconstructed with a writelimit given as parameter ( 10*1024*1024); this type of constructor creates a content handler that writes xhtml body character events to an internal string buffer and in case of documents with large content is less likely to throw a saxexception (thrown when the default write limit is reached). as a result of our parsing we have obtained a metadata object that we can now use to detect file properties (title or any other header specific to a document format). metadata processing can be done as described below ( com.retriever.lucene.index.indexcreator , method indexfiledescriptors) : listing 1.2 processing metadata private static document indexfiledescriptors(string filename, metadata metadata) { document doc = new document(); //store file name in a separate textfield doc.add(new textfield(isearchconstants.field_file, filename, store.yes)); for (string key : metadata.names()) { string name = key.tolowercase(); string value = metadata.get(key); if (stringutils.isblank(value)) { continue; } if ("keywords".equalsignorecase(key)) { for (string keyword : value.split(",?(\\s+)")) { doc.add(new textfield(name, keyword, store.yes)); } } else if (isearchconstants.field_title.equalsignorecase(key)) { doc.add(new textfield(name, value, store.yes)); } else { doc.add(new textfield(name, filename, store.no)); } } in the method presented above we store the file name in a separate field and also the document's title ( a document can have a title different from its file name); we are not interested in storing other informations.
October 22, 2014
by Ana-Maria Mihalceanu
· 18,730 Views · 4 Likes
article thumbnail
Sharding Pitfalls Part II: Running a Sharded Cluster
By Adam Comerford, Senior Solutions Engineer In Part I we discussed important considerations when picking a shard key. In this post we will go through some recommendations when running a sharded cluster at scale. Scalability is one of the core benefits of sharding in MongoDB but this can give you a false sense of security; even with that flexibility, you still have to make smart decisions about how and when you deploy resources. In this post, we will cover a couple of common mistakes that people tend to make when it comes to running a sharded cluster. 3. Waiting too long to add a new shard (overloaded) You sharded your database and scaled horizontally for a reason, perhaps it was to add more memory or disk capacity. Whatever the reason, if your application usage grows over time so (generally) does your database utilization. Eventually, your current sharded cluster will pass a certain point, let’s call it 80% utilized (as a nice round estimate), such that it becomes problematic to add another shard. Why? Well, adding a new shard to a cluster is not free, and it is not instantaneous. It consumes resources and (initially) accepts very little traffic. Essentially, at the start of its existence, a newly added shard costs you capacity instead of adding capacity. The length of time it will stay in this state will depend on the balancer and how long it takes for a significant portion of “busy/active” chunks to move onto the new shard. It can often be easier to visualize this process, so let’s make up some hypothetical numbers and set the bar relatively low. Our imaginary existing cluster will be a set of 2 shards, with 2000 chunks (500 considered “active”) and to that we need to add a 3rd shard. This 3rd shard will eventually store one third of the active chunks (and total chunks). The question is, when does this shard stop adding overhead overall and instead become an asset? In reality, this will vary from cluster to cluster and have a lot of dependencies and variables - in other words you need to have good metrics about your cluster, particularly your load bottleneck. Therefore we will once again use our imaginations and go with a relatively low bar: when 5% of active chunks—that is, those chunks seeing most traffic—have migrated to the new shard, you should expect a net gain in performance. In our imaginary system we have evaluated our load levels, the expected impact of migrations and have determine that once that 5% threshold of active chunks has been migrated to the new shard it can be considered a net gain for the overall system. Once all chunks have been balanced, then the migration overhead disappears, but initially this will be an expected trade off. This chart shows how long it would take for new shards to reach net positive contribution in your cluster (the dotted line implies net gain): In this fabricated example, it takes almost 2 hours for the new shard to attain a viable level of active chunks and be considered a net gain for the overall system. Although these numbers are fictional, these numbers are based on setups we have seen in real systems with moderate load. From there it is relatively easy to imagine this set of migrations taking even longer on an overloaded set of shards, and taking far longer for our newly added shard to cross the threshold and become a net gain. As such it is best to be proactive and add capacity before it becomes a necessity. Possible Mitigation Strategies Manual balancing of targeted “hot” chunks (chunk that is being accessed more than others) to move activity to the new shard more quickly Add the shard at low traffic time so that there is less competition for resources Disable balancing on some collections, prioritise balancing busy collections first 4. Under-provisioning Config Servers Provisioning enough resources without being wasteful is always tricky, and all the more so in a complicated distributed system like a MongoDB sharded cluster. Everyone wants to use their hardware, virtual instances, virtual machines, containers and the like in the most efficient way possible, and get the best bang for their buck. Hence it is only natural to take a look at the various pieces of a distributed cluster and look for lower utilized pieces that could be put on less expensive resources. The most common pitfall here with MongoDB are the config servers, which are often neglected when stress testing a cluster. In testing environments and smaller deployments (unless specific measures are taken to stress them) they are relatively lightly loaded and usually identified as candidates for lesser instances/hardware. The problem is that these are critical pieces of infrastructure. They may not be heavily loaded all the time, but when they do see load and struggle to service requests, that can impact all queries (reads, writes, authentication) and add latency to all requests made of the cluster in question. In particular, the first config server in the list supplied to your mongos processes is vital. This is the config server that all mongos processes will default to read from when fetching or refreshing their view of the data distribution in your cluster. Similarly, this is the server that will be hit when attempting to authenticate a user. If it is under-provisioned and cannot service queries, or if it has problems with networking (packet loss, congestion), then the effects will be significant. Possible Mitigation Strategies Ensure the config servers are load tested, slightly over-provisioned (the first config server in particular) If using virtual machines or cloud based instances, investigate increasing available resources Turning off the balancer, disabling chunk splitting will reduce the chances of high read traffic to the config servers (no migrations, no meta data refresh) but this is only a temporary fix unless you have a perfect write distribution and may not eliminate issues completely. 5. Using the count() command on sharded collections This pitfall is very common, and it seems to hit somewhat randomly in terms of how long someone has been running a sharded environment. At some point, a question will arise along the lines of: “How are we tracking/verifying/checking how many documents we have in each collection on each shard, how balanced are they and do they agree with ?” Hopefully no one is actually constructing questions this way in your organization, but you get the basic idea. The most obvious way to do a quick check on this type of thing is to count the documents and see if the numbers make sense and/or agree with counts elsewhere. That thinking naturally leads people to the count command and they proceed to use it to gather figures for their documents and collections. Unfortunately, on a busy, mature sharded cluster, the results will very rarely be what is expected. The reason for this is that the count command as implemented today has several optimizations in place to make it faster to run in general and those speed optimizations essentially bypass a key piece of the sharding functionality needed to return accurate results in this case. This is a known bug and is being tracked in SERVER-3645, but does not stop people from consistently hitting this issue. The nature of the issue means that count will report documents in the results that it should not, for example: Documents that are being deleted as part of a chunk migrations Documents that have been left behind from previous chunk migrations (also known as orphans) Documents currently being copied as part of an in-flight chunk migration A regular query (rather than a count) will have its results filtered by the respective primary and not suffer from the same problem. Hence, if you were to manually count the results from a query client-side you would get an accurate result. This quirk of sharded environments will eventually be fixed, but for now it will inevitably crop up from time to time in all active sharded clusters used by a large team. Possible Mitigation Strategies Do counts on the client side, or use targeted, range based queries (with a primary read preference) to count instead Use cleanUpOrphaned and disable the balancer (make sure it has finished current round) when performing counts across the cluster If you want tolearn more about managing MongoDB deployments at scale, sign up for my online education course, MongoDB Advanced Deployment and Operations. Planning for scale? No problem: MongoDB is here to help. Get a preview of what it’s like to work with MongoDB’s Technical Services Team. Give us some details on your deployment and we can set you up with an expert who can provide detailed guidance on all aspects of scaling with MongoDB, based on our experience with hundreds of deployments.
October 21, 2014
by Francesca Krihely
· 4,720 Views
article thumbnail
AppDynamics VS New Relic – Which Tool is Right For You? The Complete Guide
New Relic VS AppDynamics: All the performance features, integrations, installation procedures and pricing plans side by side to help you decide which tool to use When thinking about performance, AppDynamics and New Relic are the main modern tools that come to mind. Both spawned from the same company, Wily Technology, who also dealt with performance monitoring and was acquired by CA back in 2006 - making way to new technology. New Relic is an anagram of Lew Cirne, its founder and CEO. AppDynamics was founded by Jyoti Bansal, who was a Lead Software Architect at the same Wily Technology, which was also founded by Lew. The main goal of this guide is to help you understand the similarities and differences between the two, so you can decide which one fits your company’s needs. Table of Contents What is APM anyhow? Supported Languages and Environments Features - Backend Monitoring - Fronted & Mobile Monitoring How to Solve the Errors You Find Installation Dashboard and Usage Integrations and Plugins Pricing Conclusion 1. What is APM anyhow? It’s the only buzzword you’ll read in this article, promise. Well, maybe also DevOps, but that’s it. So application Performance Management has been around for a while, though it seems like many developers are not comfortable with it yet. APM provides us with analytics around our application’s performance - at the core this means timing how long it takes to execute different areas in the code and complete transactions - this is done either by instrumenting the code, monitoring logs, or including network / hardware metrics. On top of this basic concept, many different implementations exist - but there are a basic truths we can agree on: A modern solution should monitor production environments, so its overhead (in terms of CPU and throughput) becomes very important. Also, it should display what the web/mobile end users are experiencing, which was not part of traditional APMs. What was once considered a luxury is becoming commonplace: Rapid new deployments in production mean more chances to introduce errors to your systems architecture, slow it down, and maybe even crash it. Let’s see what AppDynamics and New Relic have in store for us. 2. Supported Environments AppDynamics: Java, Scala, .NET, PHP, Node.js, iOS and Android; including your favorite flavour of database and cloud platform. New Relic: Java, Scala, .NET, PHP, Node.js, Ruby and Python. Supported databases, cloud platforms and other plug-ins are available here. We’ll dig in deeper with extensions later on. On the user monitoring front, iOS, Android, and JavaScript support is included with both tools. Bottom line: Main difference here is New Relic’s Ruby and Python support, and different levels of support for various platforms. 3. Features Both New Relic and AppDynamics can be broken down into 6 different products, all reporting to a main dashboard interface. Let’s split these to backend, mobile and frontend to do a quick runthrough over the main offerings. Backend Monitoring The bread and butter of performance management - reporting stats, graphs and insights of your applications performance under the hood. AppDynamics and NewRelic each offer 4 approaches here: Application Performance Management High level metrics with drill downs to code level data about how your application is performing. Must have metrics include transaction response time, error rate, throughput (Requests per Minute) on NewRelic and load (calls/min) on AppDynamics. AppDynamics dashboard on the left, New Relic on the Right You’ve probably noticed the main screen at AppDynamics include a map of the services the application is using with their call loads and health index while NewRelic displays a response time graph. This might be a way to signal each tool’s monitoring priorities and AppDynamics inclination to larger enterprises. Anyhow, enough with this dev tool psychology, but it’s worth noting that a similar map is also available on New Relic: New Relic’s application map One of the thorny issues here is alerting and reporting, with so many metrics and moving parts, it’s hard to identify which matters most. Is it a low error rate? Responsiveness? Throughput? AppDynamics and New Relic each took a different approach to distill these metrics into performance indicators. New Relic is using the Apdex score index, which uses a user defined response time threshold T to imply end-user satisfaction. Simply put, they require you to manually set the threshold. Here’s an example for the way this score is calculated: Calculating Apdex, now sum this over all requests for a given time and you’ll get the score AppDynamics on the other hand, doesn’t believe in Apdex (as they explained in an article called ”Apdex is Fatally Flawed”). They’ve come up with a solution of their own that automatically creates a dynamic baseline for the apps performance which varies by time. For example, the definition of a slow transaction might vary under low and high loads on the system. Bottom line: We’re seeing that AppDynamics puts its priority on visualizing the stack from end to end, while NewRelic is focused on bottom line response times. We’ve also seen the difference in alerting with Apdex and a dynamic baselines. Server Monitoring Another monitoring capability offered by both tools focuses on the hardware your servers run on: specs, CPU usage, memory utilization, disk I/O and network IO. AppDynamics on the left, New Relic on the right with some sample data In this category, AppDynamics offers a few more features than New Relic, mostly around memory: heap size & utilization, garbage collection stats divided by gens and memory leak detection. AppDynamics Server Monitoring - Memory features Bottom line: AppDynamics provides deeper insights into garbage collection and memory leak detection beyond the standard metrics. Database Monitoring Moving on to other components in your stack, the first thing that comes to mind is the database. Here we have a greater distinction with a richer AppDynamics dashboard looking into things like resource consumption, wait states, user sessions, specific query calls and more. On New Relic’s end the situation is a bit different with the Database dashboard as part of the basic APM product. Both tools have specific database monitoring metrics available through plugins to view data from external services (we’ll talk a bit more about integrations later). Either way, both native and external feature sets here might be different depending on the database you use. AppDynamics on the left with an Oracle DB, New Relic on the right with MySQL plugin Bottom line: Beyond the shared database metrics that go a bit deeper with AppDynamics, it’s worth looking into the features available for your specific database within each tool. Insights and Analytics This one is a wildcard, going beyond traditional APM and opening up to business intelligence metrics. Since both New Relic and AppDynamics already have access to the messages that go through your application, they’ve built this opt-in additional database to store your stats and enable you to query them. AppDynamics on the left, New Relic on the right with some sample data Bottom line: If you don’t have a solution that already allows you to process such queries, it might be about time to get one. Frontend & Mobile Monitoring Switching seats from the backend, lets take a quick look at what we’re getting on the Real-User Monitoring front. Both AppDynamics and New Relic have a product targeting browsers and a product targeting mobile with iOS & Android support. On Mobile, the flagship features include insights on slowdowns and crashes, that are filtered through geographic regions, devices, operating systems and operator networks: AppDynamics on the left, New Relic on the right - Mobile Real-User Monitoring With end-user browser analytics, it feels like having the visibility you have on your browsers load times through Chrome dev tools available on the actual users of your app: AppDynamics on the left, New Relic on the right - Browser Real-User Monitroing Bottom line: We’re seeing again how New Relic’s focus is on response time bottom lines while AppDynamics emphasizes the global picture. 4. How to Solve the errors you find To go beyond the reporting and alerting of errors by AppDynamics and New Relic, many of our users add Takipi to their toolbox. This allows them not only to monitor server slowdowns and errors via New Relic or AppDynamics, but also to solve them using Takipi. Whenever a new exception is thrown or a log error occurs - Takipi captures it and shows you the variable state which caused it, across methods and machines. Takipi will overlay this over the actual code which executed at the moment of error – so you can analyze the exception as if you were there when it happened. The dashboard links each error to a recorded instance of all involved code when the bug happened, and includes the variable values that caused it: Takipi plays well with AppDynamics, and it also has a New Relic plug-in that displays an exception and log error dashboard: Takipi for New Relic Bottom line: It’s one thing to identify what’s stopping you down, but solving it, is a whole different issue. Java or Scala developers? Whether you're using an APM tool or not, try Takipi. 5. Dashboard and Usage To get a better feel of each tool’s user experience and way of solving problems, I think it’s probably best to browse through a video. But before that, it’s worth noting that AppDynamics uses a dashboard based on Flash, yes… Flash, turns out it’s still out there. This felt like a drawback but its probably the best Flash application I’ve seen out there: NewRelic at TypeSafe, a webinar that gives an overview of New Relic for Play (it’s a bit long but gives a nice overview if you browse through): Bottom line: I still can’t believe the Flash didn’t scare me off completely. And it was ok actually. Both tools provide a nice experience, but still feel a bit cluttered. 6. Installation SaaS/On-Premise: AppDynamics offers a few modes of operation - SaaS, on-premises and a hybrid approach, each with its installation instructions. New Relic is only available through SaaS. Agents: Monitoring your application becomes available through attaching language specific agents to your server. For example, with Java there are 2 possible ways to instrument your code with agents, either by using a Java agent or a native agent. New Relic and AppDynamics use a Java agent to collect the performance data they’re reporting. To gather the low-level data required not only to point to an error but to help solve it, Takipi uses a native agent. Code and configuration changes: On the Real-User Monitoring front, project and configurations changes including introducing a few dependencies would be needed if you’d like to add monitoring capabilities to your web or mobile app. This includes adding JS agents to your website and native mobile agents to your mobile application. Alerting: AppDynamics computes your response time thresholds by itself and might take some time to learn your system. New Relic relies on custom thresholds defined by you for its Apdex index. Bottom line: If you require an on-premise version, the answer is clearly AppDynamics. Otherwise, ease of installation is pretty much the same - mind the alerting though. 7. Integrations and Plugins Branching out, AppDynamics and New Relic offer integrations and plug-ins to hundreds of services. Let’s start with NewRelic, we’ve already mentioned the Platform program earlier: a plug in platform with 116 (Last time I checked) plugins to services like Hadoop, RabbitMQ and Redis, that stream metrics of their data so you can view in on New Relic. On the integrations side of the table, there’s Connect, with 53 integrations with tools like Jira, HipChat, Takipi and pagerduty. AppDynamics Exchange offers 100 plugins and is also an open platform for developers to build plugins. Bottom line: New Relic has richer integrations that feel friendlier, but any way you go it’s an individual decision to see where and how your tools of choice integrate better. 8. Pricing Both tools have a free lite version with limited features across all products, including a 24hr data retention with pro trials of 14-30 days. Pricing with AppDynamics pro programs is more individual, you’ll have to contact sales to get a customized plan based on the number of agents you need. Mobile monitoring is a bit different when each agent is priced per 5000 Monthly Active Users. With New Relic, pro account pricing starts with with $199 per month per host ($149 on a yearly plan), this includes APM, Servers, Platfrom and Browser basics. Mobile monitoring costs $49 per month ($29 on a yearly plan) with 1 week of data retention. The Insights product start from $250 per month for up to 75 million events. Bottom line: New Relic’s pricing caters more to startups and small-medium business while AppDynamics focus is on customizing solutions to enterprises. With that said, each tool ventures off to the other’s natural playground and this distinction today is not that clear as it was. Conclusion AppDynamics and New Relic are top of the line APM tools, each traditionally targeted a different type of developer, from enterprises to startups. But as both are stepping forward to their IPOs and after experiencing huge growth the lines are getting blurred. The choice is not clear, but you could not go wrong – On premise = AppDynamics, otherwise, it’s an individual call depends on which better fits your stack (and which of all these features are you actually thinking you’re going to use). Originally posted on Takipi's blog
October 16, 2014
by Chen Harel
· 12,540 Views
article thumbnail
MySQL Replication: 'Got fatal error 1236' Causes and Cures
Originally Written by Muhammad Irfan MySQL replication is a core process for maintaining multiple copies of data – and replication is a very important aspect in database administration. In order to synchronize data between master and slaves you need to make sure that data transfers smoothly, and to do so you need to act promptly regarding replication errors to continue data synchronization. Here on the Percona Support team, we often help customers with replication broken-related issues. In this post I’ll highlight the top most critical replication error code 1236 along with the causes and cure. MySQL replication error “Got fatal error 1236” can be triggered by multiple reasons and I will try to cover all of them. Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: ‘log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the first event ‘binlog.000201′ at 5480571 This is a typical error on the slave(s) server. It reflects the problem around max_allowed_packet size. max_allowed_packet refers to single SQL statement sent to the MySQL server as binary log event from master to slave. This error usually occurs when you have a different size of max_allowed_packet on the master and slave (i.e. master max_allowed_packet size is greater then slave server). When the MySQL master server tries to send a bigger packet than defined on the slave server, the slave server then fails to accept it and hence the error. In order to alleviate this issue please make sure to have the same value for max_allowed_packet on both slave and master. You can read more about max_allowed_packet here. This error usually occurs when updating a huge number of rows on the master and it doesn’t fit into the value of slave max_allowed_packet size because slave max_allowed_packet size is lower then the master. This usually happens with queries “LOAD DATA INFILE” or “INSERT .. SELECT” queries. As per my experience, this can also be caused by application logic that can generate a huge INSERT with junk data. Take into account, that one new variable introduced in MySQL 5.6.6 and later slave_max_allowed_packet_size which controls the maximum packet size for the replication threads. It overrides the max_allowed_packet variable on slave and it’s default value is 1 GB. In this post, “max_allowed_packet and binary log corruption in MySQL,”my colleague Miguel Angel Nieto explains this error in detail. Got fatal error 1236 from master when reading data from binary log: ‘Could not find first log file name in binary log index file’ This error occurs when the slave server required binary log for replication no longer exists on the master database server. In one of the scenarios for this, your slave server is stopped for some reason for a few hours/days and when you resume replication on the slave it fails with above error. When you investigate you will find that the master server is no longer requesting binary logs which the slave server needs to pull in order to synchronize data. Possible reasons for this include the master server expired binary logs via system variable expire_logs_days – or someone manually deleted binary logs from master via PURGE BINARY LOGS command or via ‘rm -f’ command or may be you have some cronjob which archives older binary logs to claim disk space, etc. So, make sure you always have the required binary logs exists on the master server and you can update your procedures to keep binary logs that the slave server requires by monitoring the “Relay_master_log_file” variable from SHOW SLAVE STATUS output. Moreover, if you have set expire_log_days in my.cnf old binlogs expire automatically and are removed. This means when MySQL opens a new binlog file, it checks the older binlogs, and purges any that are older than the value of expire_logs_days (in days). Percona Server added a feature to expire logs based on total number of files used instead of the age of the binlog files. So in that configuration, if you get a spike of traffic, it could cause binlogs to disappear sooner than you expect. For more information check Restricting the number of binlog files. In order to resolve this problem, the only clean solution I can think of is to re-create the slave server from a master server backup or from other slave in replication topology. – Got fatal error 1236 from master when reading data from binary log: ‘binlog truncated in the middle of event; consider out of disk space on master; the first event ‘mysql-bin.000525′ at 175770780, the last event read from ‘/data/mysql/repl/mysql-bin.000525′ at 175770780, the last byte read from ‘/data/mysql/repl/mysql-bin.000525′ at 175771648.’ Usually, this caused by sync_binlog <>1 on the master server which means binary log events may not be synchronized on the disk. There might be a committed SQL statement or row change (depending on your replication format) on the master that did not make it to the slave because the event is truncated. The solution would be to move the slave thread to the next available binary log and initialize slave thread with the first available position on binary log as below: mysql>CHANGE MASTERTOMASTER_LOG_FILE='mysql-bin.000526',MASTER_LOG_POS=4; – [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: ‘Client requested master to start replication from impossible position; the first event ‘mysql-bin.010711′ at 55212580, the last event read from ‘/var/lib/mysql/log/mysql-bin.000711′ at 4, the last byte read from ‘/var/lib/mysql/log/mysql-bin.010711′ at 4.’, Error_code: 1236 I foresee master server crashed or rebooted and hence binary log events not synchronized on disk. This usually happens when sync_binlog != 1 on the master. You can investigate it as inspecting binary log contents as below: $mysqlbinlog--base64-output=decode-rows--verbose--verbose--start-position=55212580mysql-bin.010711 You will find this is the last position of binary log and end of binary log file. This issue can usually be fixed by moving the slave to the next binary log. In this case it would be: mysql>CHANGE MASTER TOMASTER_LOG_FILE='mysql-bin.000712',MASTER_LOG_POS=4; This will resume replication. To avoid corrupted binlogs on the master, enabling sync_binlog=1 on master helps in most cases. sync_binlog=1 will synchronize the binary log to disk after every commit. sync_binlog makes MySQL perform on fsync on the binary log in addition to the fsync by InnoDB. As a reminder, it has some cost impact as it will synchronize the write-to-binary log on disk after every commit. On the other hand, sync_binlog=1 overhead can be very minimal or negligible if the disk subsystem is SSD along with battery-backed cache (BBU). You can read more about this here in the manual. sync_binlog is a dynamic option that you can enable on the fly. Here’s how: mysql-master>SET GLOBAL sync_binlog=1; To make the change persistent across reboot, you can add this parameter in my.cnf. As a side note, along with replication fixes, it is always a better option to make sure your replica is in the master and to validate data between master/slaves. Fortunately, Percona Toolkit has tools for this purpose: pt-table-checksum & pt-table-sync. Before checking for replication consistency, be sure to check the replication environment and then, later, to sync any differences.
October 15, 2014
by Peter Zaitsev
· 40,801 Views
article thumbnail
JSR 199 - Compiler API
JSR 199 provides the compiler API to compile the Java code inside another Java program. The following are the important classes and interfaces provided for facilitating the compilation from a Java program. JavaFileObject - Represents a compilation unit, typically a class source. SimpleJavaFileObject - Implementation of the methods defined in JavaFileObject DiagnosticCollector - Collects the compilation errors, warning into a list of Diagnostic type Diagnostic - Reports the type of the problem and details like line number, character, error reason etc. JavaFileManager - To work on the Java source and class files. JavaCompiler - The compiler instance for compiling the compilation unit. CompilationTask - A sub interface of JavaCompiler which helps to compile and return the status with diagnostic when used call method on it. Where to start To compile a Java code, we need the Java source. The source can be a physical file on the disk or a string inside the program. Using the source, we need create an instance type of JavaFileObject. Using String literal Create a class which implements JavaFileObject, here i am using SimpleJavaFileObject. We need create the path URI of the class file package com.test; import java.io.IOException; import java.net.URI; import javax.tools.SimpleJavaFileObject; public class SampleSource extends SimpleJavaFileObject { private String source; protected SampleSource(String name, String code) { super(URI.create("string:///" +name.replaceAll("\\.", "/") + Kind.SOURCE.extension), Kind.SOURCE); this.source = code ; } @Override public CharSequence getCharContent(boolean ignoreEncodingErrors) throws IOException { return source ; } } Now, create the instance of JavaFileObject and from those, create the Compilation Unit (A collection of JavaFileObject) String str = "package com.test;" + "\n" + "public class Test {" + "\npublic static void test() {" + "\nSystem.out.println(\"Comiler API Test\")-;" + "" + "\n}" + "\n}"; SimpleJavaFileObject fileObject = new SampleSource("com.test.Test", str); JavaFileObject javaFileObjects[] = new JavaFileObject[] { fileObject }; Iterable compilationUnits = Arrays .asList(javaFileObjects); From File System If the source is from physical location. Then create like this. File []files = new File[]{file1, file2, file3, file4} ; Iterable units = fileManager.getJavaFileObjectsFromFiles(Arrays.asList(files)); Create a JavaFileManger We will see, how to create a fileManger now. JavaFileManager fileManager = compiler.getStandardFileManager( diagnostics, Locale.getDefault(), Charset.defaultCharset()); To get the FileManger, we need diagnostic - A DiagnosticCollector of JavaFileObject locale - The locale of the compilation charset - The charset to be used. Compiler Get the compiler instance using ToolProvider. Finally, create the CompilationTask from the compiler instance using diagnostics, file manager and compilation units (Optionally writer and compilation options). JavaCompiler compiler = ToolProvider.getSystemJavaCompiler(); CompilationTask task = compiler.getTask(null, fileManager, diagnostics, compilationOptionss, null, compilationUnits); The argument required to get the CompilationTask are out - A writer which writes the output of the compiler. Defaults to System.err if null listener - A diagnostic listener, the errors or warning can be accessed using. options - Compiler options (Ex : -d, like we give in command line using javac ) classes - Name of the classes to be processed compilationUnits - List of compilation units Compile Finally, call the method to compile. This method to be called only once otherwise it throws IllegalStateException on multiple calls. Once compiled, returns true for successful compilation otherwise false. We need to look the diagnosticCollector to get the error/warning details. boolean status = task.call(); All together Putting all together. public static void main(String[] args) { String str = "package com.test;" + "\n" + "public class Test {" + "\npublic static void test() {" + "\nSystem.out.println(\"Comiler API Test\")-;" + "" + "\n}" + "\n}"; SimpleJavaFileObject fileObject = new SampleSource("com.test.Test", str); JavaFileObject javaFileObjects[] = new JavaFileObject[] { fileObject }; Iterable compilationUnits = Arrays .asList(javaFileObjects); Iterable compilationOptionss = Arrays.asList(new String[] { "-d", "classes" }); DiagnosticCollector diagnostics = new DiagnosticCollector(); JavaCompiler compiler = ToolProvider.getSystemJavaCompiler(); JavaFileManager fileManager = compiler.getStandardFileManager( diagnostics, Locale.getDefault(), Charset.defaultCharset()); CompilationTask task = compiler.getTask(null, fileManager, diagnostics, compilationOptionss, null, compilationUnits); boolean status = task.call(); if(!status) { System.out.println("Found errors in compilation"); int errors = 1; for(Diagnostic diagnostic : diagnostics.getDiagnostics()) { printError(errors, diagnostic); errors++; } } else System.out.println("Compilation sucessfull"); try { fileManager.close(); } catch (IOException e){} } public static void printError(int number,Diagnostic diagnostic) { System.out.println(); System.out.print(diagnostic.getKind()+" : "+number+" Type : "+diagnostic.getMessage(Locale.getDefault())); System.out.print(" at column : "+diagnostic.getColumnNumber()); System.out.println(" Line number : "+diagnostic.getLineNumber()); System.out.println("Source : "+diagnostic.getSource()); } Output Output with an error will be (because of an hyphen in System.out.println in main method of Test) Found errors in compilation ERROR : 1 Type : illegal start of expression at column : 40 Line number : 4 Source : com.test.SampleSource[string:///com/test/Test.java] ERROR : 2 Type : not a statement at column : 39 Line number : 4 Source : com.test.SampleSource[string:///com/test/Test.java] To read more about JSR 199, follow the official link. Happy Learning!!!! Read more articles at blog
October 15, 2014
by Veeresham Kardas
· 6,473 Views
article thumbnail
jQuery Mobile Tutorial: User Registration, Login and Logout Screens for the Meeting Room Booking App
in this jquery mobile tutorial we will create the screens that will handle user registration, login and logout in a real-world meeting room booking application. this article is part of a series of mobile application development tutorials that i have been publishing on my blog jorgeramon.me. if you are new to this series, i recommend that you read its first part, as well as this mobile ui patterns article where i provide a flowchart describing the user registration, login and logout screens in a mobile application. we will use this chart as a guide for this article. here’s a screenshot: in this part of the tutorial we will only create the static html for the screens. in future articles we will implement the programming logic that makes the pages work. the first step we are going to take is to set up a jquery mobile project for the app. how to set up a jquery mobile project while you can use mobile sdks such as kendo ui mobile and intel xdk to create, debug and deploy jquery mobile apps, in this tutorial i will show you how to create a simple jquery mobile project without using the facilities provided by those sdks. i think that it’s important to understand how you can create this type of project from scratch, and how the different pieces in the project work together in an app. the project’s directories and files we need to pick a directory in our development workstations where we will place the project’s files. in my case i named that directory “apps”. in that directory, we will create a root directory for the application, which we will name “conf-rooms”. make sure that this directory is set up so it can be accessed from your local web server. under “conf-rooms” we will create a “css” directory, where we will place the css assets of the project; and an “img” directory for the images that we will use. at the same level of the “apps” directory, we will create a “lib” directory. this is where we will place the jquery mobile and any other libraries that our application will use. you also need to set up this directory so it can be accessed from your local web server. on my workstation the directories look as depicted below: now is a good time to download the jquery mobile and jquery libraries from their respective websites, and place them in the “jqm” and “jquery” directories, all under the “lib” directory. this is how the files look on my workstation: how jquery mobile works a short overview of jquery mobile for those who aren’t very familiar with it yet. as its documentation clearly explains, jquery mobile is a unified user interface system with the following characteristics: it works seamlessly across all popular mobile device platforms. it uses jquery and jquery ui as its foundations. it has a lightweight codebase built on progressive enhancement. it has a flexible and easily themeable design. an attribute that differentiates jquery mobile from other frameworks is that it targets a wide variety of mobile browsers. the reason this coverage is possible has to do with the way jquery mobile works. jquery mobile works by applying css and javascript enhancements to html pages built with clean, semantic html. the usage of semantic html ensures compatibility with most web-enabled devices. the techniques applied by the framework to an html page, transform the semantic page into a rich and interactive experience. we call these changes progressive enhancements, as they are applied progressively to the page, taking advantage of the capabilities of the browser on the web-enabled device. the enhancements result in pages that provide a great user experience on the latest mobile browsers and degrade gracefully on less capable browsers, without losing their intrinsic functionality. in addition, the framework provides support for screen readers and other assistive technologies through a tight integration with the web accessibility initiative – accessible rich internet applications suite (wai-aria) technical specification. creating the landing screen the first application that we will create is the landing screen. this screen will come up when users launch our application. as reflected in the flowchart at the beginning of this article, the landing screen is the door to all the areas of the app, and it requires that users log in before navigating any further. in the “wireframes for signing in and signing up” section of the third part of this tutorial we created the following mockup for this screen. let’s create an empty index.html file in the “conf-rooms” directory, and add a jquery mobile page template to the file as follows: book it welcome! existing users sign in don't have an account? sign up before we step through this code i want you to check out this file in a mobile browser or simulator. the result should look like this screenshot: what do you think? back in the index.html file, in the head section we have references to the jquery and jquery mobile libraries. double check that yours are pointing to the correct directories in your workstation. how to use a custom theme in jquery mobile now i want to direct your attention to the following lines: these lines mean that we are using a custom jquery mobile theme that resides in the “conf-room1.min.css” file. i created this file using jquery mobile’s theme roller . we will use this theme to give our app a look different than the standard jquery mobile themes. you can download the theme using this link . after downloading the zipped theme files, we will go to the “css” directory, create a “css/themes/1″ directory and place the unzipped theme files there. when done, the “css” directory should look like this: in the head section of the index.html file we also have this code: app.css is the css file where we will place any additional custom styles that we will use in the app. for the moment, we will add the following code to the “app.css” file: /* change html headers color */ h1,h2,h3,h4,h5 { color:#0071bc; } h2.mc-text-danger, h3.mc-text-danger { color:red; } /* change border radius of icon buttons */ .ui-btn-icon-notext.ui-corner-all { -webkit-border-radius: .3125em; border-radius: .3125em; } /* change color of jquery mobile page headers */ .ui-title { color:#fff; } /* center-aligned text */ .mc-text-center { text-align:center; } /* top margin for some elements */ .mc-top-margin-1-5 { margin-top:1.5em; } these are just a few cosmetic changes that will enhance the look of the app. notice that i prefixed non-jquerymobile classes with the characters “mc-” to avoid potential collisions with jquerymobile’s classes. the remaining lines in the head section of index.html are the references to the jquery and jquery mobile libraries. as i suggested earlier, make sure that yours are pointing to the correct directories in your project. let’s move on to the body section of the “index.html file”. there you will find the standard jquery mobile page template with a header and main divs: book it welcome! existing users sign in don't have an account? sign up we have decorated the “header” div with the data-theme=”c” attribute, which gives it the nice purple background color that we defined in the custom theme: in the “main” div we are using a couple of links to the sign in and sign up screens respectively. the links point to the sign-in.html and sing-up.html files that we will create in a few minutes. these links are decorated with the jquery mobile ui-btn, ui-btn-b and ui-corner-all classes, which make them look like buttons: this is all we need to do in the lading page for the moment. let’s move on to the log in screen. creating a log in screen with jquery mobile here’s the log in screen’s mockup that we built in the third part of this tutorial : we will use the log in screen to capture the user’s credentials and validate them against the application’s user accounts database. if validation succeeds, we will direct users to a “main menu” scree that we will create in an upcoming tutorial. let’s create an empty sign-in.html file in the project’s directory. in the file, we will write the following code: book it sign in email address password remember me submit can't access your account? login failed did you enter the right credentials? ok if you open this sign-in.html with a mobile browser or emulator, you will see something like this: the head section of the html document is similar to the index.html file we created a few minutes ago, with the exception of the document’s title. no need to explain much there. in the “main” section of the jquery mobile page that wee added to the file, we dropped a few controls that will allow us to capture the user’s email and password, along with a checkbox that will let us know when the user wants the app to remember their credentials: sign in email address password remember me submit can't access your account? we also added a link that will allow users to initiate the password reset process if they have problems logging in. you will also notice that the “submit” link points to the “dlg-invalid-credentials” anchor defined in the same jquery mobile page. this link is decorated with the data-rel=”popup”, data-transition=”pop” and data-position-to=”window” attributes. when we do this, we are telling jquery mobile to open the link to the element with id=”dlg-invalid-credentials” as a popup dialog, using a “pop” transition, and center the element relative to the document’s window. here’s the html for the popup: login failed did you enter the right credentials? ok notice that the “dlg-invalid-credentials” div is decorated with the data-rel=”popup” attribute, signaling to jquery mobile to apply popup styles to this div. if you click or tap the “submit” button, you will see the “invalid credentials” popup: one last thing on this screen. we have the popup linked directly to the “submit” button for testing purposes. in an upcoming part of this tutorial we will add programming logic that will activate the popup only when the login fails. for the moment we are only concerned with creating the html code for the pages and making sure that the jquery mobile enhancements work on them. creating an account locked screen with jquery mobile many business applications use an account locked feature as a measure to increase the app’s security. we will use this feature in our app, and this means that we need to create an account locked screen. the purpose of the screen is to notify the user that their account is locked. we will define under which conditions an account will be locked through programming logic that we will add in an upcoming chapter of this tutorial. let’s create an empty account-locked.html file, and drop the following code in it: back app name your account is locked please contact the helpdesk to resolve this issue. the file should look like the screenshot below when viewed with a mobile browser or emulator: the html code for this screen is very similar to that in the prior two screens, with the exception of one element that i want you to pay attention to: back app name the header section of the jquery mobile page we just created contains a link to the sign-in.html file. when we decorate it with the ui-btn-left, ui-btn, ui-btn-icon-notext, ui-corner-all and ui-icon-back classes, we are giving the link the appearance of a toolbar button, just like this: the data-rel=”back” attribute causes any taps on the anchor to mimic a “back button”, going back one history entry and ignoring the anchor’s default href. you can read more about navigation and linking on jquery mobile by visiting jquery mobile’s navigation documentation. you should also visit the jquery mobile buttons guide to learn about how to create buttons. creating a sign up page with jquery mobile when users tap on the landing screen’s “sign up” button, they will open the sign up screen. this is where we will capture the user’s personal information so we can create an account for them. remember that our mockup of the sign up screen looks like this: let’s create an empty sign-up.html file and add the following code to it: book it sign up first name last name email address password confirm password submit almost done... confirm your email address we sent you an email with instructions on how to confirm your email address. please check your inbox and follow the instructions in the email. ok the file should look as depicted below when you open it with a mobile browser or emulator: the only difference with the mockup is that we are using a “submit” button at the bottom of the screen, instead of a “done” button in the toolbar. when you examine the html code, you will find that the “submit” button is wired to the “dlg-sign-up-sent” popup: almost done... confirm your email address we sent you an email with instructions on how to confirm your email address. please check your inbox and follow the instructions in the email. ok if you tap on the button, the popup will become visible: we will use this popup to notify the users that we have sent them a message asking them to confirm their email address. the message will contain a link to a webpage where users will need to re-enter the email address used to create their account in the app. with this step we are trying to make sure that it was is a human with a valid email inbox who created the account. back in the popup’s html code, notice how the “ok” button links back to the sign in screen. you should be able to confirm that this link works when you tap the button. creating a password reset screen with jquery mobile the app’s landing screen has a “can’t access your account?” link that helps user initiate the password reset workflow of the app. the first step of this workflow is to present the “begin password reset” screen to the user. we will use this screen to capture the user’s email address. if we find this email address in the user accounts database of the app, we will email the user a provisional password. next, we will activate the “end password reset” screen, where the user will need to enter the provisional password and a new password of their choosing. the picture below illustrates this process. let’s create an empty begin-password-reset.html file in the project’s directory. we will write the following code int the file: book it password reset enter your email address submit password reset check your inbox we sent you an email with instructions on how to reset your password. please check your inbox and follow the instructions in the email. ok this is how the screen should look when viewed on a mobile browser or emulator: there is nothing new in the html code of this screen. we wired the “submit” button so when a user taps it, the embedded “dlg-pwd-reset-sent” popup will become active: we did this for testing purposes. remember that we will add the programming logic that activates these popups in upcoming chapters of this tutorial. when a user taps the popup’s “ok” button, the application will navigate to the “end password reset” screen, which we will create next. the end password reset screen this is the screen where the user will enter the provisional password we sent them via email, along with a new password of their choosing. to create this screen we will add an empty “end-password-reset.html” file to the project. here’s the code that goes in the file: book it reset password provisional password new password confirm new password submit done your password was changed. ok the screen should look like the picture below when viewed with a mobile browser or emulator: we wired the “submit” button so it activates the embedded “dlg-pwd-changed” popup: this popup simply tells the user that their password was changed. tapping the “ok” button will make the app navigate back to the sign in screen, where the user can sign in with the new password. summary and next steps this concludes our first phase of work on the user registration, login and logout screens of the jquery mobile version of the app. i will emphasize again that in this phase we are not adding programming logic to the screens. we are simply creating a jquery mobile page for each screen and making sure that the visual elements within the screens adhere to the mockups that we created in previous chapters of this tutorial, as well as to the ui patterns flowchart that i mentioned at the beginning of this article. while we’ve made significant progress with the app at this point, it’s fair to say that we are just getting started. we are still missing the programming for this article’s screens, as well as the jquery mobile pages and programming for the screens that will allow users to browse and reserve meeting rooms, which is why we created the app in the first place. in the next chapter of this tutorial we will get started with the programming of the user profile screens we just created. don’t forget to sign up for my mailing list so you can be among the first to know when i publish the next update. stay tuned don’t miss any articles. get new articles and updates sent free to your inbox.
October 13, 2014
by Jorge Ramon
· 78,805 Views · 4 Likes
article thumbnail
Neo4j: COLLECTing Multiple Values (Too Many Parameters for Function ‘Collect’)
One of my favourite functions in Neo4j’s cypher query language is COLLECT which allows us to group items into an array for later consumption. However, I’ve noticed that people sometimes have trouble working out how to collect multiple items with COLLECT and struggle to find a way to do so. Consider the following data set: create (p:Person {name: "Mark"}) create (e1:Event {name: "Event1", timestamp: 1234}) create (e2:Event {name: "Event2", timestamp: 4567}) create (p)-[:EVENT]->(e1) create (p)-[:EVENT]->(e2) If we wanted to return each person along with a collection of the event names they’d participated in we could write the following: $ MATCH (p:Person)-[:EVENT]->(e) > RETURN p, COLLECT(e.name); +--------------------------------------------+ | p | COLLECT(e.name) | +--------------------------------------------+ | Node[0]{name:"Mark"} | ["Event1","Event2"] | +--------------------------------------------+ 1 row That works nicely, but what about if we want to collect the event name and the timestamp but don’t want to return the entire event node? An approach I’ve seen a few people try during workshops is the following: MATCH (p:Person)-[:EVENT]->(e) RETURN p, COLLECT(e.name, e.timestamp) Unfortunately this doesn’t compile: SyntaxException: Too many parameters for function 'collect' (line 2, column 11) "RETURN p, COLLECT(e.name, e.timestamp)" ^ As the error message suggests, the COLLECT function only takes one argument so we need to find another way to solve our problem. One way is to put the two values into a literal array which will result in an array of arrays as our return result: $ MATCH (p:Person)-[:EVENT]->(e) > RETURN p, COLLECT([e.name, e.timestamp]); +----------------------------------------------------------+ | p | COLLECT([e.name, e.timestamp]) | +----------------------------------------------------------+ | Node[0]{name:"Mark"} | [["Event1",1234],["Event2",4567]] | +----------------------------------------------------------+ 1 row The annoying thing about this approach is that as you add more items you’ll forget in which position you’ve put each bit of data so I think a preferable approach is to collect a map of items instead: $ MATCH (p:Person)-[:EVENT]->(e) > RETURN p, COLLECT({eventName: e.name, eventTimestamp: e.timestamp}); +--------------------------------------------------------------------------------------------------------------------------+ | p | COLLECT({eventName: e.name, eventTimestamp: e.timestamp}) | +--------------------------------------------------------------------------------------------------------------------------+ | Node[0]{name:"Mark"} | [{eventName -> "Event1", eventTimestamp -> 1234},{eventName -> "Event2", eventTimestamp -> 4567}] | +--------------------------------------------------------------------------------------------------------------------------+ 1 row During the Clojure Neo4j Hackathon that we ran earlier this week this proved to be a particularly pleasing approach as we could easily destructure the collection of maps in our Clojure code.
October 13, 2014
by Mark Needham
· 10,339 Views
  • Previous
  • ...
  • 493
  • 494
  • 495
  • 496
  • 497
  • 498
  • 499
  • 500
  • 501
  • 502
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×