DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Curious about the future of data-driven systems? Join our Data Engineering roundtable and learn how to build scalable data platforms.

Data Engineering: The industry has come a long way from organizing unstructured data to adopting today's modern data pipelines. See how.

Threat Detection: Learn core practices for managing security risks and vulnerabilities in your organization — don't regret those threats!

Managing API integrations: Assess your use case and needs — plus learn patterns for the design, build, and maintenance of your integrations.

Avatar

Bozhidar Bozhanov

Software Engineer at easyProperty

Sofia, BG

Joined May 2006

http://techblog.bozho.net

About

Senior Java developer, one of the top stackoverflow users, fluent with Java and Java technology stacks - Spring, JPA, JavaEE. Creator of https://logsentinel.com and http://computoser.com . Worked on Ericsson projects, Bulgarian e-government projects, large scale recruitment platforms, cloud navigation synchronization. Member of the jury of the International Olympiad in Linguistics and the Program committee of the North American Computational Linguistics Olympiad. @bozhobg

Stats

Reputation: 2088
Pageviews: 2.0M
Articles: 34
Comments: 74
  • Articles
  • Comments

Articles

article thumbnail
Always Name Your Thread Pools
Make life a little easier for the non-Spring folk.
March 24, 2021
· 9,352 Views · 3 Likes
article thumbnail
OpenSSL Key and IV Padding
OpenSSL is an omnipresent tool when it comes to encryption, but we're used to native Java implementations of cryptographic primitives.
October 30, 2020
· 5,386 Views · 1 Like
article thumbnail
A Disk-Backed ArrayList
Learn more about disk-backed ArrayLists and Java performance.
Updated January 2, 2020
· 12,871 Views · 1 Like
article thumbnail
JKS: Extending a Self-Signed Certificate
Learn more about extending self-signed certificates with the Java keystore.
Updated May 22, 2019
· 13,936 Views · 2 Likes
article thumbnail
Multiple Cache Configurations With Caffeine and Spring Boot
Caching is key.
May 7, 2019
· 23,003 Views · 4 Likes
article thumbnail
Writing Large JSON Files With Jackson
Let's take a look at how to easily write a large amount of JSON data to a file using everyone's favorite JSON library, Jackson! Click here to learn more.
August 20, 2018
· 29,986 Views · 8 Likes
article thumbnail
Implementing White-Labelling
Why reinvent the wheel? Many companies offer others' applications under their own branding, and white-labeling your app for large clients is pretty straightforward.
July 20, 2018
· 10,185 Views · 4 Likes
article thumbnail
User Authentication Best Practices Checklist
All sites now have the ability to provide authentication. Yet, it's still pretty tricky for devs to implement. Read on for best practices when developing authentication.
April 18, 2018
· 33,733 Views · 8 Likes
article thumbnail
OWASP Dependency-Check Maven Plugin: A Must-Have
Though it's tough, this developer admits he hadn't heard of this plugin until recently. If you're in the same boat, read to get an overview of this great security tool.
December 30, 2017
· 18,295 Views · 5 Likes
article thumbnail
Enabling Two-Factor Authentication for Your Web Application
In this article, you'll learn both why two-factor authentication works as a security protocol, and how to use it in your web applications.
Updated December 6, 2017
· 55,939 Views · 10 Likes
article thumbnail
Why I Still Prefer Eclipse Over IntelliJ IDEA
Though IDEA has grown in popularity, let's see what combination of factors makes one dev still prefer Eclipse as his IDE, with a focus on JVM language projects.
November 15, 2017
· 90,056 Views · 44 Likes
article thumbnail
Basic API Rate-Limiting
In this post, we take a look at how to implement rate limiting for your own API at the application level. Interested? Read on for more info!
July 19, 2017
· 66,113 Views · 10 Likes
article thumbnail
Custom Audit Log With Spring and Hibernate
If you can't use Envers to automatically audit your database operations with Hibernate, you can use event listeners instead. Here's how.
July 25, 2016
· 51,579 Views · 8 Likes
article thumbnail
Using Spring-Managed Event Listeners in Hibernate
Learn from Bozhidar Bozhanov of TomTom about a new and better way to use listeners in Hibernate (because the old way is broken).
July 17, 2016
· 20,008 Views · 4 Likes
article thumbnail
Installing a Java Application as a Windows Service
Ever wondered how to install a Java application as a Windows service? This article shows you how to do it.
June 27, 2016
· 43,436 Views · 20 Likes
article thumbnail
Setting Up Distributed Infinispan Cache with Hibernate and Spring
A pretty typical setup–a Spring and Hibernate application that requires a distributed cache. But it turns out to be not so trivial to setup.
May 26, 2016
· 16,465 Views · 3 Likes
article thumbnail
Concurrency and How to Avoid It
The issue of multiple threads or processes trying to modify the same data at the same time is one as old as programming itself. Is it possible to avoid the situation altogether? Or are we left with the current, age-old implements?
April 29, 2016
· 68,774 Views · 36 Likes
article thumbnail
The 12-Factor App: A Java Developer's Perspective
Web app development involves just: codebase, dependencies, config, backing services, build/release/run, processes, port binding, concurrency...
August 26, 2015
· 94,022 Views · 20 Likes
article thumbnail
Blue-Green Deployment With a Single Database
A blue-green deployment is a way to have incremental updates to your production stack without downtime and without any complexity for properly handling rolling updates (including the rollback functionality) I don’t need to repeat this wonderful explanation or Martin Fowler’s original piece. But I’ll extend on them. A blue-green deployment is one where there is an “active” and a “spare” set of servers. The active running the current version, and the spare being ready to run any newly deployed version. The “active” and “spare” is slightly different than “blue” and “green”, because one set is always “blue” and one is always “green”, while the “active” and “spare” labels change. On AWS, for example, you can script the deployment by having two child stacks of your main stacks – active and spare (indicated by a stack label), each having one (or more) auto-scaling group for your application layer, and a script that does the following (applicable to non-AWS as well): push build to an accessible location (e.g. s3) set the spare auto-scaling group size to the desired value (the spare stays at 0 when not used) make it fetch the pushed build on startup wait for it to start run sanity tests switch DNS to point to an ELB in front of the spare ASG switch the labels to make the spare one active and vice versa set the previously active ASG size to 0 The application layer is stateless, so it’s easy to do hot-replaces like that. But (as Fowler indicated) the database is the most tricky component. If you have 2 databases, where the spare one is a slave replica of the active one (and that changes every time you switch), the setup becomes more complicated. And you’ll still have to do schema changes. So using a single database, if possible, is the easier approach, regardless of whether you have a “regular” database or a schemaless one. In fact, it boils down to having your application modify the database on startup, in a way that works with both versions. This includes schema changes – table (or the relevant term in the schemaless db) creation, field addition/removal and inserting new data (e.g. enumerations). And it can go wrong in many ways, depending on the data and datatypes. Some nulls, some datatype change that makes a few values unparseable, etc. Of course, it’s harder to do it with a regular SQL database. As suggested in the post I linked earlier, you can use stored procedures (which I don’t like), or you can use a database migration tool. For a schemaless database you must do stuff manually, but but fewer actions are normally needed – you don’t have to alter tables or explicitly create new ones, as everything is handled automatically. And the most important thing is to not break the running version. But how to make sure everything works? test on staging – preferably with a replica of the production database (automatically) run your behaviour/acceptance/sanity test suites against the not-yet-active new deployment before switching the DNS to point to it. Stop the process if they fail. Only after these checks pass, switch the DNS and point your domain to the previously spare group, thus promoting it to “active”. Switching can be done manually, or automatically with the deployment script. The “switch” can be other than a DNS one (as you need a low TTL for that). It can be a load-balancer or a subnet configuration, for example – the best option depends on your setup. And while it is good to automate everything, having a few manual steps isn’t necessarily a bad thing. Overall, I’d recommend the blue-green deployment approach in order to achieve zero downtime upgrades. But always make sure your database is properly upgraded, so that it works with both the old and the new version.
June 26, 2015
· 6,224 Views · 1 Like
article thumbnail
Optional Dependencies
Sometimes a library you are writing may have optional dependencies. E.g. “if apache http client is on the classpath, use it; otherwise – fallback to HttpURLConnection”. Why would you do that? For various reasons – when distributing a library and you may not want to force a big dependency footprint. On the other hand, a more advanced library may have performance benefits, so whoever needs these, may include it. Or you may want to allow easily pluggable implementations of some functionality – e.g. json serialization. Your library doesn’t care whether it’s Jackson, gson or native android json serialization – so you may provide implementations using all of these, and pick the one whose dependency is found. One way to achieve this is to explicitly specify/pass the library to use. When the user of your library/framework instantiates its main class, they can pass a booleanuseApacheClient=true, or an enum value JsonSerializer.JACKSON. That is not a bad option, as it forces the user to be aware of what dependency they are using (and is a de-facto dependency injection) Another option, used by spring among others, is to dynamically check is the dependency is available on the classpath. E.g. private static final boolean apacheClientPresent = isApacheHttpClientPresent(); private static boolean isApacheHttpClientPresent() { try { Class.forName("org.apache.http.client.HttpClient"); logger.info("Apache HTTP detected, using it for HTTP communication.); return true; } catch (ClassNotFoundException ex) { logger.info("Apache HTTP client not found, using HttpURLConnection."); return false; } } and then, whenever you need to make HTTP requests (where ApacheHttpClient and HttpURLConnectionClient are your custom implementations of your own HttpClient interface): HttpClient client = null; if (apacheClientPresent) { client = new ApacheHttpClient(); } else { client = new HttpURLConnectionClient(); } Note that it’s important to guard any code that may try to load classes from the dependency with the “isXPresent” boolean. Otherwise class loading exceptions may fly. E.g. in spring, they wrapped the Jackson dependencies in a MappingJackson2HttpMessageConverter if (jackson2Present) { this.messageConverters.add(new MappingJackson2HttpMessageConverter()); } That way, if Jackson is not present, the class is not instantiated and loading of Jackson classes is not attempted at all. Whether to prefer the automatic detection, or require explicit configuration of what underlying dependency to use, is a hard question. Because automatic detection may leave the user of your library unaware of the mechanism, and when they add a dependency for a different purpose, it may get picked by your library and behaviour may change (though it shouldn’t, tiny differences are always there). You should document that, of course, and even log messages (as above), but that may not be enough to avoid (un)pleasant surprises. So I can’t answer when to use which, and it should be decided case-by-case. This approach is applicable also to internal dependencies – your core module may look for a more specific module to be present in order to use it, and otherwise fallback to a default. E.g. you provide a default implementation of “elapsed time” using System.nano(), but when using Android you’d better rely on SystemClock for that – so you may want to detect whether your elapsed time android implementation is present. This looks like logical coupling, so in this scenario it’s maybe wiser to prefer to explicit approach, though. Overall, this is a nice technique to use optional dependencies, with a basic fallback; or one of many possible options without a fallback. And it’s good to know that you can do it, and have it in your “toolkit” of possible solutions to a problem. But you shouldn’t always use it over the explicit (dependency injection) option.
June 10, 2015
· 6,549 Views
article thumbnail
Log Collection With Graylog on AWS
Log collection is essential to properly analyzing issues in production. An interface to search and be notified about exceptions on all your servers is a must. Well, if you have one server, you can easily ssh to it and check the logs, of course, but for larger deployments, collecting logs centrally is way more preferable than logging to 10 machines in order to find “what happened”. There are many options to do that, roughly separated in two groups – 3rd party services and software to be installed by you. 3rd party (or “cloud-based” if you want) log collection services include Splunk,Loggly, Papertrail, Sumologic. They are very easy to setup and you pay for what you use. Basically, you send each message (e.g. via a custom logback appender) to a provider’s endpoint, and then use the dashboard to analyze the data. In many cases that would be the preferred way to go. In other cases, however, company policy may frown upon using 3rd party services to store company-specific data, or additional costs may be undesired. In these cases extra effort needs to be put into installing and managing an internal log collection software. They work in a similar way, but implementation details may differ (e.g. instead of sending messages with an appender to a target endpoint, the software, using some sort of an agent, collects local logs and aggregates them). Open-source options include Graylog, FluentD, Flume, Logstash. After a very quick research, I considered graylog to fit our needs best, so below is a description of the installation procedure on AWS (though the first part applies regardless of the infrastructure). The first thing to look at are the ready-to-use images provided by graylog, including docker, openstack, vagrant and AWS. Unfortunately, the AWS version has two drawbacks – it’s using Ubuntu, rather than the Amazon AMI. That’s not a huge issue, although some generic scripts you use in your stack may have to be rewritten. The other was the dealbreaker – when you start it, it doesn’t run a web interface, although it claims it should. Only mongodb, elasticsearch and graylog-server are started. Having 2 instances – one web, and one for the rest would complicate things, so I opted for manual installation. Graylog has two components – the server, which handles the input, indexing and searching, and the web interface, which is a nice UI that communicates with the server. The web interface uses mongodb for metadata, and the server uses elasticsearch to store the incoming logs. Below is a bash script (CentOS) that handles the installation. Note that there is no “sudo”, because initialization scripts are executed as root on AWS. #!/bin/bash # install pwgen for password-generation yum upgrade ca-certificates --enablerepo=epel yum --enablerepo=epel -y install pwgen # mongodb cat >/etc/yum.repos.d/mongodb-org.repo <<'EOT' [mongodb-org] name=MongoDB Repository baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/ gpgcheck=0 enabled=1 EOT yum -y install mongodb-org chkconfig mongod on service mongod start # elasticsearch rpm --import https://packages.elasticsearch.org/GPG-KEY-elasticsearch cat >/etc/yum.repos.d/elasticsearch.repo <<'EOT' [elasticsearch-1.4] name=Elasticsearch repository for 1.4.x packages baseurl=http://packages.elasticsearch.org/elasticsearch/1.4/centos gpgcheck=1 gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch enabled=1 EOT yum -y install elasticsearch chkconfig --add elasticsearch # configure elasticsearch sed -i -- 's/#cluster.name: elasticsearch/cluster.name: graylog2/g' /etc/elasticsearch/elasticsearch.yml sed -i -- 's/#network.bind_host: localhost/network.bind_host: localhost/g' /etc/elasticsearch/elasticsearch.yml service elasticsearch stop service elasticsearch start # java yum -y update yum -y install java-1.7.0-openjdk update-alternatives --set java /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java # graylog wget https://packages.graylog2.org/releases/graylog2-server/graylog-1.0.1.tgz tar xvzf graylog-1.0.1.tgz -C /opt/ mv /opt/graylog-1.0.1/ /opt/graylog/ cp /opt/graylog/bin/graylogctl /etc/init.d/graylog sed -i -e 's/GRAYLOG2_SERVER_JAR=\${GRAYLOG2_SERVER_JAR:=graylog.jar}/GRAYLOG2_SERVER_JAR=\${GRAYLOG2_SERVER_JAR:=\/opt\/graylog\/graylog.jar}/' /etc/init.d/graylog sed -i -e 's/LOG_FILE=\${LOG_FILE:=log\/graylog-server.log}/LOG_FILE=\${LOG_FILE:=\/var\/log\/graylog-server.log}/' /etc/init.d/graylog cat >/etc/init.d/graylog <<'EOT' #!/bin/bash # chkconfig: 345 90 60 # description: graylog control sh /opt/graylog/bin/graylogctl $1 EOT chkconfig --add graylog chkconfig graylog on chmod +x /etc/init.d/graylog # graylog web wget https://packages.graylog2.org/releases/graylog2-web-interface/graylog-web-interface-1.0.1.tgz tar xvzf graylog-web-interface-1.0.1.tgz -C /opt/ mv /opt/graylog-web-interface-1.0.1/ /opt/graylog-web/ cat >/etc/init.d/graylog-web <<'EOT' #!/bin/bash # chkconfig: 345 91 61 # description: graylog web interface sh /opt/graylog-web/bin/graylog-web-interface > /dev/null 2>&1 & EOT chkconfig --add graylog-web chkconfig graylog-web on chmod +x /etc/init.d/graylog-web #configure mkdir --parents /etc/graylog/server/ cp /opt/graylog/graylog.conf.example /etc/graylog/server/server.conf sed -i -e 's/password_secret =.*/password_secret = '$(pwgen -s 96 1)'/' /etc/graylog/server/server.conf sed -i -e 's/root_password_sha2 =.*/root_password_sha2 = '$(echo -n password | shasum -a 256 | awk '{print $1}')'/' /etc/graylog/server/server.conf sed -i -e 's/application.secret=""/application.secret="'$(pwgen -s 96 1)'"/g' /opt/graylog-web/conf/graylog-web-interface.conf sed -i -e 's/graylog2-server.uris=""/graylog2-server.uris="http:\/\/127.0.0.1:12900\/"/g' /opt/graylog-web/conf/graylog-web-interface.conf service graylog start sleep 30 service graylog-web start You may also want to set a TTL (auto-expiration) for messages, so that you don’t store old logs forever. Here’s how # wait for the index to be created INDEXES=$(curl --silent "http://localhost:9200/_cat/indices") until [[ "$INDEXES" =~ "graylog2_0" ]]; do sleep 5 echo "Index not yet created. Indexes: $INDEXES" INDEXES=$(curl --silent "http://localhost:9200/_cat/indices") done # set each indexed message auto-expiration (ttl) curl -XPUT "http://localhost:9200/graylog2_0/message/_mapping" -d'{"message": {"_ttl" : { "enabled" : true, "default" : "15d" }}' Now you have everything running on the instance. Then you have to do some AWS-specific things (if using CloudFormation, that would include a pile of JSON). Here’s the list: you can either have an auto-scaling group with one instance, or a single instance. I prefer the ASG, though the other one is a bit simpler. The ASG gives you auto-respawn if the instance dies. set the above script to be invoked in the UserData of the launch configuration of the instance/asg (e.g. by getting it from s3 first) allow UDP port 12201 (the default logging port). That should happen for the instance/asg security group (inbound), for the application nodes security group (outbound), and also as a network ACL of your VPC. Test the UDP connection to make sure it really goes through. Keep the access restricted for all sources, except for your instances. you need to pass the private IP address of your graylog server instance to all the application nodes. That’s tricky on AWS, as private IP addresses change. That’s why you need something stable. You can’t use an ELB (load balancer), because it doesn’t support UDP. There are two options: Associate an Elastic IP with the node on startup. Pass that IP to the application nodes. But there’s a catch – if they connect to the elastic IP, that would go via NAT (if you have such), and you may have to open your instance “to the world”. So, you must turn the elastic IP into its corresponding public DNS. The DNS then will be resolved to the private IP. You can do that by manually and hacky: 1 GRAYLOG_ADDRESS="ec2-$GRAYLOG_ADDRESS//./-}.us-west-1.compute.amazonaws.com" or you can use the AWS EC2 CLI to obtain the instance details of the instance that the elastic IP is associated with, and then with another call obtain its Public DNS. Instead of using an Elastic IP, which limits you to a single instance, you can use Route53 (the AWS DNS manager). That way, when a graylog server instance starts, it can append itself to a route53 record, that way allowing for a round-robin DNS of multiple graylog instances that are in a cluster. Manipulating the Route53 records is again done via the AWS CLI. Then you just pass the domain name to applications nodes, so that they can send messages. alternatively, you can install graylog-server on all the nodes (as an agent), and point them to an elasticsearch cluster. But that’s more complicated and probably not the intended way to do it configure your logging framework to send messages to graylog. There are standard GELF (the greylog format) appenders, e.g. this one, and the only thing you have to do is use the Public DNS environment variable in the logback.xml (which supports environment variable resolution). You should make the web interface accessible outside the network, so you can use an ELB for that, or the round-robin DNS mentioned above. Just make sure the security rules are tight and not allowing external tampering with your log data. If you are not running a graylog cluster (which I won’t cover), then the single instance can potentially fail. That isn’t a great loss, as log messages can be obtained from the instances, and they are short-lived anyway. But the metadata of the web interface is important – dashboards, alerts, etc. So it’s good to do regular backups (e.g. with mongodump). Using an EBS volume is also an option. Even though you send your log messages to the centralized log collector, it’s a good idea to also keep local logs, with the proper log rotation and cleanup. It’s not a trivial process, but it’s essential to have log collection, so I hope the guide has been helpful.
May 14, 2015
· 19,284 Views
article thumbnail
Interrupting Executor Tasks
There’s this usecase that is not quite rare, when you want to cancel a running executor task. For example, you have ongoing downloads that you want to stop, or you have ongoing file copying that you want to cancel. So you do: ExecutorService executor = Executors.newSingleThreadExecutor(); Future future = executor.submit(new Runnable() { @Override public void run() { // Time-consuming or possibly blocking I/O } }); .... executor.shutdownNow(); // or future.cancel(); Unfortunately, that doesn’t work. Calling shutdownNow() or cencel() doesn’t stop the ongoing runnable. What these methods do is simply call .interrupt() on the respective thread(s). The problem is, your runnable doesn’t handle InterruptedException (and it can’t). It’s a pretty common problem described in multiple books and articles, but still it’s a bit counterintuitive. So what do you do? you need a way to stop the slow or blocking operation. If you have a long/endless loop, you can just add a condition whetherThread.currentThread().isInterrupted() and don’t continue if it is. However, generally, the blocking happens outside of your code, so you have to instruct the underlying code to stop. Usually this is by closing a stream or disconnecting a connection. But in order to do that, you need to do quite a few things. Extend Runnable Make the “cancellable” resources (e.g. the input stream) an instance field, which provide a cancel method to your extended runnable, where you get the “cancellable” resource and cancel it (e.g. call inputStream.close()) Implement a custom ThreadFactory that in turn creates custom Thread instances that override the interrupt() method and invoke the cancel() method on your extended Runnable Instantiate the executor with the custom thread factory (static factory methods take it as an argument) Handle abrupt closing/stopping/disconnecting of your blocking resources, in the run()method The bad news is, you need to have access to the particular cancellable runnable in your thread factory. You cannot use instanceof to check if it’s of an appropriate type, because executors wrap the runnables you submit to them in Worker instances which do not expose their underlying runnables. For single-threaded executors that’s easy – you simply hold in your outermost class a reference to the currently submitted runnable, and access it in the interrupt method, e.g.: private final CancellableRunnable runnable; ... runnable = new CancellableRunnable() { private MutableBoolean bool = new MutableBoolean(); @Override public void run() { bool.setValue(true); while (bool.booleanValue()) { // emulating a blocking operation with an endless loop } } @Override public void cancel() { bool.setValue(false); // usually here you'd have inputStream.close() or connection.disconnect() } }; ExecutorService executor = Executors.newSingleThreadExecutor(new ThreadFactory() { @Override public Thread newThread(Runnable r) { return new Thread(r) { @Override public void interrupt() { super.interrupt(); runnable.cancel(); } }; } }); Future future = executor.submit(runnable); ... future.cancel(); (CancellableRunnable is a custom interface that simply defines the cancel() method) But what happens if your executor has to run multiple tasks at the same time? If you want to cancel all of them, then you can keep a list of submitted CancellableRunnable instance and simply cancel all of them when interrupted. Thus runnables will be cancelled multiple times, so you have to account for that. If you want fine-grained control, e.g. by cancelling particular futures, then there is no easy solution. You can’t even extend ThreadPoolExecutor because the addWorker method is private. You have to copy-paste it. The only option is not to rely on future.cancel() or executor.shutdownAll() and instead keep your own list of CancellableFuture instances and map them to their corresponding futures. So whenever you want to cancel some (or all) runnables, you do it the other way around – get the desired runnable you want to cancel, call .cancel() (as shown above), then get its corresponding Future, and cancel it as well. Something like: Map> cancellableFutures = new HashMap<>(); Future future = executor.submit(runnable); cancellableFutures.put(runnable, future); //now you want to abruptly cancel a particular task runnable.cancel(); cancellableFutures.get(runnable).cancel(true); (Instead of using the runnable as key, you may use some identifier which makes sense in your usecase and store both the runnable and future as a value under that key) That’s a neat workaround, but anyway I’ve submitted a request for enhancement of the java.util.concurrent package, so that in a future release we do have the option to manage that usecase.
November 21, 2014
· 45,500 Views · 7 Likes
article thumbnail
Caveats of HttpURLConnection
But while running some performance tests, and trying to figure out a bottleneck issue, we figured out what was wrong in our code.
September 9, 2014
· 23,819 Views · 4 Likes
article thumbnail
You Probably Don’t Need a Message Queue
I’m a minimalist, and I don’t like to complicate software too early and unnecessarily. And adding components to a software system is one of the things that adds a significant amount of complexity. So let’s talk about message queues. Message Queues are systems that let you have fault-tolerant, distributed, decoupled, etc, etc. architecture. That sounds good on paper. Message queues may fit in several use-cases in your application. You can check this nice article about the benefits of MQs of what some use-cases might be. But don’t be hasty in picking an MQ because “decoupling is good”, for example. Let’s use an example – you want your email sending to be decoupled from your order processing. So you post a message to a message queue, then the email processing system picks it up and sends the emails. How would you do that in a monolithic, single classpath application? Just make your order processing service depend on an email service, and call sendEmail(..) rather than sendToMQ(emailMessage). If you use MQ, you define a message format to be recognized by the two systems; if you don’t use an MQ you define a method signature. What is the practical difference? Not much, if any. But then you probably want to be able to add another consumer that does additional thing with a given message? And that might happen indeed, it’s just not for the regular project out there. And even if it is, it’s not worth it, compared to adding just another method call. Coupled – yes. But not inconveniently coupled. What if you want to handle spikes? Message queues give you the ability to put requests in a persistent queue and process all of them. And that is a very useful feature, but again it’s limited based on several factors – are your requests processed in the UI background, or require immediate response? The servlet container thread pool can be used as sort-of queue – response will be served eventually, but the user will have to wait (if the thread acquisition timeout is too small, requests will be dropped, though). Or you can use an in-memory queue for the heavier requests (that are handled in the UI background). And note that by default your MQ might not be highly-availably. E.g. if an MQ node dies, you lose messages. So that’s not a benefit over an in-memory queue in your application node. Which leads us to asynchronous processing – this is indeed a useful feature. You don’t want to do some heavy computation while the user is waiting. But you can use an in-memory queue, or simply start a new thread (a-la spring’s @Async annotation). Here comes another aspect – does it matter if a message is lost? If you application node, processing the request, dies, can you recover? You’ll be surprised how often it doesn’t actually matter, and you can function properly without guaranteeing all messages are processed. So, just asynchronously handling heavier invocations might work well. Even if you can’t afford to lose messages, the use-case when a message is put into a queue in order for another component to process it, there’s still a simple solution – the database. You put a row with a processed=false flag in the database. A scheduled job runs, picks all unprocessed ones and processes them asynchronously. Then, when processing is finished, set the flag to true. I’ve used this approach a number of times, including large production systems, and it works pretty well. And you can still scale your application nodes endlessly, as long as you don’t have any persistent state in them. Regardless of whether you are using an MQ or not. (Temporary in-memory processing queues are not persistent state). Why I’m trying to give alternatives to common usages of message queues? Because if chosen for the wrong reason, an MQ can be a burden. They are not as easy to use as it sounds. First, there’s a learning curve. Generally, the more separate integrated components you have, the more problems may arise. Then there’s setup and configuration. E.g. when the MQ has to run in a cluster, in multiple data centers (for HA), that becomes complex. High availability itself is not trivial – it’s not normally turned on by default. And how does your application node connect to the MQ? Via a refreshing connection pool, using a short-lived DNS record, via a load balancer? Then your queues have tons of configurations – what’s their size, what’s their behaviour (should consumers explicitly acknowledge receipt, should they explicitly acknowledge failure to process messages, should multiple consumers get the same message or not, should messages have TTL, etc.). Then there’s the network and message transfer overhead – especially given that people often choose JSON or XML for transferring messages. If you overuse your MQ, then it adds latency to your system. And last, but not least – it’s harder to track the program flow when analyzing problems. You can’t just see the “call hierarchy” in your IDE, because once you send a message to the MQ, you need to go and find where it is handled. And that’s not always as trivial as it sounds. You see, it adds a lot of complexity and things to take care of. Certainly MQs are very useful in some contexts. I’ve been using them in projects where they were really a good fit – e.g. we couldn’t afford to lose messages and we needed fast processing (so pinging the database wasn’t an option). I’ve also seen it being used in non-trivial scenarios, where we are using to for consuming messages on a single application node, regardless which node posts the message (pub/sub). And you can also check this stackoverflow question. And maybe you really need to have multiple languages communicate (but don’t want an ESB), or maybe your flow is getting so complex, that adding a new method call instead of a new message consumer is an overkill. So all I’m trying to say here is the trite truism “you should use the right tool for the job”. Don’t pick a message queue if you haven’t identified a real use for it that can’t be easily handled in a different, easier to setup and maintain manner. And don’t start with an MQ “just in case” – add it whenever you realize the actual need for it. Because probably, in the regular project out there, a message queue is not needed.
July 7, 2014
· 19,289 Views · 1 Like
article thumbnail
Common Misconceptions About Java
Java is the most widely used language in the world ([citation needed]), and everyone has an opinion about it. Due to it being mainstream, it is usually mocked, and sometimes rightly so, but sometimes the criticism just doesn’t touch reality. I’ll try to explain my favorite 5 misconceptions about Java. Java is slow – that might have been true for Java 1.0, and initially may sounds logical, since java is not compiled to binary, but to bytecode, which is in turn interpreted. However, modern versions of the JVM are very, very optimized (JVM optimizations is a topic worth not just an article, but a whole book) and this is no longer remotely true. As noted here, Java is even on-par with C++ in some cases. And it is certainly not a good idea to make a joke about Java being slow if you are a Ruby or PHP developer. Java is too verbose – here we need to split the language from the SDK and from other libraries. There is some verbosity in the JDK (e.g. java.io), which is: 1. easily overcome with de-facto standard libraries like guava 2. a good thing As for language verbosity, the only reasonable point were anonymous classes. Which are no longer an issue in Java 8 with the the functional additions. Getters and setters, Foo foo = new Foo() instead of using val – that is (possibly) boilerplate, but it’s not verbose – it doesn’t add conceptual weight to the code. It doesn’t take more time to write, read or understand. Other libraries – it is indeed pretty scary to see a class like AbstractCommonAsyncFacadeFactoryManagerImpl. But that has nothing to do with Java. It can be argued that sometimes these long names make sense, it can also be argued that they are as complex because the underlying abstraction is unnecessarily complicated, but either way, it is a design decision taken per-library, and nothing that the language or the SDK impose per-se. It is common to see overengineered stuff, but Java in no way pushes you in that direction – stuff can be done in a simple way with any language. You can certainly have AbstractCommonAsyncFacadeFactoryManagerImpl in Ruby, just there wasn’t a stupid architect that thought it’s a good idea and who uses Ruby. If “big, serious, heavy” companies were using Ruby, I bet we’d see the same. Enterprise Java frameworks are bloatware – that was certainly true back in 2002 when EJB 2 was in use (or “has been”, I’m too young to remember). And there are still some overengineered and bloated application servers that you don’t really need. The fact that people are using them is their own problem. You can have a perfectly nice, readable, easy to configure and deploy web application with a framework like Spring, Guice or even CDI; with a web framework like Spring-MVC, Play, Wicket, and even the latest JSF. Or even without any framework, if you feel like you don’t want to reuse the evolved-through-real-world-use frameworks. You can have an application using a message queue, a NoSQL and a SQL database, Amazon S3 file storage, and whatnot, without any accidental complexity. It’s true that people still like to overeingineer stuff, and add a couple of layers where they are not needed, but the fact that frameworks give you this ability doesn’t mean they make you do it. For example, here’s an application that crawls government documents, indexes them, and provides a UI for searching and subscribing. Sounds sort-of simple, and it is. It is written in Scala (in a very java way), but uses only java frameworks – spring, spring-mvc, lucene, jackson, guava. I guess you can start maintaining pretty fast, because it is straightforward. You can’t prototype quickly with Java – this is sort-of related to the previous point – it is assumed that working with Java is slow, and that’s why if you are a startup, or a weekend/hackathon project, you should use Ruby (with Rails), Python, Node JS or anything else that allows you to quickly prototype, to save & refresh, to painlessly iterate. Well, that is simply not true, and I don’t know even where it comes from. Maybe from the fact that big companies with heavy processes use Java, and so making a java app is taking more time. And Save-and-Refresh might look daunting to a beginner, but anyone who has programmed in Java (for the web) for a while, has to know a way to automate that (otherwise he’s a n00b, right?). I’ve summarized the possible approaches, and all of them are mostly OK. Another example here (which may be used as an example for the above point as well) – I made did this project for verifying secure password storage of websites within a weekend + 1 day to fix stuff in the evening. Including the security research. Spring-MVC, JSP templates, MongoDB. Again – quick and easy. You can do nothing in Java without an IDE – of course you can – you can use notepad++, vim, emacs. You will just lack refactoring, compile-on-save, call hierarchies. It would be just like programming in PHP or Python or javascript. The IDE vs Editor debate is a long one, but you can use Java without an IDE. It just doesn’t make sense to do so, because you get so much more from the IDE than from a text editor + command line tools. You may argue that I’m able to write nice and simple java applications quickly because I have a lot of experience, I know precisely which tools to use (and which not) and that I’m of some rare breed of developers with common sense. And while I’ll be flattered by that, I am no different than the good Ruby developer or the Python guru you may be. It’s just that java is too widespread to have only good developers and tools. if so many people were using other language, then probably the same amount of crappy code would’ve been generated. (And PHP is already way ahead even with less usage). I’m the last person not to laugh on jokes about Java, and it certainly isn’t the silver bullet language, but I’d be happier if people had less misconceptions either because of anecdotal evidence, or due to previous bad experience a-la “I hate Java since my previous company where the project was very bloated”. Not only because I don’t like people being biased, but because you may start your next project with a language that will not work, just because you’ve heard “Java is bad”.
April 4, 2014
· 21,234 Views · 1 Like
article thumbnail
Embedding Maven
It is a very rare usecase, but sometimes you need it. How to embed Maven in your application, so that you can programatically run goals? Short answer is: it's tricky. I dabbled into the matter for my java webapp automatic syncing project, and at some point I decided not to embed it. Ultimately, I used a library that does what I needed, but anyway, here are the steps and tools that might be helpful. What you usually need the embedded maven for, is to execute some goals on a maven project. There are two scenarios. The first one is, if you are running inside the maven container, i.e. you are writing a mojo/plugin. Then it's fairly easy, because you have everything managed by the already-initialized plexus container. In that case you can use the mojo-executor. Easy to use, but expects a "project", "pluginManager" and "session", which you can't easily obtain. The second scenario is completely embedded maven. There is a library that does what I needed it to do (thanks to MariuszS for pointing it out) - it's Maven Embedder. Its usage is described in this SO question. Use both the first and the second answer. Before finding that library, I tried two more libraries: the jenkins maven embedded and the Maven Invoker. The problem in both libraries is: they need a maven home. That is, the path to where a maven installation resides. Which is kind of contrary to the idea of "embedded" maven. If the Maven Embedder suits you, you can stop reading. However, there might be cases where the Maven Embedder might not be what you are looking for. In that case, you should use one of the two aforementioned libraries. So, how to find and set a maven home? Ask the user to specify it. Not too much of a hassle, probably Use M2_HOME. One of the libraries uses that by default, but the problem is it might not be set. I don't usually set it, for example. If it is not, you can fallback to the previous approach Scan the entire file system for a maven installation - sounds ok, and it can be done only once, and then stored in some entry. The problem is - there might not be a maven installation. Even if it's a developer's machine - IDEs (Eclipse, at least) have an "embedded" maven. And while it probably stores it somewhere internally in the same format a manual installation would, it may change it's path or structure depending on the version. You can, of course, re-scan the file tree every once in a while to find such an installation Download Maven programatically yourself. Then you can be sure where it is located and that it will always be located there in the same format. The problem here is version mismatch - the user might be using another version of maven. Making the version configurable is an option. All of these work in some cases, and don't work in others. So, in order of preference: 1. make sure you really need to embed maven 2. use the Maven Embedder 3. use another option with its considerations
November 13, 2013
· 17,317 Views · 2 Likes
article thumbnail
A Simple Plugin System for Web Applications
We need to make multiple web-based projects with a lot of shared functionality. For that, some sort of a plugin system would be a good option (as an alternative to copy-pasting stuff). Some frameworks (like grails) have the option to make web plugins, but most don’t, so something custom-made is to be implemented. First, let’s define what is the required functionality. The “plugin”: should be included simply by importing via maven/ivy should register all classes (either automatically, or via a one-line configuration) in a dependency injection container, if one is used should be vertical – i.e. contain all files, from javascript, css and templates, through controllers, to service layer classes should not require complex configuration that needs to be copy-pasted from project to project should allow easy development and debugging without redeployment The java classes are put into a jar file and added to the lib directory, therefore to the classpath, so that’s the easy part. But we need to get the web resources extracted to the respective locations, where they can be used by the rest of the code. There are three general approaches to that: build-time extraction, runtime extraction and runtime loading from the classpath. The last approach would require a controller (or servlet) that loads the resources from the classpath (the respective jar), cache them, and serve them. That has a couple of significant drawbacks, one of which is that being in a jar, they can’t be easily replaced during development. Working with classpath resources is also tricky, as you don’t know the names of the files in advance. The other two approaches are very similar. Grails, for example, uses the build-time extraction – the plugin is a zip file, containing all the needed resources, and they are extracted to the respective locations while the project is built. This is fine, but it would require a little more configuration (maven, in our case), which would also probably have to be copied over from project to project. So we picked the runtime extraction approach. It happens on startup – when the application is loaded, a startup listener of some sort (a spring components with @PostConstruct in our case) iterates through all jar files in the lib folder, and extracts the files from a specific folder (e.g. “web”). So, the structure of the jar file looks like this: com company pkg Foo.class Bar.class web plugin-name css main.css js foo.js bar.js images logo.png views foo.jsp bar.jsp The end-result is that on after the application is started, you get all the needed web resources accessible from the application, so you can include them in the pages (views) of your main application. And the code that does the extraction is rather simple (using zip4j for the zip part). This can be a servlet context listener, rather than a spring bean – it doesn’t make any difference. /** * Component that locates modules (in the form of jar files) and extracts their web elements, if any, on startup * * @author Bozhidar */ @Component public class ModuleExtractor { private static final Logger logger = LoggerFactory.getLogger(ModuleExtractor.class); @Inject private ServletContext ctx; @SuppressWarnings("unchecked") @PostConstruct public void init() { File lib = new File(ctx.getRealPath("/WEB-INF/lib")); File[] jars = lib.listFiles(); String targetPath = ctx.getRealPath("/"); String viewPath = "/WEB-INF/views"; //that can be made configurable for (File jar : jars) { try { ZipFile file = new ZipFile(jar); for (FileHeader header : (List) file.getFileHeaders()) { if (header.getFileName().startsWith("web/") && !fileExists(header)) { // extract views in WEB-INF (inaccessible to the outside world) // all other files are extracted in the root of the application if (header.getFileName().contains("/views/")) { file.extractFile(header, targetPath + viewPath); } else { file.extractFile(header, targetPath); } } } } catch (ZipException ex) { logger.warn("Error opening jar file and looking for a web-module in: " + jar, ex); } } } private boolean fileExists(FileHeader header) { return new File(ctx.getRealPath(header.getFileName())).exists(); } } So, in order to make a plugin, you just make a maven project with jar packaging, and add it as dependency to your main project, everything else is taken care of. You might need to register the ModuleExtractor if classpath scanning for beans is not enabled (or you choose to make it a listener), but that’s it. Note: this solution doesn’t aim to be a full-featured plugin system that solves all problems. It doesn’t support versioning, submodules, etc. That’s why the title is “simple”. But you can do many things with it, and it’s has a very low complexity. Note 2: Servlet 3.0 has a native way of doing almost the same thing, but it doesn’t allow dynamically changing the assets. If you don’t need to change them and don’t need save-and-refresh, then it’s probably the better option.
August 6, 2013
· 10,556 Views · 1 Like
article thumbnail
Runtime Classpath vs Compile-Time Classpath
This should really be a simple distinction, but I’ve been answering a slew of similar questions on Stackoverflow, and often people misunderstand the matter. So, what is a classpath? A set of all the classes (and jars with classes) that are required by your application. But there are two, or actually three distinct classpaths: compile-time classpath. Contains the classes that you’ve added in your IDE (assuming you use an IDE) in order to compile your code. In other words, this is the classpath passed to “javac” (though you may be using another compiler). runtime classpath. Contains the classes that are used when your application is running. That’s the classpath passed to the “java” executable. In the case of web apps this is your /lib folder, plus any other jars provided by the application server/servlet container test classpath – this is also a sort of runtime classpath, but it is used when you run tests. Tests do not run inside your application server/servlet container, so their classpath is a bit different Maven defines dependency scopes that are really useful for explaining the differences between the different types of classpaths. Read the short description of each scope. Many people assume that if they successfully compiled the application with a given jar file present, it means that the application will run fine. But it doesn’t – you need the same jars that you used to compile your application to be present on your runtime classpath as well. Well, not necessarily all of them, and not necessarily only them. A few examples: you compile the code with a given library on the compile-time classpath, but forget to add it to the runtime classpath. The JVM throws NoClasDefFoundError, which means that a class is missing, which was present when the code was compiled. This error is a clear sign that you are missing a jar file on your runtime classpath that you have on your compile-time classpath. It is also possible that a jar you depend on in turn depends on a jar that you don’t have anywhere. That’s why libraries (must) have their dependencies declared, so that you know which jars to put on your runtime classpath containers (servlet containers, application servers) have some libraries built-in. Normally you can’t override the built-in dependencies, and even when you can, it requires additional configuration. So, for example, you use Tomcat, which provides the servlet-api.jar. You compile your application with the servlet-api.jar on your compile-time classpath, so that you can use HttpServletRequest in your classes, but do not include it in your WEB-INF/lib folder, because tomcat will put its own jar in the runtime classpath. If you duplicate the dependency, you may get bizarre results, as classloaders get confused. a framework you are using (let’s say spring-mvc) relies on another library to do JSON serialization (usually Jackson). You don’t actually need Jackson on your compile-time classpath, because you are not referring to any of its classes or even spring classes that refer to them. But spring needs Jackson internally, so the jackson jar must be in WEB-INF/lib (runtime classpath) for JSON serialization to work. The cases might be complicated even further, when you consider compile-time constants and version mismatches, but the general point is this: the classpaths that you use for compiling and for running the application are different, and you should be aware of that.
May 12, 2012
· 29,110 Views · 2 Likes
article thumbnail
Replacing a JSON Message Converter With MessagePack
You may be using JSON to transfer data (we were using it in our message queue). While this is good, it has the only benefit of being human-readable. If you don’t care about readability, you’d probably want to use a more efficient serialization mechanism. Multiple options exist: protobuf, MessagePack, protostuff, java serialization. The easiest of them to use is java serialization, but it is less efficient (with both memory and time) than the other solutions. There are some benchmarks that will help you choose the most efficient solution, but if you want it to be easy and almost drop-in replacement to your JSON solution, MessagePack might be the best option. I made a simple test to compare the JSON output to the MessagePack output in terms of size: 2300 vs 150 bytes for a simple message. Pretty good reduction, and if the messages are a lot, it’s a must to optimize. However, you need to register all classes in the message pack. There are two options: use @Message on all the objects in the serialized graph. This is a bit tedious, especially if you already have a lot of classes that are transferred. You have to go through the whole graph you can manually register all classes with the mesagpack. Again tedious, because you also have to register all classes that the message class contains as a field (recursively) That’s why I wrote the following code to loop all our message classes, and register them with the message pack on startup. It partly relies on spring classes, but if you are not using Spring, you can replace them: private MessagePack serializer = new MessagePack(); private ClassMapper classMapper = new DefaultClassMapper(); @PostConstruct public void init() { // we need to find all messages, and register their classes, and also all their fields' recursively ClassPathScanningCandidateComponentProvider provider = new ClassPathScanningCandidateComponentProvider(false); Set classes = provider.findCandidateComponents("com.foo.bar.messages"); // hacking MessagePack to allow Set handling Field fld = ReflectionUtils.findField(MessagePack.class, "registry"); ReflectionUtils.makeAccessible(fld); TemplateRegistry registry = (TemplateRegistry) ReflectionUtils.getField(fld, serializer); registry.register(Set.class, new SetTemplate(new AnyTemplate(registry))); registry.registerGeneric(Set.class, new GenericCollectionTemplate(registry, SetTemplate.class)); try { for (BeanDefinition def : classes) { Class clazz = Class.forName(def.getBeanClassName()); registerHierarcy(clazz, serializer, Sets.>newHashSet()); } } catch (ClassNotFoundException e) { throw new IllegalStateException(e); } } private void registerHierarcy(Class clazz, MessagePack serializer, Set> handledClasses) { if (!isEligibleForRegistration(clazz)) { return; } Class currentClass = clazz; while (currentClass != null && !currentClass.isEnum() && currentClass != Object.class) { for (Field field : currentClass.getDeclaredFields()) { registerHierarcy(field.getType(), serializer, handledClasses); // type parameters Type type = field.getGenericType(); if (type instanceof ParameterizedType) { for (Type typeParam : ((ParameterizedType) type).getActualTypeArguments()) { // avoid circular generics references, resulting in stackoverflow Class typeParamClass = (Class) typeParam; if (!handledClasses.contains(typeParamClass)) { handledClasses.add(typeParamClass); registerHierarcy(typeParamClass, serializer, handledClasses); } } } } currentClass = currentClass.getSuperclass(); } try { serializer.register(clazz); } catch (Exception ex) { logger.warn("Problem registering class " + clazz, ex.getMessage()); } } private boolean isEligibleForRegistration(Class clazz) { return !(clazz.isAnnotationPresent(Entity.class) || clazz == Class.class || Type.class.isAssignableFrom(clazz) || clazz.isInterface() || clazz.isArray() || ClassUtils.isPrimitiveOrWrapper(clazz) || clazz == String.class || clazz == Date.class || clazz == Object.class); }
April 26, 2012
· 10,125 Views
article thumbnail
Avoid Lazy JPA Collections
Hibernate (and actually JPA) has collection mappings: @OneToMany, @ManyToMany, @ElementCollection. All of these are by default lazy. This means the collections are specific implementations of the List or Set interface that hold a reference to the persistent session and the values are loaded from the database only if the collection is accessed. That saves unnecessary database queries if you only occasionally use the collection. However, there’s a problem with that. The problem that manifests itself through the exception that in my observations is 2nd most commonly asked exception (after NullPointerException) – the LazyInitializationException. The problem is that the session is usually open for your service layer and is closed as soon as you return the entity to the view layer. And when you try to iterate the uninitialized collection in your view (jsp for example), the collection throws LazyInitializationException, because the session that they hold a reference to is already closed and they can’t fetch the items. How is this solved? The so called OpenSessionInView / OpenEntityManagerInView “patterns”. In short: you make a filter that opens the session when the request starts and closes it after the view has been rendered (and not after the service layer finishes). Some people call that an anti-pattern, because it leaks persistence handling into the view layer, and complicates the setup. I wouldn’t say it’s that bad: generally it solves the problem without introducing other problems. But in all recent project I’ve been involved, we aren’t using OpenSessionInView, and it works fine. It works fine because we aren’t using lazy collections. But then, you’ll rightly point, you will be fetching “the whole world” when you load a single entity. Well, no. There are two types of *ToMany mappings: value-type mappings where the collection logically does not hold more than a dozen elements. This is in most cases @ElementCollection, and also @*ToMany with items like “Category” or “Price” that are just more complex value objects, but that do not hold any other mappings themselves. Another common feature of these types of collections is that they are usually displayed in the UI together with their owning entity. It is most likely that you want to display the categories of an article, for example. For this type of collections EAGER is the better option. You’ll have to fetch them anyway, why not let hibernate (or any jpa implementation) think of some clever join? As I said – the collections are logically not bigger than a dozen or two, so fetching them won’t be a performance hit. And, logically, they won’t fetch a big object graph with them. mappings across the big, core entities. This can be “all orders made by the user” or “all users for the organization”, “all items of the supplier”, etc. You certainly don’t want to fetch them eagerly. Because if you fetch 2000 users for an organization, which in turn have 1000 orders each, and an order has 3 items on average which in turn have a collection of all people who have purchased it.. you’ll end up with your entire database in memory. Obviously you need lazy collections, right? Well, no. In that case you should not be using collection mappings at all. These types of relations are, in 99% of the cases, displayed in paged lists in the UI. Or in search results. They are never (and should never) be displayed all on one screen (or should rarely be returned in one API call, if your application provides something like a REST API). You have to make queries for them, and use query.setMaxResults and query.setFirstResult() (or limit them with some restrictive criteria). Furthermore having the collections mapped means someone will try to use them at some point, which may fail. And if the object is serialized (xml, json, etc.) the collection contents will be fetched. Something you almost certainly don’t want to happen. (A draft idea here: JPA could have a PagedList collection that would allow paged lazy fetching, thus eliminating the need for a query) So what did I just say – that you should never use lazy collections. Use eager collections for very simple, shallow mappings, and use paged queries for the bigger ones. Well, not exactly. Lazy collections are there and they have application, though it is rather limited. Or at least they are way less applicable than they are used. Here’s an example scenario where I found it applicable. In my side-project I have a Message entity, and it holds a collection of Picture entities. When a user uploads a picture, it is stored in that collection. A message can have no more than 10 pictures, so the collection could very well be eager. But then, Message is the most commonly used entity – it’s fetched virtually on every request. But only some messages have pictures (how many of the tweets on your stream have a a picture upload?). So I don’t want hibernate to make queries just to find out there are no pictures for a given message. Hence I store the number of pictures in a separate field, make the pictures collection lazy, and Hibernate.initialize(..) it manually only if the number of pictures is > 0. So there are scenarios, when the entity has optional collections that fall into the first category above (“small, shallow collections”). So if it is small, shallow and optional (say, used in less than 20% of the cases), then you should go with Lazy to save unnecessary queries. For everything else – having lazy collections will make your life harder. From http://techblog.bozho.net/?p=645
October 28, 2011
· 20,716 Views · 1 Like

Comments

A Simple Plugin System for Web Applications

Jan 20, 2019 · James Sugrue

It was a long time ago and I've lost all code except the one in the blogpost..


Basic API Rate-Limiting

Jun 10, 2018 · Duncan Brown

If you want to be precise, yes. And it's not so applicable to the scenario of constantly coming and going servers. But that's not the general usecase out there

Why I Still Prefer Eclipse Over IntelliJ IDEA

Nov 15, 2017 · Mike Gates

Eclipse is not without issues, of course, and I also like IDEA's extras.

Why I Still Prefer Eclipse Over IntelliJ IDEA

Nov 15, 2017 · Mike Gates

which one I didn't know how to use? The call hierarchy critique was not about me not liking it, but an observation of IDEA users. I use it heavily (and find it lacking when trying to find a default constructor call hierarchy).

The rest are not features (I know about proejcts vs modules - but modules imply logocal coupling and that's not applicable in many cases)


A Simple Plugin System for Web Applications

Aug 09, 2013 · James Sugrue

.......

A Simple Plugin System for Web Applications

Aug 09, 2013 · James Sugrue

Good point - the only drawback of the native solution is that the resources stay in the jars, which means you can't change them in development mode, and save-and-refresh won't work.

A Simple Plugin System for Web Applications

Aug 09, 2013 · James Sugrue

Good point - the only drawback of the native solution is that the resources stay in the jars, which means you can't change them in development mode, and save-and-refresh won't work.

Skills Matter Coding Dojo - video

May 23, 2013 · Skills Matter Marketing

Automatic reloading of jsps is a standard JavaEE servlet container option, there is no issue with that.

Skills Matter Coding Dojo - video

May 23, 2013 · Skills Matter Marketing

Automatic reloading of jsps is a standard JavaEE servlet container option, there is no issue with that.

Skills Matter Coding Dojo - video

May 23, 2013 · Skills Matter Marketing

Automatic reloading of jsps is a standard JavaEE servlet container option, there is no issue with that.

Skills Matter Coding Dojo - video

May 23, 2013 · Skills Matter Marketing

Automatic reloading of jsps is a standard JavaEE servlet container option, there is no issue with that.

Skills Matter Coding Dojo - video

May 23, 2013 · Skills Matter Marketing

Automatic reloading of jsps is a standard JavaEE servlet container option, there is no issue with that.

Don’t Use JSON And XML As Internal Transfer Formats

Nov 05, 2012 · Bozhidar Bozhanov

Disagreed about both points. 1. Testing works ok, for two reasons: 1 you can deserialize the message manually or log it to a console. Also, see 2 2. It is no problem to switch between JSON and a binary format, if you do things properly. If we ever need to get back to JSON, for example, all we need to change is the converter implementation. This also means the JSON converter can be used for quick tests, while MessagePack converter can be used in staging/production.
Why Grails is so groovy

Oct 28, 2011 · Mr B Loid

@Gordon Yorke - yes, but how is that achieved in EclipseLink? Isn't it OSIV in disguise?
Avoid Lazy JPA Collections

Oct 28, 2011 · James Sugrue

@Gordon Yorke - yes, but how is that achieved in EclipseLink? Isn't it OSIV in disguise?
Avoid Lazy JPA Collections

Oct 25, 2011 · Bozhidar Bozhanov

I wouldn't write a thing like that if it was based on only one application. It's about 6 or 7. And those extended objects graphs - did you need collections at all for that? Didn't it fall in the 2nd category? Yes, Hibernate.initialize(..) is tying. Perhaps a better approach would be MyPersistenceUtil.initialize(..), which in turn could delegate to the provider's initializer.
Avoid Lazy JPA Collections

Oct 25, 2011 · Bozhidar Bozhanov

What exactly do you mean? How do EclipseLink or OpenJPA handle the situation? As you've noticed, I have included only JPA annotations. The issue is conceptual, not "implementational". It's the same in other places, unless they have decided to make something super-custom (like my draft idea about PagedList). But that's not JPA anyway.
Avoid Lazy JPA Collections

Oct 25, 2011 · Bozhidar Bozhanov

The title is a broad generalization, I agree. But the article simply says that "lazy" is used in more situations than it is applicable for. And it is not applicable in too many situations. As I noted, I prefer Hibernate.initialize(..) rather than calling the accessor. And having 2 versions of the same entity looks like a wrong approach to me.
Getters and Setters Are Not Evil

Oct 13, 2011 · Bozhidar Bozhanov

Generating them with IDE is simple. I agree having something like properties in C# would be nicer, but it's not bad with IDEs either
Extract method: an in-depth motivation

Oct 12, 2011 · Andries Inzé

That's very similar to my solution of the same problem http://java.dzone.com/articles/making-spring-and-quartz-work . It really seems spring should consider putting such support in their code - it is obviously a requirement in many cases.
Lessons learnt using Hibernate

Sep 28, 2011 · Shamanth Murthy

I disagree, again. Spring has nothing to do with that, and I can assure you it is not over-engineered. Hibernate has its issues, but once you move one step further from plain JDBC (which almost everyone does eventually), you will stumble upon many problems hibernate has already solved. I don't get your skepticism - have Gavin and Rod done anything bad to you? :)
Lessons learnt using Hibernate

Sep 27, 2011 · Shamanth Murthy

I disagree. And so do hibernate designers, as we can see - read carefully http://community.jboss.org/wiki/EqualsAndHashCode. Object identity != database identity. They are used for different purposes. That said, it is usually fine to use the primary key (I do so in most cases), but you should be aware that it bears some complications.
Lessons learnt using Hibernate

Sep 26, 2011 · Shamanth Murthy

The recommendation is to use the "semi"-unique attributes for hashCode & equals, not for primary key.
The Logging Mess: What to use for Java Logging

Sep 26, 2011 · mitchp

Well, it comes from my blog, and is aggregated in JCG and java.dzone.com.
Steve Ballmer: "I would love to see all open source innovation happen on top of Windows"

Sep 07, 2011 · Mr B Loid

They use dynamic proxies, which is still an OOP approach, although, for concrete classes, there's no good support in the language and its libraries. Btw, in normal (eager) scenarios JPA implementations don't need to make proxies.

I called it "black magic", because the dependency appears to come from nowhere. With JPA proxies you get that from the entitymanager, so even though it may look strange, it is obvious where it comes from. With aspectj you have "new Foo()", and suddenly your instances has all sorts of things happening to it - contrary to what you'd expect from a normal object.

I agree that aspectj is not a bad thing, but note also my point that it is only supported by spring. I was suggesting something more standard.

Steve Ballmer: "I would love to see all open source innovation happen on top of Windows"

Sep 07, 2011 · Mr B Loid

They use dynamic proxies, which is still an OOP approach, although, for concrete classes, there's no good support in the language and its libraries. Btw, in normal (eager) scenarios JPA implementations don't need to make proxies.

I called it "black magic", because the dependency appears to come from nowhere. With JPA proxies you get that from the entitymanager, so even though it may look strange, it is obvious where it comes from. With aspectj you have "new Foo()", and suddenly your instances has all sorts of things happening to it - contrary to what you'd expect from a normal object.

I agree that aspectj is not a bad thing, but note also my point that it is only supported by spring. I was suggesting something more standard.

Steve Ballmer: "I would love to see all open source innovation happen on top of Windows"

Sep 07, 2011 · Mr B Loid

They use dynamic proxies, which is still an OOP approach, although, for concrete classes, there's no good support in the language and its libraries. Btw, in normal (eager) scenarios JPA implementations don't need to make proxies.

I called it "black magic", because the dependency appears to come from nowhere. With JPA proxies you get that from the entitymanager, so even though it may look strange, it is obvious where it comes from. With aspectj you have "new Foo()", and suddenly your instances has all sorts of things happening to it - contrary to what you'd expect from a normal object.

I agree that aspectj is not a bad thing, but note also my point that it is only supported by spring. I was suggesting something more standard.

Steve Ballmer: "I would love to see all open source innovation happen on top of Windows"

Sep 07, 2011 · Mr B Loid

DDD with ORM + dependency injection is problematic, because you can't have the DI framework inject your dependencies into the domain objects.
Steve Ballmer: "I would love to see all open source innovation happen on top of Windows"

Sep 07, 2011 · Mr B Loid

DDD with ORM + dependency injection is problematic, because you can't have the DI framework inject your dependencies into the domain objects.
Steve Ballmer: "I would love to see all open source innovation happen on top of Windows"

Sep 07, 2011 · Mr B Loid

DDD with ORM + dependency injection is problematic, because you can't have the DI framework inject your dependencies into the domain objects.
Those evil frameworks and their complexity

Aug 31, 2011 · Byron Kiourtzoglou

As I noted in the article, there are always some projects that will probably be handled better without any (major) framework. But that's not true about most projects. "Going directly to code" doesn't exclude frameworks, but. Just the other day I started a small project with spring, spring-mvc and google-app-engine. The project is already deployed and working. It's simple, but the frameworks made it even simpler.
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

If the project is using a JavaEE container, perhaps that's the better option, yes. But is the app server configuration easily portable?
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

If the project is using a JavaEE container, perhaps that's the better option, yes. But is the app server configuration easily portable?
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

If the project is using a JavaEE container, perhaps that's the better option, yes. But is the app server configuration easily portable?
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

If the project is using a JavaEE container, perhaps that's the better option, yes. But is the app server configuration easily portable?
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

If the project is using a JavaEE container, perhaps that's the better option, yes. But is the app server configuration easily portable?
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

If the project is using a JavaEE container, perhaps that's the better option, yes. But is the app server configuration easily portable?
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

If the project is using a JavaEE container, perhaps that's the better option, yes. But is the app server configuration easily portable?
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

If the project is using a JavaEE container, perhaps that's the better option, yes. But is the app server configuration easily portable?
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

web.xml is part of the build, so I don't think it's nice to put the path there.

I don't like the database option, because it is much easier for Ops to deploy, redeploy and reconfigure text files than a database. It's simpler for development as well.

Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

web.xml is part of the build, so I don't think it's nice to put the path there.

I don't like the database option, because it is much easier for Ops to deploy, redeploy and reconfigure text files than a database. It's simpler for development as well.

Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

web.xml is part of the build, so I don't think it's nice to put the path there.

I don't like the database option, because it is much easier for Ops to deploy, redeploy and reconfigure text files than a database. It's simpler for development as well.

Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

Of course, that is up to you, the containers I noted were just examples. They would work fine in some scenarios, and won't in other, more complex ones. Since I'm mostly using spring, it does all that for me.
Entity Framework object graphs and viewstate

Aug 31, 2011 · Tony Thomas

Of course, that is up to you, the containers I noted were just examples. They would work fine in some scenarios, and won't in other, more complex ones. Since I'm mostly using spring, it does all that for me.
Entity Framework object graphs and viewstate

Aug 30, 2011 · Tony Thomas

Yes, the mechanism for getting the values in your classes depends on the framework used. @Resource is an option in many DI frameworks. But the important bit is populating the properties in the first place, and not putting them inside the buld.
What to do with IDE project files

Aug 20, 2011 · Bozhidar Bozhanov

I didn't get the line about the custom settings.
What to do with IDE project files

Aug 20, 2011 · Bozhidar Bozhanov

I wouldn't suggest it if on numerous occasions project setup was non-trivial. Yes, projects that are using maven are a bit eadier, but they may still take up a lot of time. And yes, I know it's an unpopular opinion, but these files do no harm ig properly handles.
Google plus the dawn of facebook

Jul 01, 2011 · Alex Hortopan

"dawn" like "before sunrise", "beginning"?
Those evil frameworks and their complexity

May 09, 2011 · Bozhidar Bozhanov

I believe I made that point - if you think standards are not good, and know how to improve it - do it. Like Gavin King and Rod Johnson did. But if you don't know how exactly - stick with the standards.
Those evil frameworks and their complexity

May 02, 2011 · Bozhidar Bozhanov

DI != coding to an interface. I have used DI without interfaces for classes that do not require one.
Those evil frameworks and their complexity

May 01, 2011 · Bozhidar Bozhanov

And there goes the typical argument in this case "this is not programming", "this is complex", "it's unneeded". I'm perfectly fine with people writing crappy code, I just don't want it in my projects ;) (and no, crappy code != non-spring code, but it usually comes from people in complete denial of concepts like DI)
Those evil frameworks and their complexity

May 01, 2011 · Bozhidar Bozhanov

My spring xml file is 100 lines, which is mainly configuration of db, pools, and external utilities. All else is through annotations. So no, I'm not programming in XML. I just save 1 new FooImpl() everywhere.
Those evil frameworks and their complexity

Apr 30, 2011 · Bozhidar Bozhanov

now that's just random jabber and trolling ;)
Those evil frameworks and their complexity

Apr 30, 2011 · Bozhidar Bozhanov

So? In what projects, in what context is he using "new"? If Gosling tells you he manually sorts arrays (because he needed some particular optimization), will you deprecate Arrays.sort?
Those evil frameworks and their complexity

Apr 30, 2011 · Bozhidar Bozhanov

Yes, but so far JPA has leaked only on this occasion. Which is rather different from your claim that it leaks completely and entirely.
Those evil frameworks and their complexity

Apr 30, 2011 · Bozhidar Bozhanov

Only one hibernate abstraction is leaky, and that is the autoflushing (http://techblog.bozho.net/?p=227). In my current hibernate project I'm using 1. my entities as DTOs, as if they had no relation with hibernate. 2. a service layer that is completely unaware of hibernate, and I'm already switching some parts of the systems to NoSQL stores, by only changing DAOs. So, no, hibernate doesn't leak. It may leak if you use it the wrong way.
Those evil frameworks and their complexity

Apr 30, 2011 · Bozhidar Bozhanov

sorry, clicked the wrong button, should've been an upvote rather than a downvote. DI is _almost_ a silver bullet. It need not be spring, it can be CDI or guice. But it greatly simplifies the architecture and maintainability. And I'm talking about this kind of people that throw DI away because they had some confusion with it. Well, the problem is not DI.
Those evil frameworks and their complexity

Apr 30, 2011 · Bozhidar Bozhanov

It didn't work the last time I pressed it.
Those evil frameworks and their complexity

Apr 30, 2011 · Bozhidar Bozhanov

I claim to know spring very in-depth (http://stackoverflow.com/tags/spring/topusers), and I haven't found a single fault in its design. Yes, some classes do not do fit your case completely, but then you extend them, override 1 method, and it works. And honestly, whenever someone claims something's wrong with spring (or hibernate), it's either without any concrete examples, or the examples prove their incompetence.
Those evil frameworks and their complexity

Apr 30, 2011 · Bozhidar Bozhanov

Spring and hibernate are not over-engineered. That is my point - people that don't get them in-depth accuse them of being complex. And another thing - discussing the pros and cons is what I'm for. But usually I see arguments like "they are overengineered" .. from people that haven't done a single project with them.
Authenticated rss proxy (with Ruby)

Feb 28, 2011 · Gerd Storm

This is not deep copying. Both clone() and BeanUtils.cloneBean() make shallow copies. Deep copying means that the whole data hierarchy is replaced. With the above approaches if you call foo.getBar().setBaz(..), the change will be reflected in both the clone and the original object.

There are two ways to to deep cloining in Java - one is reflection and the other is serialization. commons-lang have SerializationUtils.clone(..). And for reflection there's java-deep-cloning-library.

See this stackoverflow answer.

Creating a Copy of JavaBeans

Feb 28, 2011 · James Sugrue

This is not deep copying. Both clone() and BeanUtils.cloneBean() make shallow copies. Deep copying means that the whole data hierarchy is replaced. With the above approaches if you call foo.getBar().setBaz(..), the change will be reflected in both the clone and the original object.

There are two ways to to deep cloining in Java - one is reflection and the other is serialization. commons-lang have SerializationUtils.clone(..). And for reflection there's java-deep-cloning-library.

See this stackoverflow answer.

ThreadLocal variables and thread pools – it can go wrong

Feb 22, 2011 · Bozhidar Bozhanov

That is as a last resort, when the vendor does not want to fix the issues with his code. Anyway, luckily there aren't many closed-source libraries in the java world.
Why startups should not choose NoSQL

Dec 30, 2010 · Bozhidar Bozhanov

No. But you would agree that the decision between learning curve vs no learning curve, if timing is crucial, leans towards no learning curve. Anyway, the point was (as I specified in the comments under the post) that people should be careful not to choose NoSQL for the wrong reasons.
Sync Your Database to SharePoint Using SSIS

Dec 28, 2010 · Tony Thomas

After a year on stackoverflow, answering questions, I can vote with two hands for point 3 ;)
10 Effective Ways to Become a Good Programmer

Dec 28, 2010 · James Sugrue

After a year on stackoverflow, answering questions, I can vote with two hands for point 3 ;)
RE: Moving from Spring to Java EE 6: The Age of Frameworks is Over

Oct 18, 2010 · Peter Thomas

I wouldn't agree fremeworks like spring will be obsoleted as well. Here is my extended opinion: http://www.dzone.com/links/views_on_cdi_and_whats_new_in_javaee6.html
Learn how visitors view websites and design accordingly

Sep 17, 2010 · Yuri Filimonov

And what is the problem of

throw new RuntimeException(originalException);

Throwing Undeclared Checked Exceptions

Sep 17, 2010 · Luigi Viggiano

And what is the problem of

throw new RuntimeException(originalException);

C Programming FAQs for orkut forums.

Sep 05, 2010 · Sharath A.V

Reread this post and ask yourself whether it's not just a rant without any sign of concreteness.

In this thread I've made some (I hope) less biased points about CDI

Btw, Spring commercial? VMWare is, but not the sprign framework.

CDI is based on Seam, Guice and Spring. (less on spring, alas). And it looks good when you get into depth - it is a mature DI framework. Whether it's better than spring or guice - time will tell.

Quick comparison of Apollo vs JavaFX vs Silverlight vs Flash/Flex

May 23, 2010 · erik ooostent

Can you create a generic repository (DAO) with this scenario? "createBean" is type-unsafe, because you are passing the bean name and return an Object. If we extend this approach to JPA, we'll have to consider caches, entity manager lifecycle combined with a prototype lifecycle, etc.

My view on DDD with spring is a bit different - you shouldn't use spring to manage your domain objects (if using JPA at least)

How to restart a remote computer

Aug 20, 2009 · Denis Gobo

Well, your statement is wrong from the very beginning. It is not hard to google 'JSF ajax' (or something of the sort), and stumble upon wonderful frameworks like RichFaces or PrimeFaces. There are ajaxified and ajaxifying jsf components. Then all of your arguments go nowhere. I've worked with RichFaces for sometime, and it is really state of the art. You use ajax with writing no javascript whatsoever. Validators can use your model objects directly (with Hibernate-validation), so - automatic ajax validation. And so on, and so forth.

So better go check those frameworks ;)

Real Programmers Don't Use Pascal

Aug 16, 2009 · Lebon Bon Lebon

Actually, JSF is not that much 'ivory towery'. I'm using JSF 1.2 now + RichFaces, and I must say, this is pretty much sufficient. It offers almost all the abstractions that can possibly be needed (and even such that would not be needed). Yes, it has drawbacks, which are being addressed in JSF 2.0.

NetBeans 6.1 - not even 10% a useful IDE

Jul 09, 2008 · Bozhidar Bozhanov

Look, that was my idea - it has a good tool for reveng, good for ruby, even good Matisse (I actually have no complaints about Matisee - it is find). But basic editing and general features are somewhat buggy. And hence everything is spoiled.

User has been successfully modified

Failed to modify user

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: