DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

The Latest Coding Topics

article thumbnail
All About Java Modifier Keywords
I’ve been a Java programmer for a while now, however, recently someone asked me a question regarding one of Java modifier keywords and I had no clue what it was. This made it obvious to me that I needed to brush up on some Java that goes beyond actual coding and algorithms. After a few Google searches, I got bits and pieces on the topic, but never really the full story, so I’m using this post as a way to document the subject. This is a great interview question to test your computer science book-smarts. Modifiers in Java are keywords that you add to variables, classes, and methods in order to change their meaning. They can be broken into two groups: Access control modifiers Non-access modifiers Let’s first take a look at the access control modifiers and see some code examples on how to use them. Modifier Description public Visible to the world private Visible to the class protected Visible to the package and all subclasses So how do you use these three access control modifiers? Let’s take the following two classes. Please ignore how inefficient they may or may not be as that is besides the point for this tutorial. Create a file called project/mypackage/Person.java and add the following code: package mypackage; class Person { private String firstname; private String lastname; protected void setFirstname(String firstname) { this.firstname = firstname; } protected void setLastname(String lastname) { this.lastname = lastname; } protected String getFirstname() { return this.firstname; } protected String getLastname() { return this.lastname; } } The above Person class is going to have private variables and protected methods. This means that the variables will only be accessible from the class and the methods will only be accessible from the mypackage package. Next create a file called project/mypackage/Company.java and add the following code: package mypackage; import java.util.*; public class Company { private ArrayList people; public Company() { this.people = new ArrayList(); } public void addPerson(String firstname, String lastname) { Person p = new Person(); p.setFirstname(firstname); p.setLastname(lastname); this.people.add(p); } public void printPeople() { for(int i = 0; i < this.people.size(); i++) { System.out.println(this.people.get(i).getFirstname() + " " + this.people.get(i).getLastname()); } } } The above class is public, so it can be accessed from any future classes inside and outside of the package. It has a private variable that is only accessible from within the class, and it has a bunch of public methods. Because the Person class and Company class both share the same package, the Company class can access the Person class as well as all its methods. To complete the demonstration of the access control modifiers, let’s create a driver class in a new project/MainDriver.java file: import mypackage.*; public class MainDriver { public static void main(String[] args) { Company c = new Company(); c.addPerson("Nic", "Raboy"); c.printPeople(); Person p = new Person(); p.setFirstname("Maria"); p.setLastname("Campos"); } } Remember, because the Company class is public, we won’t have issues adding and printing people. However, because the Person class is protected, we’re going to get a compile time error since the MainDriver is not part of the mypackage package. Now let’s take a look at the available non-access modifiers and some example code on how to use them. Modifier Description static Used for creating class methods and variables final Used for finalizing implementations of classes, variables, and methods abstract Used for creating abstract methods and classes synchronized Used in threads and locks the method or variable so it can only be used by one thread at a time volatile Used in threads and keeps the variable in main memory rather than caching it locally in each thread So how do you use these five non-access modifiers? A good example of the static modifier is the following in Java: int max = Integer.MAX_VALUE int numeric = Integer.parseInt("1234"); Notice in the above example we make use of variables and methods in the Integer class without first instantiating it. This is because those particular methods and variables are static. The abstract modifier is a little different. You can create a class with methods, but they are essentially nothing more than definitions. You cannot add logic to them. For example: abstract class Shape { abstract int getArea(int width, int height); } Then inside a child class you would add code similar to this: class Rectangle extends Shape { int getArea(int width, int height) { return width * height; } } This brings us to the synchronized and volatile modifiers. Let’s take a look at a threading example where we try to access the same method from two different threads: import java.lang.*; public class ThreadExample { public static void main(String[] args) { Thread thread1 = new Thread(new Runnable() { public void run() { print("THREAD 1"); } }); Thread thread2 = new Thread(new Runnable() { public void run() { print("THREAD 2"); } }); thread1.start(); thread2.start(); } public static void print(String s) { for(int i = 0; i < 5; i++) { System.out.println(s + ": " + i); } } } Running the above code will result in output that is printed in a random order. It could be sequential, or not, it depends on the CPU. However, if we make use of the synchronized modifier, the first thread must complete before the second one can start printing. The print(String s) method will now look like this: public static synchronized void print(String s) { for(int i = 0; i < 5; i++) { System.out.println(s + ": " + i); } } Next let’s take a look at an example using the volatile modifier: import java.lang.*; public class ThreadExample { public static volatile boolean isActive; public static void main(String[] args) { isActive = true; Thread thread1 = new Thread(new Runnable() { public void run() { while(true) { if(isActive) { System.out.println("THREAD 1"); isActive = false; } } } }); Thread thread2 = new Thread(new Runnable() { public void run() { while(true) { if(!isActive) { System.out.println("THREAD 2"); try { Thread.sleep(100); } catch (Exception e) { } isActive = true; } } } }); thread1.start(); thread2.start(); } } Running the above code will print the thread number and alternate between them because our volatile variable is a status flag. This is because the flag is stored in main memory. If we remove the volatile keyword, the thread will only alternate one time because only a local reference is used and the two threads are essentially hidden from each other. Conclusion Java modifiers can be a bit tricky to understand and it is actually common for programmers to be unfamiliar with a lot of them. This is a great interview question to test your book knowledge too. If I’ve missed any or you think my explanations could be better, definitely share in the comments section.
May 15, 2015
by Nic Raboy
· 12,767 Views
article thumbnail
Log Collection With Graylog on AWS
Log collection is essential to properly analyzing issues in production. An interface to search and be notified about exceptions on all your servers is a must. Well, if you have one server, you can easily ssh to it and check the logs, of course, but for larger deployments, collecting logs centrally is way more preferable than logging to 10 machines in order to find “what happened”. There are many options to do that, roughly separated in two groups – 3rd party services and software to be installed by you. 3rd party (or “cloud-based” if you want) log collection services include Splunk,Loggly, Papertrail, Sumologic. They are very easy to setup and you pay for what you use. Basically, you send each message (e.g. via a custom logback appender) to a provider’s endpoint, and then use the dashboard to analyze the data. In many cases that would be the preferred way to go. In other cases, however, company policy may frown upon using 3rd party services to store company-specific data, or additional costs may be undesired. In these cases extra effort needs to be put into installing and managing an internal log collection software. They work in a similar way, but implementation details may differ (e.g. instead of sending messages with an appender to a target endpoint, the software, using some sort of an agent, collects local logs and aggregates them). Open-source options include Graylog, FluentD, Flume, Logstash. After a very quick research, I considered graylog to fit our needs best, so below is a description of the installation procedure on AWS (though the first part applies regardless of the infrastructure). The first thing to look at are the ready-to-use images provided by graylog, including docker, openstack, vagrant and AWS. Unfortunately, the AWS version has two drawbacks – it’s using Ubuntu, rather than the Amazon AMI. That’s not a huge issue, although some generic scripts you use in your stack may have to be rewritten. The other was the dealbreaker – when you start it, it doesn’t run a web interface, although it claims it should. Only mongodb, elasticsearch and graylog-server are started. Having 2 instances – one web, and one for the rest would complicate things, so I opted for manual installation. Graylog has two components – the server, which handles the input, indexing and searching, and the web interface, which is a nice UI that communicates with the server. The web interface uses mongodb for metadata, and the server uses elasticsearch to store the incoming logs. Below is a bash script (CentOS) that handles the installation. Note that there is no “sudo”, because initialization scripts are executed as root on AWS. #!/bin/bash # install pwgen for password-generation yum upgrade ca-certificates --enablerepo=epel yum --enablerepo=epel -y install pwgen # mongodb cat >/etc/yum.repos.d/mongodb-org.repo <<'EOT' [mongodb-org] name=MongoDB Repository baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/ gpgcheck=0 enabled=1 EOT yum -y install mongodb-org chkconfig mongod on service mongod start # elasticsearch rpm --import https://packages.elasticsearch.org/GPG-KEY-elasticsearch cat >/etc/yum.repos.d/elasticsearch.repo <<'EOT' [elasticsearch-1.4] name=Elasticsearch repository for 1.4.x packages baseurl=http://packages.elasticsearch.org/elasticsearch/1.4/centos gpgcheck=1 gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch enabled=1 EOT yum -y install elasticsearch chkconfig --add elasticsearch # configure elasticsearch sed -i -- 's/#cluster.name: elasticsearch/cluster.name: graylog2/g' /etc/elasticsearch/elasticsearch.yml sed -i -- 's/#network.bind_host: localhost/network.bind_host: localhost/g' /etc/elasticsearch/elasticsearch.yml service elasticsearch stop service elasticsearch start # java yum -y update yum -y install java-1.7.0-openjdk update-alternatives --set java /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java # graylog wget https://packages.graylog2.org/releases/graylog2-server/graylog-1.0.1.tgz tar xvzf graylog-1.0.1.tgz -C /opt/ mv /opt/graylog-1.0.1/ /opt/graylog/ cp /opt/graylog/bin/graylogctl /etc/init.d/graylog sed -i -e 's/GRAYLOG2_SERVER_JAR=\${GRAYLOG2_SERVER_JAR:=graylog.jar}/GRAYLOG2_SERVER_JAR=\${GRAYLOG2_SERVER_JAR:=\/opt\/graylog\/graylog.jar}/' /etc/init.d/graylog sed -i -e 's/LOG_FILE=\${LOG_FILE:=log\/graylog-server.log}/LOG_FILE=\${LOG_FILE:=\/var\/log\/graylog-server.log}/' /etc/init.d/graylog cat >/etc/init.d/graylog <<'EOT' #!/bin/bash # chkconfig: 345 90 60 # description: graylog control sh /opt/graylog/bin/graylogctl $1 EOT chkconfig --add graylog chkconfig graylog on chmod +x /etc/init.d/graylog # graylog web wget https://packages.graylog2.org/releases/graylog2-web-interface/graylog-web-interface-1.0.1.tgz tar xvzf graylog-web-interface-1.0.1.tgz -C /opt/ mv /opt/graylog-web-interface-1.0.1/ /opt/graylog-web/ cat >/etc/init.d/graylog-web <<'EOT' #!/bin/bash # chkconfig: 345 91 61 # description: graylog web interface sh /opt/graylog-web/bin/graylog-web-interface > /dev/null 2>&1 & EOT chkconfig --add graylog-web chkconfig graylog-web on chmod +x /etc/init.d/graylog-web #configure mkdir --parents /etc/graylog/server/ cp /opt/graylog/graylog.conf.example /etc/graylog/server/server.conf sed -i -e 's/password_secret =.*/password_secret = '$(pwgen -s 96 1)'/' /etc/graylog/server/server.conf sed -i -e 's/root_password_sha2 =.*/root_password_sha2 = '$(echo -n password | shasum -a 256 | awk '{print $1}')'/' /etc/graylog/server/server.conf sed -i -e 's/application.secret=""/application.secret="'$(pwgen -s 96 1)'"/g' /opt/graylog-web/conf/graylog-web-interface.conf sed -i -e 's/graylog2-server.uris=""/graylog2-server.uris="http:\/\/127.0.0.1:12900\/"/g' /opt/graylog-web/conf/graylog-web-interface.conf service graylog start sleep 30 service graylog-web start You may also want to set a TTL (auto-expiration) for messages, so that you don’t store old logs forever. Here’s how # wait for the index to be created INDEXES=$(curl --silent "http://localhost:9200/_cat/indices") until [[ "$INDEXES" =~ "graylog2_0" ]]; do sleep 5 echo "Index not yet created. Indexes: $INDEXES" INDEXES=$(curl --silent "http://localhost:9200/_cat/indices") done # set each indexed message auto-expiration (ttl) curl -XPUT "http://localhost:9200/graylog2_0/message/_mapping" -d'{"message": {"_ttl" : { "enabled" : true, "default" : "15d" }}' Now you have everything running on the instance. Then you have to do some AWS-specific things (if using CloudFormation, that would include a pile of JSON). Here’s the list: you can either have an auto-scaling group with one instance, or a single instance. I prefer the ASG, though the other one is a bit simpler. The ASG gives you auto-respawn if the instance dies. set the above script to be invoked in the UserData of the launch configuration of the instance/asg (e.g. by getting it from s3 first) allow UDP port 12201 (the default logging port). That should happen for the instance/asg security group (inbound), for the application nodes security group (outbound), and also as a network ACL of your VPC. Test the UDP connection to make sure it really goes through. Keep the access restricted for all sources, except for your instances. you need to pass the private IP address of your graylog server instance to all the application nodes. That’s tricky on AWS, as private IP addresses change. That’s why you need something stable. You can’t use an ELB (load balancer), because it doesn’t support UDP. There are two options: Associate an Elastic IP with the node on startup. Pass that IP to the application nodes. But there’s a catch – if they connect to the elastic IP, that would go via NAT (if you have such), and you may have to open your instance “to the world”. So, you must turn the elastic IP into its corresponding public DNS. The DNS then will be resolved to the private IP. You can do that by manually and hacky: 1 GRAYLOG_ADDRESS="ec2-$GRAYLOG_ADDRESS//./-}.us-west-1.compute.amazonaws.com" or you can use the AWS EC2 CLI to obtain the instance details of the instance that the elastic IP is associated with, and then with another call obtain its Public DNS. Instead of using an Elastic IP, which limits you to a single instance, you can use Route53 (the AWS DNS manager). That way, when a graylog server instance starts, it can append itself to a route53 record, that way allowing for a round-robin DNS of multiple graylog instances that are in a cluster. Manipulating the Route53 records is again done via the AWS CLI. Then you just pass the domain name to applications nodes, so that they can send messages. alternatively, you can install graylog-server on all the nodes (as an agent), and point them to an elasticsearch cluster. But that’s more complicated and probably not the intended way to do it configure your logging framework to send messages to graylog. There are standard GELF (the greylog format) appenders, e.g. this one, and the only thing you have to do is use the Public DNS environment variable in the logback.xml (which supports environment variable resolution). You should make the web interface accessible outside the network, so you can use an ELB for that, or the round-robin DNS mentioned above. Just make sure the security rules are tight and not allowing external tampering with your log data. If you are not running a graylog cluster (which I won’t cover), then the single instance can potentially fail. That isn’t a great loss, as log messages can be obtained from the instances, and they are short-lived anyway. But the metadata of the web interface is important – dashboards, alerts, etc. So it’s good to do regular backups (e.g. with mongodump). Using an EBS volume is also an option. Even though you send your log messages to the centralized log collector, it’s a good idea to also keep local logs, with the proper log rotation and cleanup. It’s not a trivial process, but it’s essential to have log collection, so I hope the guide has been helpful.
May 14, 2015
by Bozhidar Bozhanov
· 19,629 Views
article thumbnail
HashMap Custom implementation in java
Contents of page : Custom HashMap > Entry Putting 5 key-value pairs in HashMap (step-by-step)> Methods used in custom HashMap > What will happen if map already contains mapping for key? Complexity calculation of put and get methods in HashMap > put method - worst Case complexity > put method - best Case complexity > get method - worst Case complexity > get method - best Case complexity > Summary of complexity of methods in HashMap > Custom HashMap > This is very important and trending topic. In this post i will be explaining HashMap custom implementation in lots of detail with diagrams which will help you in visualizing the HashMap implementation. I will be explaining how we will put and get key-value pair in HashMap by overriding- >equals method - helps in checking equality of entry objects. >hashCode method - helps in finding bucket’s index on which data will be stored. We will maintain bucket (ArrayList) which will store Entry (LinkedList). Entry We store key-value pair by usingEntry Entry contains K key, V value and Entrynext (i.e. next entry on that location of bucket). static class Entry { K key; V value; Entry next; public Entry(K key, V value, Entry next){ this.key = key; this.value = value; this.next = next; } } Putting 5 key-value pairs in custom HashMap (step-by-step)> I will explain you the whole concept of HashMap by putting 5 key-value pairs in HashMap. Initially, we have bucket of capacity=4. (all indexes of bucket i.e. 0,1,2,3 are pointing to null) Let’s put first key-value pair in HashMap- Key=21, value=12 newEntry Object will be formed like this > We will calculate hash by using our hash(K key) method - in this case it returns key/capacity= 21%4= 1. So, 1 will be the index of bucket on which newEntry object will be stored. We will go to 1stindex as it is pointing to null we will put our newEntry object there. At completion of this step, our HashMap will look like this- Let’s put second key-value pair in HashMap- Key=25, value=121 newEntry Object will be formed like this > We will calculate hash by using our hash(K key) method - in this case it returns key/capacity= 25%4= 1. So, 1 will be the index of bucket on which newEntry object will be stored. We will go to 1st index, it contains entry with key=21, we will compare two keys(i.e. compare 21 with 25 by using equals method), as two keys are different we check whether entry with key=21’s next is null or not, if next is null we will put our newEntry objecton next. At completion of this step our HashMap will look like this- Let’s put third key-value pair in HashMap- Key=30, value=151 newEntry Object will be formed like this > We will calculate hash by using our hash(K key) method - in this case it returns key/capacity= 30%4= 2. So, 2 will be the index of bucket on which newEntry object will be stored. We will go to 2nd index as it is pointing to null we will put our newEntry object there. At completion of this step, our HashMap will look like this- Let’s put fourth key-value pair in HashMap- Key=33, value=15 Entry Object will be formed like this > We will calculate hash by using our hash(K key) method - in this case it returns key/capacity= 33%4= 1, So, 1 will be the index of bucket on whichnewEntry object will be stored. We will go to 1st index - >it contains entry with key=21, we will compare two keys (i.e. compare 21 with 33 by using equals method, as two keys are different, proceed to next of entry with key=21 (proceed only if next is not null). >now, next contains entry with key=25, we will compare two keys (i.e. compare 25 with 33 by using equals method, as two keys are different, now next of entry with key=25 is pointing to null so we won’t proceed further, we will put our newEntry object on next. At completion of this step our HashMap will look like this- Let’s put fifth key-value pair in HashMap- Key=35, value=89 Repeat above mentioned steps. At completion of this step our HashMap will look like this- Must read: LinkedHashMap Custom implementation Methods used in custom HashMap > public void put(K newKey, V data) -Method allows you put key-value pair in HashMap -If the map already contains a mapping for the key, the old value is replaced. -provide complete functionality how to override equals method. -provide complete functionality how to override hashCode method. public V get(K key) Method returns value corresponding to key. public boolean remove(K deleteKey) Method removes key-value pair from HashMapCustom. public void display() -Method displays all key-value pairs present in HashMapCustom., -insertion order is not guaranteed, for maintaining insertion order refer LinkedHashMapCustom. private int hash(K key) -Method implements hashing functionality, which helps in finding the appropriate bucket location to store our data. -This is very important method, as performance of HashMapCustom is very much dependent on this method's implementation. What will happen if map already contains mapping for key? If the map already contains a mapping for the key, the old value is replaced. Complexity calculation of put and get methods in HashMap > put method - worst Case complexity > O(n). But how complexity is O(n)? Initially, let's say map is like this - And we have to insert newEntry Object with Key=25, value=121 We will calculate hash by using our hash(K key) method - in this case it returns key/capacity= 25%4= 1. So, 1 will be the index of bucket on which newEntry object will be stored. We will go to 1st index, it contains entry with key=21, we will compare two keys(i.e. compare 21 with 25 by using equals method), as two keys are different we check whether entry with key=21’s next is null or not, if next is null we will put our newEntry objecton next. At completion of this step our HashMap will look like this- Now let’s do complexity calculation - Earlier there was 1 element in HashMap and for putting newEntry Object we iterated on it. Hence complexity was O(n). Note: We may calculate complexity by adding more elements in HashMap as well, but to keep explanation simple i kept less elements in HashMap. put method - best Case complexity > O(1). But how complexity is O(n)? Let's say map is like this - And we have to insert newEntry Object with Key=30, value=151 We will calculate hash by using our hash(K key) method - in this case it returns key/capacity= 30%4= 2. So, 2 will be the index of bucket on which newEntry object will be stored. We will go to 2nd index as it is pointing to null we will put our newEntry object there. At completion of this step our HashMap will look like this- Now let’s do complexity calculation - Earlier there 2 elements in HashMap but we were able to put newEntry Object in first go. Hence complexity was O(1). get method - worst Case complexity > O(n). But how complexity is O(n)? Initially, let's say map is like this - And we have to get Entry Object with Key=25, value=121 We will calculate hash by using our hash(K key) method - in this case it returns key/capacity= 25%4= 1. So, 1 will be the index of bucket on which Entry object is stored. We will go to 1st index, it contains entry with key=21, we will compare two keys(i.e. compare 21 with 25 by using equals method), as two keys are different we check whether entry with key=21’s next is null or not, next is not null so we will repeat same process and ultimately will be able to get Entry object. Now let’s do complexity calculation - There were 2 elements in HashMap and for getting Entry Object we iterated on both of them. Hence complexity was O(n). Note: We may calculate complexity by using HashMap of larger size, but to keep explanation simple i kept less elements in HashMap. get method - best Case complexity > O(1). But how complexity is O(n)? Initially, let's say map is like this - And we have to get Entry Object with Key=30, value=151 We will calculate hash by using our hash(K key) method - in this case it returns key/capacity= 30%4= 2. So, 2 will be the index of bucket on which Entry object is stored. We will go to 2nd index and get Entry object. Now let’s do complexity calculation - There were 3 elements in HashMap but we were able to get Entry Object in first go. Hence complexity was O(1). Summary of complexity of methods in HashMap > Operation/ method Worst case Best case put(K key, V value) O(n) O(1) get(Object key) O(n) O(1) REFER: http://javamadesoeasy.com/2015/02/hashmap-custom-implementation.html HashMap Custom implementation - put, get, remove Employee object.
May 13, 2015
by Ankit Mittal
· 74,203 Views · 10 Likes
article thumbnail
Docker Machine on Windows - How To Setup You Hosts
I've been playing around with Docker a lot lately. Many reasons for that, one for sure is, that I love to play around with latest technology and even help out to build a demo or two or a lab. The main difference, between what everybody else of my coworkers is doing is, that I run my setup on Windows. Like most of the middleware developers out there. So, If you followed Arun's blog about "Docker Machine to Setup Docker Host" you might have tried to make this work on windows already. Here is the ultimate short how-to guide on using Docker Machine to administrate and spin up your Docker hosts. Docker Machine Machine lets you create Docker hosts on your computer, on cloud providers, and inside your own data center. It creates servers, installs Docker on them, then configures the Docker client to talk to them. You basically don't have to have anything installed on your machine prior to this. Which is a hell lot easier, than having to manually install boot2docker before. So, let's try this out. You want to have at least one thing in place before starting with anything Docker or Machine. Go and get Git for Windows (aka msysgit). It has all kinds of helpful unix tools in his belly, which you need anyway. Prerequisites - The One For All Solution The first is to install the windows boot2docker distribution which I showed in an earlier blog. It contains the following bits configured and ready for you to use: - VirtualBox - Docker Windows Client Prerequisites- The Bits And Pieces I dislike the boot2docker installer for a variety of reasons. Mostly, because I want to know what exactly is going on on my machine. So I played around a bit and here is the bits and pieces installer if you decide against the one-for-all solution. Start with the virtualization solution. We need something like that on Windows, because it just can't run Linux and this is what Docker is based on. At least for now. So, get VirtualBox and ensure that version 4.3.18 is correctly installed on your system (VirtualBox-4.3.18-96516-Win.exe, 105 MB). WARNING: There is a strange issue, when you run Windows itself in Virtualbox. You might run into an issue with starting the host. And while you're at it, go and get the Docker Windows Client. The other is to grab the final from the test servers as a direct download (docker-1.6.0.exe, x86_64, 7.5MB). Rename to "docker" and put it into a folder of your choice (I assume it will be c:\docker\. Now you also need to download Docker Machine, which is another single executable (docker-machine_windows-amd64.exe, 11.5MB). Rename to "docker-machine" and put it into the same folder. Now add this folder to your PATH: set PATH=%PATH%;C:\docker If you change your standard PATH environment variable, this might safe your from a lot of typing. That's it. Now you're ready to create your first Machine managed Docker Host. Create Your Docker Host With Machine All you need is a simple command: docker-machine create --driver virtualbox dev And the output should state: ←[34mINFO←[0m[0000] Creating SSH key... ←[34mINFO←[0m[0001] Creating VirtualBox VM... ←[34mINFO←[0m[0016] Starting VirtualBox VM... ←[34mINFO←[0m[0022] Waiting for VM to start... ←[34mINFO←[0m[0076] "dev" has been created and is now the active machine. ←[34mINFO←[0m[0076] To point your Docker client at it, run this in your shell: eval "$(docker-machine.exe env dev)" This means, you just created a Docker Host using the VirtualBox provider and the name “dev”. Now you need to find out on which IP address the host is running. docker-machine ip 192.168.99.102 If you want to configure your environment variables, needed by the client more easy, just use the following command: docker-machine env dev export DOCKER_TLS_VERIFY=1 export DOCKER_CERT_PATH="C:\\Users\\markus\\.docker\\machine\\machines\\dev" export DOCKER_HOST=tcp://192.168.99.102:2376 Which outputs the Linux version of environment variable definition. All you have to do is to change the "export" keyword to "set", remove the " and the double back-slashes and you are ready to go. C:\Users\markus\Downloads>set DOCKER_TLS_VERIFY=1 C:\Users\markus\Downloads>set DOCKER_CERT_PATH=C:\Users\markus\.docker\machine\machines\dev C:\Users\markus\Downloads>set DOCKER_HOST=tcp://192.168.99.102:2376 Time to test our Docker Client And here we go now run WildFly on your freshly created host: docker run -it -p 8080:8080 jboss/wildfly Watch the container being downloaded and check, that it is running by redirecting your browser to http://192.168.99.102:8080/. Congratulations on having setup your very first docker host with Maschine on Windows.
May 12, 2015
by Markus Eisele
· 19,835 Views
article thumbnail
Collecting Transaction Per Minute from SQL Server and HammerDB
SQL Server script file can be created to run in a loop collecting for a given amount of time at a specified interval.
May 11, 2015
by Greg Schulz
· 9,995 Views
article thumbnail
Groovy vs Java for Testing [video]
Video and slideshow for Trisha Gee's lecture on whether Groovy is better for testing than Java and why, along with other resources to learn more about testing.
May 10, 2015
by Trisha Gee
· 10,854 Views
article thumbnail
Python: Equivalent to flatMap for Flattening an Array of Arrays
I found myself wanting to flatten an array of arrays while writing some Python code earlier this afternoon and being lazy my first attempt involved building the flattened array manually: episodes = [ {"id": 1, "topics": [1,2,3]}, {"id": 2, "topics": [4,5,6]} ] flattened_episodes = [] for episode in episodes: for topic in episode["topics"]: flattened_episodes.append({"id": episode["id"], "topic": topic}) for episode in flattened_episodes: print episode If we run that we’ll see this output: $ python flatten.py {'topic': 1, 'id': 1} {'topic': 2, 'id': 1} {'topic': 3, 'id': 1} {'topic': 4, 'id': 2} {'topic': 5, 'id': 2} {'topic': 6, 'id': 2} What I was really looking for was the Python equivalent to the flatmap function which I learnt can be achieved in Python with a list comprehension like so: flattened_episodes = [{"id": episode["id"], "topic": topic} for episode in episodes for topic in episode["topics"]] for episode in flattened_episodes: print episode We could also choose to use itertools in which case we’d have the following code: from itertools import chain, imap flattened_episodes = chain.from_iterable( imap(lambda episode: [{"id": episode["id"], "topic": topic} for topic in episode["topics"]], episodes)) for episode in flattened_episodes: print episode We can then simplify this approach a little by wrapping it up in a ‘flatmap’ function: def flatmap(f, items): return chain.from_iterable(imap(f, items)) flattened_episodes = flatmap( lambda episode: [{"id": episode["id"], "topic": topic} for topic in episode["topics"]], episodes) for episode in flattened_episodes: print episode I think the list comprehensions approach still works but I need to look into itertools more – it looks like it could work well for other list operations.
May 9, 2015
by Mark Needham
· 35,457 Views · 1 Like
article thumbnail
8 Questions You Need to Ask About Microservices, Containers & Docker in 2015
In containers and microservices, we’re facing the greatest potential change in how we deliver and run software services since the arrival of virtual machines.
May 9, 2015
by Andrew Phillips
· 14,730 Views · 1 Like
article thumbnail
Break Single Responsibility Principle
Single Responsibility Principle (SRP) is not absolute. It exists to help the code maintainability and readability. But from time to time you may see solutions, patterns that break the SRP and are kind of OK. This is also true for other principles, but this time I would like to talk about SRP. Singleton breaks SRP The oldest and simplest pattern that breaks SRP is the singleton pattern. This pattern restricts the creation of an object so that there is a single instance of a certain class. Many thinks that singleton actually is an antipattern and I also tend to believe that this is better to use some container to manage the lifecycle of the objects than hard coding singletons or other home made factories. The anti pattern-ness of singleton generally comes from the fact that it breaks the SRP. A singleton has two responsibilities: Manage the creation of the instance of the class Do something that is the original responsibility of the class You can easily create a singleton that does not violate SRP keeping the first responsibility and drop the second one public class Singleton { private static final Singleton instance = new Singleton(); public static Singleton getInstance() { return instance; } private Singleton() {} } but there is not much use of such a beast. Singletons are simple and discussed more than enough in blogs. Let me look at something more complex that breaks SRP. Mockito breaks SRP Mockito is a mocking framework, which we usually use in unit tests. I assume that you are familiar with mocking and mockito. A typical test looks like the following: import static org.mockito.Mockito.*; List mockedList = mock(List.class); when(mockedList.get(0)).thenReturn("first"); System.out.println(mockedList.get(0)); mockedList.add("one"); mockedList.clear(); verify(mockedList).add("one"); verify(mockedList).clear(); (sample is taken from the Mockito page, actually mixing two examples). The mock object is created using the static call 1 List mockedList = mock(List.class); and after it is used for three different things: Setup the mock object for its mocking task. Behave as a mock mocking the real life object during testing. Help verification of the mock usage. The call when(mockedList.get(0)).thenReturn("first"); sets up the mock object. The calls System.out.println(mockedList.get(0)); mockedList.add("one"); mockedList.clear(); use the core responsibility of the mock object and finally the lines verify(mockedList).add("one"); verify(mockedList).clear(); act as verification. These are three different tasks not one. I get the point that they are closely related to each other. You can even say that they are just three aspects of a single responsibility. One could argue that verification only uses the mock object as a parameter and it is not the functionality of the mock object. The fact is that the mock object keeps track of its mock usage and acts actively in the verification process behind the scenes. Okay, okay: these all may be true, more or less. The real question is: does it matter? So what? Does the readability of the code of Mockito suffer from treating the SRP this way? Does the usability of the API of Mockito suffer from this? The answer is definite NO for both of the questions. The code is as readable as it gets (imho it is more readable than many other open source projects) but it is not affected by the fact that the mock objects have multiple responsibilities. As for the API you can even say more. It is readable and usable even more with this approach. Former mocking frameworks used strings to specify the method calls like mailer.expects(once()).method("send"); warehouse.expects(once()).method("hasInventory") (fragment from the page), which is less readable and error prone. A typo in the name of the method is discovered test run time instead of compile time. What is the morale? Don’t be dogmatic. Care programming principles, since they are there to help you to write good code. I do not urge anyone to ignore them every day. On the other hand if you feel that some of the principles restrict you and your code would be better without it, do not hesitate to consider writing a code that breaks the principles. Discuss it with your peers (programming is a team work anyway) and come to a conclusion. The conclusion will be that you were wrong considering to break SRP in 90% of the cases. In 10%, however, you may come up with brilliant ideas.
May 8, 2015
by Peter Verhas DZone Core CORE
· 6,782 Views
article thumbnail
Logging Stop-the-world Pauses in JVM
Different events can cause the JVM to pause all the application threads. Such pauses are called Stop-The-World (STW) pauses. The most common cause for an STW pause to be triggered is garbage collection (example in github) , but different JIT actions (example), biased lock revocation (example), certainJVMTI operations , and many more also require the application to be stopped. The points at which the application threads may be safely stopped are called, surprise, safepoints. This term is also often used to refer to all the STW pauses. It is more or less common that GC logs are enabled. However, this does not capture information on all the safepoints. To get it all, use these JVM options: 1.-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime If you are wondering about the naming explicitly referring to GC, don’t be alarmed – turning on these options logs all of the safepoints, not just garbage collection pauses. If you run a following example (source in github) with the flags specified above 01.public class FullGc { 02. private static final Collection
May 7, 2015
by Nikita Salnikov-Tarnovski
· 15,851 Views
article thumbnail
Binding to Data Services with Spring Boot in Cloud Foundry
Written by Dave Syer on the Spring blog In this article we look at how to bind a Spring Boot application to data services (JDBC, NoSQL, messaging etc.) and the various sources of default and automatic behaviour in Cloud Foundry, providing some guidance about which ones to use and which ones will be active under what conditions. Spring Boot provides a lot of autoconfiguration and external binding features, some of which are relevant to Cloud Foundry, and many of which are not. Spring Cloud Connectors is a library that you can use in your application if you want to create your own components programmatically, but it doesn’t do anything “magical” by itself. And finally there is the Cloud Foundry java buildpack which has an “auto-reconfiguration” feature that tries to ease the burden of moving simple applications to the cloud. The key to correctly configuring middleware services, like JDBC or AMQP or Mongo, is to understand what each of these tools provides, how they influence each other at runtime, and and to switch parts of them on and off. The goal should be a smooth transition from local execution of an application on a developer’s desktop to a test environment in Cloud Foundry, and ultimately to production in Cloud Foundry (or otherwise) with no changes in source code or packaging, per the twelve-factor application guidelines. There is some simple source code accompanying this article. To use it you can clone the repository and import it into your favourite IDE. You will need to remove two dependencies from the complete project to get to the same point where we start discussing concrete code samples, namely spring-boot-starter-cloud-connectors and auto-reconfiguration. NOTE: The current co-ordinates for all the libraries being discussed are org.springframework.boot:spring-boot-*:1.2.3.RELEASE,org.springframework.boot:spring-cloud-*-connector:1.1.1.RELEASE,org.cloudfoundry:auto-reconfiguration:1.7.0.RELEASE. TIP: The source code in github includes a docker-compose.yml file (docs here). You can use that to create a local MySQL database if you don’t have one running already. You don’t actually need it to run most of the code below, but it might be useful to validate that it will actually work. Punchline for the Impatient If you want to skip the details, and all you need is a recipe for running locally with H2 and in the cloud with MySQL, then start here and read the rest later when you want to understand in more depth. (Similar options exist for other data services, like RabbitMQ, Redis, Mongo etc.) Your first and simplest option is to simply do nothing: do not define a DataSource at all but put H2 on the classpath. Spring Boot will create the H2 embedded DataSource for you when you run locally. The Cloud Foundry buildpack will detect a database service binding and create a DataSource for you when you run in the cloud. If you add Spring Cloud Connectors as well, your app will also work in other cloud platforms, as long as you include a connector. That might be good enough if you just want to get something working. If you want to run a serious application in production you might want to tweak some of the connection pool settings (e.g. the size of the pool, various timeouts, the important test on borrow flag). In that case the buildpack auto-reconfiguration DataSource will not meet your requirements and you need to choose an alternative, and there are a number of more or less sensible choices. The best choice is probably to create a DataSource explicitly using Spring Cloud Connectors, but guarded by the “cloud” profile: @Configuration @Profile("cloud") public class DataSourceConfiguration { @Bean public Cloud cloud() { return new CloudFactory().getCloud(); } @Bean @ConfigurationProperties(DataSourceProperties.PREFIX) public DataSource dataSource() { return cloud().getSingletonServiceConnector(DataSourceclass, null); } } You can use spring.datasource.* properties (e.g. in application.properties or a profile-specific version of that) to set the additional properties at runtime. The “cloud” profile is automatically activated for you by the buildpack. Now for the details. We need to build up a picture of what’s going on in your application at runtime, so we can learn from that how to make a sensible choice for configuring data services. Layers of Autoconfiguration Let’s take a a simple app with DataSource (similar considerations apply to RabbitMQ, Mongo, Redis): @SpringBootApplication public class CloudApplication { @Autowired private DataSource dataSource; public static void main(String[] args) { SpringApplication.run(CloudApplication.class, args); } } This is a complete application: the DataSource can be @Autowired because it is created for us by Spring Boot. The details of the DataSource (concrete class, JDBC driver, connection URL, etc.) depend on what is on the classpath. Let’s assume that the application uses Spring JDBC via the spring-boot-starter-jdbc (or spring-boot-starter-data-jpa), so it has aDataSource implementation available from Tomcat (even if it isn’t a web application), and this is what Spring Boot uses. Consider what happens when: Classpath contains H2 (only) in addition to the starters: the DataSource is the Tomcat high-performance pool from DataSourceAutoConfiguration and it connects to an in memory database “testdb”. Classpath contains H2 and MySQL: DataSource is still H2 (same as before) because we didn’t provide any additional configuration for MySQL and Spring Boot can’t guess the credentials for connecting. Add spring-boot-starter-cloud-connectors to the classpath: no change inDataSource because the Spring Cloud Connectors do not detect that they are running in a Cloud platform. The providers that come with the starter all look for specific environment variables, which they won’t find unless you set them, or run the app in Cloud Foundry, Heroku, etc. Run the application in “cloud” profile with spring.profiles.active=cloud: no change yet in the DataSource, but this is one of the things that the Java buildpack does when your application runs in Cloud Foundry. Run in “cloud” profile and provide some environment variables to simulate running in Cloud Foundry and binding to a MySQL service: VCAP_APPLICATION={"name":"application","instance_id":"FOO"} VCAP_SERVICES={"mysql":[{"name":"mysql","tags":["mysql"],"credentials":{"uri":"mysql://root:root@localhost/test"}]} (the “tags” provides a hint that we want to create a MySQL DataSource, the “uri” provides the location, and the “name” becomes a bean ID). The DataSource is now using MySQL with the credentials supplied by the VCAP_* environment variables. Spring Boot has some autoconfiguration for the Connectors, so if you looked at the beans in your application you would see a CloudFactory bean, and also the DataSource bean (with ID “mysql”). Theautoconfiguration is equivalent to adding @ServiceScan to your application configuration. It is only active if your application runs in the “cloud” profile, and only if there is no existing @Bean of type Cloud, and the configuration flagspring.cloud.enabled is not “false”. Add the “auto-reconfiguration” JAR from the Java buildpack (Maven co-ordinatesorg.cloudfoundry:auto-reconfiguration:1.7.0.RELEASE). You can add it as a local dependency to simulate running an application in Cloud Foundry, but it wouldn’t be normal to do this with a real application (this is just for experimenting with autoconfiguration). The auto-reconfiguration JAR now has everything it needs to create a DataSource, but it doesn’t (yet) because it detects that you already have a bean of type CloudFactory, one that was added by Spring Boot. Remove the explicit “cloud” profile. The profile will still be active when your app starts because the auto-reconfiguration JAR adds it back again. There is still no change to theDataSource because Spring Boot has created it for you via the @ServiceScan. Remove the spring-boot-starter-cloud-connectors dependency, so that Spring Boot backs off creating a CloudFactory. The auto-reconfiguration JAR actually has its own copy of Spring Cloud Connectors (all the classes with different package names) and it now uses them to create a DataSource (in a BeanFactoryPostProcessor). The Spring Boot autoconfigured DataSource is replaced with one that binds to MySQL via theVCAP_SERVICES. There is no control over pool properties, but it does still use the Tomcat pool if available (no support for Hikari or DBCP2). Remove the auto-reconfiguration JAR and the DataSource reverts to H2. TIP: use web and actuator starters with endpoints.health.sensitive=false to inspect the DataSource quickly through “/health”. You can also use the “/beans”, “/env” and “/autoconfig” endpoints to see what is going in in the autoconfigurations and why. NOTE: Running in Cloud Foundry or including auto-reconfiguration JAR in classpath locally both activate the “cloud” profile (for the same reason). The VCAP_* env vars are the thing that makes Spring Cloud and/or the auto-reconfiguration JAR create beans. NOTE: The URL in the VCAP_SERVICES is actually not a “jdbc” scheme, which should be mandatory for JDBC connections. This is, however, the format that Cloud Foundry normally presents it in because it works for nearly every language other than Java. Spring Cloud Connectors or the buildpack auto-reconfiguration, if they are creating a DataSource, will translate it into a jdbc:* URL for you. NOTE: The MySQL URL also contains user credentials and a database name which are valid for the Docker container created by the docker-compose.yml in the sample source code. If you have a local MySQL server with different credentials you could substitute those. TIP: If you use a local MySQL server and want to verify that it is connected, you can use the “/health” endpoint from the Spring Boot Actuator (included in the sample code already). Or you could create a schema-mysql.sql file in the root of the classpath and put a simple keep alive query in it (e.g. SELECT 1). Spring Boot will run that on startupso if the app starts successfully you have configured the database correctly. The auto-reconfiguration JAR is always on the classpath in Cloud Foundry (by default) but it backs off creating any DataSource if it finds a org.springframework.cloud.CloudFactorybean (which is provided by Spring Boot if the CloudAutoConfiguration is active). Thus the net effect of adding it to the classpath, if the Connectors are also present in a Spring Boot application, is only to enable the “cloud” profile. You can see it making the decision to skip auto-reconfiguration in the application logs on startup: 015-04-14 15:11:11.765 INFO 12727 --- [ main] urceCloudServiceBeanFactoryPostProcessor : Skipping auto-reconfiguring beans of type javax.sql.DataSource 2015-04-14 15:11:57.650 INFO 12727 --- [ main] ongoCloudServiceBeanFactoryPostProcessor : Skipping auto-reconfiguring beans of type org.springframework.data.mongodb.MongoDbFactory 2015-04-14 15:11:57.650 INFO 12727 --- [ main] bbitCloudServiceBeanFactoryPostProcessor : Skipping auto-reconfiguring beans of type org.springframework.amqp.rabbit.connection.ConnectionFactory 2015-04-14 15:11:57.651 INFO 12727 --- [ main] edisCloudServiceBeanFactoryPostProcessor : Skipping auto-reconfiguring beans of type org.springframework.data.redis.connection.RedisConnectionFactory ... etc. Create your own DataSource The last section walked through most of the important autoconfiguration features in the various libraries. If you want to take control yourself, one thing you could start with is to create your own instance of DataSource. You could do that, for instance, using aDataSourceBuilder which is a convenience class and comes as part of Spring Boot (it chooses an implementation based on the classpath): @SpringBootApplication public class CloudApplication { @Bean public DataSource dataSource() { return DataSourceBuilder.create().build(); } ... } The DataSource as we’ve defined it is useless because it doesn’t have a connection URL or any credentials, but that can easily be fixed. Let’s run this application as if it was in Cloud Foundry: with the VCAP_* environment variables and the auto-reconfiguration JAR but not Spring Cloud Connectors on the classpath and no explicit “cloud” profile. The buildpack activates the “cloud” profile, creates a DataSource and binds it to the VCAP_SERVICES. As already described briefly, it removes your DataSource completely and replaces it with a manually registered singleton (which doesn’t show up in the “/beans” endpoint in Spring Boot). Now add Spring Cloud Connectors back into the classpath the application and see what happens when you run it again. It actually fails on startup! What has happened? The@ServiceScan (from Connectors) goes and looks for bound services, and creates bean definitions for them. That’s a bit like the buildpack, but different because it doesn’t attempt to replace any existing bean definitions of the same type. So you get an autowiring error because there are 2 DataSources and no way to choose one to inject into your application in various places where one is needed. To fix that we are going to have to take control of the Cloud Connectors (or simply not use them). Using a CloudFactory to create a DataSource You can disable the Spring Boot autoconfiguration and the Java buildpack auto-reconfiguration by creating your own Cloud instance as a @Bean: @Bean public Cloud cloud() { return new CloudFactory().getCloud(); } @Bean @ConfigurationProperties(DataSourceProperties.PREFIX) public DataSource dataSource() { return cloud().getSingletonServiceConnector(DataSource.class, null); } Pros: The Connectors autoconfiguration in Spring Boot backed off so there is only oneDataSource. It can be tweaked using application.properties via spring.datasource.*properties, per the Spring Boot User Guide. Cons: It doesn’t work without VCAP_* environment variables (or some other cloud platform). It also relies on user remembering to ceate the Cloud as a @Bean in order to disable the autoconfiguration. Summary: we are still not in a comfortable place (an app that doesn’t run without some intricate wrangling of environment variables is not much use in practice). Dual Running: Local with H2, in the Cloud with MySQL There is a local configuration file option in Spring Cloud Connectors, so you don’t have to be in a real cloud platform to use them, but it’s awkward to set up despite being boiler plate, and you also have to somehow switch it off when you are in a real cloud platform. The last point there is really the important one because you end up needing a local file to run locally, but only running locally, and it can’t be packaged with the rest of the application code (for instance violates the twelve factor guidelines). So to move forward with our explicit @Bean definition it’s probably better to stick to mainstream Spring and Spring Boot features, e.g. using the “cloud” profile to guard the explicit creation of a DataSource: @Configuration @Profile("cloud") public class DataSourceConfiguration { @Bean public Cloud cloud() { return new CloudFactory().getCloud(); } @Bean @ConfigurationProperties(DataSourceProperties.PREFIX) public DataSource dataSource() { return cloud().getSingletonServiceConnector(DataSource.class, null); } } With this in place we have a solution that works smoothly both locally and in Cloud Foundry. Locally Spring Boot will create a DataSource with an H2 embedded database. In Cloud Foundry it will bind to a singleton service of type DataSource and switch off the autconfigured one from Spring Boot. It also has the benefit of working with any platform supported by Spring Cloud Connectors, so the same code will run on Heroku and Cloud Foundry, for instance. Because of the @ConfigurationProperties you can bind additional configuration to the DataSource to tweak connection pool properties and things like that if you need to in production. NOTE: We have been using MySQL as an example database server, but actually PostgreSQL is at least as compelling a choice if not more. When paired with H2 locally, for instance, you can put H2 into its “Postgres compatibility” mode and use the same SQL in both environments. Manually Creating a Local and a Cloud DataSource If you like creating DataSource beans, and you want to do it both locally and in the cloud, you could use 2 profiles (“cloud” and “local”), for example. But then you would have to find a way to activate the “local” profile by default when not in the cloud. There is already a way to do that built into Spring because there is always a default profile called “default” (by default). So this should work: @Configuration @Profile("default") // or "!cloud" public class LocalDataSourceConfiguration { @Bean @ConfigurationProperties(DataSourceProperties.PREFIX) public DataSource dataSource() { return DataSourceBuilder.create().build(); } } @Configuration @Profile("cloud") public class CloudDataSourceConfiguration { @Bean public Cloud cloud() { return new CloudFactory().getCloud(); } @Bean @ConfigurationProperties(DataSourceProperties.PREFIX) public DataSource dataSource() { return cloud().getSingletonServiceConnector(DataSource.class, null); } } The “default” DataSource is actually identical to the autoconfigured one in this simple example, so you wouldn’t do this unless you needed to, e.g. to create a custom concreteDataSource of a type not supported by Spring Boot. You might think it’s all getting a bit complicated, but in fact Spring Boot is not making it any harder, we are just dealing with the consequences of needing to control the DataSource construction in 2 environments. Using a Non-Embedded Database Locally If you don’t want to use H2 or any in-memory database locally, then you can’t really avoid having to configure it (Spring Boot can guess a lot from the URL, but it will need that at least). So at a minimum you need to set some spring.datasource.* properties (the URL for instance). That that isn’t hard to do, and you can easily set different values in different environments using additional profiles, but as soon as you do that you need to switch off the default values when you go into the cloud. To do that you could define thespring.datasource.* properties in a profile-specific file (or document in YAML) for the “default” profile, e.g. application-default.properties, and these will not be used in the “cloud” profile. A Purely Declarative Approach If you prefer not to write Java code, or don’t want to use Spring Cloud Connectors, you might want to try and use Spring Boot autoconfiguration and external properties (or YAML) files for everything. For example Spring Boot creates a DataSource for you if it finds the right stuff on the classpath, and it can be completely controlled through application.properties, including all the granular features on the DataSource that you need in production (like pool sizes and validation queries). So all you need is a way to discover the location and credentials for the service from the environment. The buildpack translates Cloud Foundry VCAP_*environment variables into usable property sources in the Spring Environment. Thus, for instance, a DataSource configuration might look like this: spring.datasource.url: ${cloud.services.mysql.connection.jdbcurl:jdbc:h2:mem:testdb} spring.datasource.username: ${cloud.services.mysql.connection.username:sa} spring.datasource.password: ${cloud.services.mysql.connection.password:} spring.datasource.testOnBorrow: true The “mysql” part of the property names is the service name in Cloud Foundry (so it is set by the user). And of course the same pattern applies to all kinds of services, not just a JDBCDataSource. Generally speaking it is good practice to use external configuration and in particular @ConfigurationProperties since they allow maximum flexibility, for instance to override using System properties or environment variables at runtime. Note: similar features are provided by Spring Boot, which provides vcap.services.*instead of cloud.services.*, so you actually end up with more than one way to do this. However, the JDBC urls are not available from the vcap.services.* properties (non-JDBC services work fine with tthe corresponding vcap.services.*credentials.url). One limitation of this approach is it doesn’t apply if the application needs to configure beans that are not provided by Spring Boot out of the box (e.g. if you need 2 DataSources), in which case you have to write Java code anyway, and may or may not choose to use properties files to parameterize it. Before you try this yourself, though, beware that actually it doesn’t work unless you also disable the buildpack auto-reconfiguration (and Spring Cloud Connectors if they are on the classpath). If you don’t do that, then they create a new DataSource for you and Spring Boot cannot bind it to your properties file. Thus even for this declarative approach, you end up needing an explicit @Bean definition, and you need this part of your “cloud” profile configuration: @Configuration @Profile("cloud") public class CloudDataSourceConfiguration { @Bean public Cloud cloud() { return new CloudFactory().getCloud(); } } This is purely to switch off the buildpack auto-reconfiguration (and the Spring Boot autoconfiguration, but that could have been disabled with a properties file entry). Mixed Declarative and Explicit Bean Definition You can also mix the two approaches: declare a single @Bean definition so that you control the construction of the object, but bind additional configuration to it using@ConfigurationProperties (and do the same locally and in Cloud Foundry). Example: @Configuration public class LocalDataSourceConfiguration { @Bean @ConfigurationProperties(DataSourceProperties.PREFIX) public DataSource dataSource() { return DataSourceBuilder.create().build(); } } (where the DataSourceBuilder would be replaced with whatever fancy logic you need for your use case). And the application.properties would be the same as above, with whatever additional properties you need for your production settings. A Third Way: Discover the Credentials and Bind Manually Another approach that lends itself to platform and environment independence is to declare explicit bean definitions for the @ConfigurationProperties beans that Spring Boot uses to bind its autoconfigured connectors. For instance, to set the default values for a DataSourceyou can declare a @Bean of type DataSourceProperties: @Bean @Primary public DataSourceProperties dataSourceProperties() { DataSourceProperties properties = new DataSourceProperties(); properties.setInitialize(false); return properties; } This sets a default value for the “initialize” flag, and allows other properties to be bound fromapplication.properties (or other external properties). Combine this with the Spring Cloud Connectors and you can control the binding of the credentials when a cloud service is detected: @Autowired(required="false") Cloud cloud; @Bean @Primary public DataSourceProperties dataSourceProperties() { DataSourceProperties properties = new DataSourceProperties(); properties.setInitialize(false); if (cloud != null) { List infos = cloud.getServiceInfos(RelationalServiceInfo.class); if (infos.size()==1) { RelationalServiceInfo info = (RelationalServiceInfo) infos.get(0); properties.setUrl(info.getJdbcUrl()); properties.setUsername(info.getUserName()); properties.setPassword(info.getPassword()); } } return properties; } and you still need to define the Cloud bean in the “cloud” profile. It ends up being quite a lot of code, and is quite unnecessary in this simple use case, but might be handy if you have more complicated bindings, or need to implement some logic to choose a DataSource at runtime. Spring Boot has similar *Properties beans for the other middleware you might commonly use (e.g. RabbitProperties, RedisProperties, MongoProperties). An instance of such a bean marked as @Primary is enough to reset the defaults for the autoconfigured connector. Deploying to Multiple Cloud Platforms So far, we have concentrated on Cloud Foundry as the only cloud platform in which to deploy the application. One of the nice features of Spring Cloud Connectors is that it supports other platforms, either out of the box or as extension points. Thespring-boot-starter-cloud-connectors even includes Heroku support. If you do nothing at all, and rely on the autoconfiguration (the lazy programmer’s approach), then your application will be deployable in all clouds where you have a connector on the classpath (i.e. Cloud Foundry and Heroku if you use the starter). If you take the explicit @Bean approach then you need to ensure that the “cloud” profile is active in the non-Cloud Foundry platforms, e.g. through an environment variable. And if you use the purely declarative approach (or any combination involving properties files) you need to activate the “cloud” profile and probably also another profile specific to your platform, so that the right properties files end up in theEnvironment at runtime. Summary of Autoconfiguration and Provided Behaviour Spring Boot provides DataSource (also RabbitMQ or Redis ConnectionFactory, Mongo etc.) if it finds all the right stuff on the classpath. Using the “spring-boot-starter-*” dependencies is sufficient to activate the behaviour. Spring Boot also provides an autowirable CloudFactory if it finds Spring Cloud Connectors on the classpath (but switches off only if it finds a @Bean of type Cloud). The CloudAutoConfiguration in Spring Boot also effectively adds a @CloudScan to your application, which you would want to switch off if you ever needed to create your ownDataSource (or similar). The Cloud Foundry Java buildpack detects a Spring Boot application and activates the “cloud” profile, unless it is already active. Adding the buildpack auto-reconfiguration JAR does the same thing if you want to try it locally. Through the auto-reconfiguration JAR, the buildpack also kicks in and creates aDataSource (ditto RabbitMQ, Redis, Mongo etc.) if it does not find a CloudFactory bean or a Cloud bean (amongst others). So including Spring Cloud Connectors in a Spring Boot application switches off this part of the “auto-reconfiguration” behaviour (the bean creation). Switching off the Spring Boot CloudAutoConfiguration is easy, but if you do that, you have to remember to switch off the buildpack auto-reconfiguration as well if you don’t want it. The only way to do that is to define a bean definition (can be of type Cloud orCloudFactory for instance). Spring Boot binds application.properties (and other sources of external properties) to@ConfigurationProperties beans, including but not limited to the ones that it autoconfigures. You can use this feature to tweak pool properties and other settings that need to be different in production environments. General Advice and Conclusion We have seen quite a few options and autoconfigurations in this short article, and we’ve only really used thee libraries (Spring Boot, Spring Cloud Connectors, and the Cloud Foundry buildpack auto-reconfiguration JAR) and one platform (Cloud Foundry), not counting local deployment. The buildpack features are really only useful for very simple applications because there is no flexibility to tune the connections in production. That said it is a nice thing to be able to do when prototyping. There are only three main approaches if you want to achieve the goal of deploying the same code locally and in the cloud, yet still being able to make necessary tweaks in production: Use Spring Cloud Connectors to explicitly create DataSource and other middleware connections and protect those @Beans with @Profile("cloud"). The approach always works, but leads to more code than you might need for many applications. Use the Spring Boot default autoconfiguration and declare the cloud bindings usingapplication.properties (or in YAML). To take full advantage you have to expliccitly switch off the buildpack auto-reconfiguration as well. Use Spring Cloud Connectors to discover the credentials, and bind them to the Spring Boot@ConfigurationProperties as default values if present. The three approaches are actually not incompatible, and can be mixed using@ConfigurationProperties to provide profile-specific overrides of default configuration (e.g. for setting up connection pools in a different way in a production environment). If you have a relatively simple Spring Boot application, the only way to choose between the approaches is probably personal taste. If you have a non-Spring Boot application then the explicit @Bean approach will win, and it may also win if you plan to deploy your application in more than one cloud platform (e.g. Heroku and Cloud Foundry). NOTE: This blog has been a journey of discovery (who knew there was so much to learn?). Thanks go to all those who helped with reviews and comments, in particularScott Frederick, who spotted most of the mistakes in the drafts and always had time to look at a new revision.
May 6, 2015
by Pieter Humphrey DZone Core CORE
· 26,559 Views · 2 Likes
article thumbnail
Adding Version Details on MANIFEST.MF via Jenkins
Article Purpose: The following article will suggest a solution for adding custom information to your web application using manifest.mf file and expose that information in API. The necessity for this solution came to solve "blindness" in deployment. in simpler words, I want to know what war i deployed, what build version it has and more useful information i might need. Since i exposed the information via API, this could function as application "health-check", so this is a nice little addition for the solution. Techs: war (your web application) mvn - build tool maven-war-plugin Jenkins job Step 1 - Adding maven war plugin First you should add the maven war plugin to your pom.xml. In the tag you should add the custom information you need. in case you want to use the default information the plugin gives you out of the box, just mark the value of as true. In the following example you can see i added fields like buildVersion, build time and so on. This build tool is going to provide this data, in this case jenkins. org.apache.maven.plugins maven-war-plugin false ${project.version} ${build.number} ${maven.build.timestamp} ${agent.name} ${user.name} ${[yourWarName].war.finalName} Step 2 - Define maven variables The properties you want to add should be provided from "out side", you should add variables in your pom so Jenkins assign the values there. In the following example below, you can see the build.number variable i defined in the previous step. SNAPSHOT Step 3 - Adding maven goals to your Jenkins job definitions Jenkins has available environment variables you can pass to the maven goal, such as BUILD_NUMBER, BUILD_ID and so on. assign the Jenkins variable to the maven variable in the build step. when Jenkins is going to build the code, it going to sign the manifest,mf file with the custom information we wanted. Step 4 - Exposing the information of war We did all this hard work for one purpose. so we will have this information available in deployment stage. you should expose via API this information. in this example i used Spring MVC controller, and mapped the data to json format but you can have your pick. package your.package; @Controller @RequestMapping(value = "/version", produces = MediaType.APPLICATION_JSON_VALUE) public class VersionController { @Autowired ApplicationContext applicationContext; @RequestMapping(method = RequestMethod.GET) @ResponseBody public JSONObject getVersion() { JSONObject result = new JSONObject(); Resource resource = applicationContext.getResource("/META-INF/MANIFEST.MF"); try { Manifest manifest = new Manifest(resource.getInputStream()); if (manifest != null){ Attributes mainAttributes = manifest.getMainAttributes(); if(mainAttributes != null){ for (Object key : mainAttributes.keySet()) { result.put(key, mainAttributes.get(key)); } } } } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } return result; } } Step 5 - Enjoy :)
May 5, 2015
by Amazia Gur
· 17,776 Views
article thumbnail
How to Create Multiple Workspaces with NetBeans
Examples Windows: netbeans.exe --userdir C:\MyOtherUserdir --cachedir "%HOME%\Locale Settings\Application Data\NetBeans\7.1\cache" Unix: ./netbeans --userdir ~/my-other-userdir Mac OS: /Applications/NetBeans.app/Contents/MacOS/executable --userdir ~/my-other-userdir Ref: http://wiki.netbeans.org/FaqAlternateUserdir
May 3, 2015
by Zemian Deng
· 8,312 Views
article thumbnail
How To: Neo4j Data Import - Minimal Example
The easiest way to import data from relational or legacy systems, like plain CSV files without headers, into Neo4j with Cypher and a graph model.
April 30, 2015
by Michael Hunger
· 7,483 Views
article thumbnail
Introduction to Probabilistic Data Structures
When processing large data sets, we often want to do some simple checks, such as number of unique items, most frequent items, and whether some items exist in the data set. The common approach is to use some kind of deterministic data structure like HashSet or Hashtable for such purposes. But when the data set we are dealing with becomes very large, such data structures are simply not feasible because the data is too big to fit in the memory. It becomes even more difficult for streaming applications which typically require data to be processed in one pass and perform incremental updates. Probabilistic data structures are a group of data structures that are extremely useful for big data and streaming applications. Generally speaking, these data structures use hash functions to randomize and compactly represent a set of items. Collisions are ignored but errors can be well-controlled under certain threshold. Comparing with error-free approaches, these algorithms use much less memory and have constant query time. They usually support union and intersection operations and therefore can be easily parallelized. This article will introduce three commonly used probabilistic data structures: Bloom filter, HyperLogLog, and Count-Min sketch. Membership Query - Bloom filter A Bloom filter is a bit array of m bits initialized to 0. To add an element, feed it to k hash functions to get k array position and set the bits at these positions to 1. To query an element, feed it to k hash functions to obtain k array positions. If any of the bits at these positions is 0, then the element is definitely not in the set. If the bits are all 1, then the element might be in the set. A Bloom filter with 1% false positive rate only requires 9.6 bits per element regardless of the size of the elements. For example, if we have inserted x, y, z into the bloom filter, with k=3 hash functions like the picture above. Each of these three elements has three bits each set to 1 in the bit array. When we look up for w in the set, because one of the bits is not set to 1, the bloom filter will tell us that it is not in the set. Bloom filter has the following properties: False positive is possible when the queried positions are already set to 1. But false negative is impossible. Query time is O(k). Union and intersection of bloom filters with same size and hash functions can be implemented with bitwise OR and AND operations. Cannot remove an element from the set. Bloom filter requires the following inputs: m: size of the bit array n: estimated insertion p: false positive probability The optimum number of hash functions k can be determined using the formula: Given false positive probability p and the estimated number of insertions n, the length of the bit array can be calculated as: The hash functions used for bloom filter should generally be faster than cryptographic hash algorithms with good distribution and collision resistance. Commonly used hash functions for bloom filter include Murmur hash, fnv series of hashes and Jenkins hashes. Murmur hash is the fastest among them. MurmurHash3 is used by Google Guava library's bloom filter implementation. Cardinality - HyperLogLog HyperLogLog is a streaming algorithm used for estimating the number of distinct elements (the cardinality) of very large data sets. HyperLogLog counter can count one billion distinct items with an accuracy of 2% using only 1.5 KB of memory. It is based on the bit pattern observation that for a stream of randomly distributed numbers, if there is a number x with the maximum of leading 0 bits k, the cardinality of the stream is very likely equal to 2^k. For each element si in the stream, hash function h(si) transforms si into string of random bits (0 or 1 with probability of 1/2): The probability P of the bit patterns: 0xxxx... → P = 1/2 01xxx... → P = 1/4 001xx... → P = 1/8 The intuition is that when we are seeing prefix 0k 1..., it's likely there are n ≥ 2k+1 different strings. By keeping track of prefixes 0k 1... that have appeared in the data stream, we can estimate the cardinality to be 2p, where p is the length of the largest prefix. Because the variance is very high when using single counter, in order to get a better estimation, data is split into m sub-streams using the first few bits of the hash. The counters are maintained by m registers each has memory space of multiple of 4 bytes. If the standard deviation for each sub-stream is σ, then the standard deviation for the averaged value is only σ/√m. This is called stochastic averaging. For instance for m=4, The elements are split into m stream using the first 2 bits (00, 01, 10, 11) which are then discarded. Each of the register stores the rest of the hash bits that contains the largest 0k 1 prefix. The values in the m registers are then averaged to obtain the cardinality estimate. HyperLogLog algorithm uses harmonic mean to normalize result. The algorithm also makes adjustment for small and very large values. The resulting error is equal to 1.04/√m. Each of the m registers uses at most log2log2 n + O(1) bits when cardinalities ≤ n need to be estimated. Union of two HyperLogLog counters can be calculated by first taking the maximum value of the two counters for each of the m registers, and then calculate the estimated cardinality. Frequency - Count-Min Sketch Count-Min sketch is a probabilistic sub-linear space streaming algorithm. It is somewhat similar to bloom filter. The main difference is that bloom filter represents a set as a bitmap, while Count-Min sketch represents a multi-set which keeps a frequency distribution summary. The basic data structure is a two dimensional d x w array of counters with d pairwise independent hash functions h1 ... hd of range w. Given parameters (ε,δ), set w = [e/ε], and d = [ln1/δ]. ε is the accuracy we want to have and δ is the certainty with which we reach the accuracy. The two dimensional array consists of wd counts. To increment the counts, calculate the hash positions with the d hash functions and update the counts at those positions. The estimate of the counts for an item is the minimum value of the counts at the array positions determined by the d hash functions. The space used by Count-Min sketch is the array of w*d counters. By choosing appropriate values for d and w, very small error and high probability can be achieved. Example of Count-Min sketch sizes for different error and probability combination: ε 1 - δ w d wd 0.1 0.9 28 3 84 0.1 0.99 28 5 140 0.1 0.999 28 7 196 0.01 0.9 272 3 816 0.01 0.99 272 5 1360 0.01 0.999 272 7 1940 0.001 0.999 2719 7 19033 Count-Min sketch has the following properties: Union can be performed by cell-wise ADD operation O(k) query time Better accuracy for higher frequency items (heavy hitters) Can only cause over-counting but not under-counting Count-Min sketch can be used for querying single item count or "heavy hitters" which can be obtained by keeping a heap structure of all the counts. Summary Probabilistic data structures have many applications in modern web and data applications where the data arrives in a streaming fashion and needs to be processed on the fly using limited memory. Bloom filter, HyperLogLog, and Count-Min sketch are the most commonly used probabilistic data structures. There are a lot of research on various streaming algorithms, synopsis data structures and optimization techniques that are worth investigating and studying. If you haven't tried these data structures, you will be amazed how powerful they can be once you start using them. It may be a little bit intimidating to understand the concept initially, but the implementation is actually quite simple. Google Guava has Bloom filter implementation using murmur hash. Clearspring's Java library stream-lib and Twitter's Scala library Algebird have implementation for all three data structures and other useful data structures that you can play with. I have included the links below. Links http://bigsnarf.wordpress.com/2013/02/08/probabilistic-data-structures-for-data-analytics/ http://en.wikipedia.org/wiki/Bloom_filter http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/40671.pdf http://research.neustar.biz/2012/10/25/sketch-of-the-day-hyperloglog-cornerstone-of-a-big-data-infrastructure/ http://dimacs.rutgers.edu/~graham/pubs/papers/cm-full.pdf http://www.moneyscience.com/pg/blog/ThePracticalQuant/read/438348/realtime-analytics-hokusai-adds-a-temporal-component-to-countmin-sketch http://people.cs.umass.edu/~mcgregor/711S12/sketches1.pdf https://github.com/addthis/stream-lib https://github.com/twitter/algebird
April 30, 2015
by Yin Niu
· 34,998 Views · 6 Likes
article thumbnail
Linux - Simulating Degraded Network Conditions
When testing an application or service, it can be very useful to simulate degraded network conditions. This allows you to see how they might perform for users.
April 30, 2015
by Corey Goldberg
· 4,863 Views
article thumbnail
Gradle Goodness: Use Git Commit Id in Build Script
The nice thing about Gradle is that we can use Java libraries in our build script. This way we can add extra functionality to our build script in an easy way. We must use the classpath dependency configuration for our build script to include the library. For example we can include the library Grgit, which provides an easy way to interact with Git from Java or Groovy code. This library is also the basis for the Gradle Git plugin. In the next example build file we add the Grgit library to our build script classpath. Then we use the open method of the Grgit class. From the returned object we invoke the head to get the commit id identified as id. With the abbreviatedId property we get the shorter version of the Git commit id. The build file also includes the application plugin. We customize the applicationDistribution CopySpec from the plugin and expand the properties in a VERSION file. This way our distribution always includes a plain text file VERSION with the Git commit id of the code. buildscript { repositories { jcenter() } dependencies { // Add dependency for build script, // so we can access Git from our // build script. classpath 'org.ajoberstar:grgit:1.1.0' } } apply plugin: 'java' apply plugin: 'application' ext { // Open the Git repository in the current directory. git = org.ajoberstar.grgit.Grgit.open(file('.')) // Get commit id of HEAD. revision = git.head().id // Alternative is using abbreviatedId of head() method. // revision = git.head().abbreviatedId } // Use abbreviatedId commit id in the version. version = "2.0.1.${git.head().abbreviatedId}" // application plugin extension properties. mainClassName = 'sample.Hello' applicationName = 'sample' // Customize applicationDistribution // CopySpec from application plugin extension. applicationDistribution.with { from('src/dist') { include 'VERSION' expand( buildDate: new Date(), // Use revision with Git commit id: revision : revision, version : project.version, appName : applicationName) } } // Contents for src/dist/VERSION: /* Version: ${version} Revision: ${revision} Build-date: ${buildDate.format('dd-MM-yyyy HH:mm:ss')} Application-name: ${appName} */ assemble.dependsOn installDist When we run the build task for our project we get the following contents in our VERSION file: Version: 2.0.1.e2ab261 Revision: e2ab2614011ff4be18c03e4dc1f86ab9ec565e6c Build-date: 22-04-2015 13:53:31 Application-name: sample Written with Gradle 2.3.
April 28, 2015
by Hubert Klein Ikkink
· 11,125 Views · 1 Like
article thumbnail
Implementing Filter and Bakery Locks in Java
In order to understand how locks work, implementing custom locks is a good way. This post will show how to implement Filter and Bakery locks at Java (which are spin locks) and will compare their performances with Java's ReentrantLock. Filter and Bakery locks satisfies mutual exclusion and are starvation free algorithms also, Bakery lock is a first-come-first-served lock [1]. For performance testing, a counter value is incremented up to 10000000 with different lock types, different number of threads and different number of times. Test system configuration is: Intel Core I7 (has 8 cores – 4 of them are real), Ubuntu 14.04 LTS and Java 1.7.0_60. Filter lock has n-1 levels which maybe considered as “waiting rooms”. A thread must traverse this waiting rooms before acquiring the lock. There are two important properties for levels [2]: 1) At least one thread trying to enter level l succeeds. 2) If more than one thread is trying to enter level l, then at least one is blocked (i.e., continues to wait at that level). Filter lock is implemented as follows: /** * @author Furkan KAMACI */ public class Filter extends AbstractDummyLock implements Lock { /* Due to Java Memory Model, int[] not used for level and victim variables. Java programming language does not guarantee linearizability, or even sequential consistency, when reading or writing fields of shared objects [The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.61.] */ private AtomicInteger[] level; private AtomicInteger[] victim; private int n; /** * Constructor for Filter lock * * @param n thread count */ public Filter(int n) { this.n = n; level = new AtomicInteger[n]; victim = new AtomicInteger[n]; for (int i = 0; i < n; i++) { level[i] = new AtomicInteger(); victim[i] = new AtomicInteger(); } } /** * Acquires the lock. */ @Override public void lock() { int me = ConcurrencyUtils.getCurrentThreadId(); for (int i = 1; i < n; i++) { level[me].set(i); victim[i].set(me); for (int k = 0; k < n; k++) { while ((k != me) && (level[k].get() >= i && victim[i].get() == me)) { //spin wait } } } } /** * Releases the lock. */ @Override public void unlock() { int me = ConcurrencyUtils.getCurrentThreadId(); level[me].set(0); } } Bakery lock algorithm maintains the first-come-first-served property by using a distributed version of the number-dispensing machines often found in bakeries: each thread takes a number in the doorway, and then waits until no thread with an earlier number is trying to enter it [3]. Bakery lock is implemented as follows: /** * @author Furkan KAMACI */ public class Bakery extends AbstractDummyLock implements Lock { /* Due to Java Memory Model, int[] not used for level and victim variables. Java programming language does not guarantee linearizability, or even sequential consistency, when reading or writing fields of shared objects [The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.61.] */ private AtomicBoolean[] flag; private AtomicInteger[] label; private int n; /** * Constructor for Bakery lock * * @param n thread count */ public Bakery(int n) { this.n = n; flag = new AtomicBoolean[n]; label = new AtomicInteger[n]; for (int i = 0; i < n; i++) { flag[i] = new AtomicBoolean(); label[i] = new AtomicInteger(); } } /** * Acquires the lock. */ @Override public void lock() { int i = ConcurrencyUtils.getCurrentThreadId(); flag[i].set(true); label[i].set(findMaximumElement(label) + 1); for (int k = 0; k < n; k++) { while ((k != i) && flag[k].get() && ((label[k].get() < label[i].get()) || ((label[k].get() == label[i].get()) && k < i))) { //spin wait } } } /** * Releases the lock. */ @Override public void unlock() { flag[ConcurrencyUtils.getCurrentThreadId()].set(false); } /** * Finds maximum element within and {@link java.util.concurrent.atomic.AtomicInteger} array * * @param elementArray element array * @return maximum element */ private int findMaximumElement(AtomicInteger[] elementArray) { int maxValue = Integer.MIN_VALUE; for (AtomicInteger element : elementArray) { if (element.get() > maxValue) { maxValue = element.get(); } } return maxValue; } } For such kind of algorithms, it should be provided or used a thread id system which starts from 0 or 1 and increments one by one. Threads' names set appropriately for that purpose. It should also be considererd that: Java programming language does not guarantee linearizability, or even sequential consistency, when reading or writing fields of shared objects [4]. So, level and victim variables for Filter lock, flag and label variables for Bakery lock defined as atomic variables. For one, who wants to test effects of Java Memory Model can change that variables into int[] and boolean[] and run algorithm with more than 2 threads. Than, can see that algorithm will hang for either Filter or Bakery even threads are alive. To test algorithm performances, a custom counter class implemented which has a getAndIncrement method as follows: /** * gets and increments value up to a maximum number * * @return value before increment if it didn't exceed a defined maximum number. Otherwise returns maximum number. */ public long getAndIncrement() { long temp; lock.lock(); try { if (value >= maxNumber) { return value; } temp = value; value = temp + 1; } finally { lock.unlock(); } return temp; } There is a maximum number barrier to fairly test multiple application configurations. Consideration is that: there is a piece amount of work (incrementing a variable up to a desired number) and with different number of threads how fast you can finish it. So, for comparison, there should be a “job” equality. This approach also tests unnecessary work load with that piece of code: if (value >= maxNumber) { return value; } for multiple threads when it is compared an approach that calculating unit work performance of threads (i.e. does not putting a maximum barrier, iterating in a loop up to a maximum number and than dividing last value to thread number). This configuration used for performance comparison: Threads 1,2,3,4,5,6,7,8 Retry Count 20 Maximum Number 10000000 This is the chart of results which includes standard errors: First of all, when you run a block of code within Java several time, there is an internal optimization for codes. When algorithm is run multiple times and first output compared to second output this optimization's effect can be seen. First elapsed time mostly should be greater than second line because of that. For example: currentTry = 0, threadCount = 1, maxNumber = 10000000, lockType = FILTER, elapsedTime = 500 (ms) currentTry = 1, threadCount = 1, maxNumber = 10000000, lockType = FILTER, elapsedTime = 433 (ms) Conclusion: From the chart, it can bee seen that Bakery lock is faster than Filter Lock with a low standard error. Reason is Filter Lock's lock method. At Bakery Lock, as a faired approach threads runs one by one but at Filter Lock they computes with each other. Java's ReentrantLock has best when compared to others. On the other hand Filter Lock gets worse linearly but Bakery and ReentrantLock are not (Filter lock may have a linear graphic when it run with much more threads). More thread count does not mean less elapsed time. 2 threads maybe worse than 1 thread because of thread creating and locking/unlocking. When thread count starts to increase, elapsed time gets better for Bakery and ReentrantLock. However when thread count keep going to increase than it gets worse. Reason is real core number of the test computer which runs algorithms. Source code for implementing filter and bakery locks in Java can be downloaded from here: https://github.com/kamaci/filbak [1] The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.31.-33. [2] The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.28. [3] The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.31. [4] The Art of Multiprocessor Programming. Maurice Herlihy, Nir Shavit, 2008, pp.61.
April 28, 2015
by Furkan Kamaci
· 8,826 Views · 2 Likes
article thumbnail
uniVocity-parsers: A powerful CSV/TSV/Fixed-width file parser library for Java
uniVocity-parsers is an open-source project CSV/TSV/Fixed-width file parser library in Java, providing many capabilities to read/write files with simplified API, and powerful features as shown below. Unlike other libraries out there, uniVocity-parsers built its own architecture for parsing text files, which focuses on maximum performance and flexibility while making it easy to extend and build new parsers. Contents Overview Installation Features Overview Reading CSV/TSV/Fixed-width Files Writing CSV/TSV/Fixed-width Files Performance and Flexibility Design and Implementations 1. Overview I'm a Java developer working on a web-based system to evaluate telecommunication carriers' network and work out reports. In the system, the CSV format was heavily involved for the network-related data, such as real-time network status (online/offline) for the broadband subscribers, and real-time traffic for each subscriber. Generally the size of a single CSV file would exceed 1GB, with millions of rows included. And we were using the library JavaCSV as the CSV file parser. As growth in the capacity of carriers' network and the time duration our system monitors, the size of data in CSV increased so much. My team and I have to work out a solution to achieve better performance (even in seconds) in CSV files processing, and better extendability to provide much more customized functionality. We came across this library uniVocity-parsers as a final solution after a lot of testing and analysis, and we found it great. In addition of better performance and extendability, the library provides developers with simplified APIs, detailed documents & tutorials and commercial support for highly customized functionality. This project is hosted at Github with 62 stars & 8 forks (at the time of writing). Tremendous documents & tutorials are provided at here and here. You can find more examples and news here as well. In addition, the well-known open-source project Apache Camel integrates uniVocity-parsers for reading and writing CSV/TSV/Fixed-width files. Find more details here. 2. Installation I'm using version 1.5.1 , but refer to the official download page to see if there's a more recent version available. The project is also available in the maven central repository, so you can add this to your pom.xml: com.univocity univocity-parsers 1.5.1 3. Features Overview uniVocity-parsers provides a list of powerful features, which can fulfill all requirements you might have for processing tabular presentations of data. Check the following overview chart for the features: 4. Reading Tabular Presentations Data Read all rows of a csv CsvParser parser = new CsvParser(new CsvParserSettings()); List allRows = parser.parseAll(getReader("/examples/example.csv")); For full list of demos in reading features, refer to: https://github.com/uniVocity/univocity-parsers#reading-csv 5. Writing Tabular Presentations Data Write data in CSV format with just 2 lines of code: List rows = someMethodToCreateRows(); CsvWriter writer = new CsvWriter(outputWriter, new CsvWriterSettings()); writer.writeRowsAndClose(rows); For full list of demos in writing features, refer to: https://github.com/uniVocity/univocity-parsers/blob/master/README.md#writing 6. Performance and Flexibility Here is the performance comparison we tested for uniVocity-parsers and JavaCSV in our system: File size Duration for JavaCSV parsing Duration for uniVocity-parsers parsing 10MB, 145453 rows 1138ms 836ms 100MB, 809008 rows 23s 6s 434MB, 4499959 rows 91s 28s 1GB, 23803502 rows 245s 70s Here are some performance comparison tables for almost all CSV parsers libraries in existence. And you can find that uniVocity-parsers got significantly ahead of other libraries in performance. uniVocity-parsers achieved its purpose in performance and flexibility with the following mechanisms: Read input on separate thread (enable by invoking CsvParserSettings.setReadInputOnSeparateThread()) Concurrent row processor (refer to ConcurrentRowProcessor which implements RowProcessor) Extend ColumnProcessor to process columns with your own business logic Extend RowProcessor to read rows with your own business logic 7. Design and Implementations A bunch of processors in uniVocity-parsers are core modules, which are responsible for reading/writing data in rows and columns, and execute data conversions. Here is the diagram of processors: You can create your own processors easily by implementing the RowProcessor interface or extending the provided implementations. In the following example I simply used an anonymous class: CsvParserSettings settings = new CsvParserSettings(); settings.setRowProcessor(new RowProcessor() { /** * initialize whatever you need before processing the first row, with your own business logic **/ @Override public void processStarted(ParsingContext context) { System.out.println("Started to process rows of data."); } /** * process the row with your own business logic **/ StringBuilder stringBuilder = new StringBuilder(); @Override public void rowProcessed(String[] row, ParsingContext context) { System.out.println("The row in line #" + context.currentLine() + ": "); for (String col : row) { stringBuilder.append(col).append("\t"); } } /** * After all rows were processed, perform any cleanup you need **/ @Override public void processEnded(ParsingContext context) { System.out.println("Finished processing rows of data."); System.out.println(stringBuilder); } }); CsvParser parser = new CsvParser(settings); List allRows = parser.parseAll(new FileReader("/myFile.csv")); The library offers a whole lot more features. I recommend you to have a look as it really made a difference in our project.
April 27, 2015
by Jerry Joe
· 8,516 Views
article thumbnail
Diagnosing SST Errors with Percona XtraDB Cluster for MySQL
[This article was written by Stephane Combaudon] State Snapshot Transfer (SST) is used in Percona XtraDB Cluster (PXC) when a new node joins the cluster or to resync a failed node if Incremental State Transfer (IST) is no longer available. SST is triggered automatically but there is no magic: If it is not configured properly, it will not work and new nodes will never be able to join the cluster. Let’s have a look at a few classic issues. Port for SST is not open The donor and the joiner communicate on port 4444, and if the port is closed on one side, SST will always fail. You will see in the error log of the donor that SST is started: [...] 141223 16:08:48 [Note] WSREP: Node 2 (node1) requested state transfer from '*any*'. Selected 0 (node3)(SYNCED) as donor. 141223 16:08:48 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 6) 141223 16:08:48 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 141223 16:08:48 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.234.101:4444/xtrabackup_sst' --auth 'sstuser:s3cret' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '04c085a1-89ca-11e4-b1b6-6b692803109b:6'' [...] But then nothing happens, and some time later you will see a bunch of errors: [...] 2014/12/23 16:09:52 socat[2965] E connect(3, AF=2 192.168.234.101:4444, 16): Connection timed out WSREP_SST: [ERROR] Error while getting data from donor node: exit codes: 0 1 (20141223 16:09:52.057) WSREP_SST: [ERROR] Cleanup after exit with status:32 (20141223 16:09:52.064) WSREP_SST: [INFO] Cleaning up temporary directories (20141223 16:09:52.068) 141223 16:09:52 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.234.101:4444/xtrabackup_sst' --auth 'sstuser:s3cret' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid '04c085a1-89ca-11e4-b1b6-6b692803109b:6' [...] On the joiner side, you will see a similar sequence: SST is started, then hangs and is finally aborted: [...] 141223 16:08:48 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 6) 141223 16:08:48 [Note] WSREP: Requesting state transfer: success, donor: 0 141223 16:08:49 [Note] WSREP: (f9560d0d, 'tcp://0.0.0.0:4567') turning message relay requesting off 141223 16:09:52 [Warning] WSREP: 0 (node3): State transfer to 2 (node1) failed: -32 (Broken pipe) 141223 16:09:52 [ERROR] WSREP: gcs/src/gcs_group.cpp:long int gcs_group_handle_join_msg(gcs_group_t*, const gcs_recv_msg_t*)():717: Will never receive state. Need to abort. The solution is of course to make sure that the ports are open on both sides. SST is not correctly configured Sometimes you will see an error like this on the donor: 141223 21:03:15 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.234.102:4444/xtrabackup_sst' --auth 'sstuser:s3cretzzz' --socket '/var/lib/mysql/mysql.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --gtid 'e63f38f2-8ae6-11e4-a383-46557c71f368:0'' [...] WSREP_SST: [ERROR] innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log (20141223 21:03:26.973) And if you look at innobackup.backup.log: 41223 21:03:26 innobackupex: Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_file=/etc/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock' as 'sstuser' (using password: YES). innobackupex: got a fatal error with the following stacktrace: at /usr//bin/innobackupex line 2995 main::mysql_connect('abort_on_error', 1) called at /usr//bin/innobackupex line 1530 innobackupex: Error: Failed to connect to MySQL server: DBI connect(';mysql_read_default_file=/etc/my.cnf;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock','sstuser',...) failed: Access denied for user 'sstuser'@'localhost' (using password: YES) at /usr//bin/innobackupex line 2979 What happened? The default SST method is xtrabackup-v2 and for it to work, you need to specify a username/password in the my.cnf file: [mysqld] wsrep_sst_auth=sstuser:s3cret And you also need to create the corresponding MySQL user: mysql> GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost' IDENTIFIED BY 's3cret'; So you should check that the user has been correctly created in MySQL and that wsrep_sst_auth is correctly set. Galera versions do not match Here is another set of errors you may see in the error log of the donor: 141223 21:14:27 [Warning] WSREP: unserialize error invalid flags 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():101 141223 21:14:30 [Warning] WSREP: unserialize error invalid flags 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():101 141223 21:14:33 [Warning] WSREP: unserialize error invalid flags 2: 71 (Protocol error) at gcomm/src/gcomm/datagram.hpp:unserialize():101 Here the issue is that you try to connect a node using Galera 2.x and a node running Galera 3.x. This can happen if you try to use a PXC 5.5 node and a PXC 5.6 node. The right solution is probably to understand why you ended up with such inconsistent versions and make sure all nodes are using the same Percona XtraDB Cluster version and Galera version. But if you know what you are doing, you can also instruct the node using Galera 3.x that it will communicate with Galera 2.x nodes by specifying in the my.cnf file: [mysqld] wsrep_provider_options="socket.checksum=1" Conclusion SST errors can have multiple reasons for occurring, and the best way to diagnose the issue is to have a look at the error log of the donor and the joiner. Galera is in general quite verbose so you can follow the progress of SST on both nodes and see where it fails. Then it is mostly about being able to interpret the error messages.
April 27, 2015
by Peter Zaitsev
· 11,560 Views
  • Previous
  • ...
  • 737
  • 738
  • 739
  • 740
  • 741
  • 742
  • 743
  • 744
  • 745
  • 746
  • ...
  • Next

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: