DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Tools Topics

article thumbnail
You Should Use JAXB Generated Classes for RESTful Web Services
Learn how to build RESTful web services in Spring that can utilize either JSON or XML.
August 10, 2015
by John Thompson
· 13,178 Views · 5 Likes
article thumbnail
Use Your RasperryPi to Decode Sound on FM Frequencies
Learn how to configure rtl_fm and Direwolf for Decoding Amateur Radio Packet on the Raspberry Pi
August 9, 2015
by Kevin Hooke
· 3,629 Views · 2 Likes
article thumbnail
ZooKeeper for Microservice Registration and Discovery
Learn how to use the service registration and discovery services in ZooKeeper to manage microservices when refactoring from an existing monolithic application.
August 6, 2015
by Arun Gupta
· 39,499 Views · 7 Likes
article thumbnail
Liquibase: Git for the Database
Liquibase is a great tool, and comparable to Git for databases. While it might not technically be a source control system, it's packed with similar functionality.
August 5, 2015
by Nathan Voxland
· 9,978 Views · 2 Likes
article thumbnail
Docker – How to SSH to a Running Container
Learn how to install SSH to a Docker container and how to SSH to other Docker containers.
August 3, 2015
by Ajitesh Kumar
· 61,446 Views · 4 Likes
article thumbnail
This Week in Modern Software: State of DevOps 2015
Read about the state of DevOps, including Puppet Labs' 2015 report, cloud computing, and Apple Watches.
July 27, 2015
by Fredric Paul
· 2,569 Views
article thumbnail
Unit Testing w/ JUnit Using Maven and IntelliJ - Pt.1
Take your first steps in using JUnit to unit test your Java code with the help of tools like Maven, a build tool, and IntelliJ IDEA, a popular IDE.
July 13, 2015
by John Thompson
· 59,676 Views · 5 Likes
article thumbnail
Docker in Action: The Shared Memory Namespace
In this article, excerpted from the book Docker in Action, I will show you how to open access to shared memory between containers. Linux provides a few tools for sharing memory between processes running on the same computer. This form of inter-process communication (IPC) performs at memory speeds. It is often used when the latency associated with network or pipe based IPC drags software performance below requirements. The best examples of shared memory based IPC usage is in scientific computing and some popular database technologies like PostgreSQL. Docker creates a unique IPC namespace for each container by default. The Linux IPC namespace partitions shared memory primitives like named shared memory blocks and semaphores, as well as message queues. It is okay if you are not sure what these are. Just know that they are tools used by Linux programs to coordinate processing. The IPC namespace prevents processes in one container from accessing the memory on the host or in other containers. Sharing IPC Primitives Between Containers I’ve created an image named allingeek/ch6_ipc that contains both a producer and consumer. They communicate using shared memory. Listing 1 will help you understand the problem with running these in separate containers. Listing 1: Launch a Communicating Pair of Programs # start a producer docker -d -u nobody --name ch6_ipc_producer \ allingeek/ch6_ipc -producer # start the consumer docker -d -u nobody --name ch6_ipc_consumer \ allingeek/ch6_ipc -consumer Listing 1 starts two containers. The first creates a message queue and starts broadcasting messages on it. The second should pull from the message queue and write the messages to the logs. You can see what each is doing by using the following commands to inspect the logs of each: docker logs ch6_ipc_producer docker logs ch6_ipc_consumer If you executed the commands in Listing 1 something should be wrong. The consumer never sees any messages on the queue. Each process used the same key to identify the shared memory resource but they referred to different memory. The reason is that each container has its own shared memory namespace. If you need to run programs that communicate with shared memory in different containers, then you will need to join their IPC namespaces with the --ipc flag. The --ipc flag has a container mode that will create a new container in the same IPC namespace as another target container. Listing 2: Joining Shared Memory Namespaces # remove the original consumer docker rm -v ch6_ipc_consumer # start a new consumer with a joined IPC namespace docker -d --name ch6_ipc_consumer \ --ipc container:ch6_ipc_producer \ allingeek/ch6_ipc -consumer Listing 2 rebuilds the consumer container and reuses the IPC namespace of the ch6_ipc_producer container. This time the consumer should be able to access the same memory location where the server is writing. You can see this working by using the following commands to inspect the logs of each: docker logs ch6_ipc_producer docker logs ch6_ipc_consumer Remember to cleanup your running containers before moving on: # remember: the v option will clean up volumes, # the f option will kill the container if it is running, # and the rm command takes a list of containers docker rm -vf ch6_ipc_producer ch6_ipc_consumer There are obvious security implications to reusing the shared memory namespaces of containers. But this option is available if you need it. Sharing memory between containers is safer alternative to sharing memory with the host.
July 9, 2015
by Jeff Nickoloff
· 39,325 Views · 1 Like
article thumbnail
Playing with Percona XtraDB Cluster in Docker
[This article was written by Sveta Smirnova] Like any good, thus lazy, engineer I don’t like to start things manually. Creating directories, configuration files, specify paths, ports via command line is too boring. I wrote already how I survive in case when I need to start MySQL server (here). There is also the MySQL Sandbox which can be used for the same purpose. But what to do if you want to start Percona XtraDB Cluster this way? Fortunately we, at Percona, have engineers who created automation solution for starting PXC. This solution uses Docker. To explore it you need: Clone the pxc-docker repository:git clone https://github.com/percona/pxc-docker Install Docker Compose as described here cd pxc-docker/docker-bld Follow instructions from the README file: a) ./docker-gen.sh 5.6 (docker-gen.sh takes a PXC branch as argument, 5.6 is default, and it looks for it on github.com/percona/percona-xtradb-cluster) b) Optional: docker-compose build (if you see it is not updating with changes). c) docker-compose scale bootstrap=1 members=2 for a 3 node cluster Check which ports assigned to containers: $docker port dockerbld_bootstrap_1 3306 0.0.0.0:32768 $docker port dockerbld_members_1 4567 0.0.0.0:32772 $docker port dockerbld_members_2 4568 0.0.0.0:32776 Now you can connect to MySQL clients as usual: $mysql -h 0.0.0.0 -P 32768 -uroot Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 10 Server version: 5.6.21-70.1 MySQL Community Server (GPL), wsrep_25.8.rXXXX Copyright (c) 2009-2015 Percona LLC and/or its affiliates Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql> 6. To change MySQL options either pass it as a mount at runtime with something like volume: /tmp/my.cnf:/etc/my.cnf in docker-compose.yml or connect to container’s bash (docker exec -i -t container_name /bin/bash), then change my.cnf and run docker restart container_name Notes. If you don’t want to build use ready-to-use images If you don’t want to run Docker Compose as root user add yourself to docker group
July 3, 2015
by Peter Zaitsev
· 4,881 Views
article thumbnail
Git Workflows: The 4 Major Types
Git offers several types of workflows. Learn what they are and which type is best suited for your specific purpose.
July 3, 2015
by Madhuka Udantha
· 34,680 Views · 2 Likes
article thumbnail
Using Camel, CDI Inside Kubernetes With Fabric8
Learn about how to integrate Apache Camel and Fabric8 into an existing Kubernetes CDI service.
July 2, 2015
by Ioannis Canellos
· 19,672 Views · 1 Like
article thumbnail
Azure Service Bus – As I Understand It: Part II (Queues & Messages)
continuing from my previous post about azure service bus, in this post i will share my learning about queues & messages. the focus of this post will be about some of the undocumented things i found as we implemented support for queues and messages in cloud portam . queues as mentioned in my previous post, queues is the simplest of the azure service bus service and kind of compares with azure storage queue service in the sense that it provides a unidirectional messaging infrastructure where a publisher publishes a message and the message is received by a receiver. there can be many receivers ready to receive the messages however one receiver can only receive a message. no two receivers can receive a single message simultaneously. now some learning about queues. queue name a queue name can be up to 260 characters in length and can contain letters, numbers, periods (.), hyphens (-), and underscores (_) . a queue name is case-insensitive. queue size when creating a queue, you must define the size of the queue. queue size could be one of the following values: 1 gb, 2 gb, 3 gb, 4 gb or 5 gb . a queue size can’t be changed once the queue is created. however if you create a “ partition enabled queue ” then service bus creates 16 partitions thus your queue size is automatically multiplied by 16 and your queue size becomes 16 gb, 32 gb, 48 gb, 64 gb or 80 gb depending on the size you selected (this confused me initially :)). queue properties a service bus queue has many properties. some of the properties can only be set during queue creation time while some of the properties can only be set if you are using “standard” tier of service bus. (above are the screenshots from cloud portam for creating a queue) status indicates the status of a queue – active or disabled . once a queue is disabled, it cannot send or receive messages. max delivery count (maxdeliverycount) indicates the maximum number of times a message can be delivered . once this count has exceeded, message will either be removed from the queue or dead-lettered. the way i understand it is this property is used to manage poison messages. if a message is not processed successfully by receivers for “x” number of times, just move it somewhere else for further inspection or remove it. message time to live (messagettl) indicates a time span for which a message will live inside a queue . if the message is not processed by that time, it will either be removed or dead-lettered. one interesting thing i noticed is that if you’re using “standard” tier, a message could live forever in a queue however in “basic” tier, a message can only live for a maximum of 14 days . lock duration (lockduration) indicates number of seconds for which a message will be locked by a receiver once it receives it so that no other receiver can receive that message . it essentially gives the receiver time to process the message. once this elapses, message will be available to be received by another receiver. maximum value for lock duration can be 5 minutes / 300 seconds . enable partitioning (enablepartitioning) indicates if the queue should be partitioned across multiple message brokers . as mentioned above, service bus automatically creates 16 partitions if this is enabled. this also results in maximum size of the queue increase by a factor of 16. this property can only be set during queue creation time . enable deadlettering (enabledeadlettering) indicates if the messages in the queue should be moved to dead-letter sub queue once they expire. if this property is not set, then the messages will be removed from the queue once they expire. enable batching (enablebatchedoperations) indicates if server-side batched operations are supported. this is used to improve the throughput of a queue as service bus holds the messages for up to 20ms before writing/deleting them in a batch. enable message ordering (supportordering) indicates if the queue supports ordering. requires duplicate detection (requiresduplicatedetection) indicates if the queue requires duplicate detection. this property can only be set during queue creation time and is only available for “standard” tier. enable express (enableexpress) indicates if the queue is an express queue. an express queue holds a message in memory temporarily before writing it to persistent storage. this property can only be set during queue creation time and is only available for “standard” tier. requires session (requiressession) indicates if the queue supports the concept of session. this property can only be set during queue creation time and is only available for “standard” tier. auto delete queue this property specifies a time period after which an idle queue should be deleted automatically by service bus . minimum period allowed is 5 minutes. this can only be set for “standard” tier . duplicate detection history time window (duplicatedetectionhistorytimewindow) defines the duration of the duplicate detection history. this can only be set for “standard” tier . forward messages to queue/topic (forwardto) you can use this property to automatically forward messages from a queue to another queue or topic. when setting this property, the queue/topic must exist in the account. this can only be set for “standard” tier . forward dead-lettered messages to queue/topic (forwarddeadletteredmessagesto) you can use this property to automatically forward dead-lettered message to another queue or topic. when setting this property, the queue/topic must exist in the account. user metadata (usermetadata) you can use this property to define any custom metadata for a queue. following table summarizes property applicability by tier and whether they are editable or not. property tier editable? size basic, standard no status basic, standard yes max delivery count basic, standard yes message time to live basic, standard yes lock duration basic, standard yes enable partitioning basic, standard no enable deadlettering basic, standard yes enable batching basic, standard yes enable message ordering basic, standard yes requires duplicate detection standard no enable express standard no require session standard no auto delete queue standard yes duplicate detection history time window standard yes forward messages to queue/topic standard yes forward dead-lettered messages to queue/topic basic, standard yes user metadata basic, standard yes to learn more about these properties, please see this link: https://msdn.microsoft.com/en-us/library/microsoft.servicebus.messaging.queuedescription.aspx . messages the way i see it, messages are the entities that contain information about the work a sender wants a receiver to do. as mentioned earlier, a sender sends a message to a queue and a receiver will receive the message. at any time, a message will be received by one and only one receiver. message processing there’re two ways by which a receiver will receive a message: peek and lock & receive and delete . peek and lock in peek and lock mode, the message is locked by the receiver for a duration specified by queue’s “ lock duration ” property or in other words under this mode a message is hidden from other receivers for a duration specified by lock duration. the receiver then would process the message and after that a receiver would mark the message as “ complete ” which essentially deletes the message from the queue. if the “lock duration” expires, other receivers will be able to fetch this message. receive and delete in receive and delete mode, once the message is received by a receiver it will be deleted from the queue automatically. if a receiver fails to process that message, then the message is lost forever. so unless you’re sure of receiver’s functionality that it will never fail or you don’t care if the message is processed successfully or not, use this mode cautiously. message composition a message in service bus consists of 3 things – message body, standard properties and custom properties. message body is the actual content of the message. there are some predefined properties of a message and those fall under standard properties. apart from that you can define custom properties on a message which are essentially a collection of name/value pairs. total size of a message is 256 kb. message properties now let’s take a look at some of the standard properties of a message that i found interesting. message id this is the identifier of a message. you can set it at the time of sending a message. because it is an identifier, one would assume that it needs to be unique but that’s not the case. different messages can have same message id. sequence number when a message is created, service bus assigns a number to a message. that number is stored in this property. please note that it is a read-only property. message time to live (message ttl) this is the time period for which a message will remain in the queue. if you recall, you can also define a default message time-to-live at queue level also. service bus actually picks the lower of the two values as message ttl. for example, if you have defined that a message will expire after 14 days at queue level but 5 minutes at the message level then the message will expire after 5 minutes. lock token whenever a message is received by a receiver in “ peek and lock ” mode, service bus returns a (lock) token that must be used to perform further operations (e.g. delete message or dead-letter message etc.) on that message. this token is valid for a duration specified by “ lock duration ” property. after the lock duration expires, the lock token becomes invalid and any attempt to use this token for performing any allowed operations will result in an error. once a lock token expires, a receiver must receive the message again. there are other properties as well which i have not included for the sake of brevity. for a complete list of properties, please see this link: https://msdn.microsoft.com/en-us/library/microsoft.servicebus.messaging.brokeredmessage_properties.aspx . summary that’s it for this post. in the next posts in this series, i will share my learning about topics and other service bus services. so stay tuned for that! again, if you think that i have provided some incorrect information, please let me know and i will fix them asap.
July 2, 2015
by Gaurav Mantri
· 8,625 Views
article thumbnail
Annoucing More Docker Support
It's a big week with Dockercon going on, and we have some great updates. At the show, we are demoing UrbanCode Build and Deploy build containers, storing them in registries, and deploying them out through test environments and production across hybrid clouds. Check out this quick overview from the team: For a deep dive on any of it, find the guys at the IBM booth at Dockercon. They'll be happy to show you!
July 2, 2015
by Eric Minick
· 1,534 Views · 1 Like
article thumbnail
Interoute Virtual Data Centre is the fastest transatlantic cloud service
Double the throughput and lower latency than the leading global cloud providers between the US and Europe in independent comparison research London & New York, 1 July, 2015. Interoute has today announced that its global cloud platform Interoute Virtual Data Centre (VDC), has been proven to deliver nearly double the throughput across the Atlantic than the next best cloud provider in comparison research conducted by Cloud Spectator. The research from March 2015 compared Interoute VDC with three leading cloud providers (Amazon AWS, Rackspace and Microsoft Azure), testing network throughput and latency between Europe and USA and between providers' European data centres. In all of the comparisons, Interoute VDC demonstrated the highest throughputs and lowest latencies. Cloud Spectator's full research report, and more information about Interoute VDC's performance and features, can be viewed here: http://bit.ly/1GHyzwJ Network performance is a significant factor in cloud computing for business services requiring the highest network capacity (throughput) and the shortest possible time from the server to the client (latency), to meet the needs of the businesses and their users. Innovating new applications and business services in the cloud needs network performance to match and this report shows the advantages of building the cloud into a huge global high performance network. Key research findings: Transatlantic: Interoute VDC delivered 1.1 Gbit/s throughput, which was 96% better than Amazon AWS, 141% better than Rackspace, and 195% better than Microsoft Azure. Interoute VDC had the lowest latency, between its London and New York data centres. Interoute was the only provider in the comparison with both of its transatlantic data centres located in key business cities, meaning that VDC users can access compute and storage resources, and deliver data to their customers, from two centres of European and US business activity. Within Europe: Interoute VDC achieved 1.3 Gbit/s throughput between its London and Amsterdam data centres. This was 52% better than Amazon AWS (Dublin - Frankfurt) and 73% better than Microsoft Azure (Dublin - Amsterdam) Interoute VDC achieved a latency of 6 milliseconds between London and Amsterdam, over three times better than the inter-data centre latency of the comparison providers. Matthew Finnie, CTO of Interoute, commented: "This independent report confirms and validates our networked cloud strategy. Building cloud into a world class network provides our customers with significantly better performance when compared with the traditional cloud models. Businesses looking to grow between Europe and US should definitely be looking at the importance of these network characteristics for their ability to shift workloads into the cloud. Interoute's fourteen global zones are all built into high performance network with over 300 interconnects in Europe alone. So wherever you choose to put your data and connect to us, your services are typically going to perform faster on Interoute than on many other global providers." Danny Gee, Senior Analyst, Cloud Spectator: "Users want to transfer large amounts of data between data centres quickly. Our study revealed that for a trans-Atlantic connection between cloud data centers, Interoute provided the highest throughput and lowest latency out of AWS, Rackspace and Azure. Interoute also had the higher network throughput and lowest latency in European testing compared to Azure and AWS (Rackspace was excluded, having only one location in Europe), making it a good option for users operating servers within this region. Interoute also provided the best latency, ideal for real-time communications. Users running geographically dispersed environments for such things as geo-redundancy would benefit from Interoute's high performance cloud connectivity."
July 1, 2015
by Fran Cator
· 1,123 Views
article thumbnail
DevOps Tools for Continuous Delivery: Workloads Distribution and Jenkins Installation
the vast majority of software development companies have to place a great emphasis on the process of continuous integration and rapid delivery of new versions of their product. obviously, when supplying enterprise-level projects, such processes need to be automated as much as possible. and this is when the cloud devops tools come in handy. thus, in today’s article we’d like to pay a special attention to the devops tools that automate the continuous integration and delivery within the jelastic paas that can be installed on any bare metal or cloud infrastructure as virtual private cloud or hybrid cloud. this is a pretty complex example of enterprise application life cycle with continuous integration and seamless migration throughout devops pipeline from development to several productions (you can use simplified process if you have less complex project ). the instruction below will be useful for jelastic cluster administrators such as systems integrators, hosting service providers, enterprises, and isv customers, who can easily implement it at their jelastic cloud installations. nevertheless, this guide contains plenty of features and continuous integration tips described, which can be interesting for different developers. so, let’s get started with the first part of the instruction! setting up dedicated user groups first of all, you need to allocate separate hardware sets for all your project teams (one per each development phase, i.e. development > testing > production ) and adjust the access permissions to make them completely isolated and not influenced by others. the multi-regions for a hybrid cloud option, that became available within the recently released jelastic 3.3 version , is optimally suited for this task. to start with, create three hardware node groups (within one region) and name them after the corresponding stages for more convenience (e.g. dev , test , production ). the next step is to prepare three user groups and attach them to the corresponding hardware – in our case the dev group has access to the dev hardware node group only, qa – to the test one, and ops should work specifically with the production set. in such a way, users from the appropriate groups can use the specified sets of hardware only, but at the same time – they have a possibility to transfer their environments throughout the whole platform, between different teams’ accounts. jenkins continuous integration server configuration now we need the integration tool, that will control and perform all of the required operations automatically, i.e. build the cloud devops pipeline. our choice fell on jenkins as one of the most popular solutions used for this goal – it can be easily installed from our marketplace either at the corresponding site page or directly via the dashboard . as a result, you’ll get the pure jenkins installed, which should be properly adjusted before you start organizing your application life cycle. thus, select the open in browser button and proceed with the following configurations steps: while at the home page, click on the manage jenkins option at the left-hand menu and select the manage plugins link within the appeared list. after you’ve been redirected to the plugin manager, switch to the available tab, find the following plugins using the search filter field above and tick them for installation: git plugin – is required for building our project’s source (stored at the github repository) envfile plugin – is used for storing system environment variables (its necessity is driven by security restrictions, implemented at jelastic, which forbid the direct exporting of environment variables from the tomcat server) click install without restart when ready. during the installation process, tick the restart jenkins when installation is complete and no jobs are running option to automatically restart jenkins for enabling the chosen plugins. then, you also need to install maven, which will be used for building the project. for that, navigate to the manage jenkins > configure system menu, scroll down to the maven section and click add maven. within the expanded section, type the desired name for your maven installation (e.g. maven ) and save the changes using the same-named button at the bottom of the page. in such a way, this tool will be also automatically installed when required (i.e. during the first app build). now your jenkins server is well-staffed for the further work. add deployment process scripts to the jenkins container the next step is to upload the scripts that you are going to use for automating different organizational actions, required to be applied to your application at the intermediate development life cycle phases (like deploying, placing it to the appropriate hardware according to the stage, running auto-test, etc). the easiest way to do this is to access your jenkins container via the jelastic ssh-gateway. in the case you haven’t performed similar operations before, you need to: generate an ssh keypair add your public ssh key to the dashboard access your account via ssh protocol once inside, create a new folder for your project (we’ll use demo ) and move in there: mkdir /opt/tomcat/demo cd /opt/tomcat/demo this location can be used for storing your scripts, variables, logs etc. here, you can upload the required scripts using the command of the following type: curl -fssl {link_to_script} -o {file_name} we also provide the set of script examples, which can be used as templates for your own ones: install.sh – gets a user session and creates a new environment via the jelastic api according to the specified manifest file. it also defines, that the name of this environment will be equal to its creation date and time (as a unique name is required for every script execution, but you won’t be able to set it manually as this operation would be run automatically). however, you can set your own dynamic name pattern to be used here transfer.sh – changes the environment ownership based on the jelastic environment transferring feature migrate.sh – physically moves an environment to another hardware set (hardnode group) note: that before the appliance, each of the script templates, presented above, have to be additionally adjusted to make them work properly within a particular jelastic installation. thus, the list of parameters below should be obligatory substituted according to your platform’s settings: /path/to/scripts/ – the full path to your scripts folder (created in the previous step) {cloud_domain} – your jelastic platform domain name {jca_dashboard_appid} – your dashboard id, that could be seen within the platform.dashboard_appid parameter at the jca > about section {jca_appstore_appid} – appstore id, listed within the same section (at the platform.appstore_appid parameter) {url_to_manifest} – link to the manifest file created according to our documentation (you may also use this one as an example – it sets up two tomcat application servers with the nginx load-balancer in front of them) note: above you can see one more runtest.sh script uploaded – it simulates the testing activities for demonstration purposes, thus we don’t provide its code in this tutorial. if required, create your own one according the specifics of your application and upload it alongside the rest of the scripts. in addition, you need to create a separate file for storing the variable with environment name (as it needs to be dynamically changed each time a new environment is created): echo env_name= > /opt/tomcat/demo/variables these are the main steps of preparation to achieve automatic continuous integration and delivery of your web application with a help of jenkins within jelastic cloud platform. in the second part of these blog series, we’ll configure the set of jobs at the jenkins server, which represents the core of our automation. each of them will be devoted for a particular operation, required to be run at the corresponding application life cycle phase: create environment > build and deploy > dev tests > migrate to qa > qa tests > migrate to production stay tuned to see the next steps. if you still don’t have jelastic installation, contact us to get access to our free demo for cloud platform evaluation or just start with trial registration at one of our hosting partners .
June 30, 2015
by Tetiana Markova
· 3,124 Views · 1 Like
article thumbnail
Wrangling the Different Docker APIs
[This article was written by Alex Harford.] Docker APIs are a convenient way for your systems to talk to Docker infrastructure. But sometimes there are challenges associated with them. I've outlined in this blog the steps you need to take and the items you need to look out for when working with Docker APIs. Initial Docker Setup Ensure you have the latest Docker client installed. It should be v1.6 or newer. [alexh:~/work] docker pull ubuntu latest: Pulling from ubuntu 428b411c28f0: Pull complete 435050075b3f: Pull complete 9fd3c8c9af32: Pull complete 6d4946999d4f: Already exists ubuntu:latest: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:45e42b43f2ff4850dcf52960ee89c21cda79ec657302d36faaaa07d880215dd9 Status: Downloaded newer image for ubuntu:latest [alexh:~/work] docker run -ti ubuntu /bin/bash root@1092e8ca2ead:/# ps PID TTY TIME CMD 1 ? 00:00:00 bash 14 ? 00:00:00 ps root@1092e8ca2ead:/# exit exit Daemons, Registries, Hubs The Docker registry is used to host docker images for download. In the most simple case, it can be a process serving static images. This would be a read-only registry supporting GET operations only. If you need something more complex, you need to use a Docker registry web service. You can [a target="_blank" href="http://www.activestate.com/blog/2014/01/deploying-your-own-private-docker-registry"]run your own private Docker registry or use the public official Docker Hub. The Docker Hub contains a Docker registry, but also includes other features, like user authentication. In our examples, we will run an unauthenticated Docker registry. Setup If you are using standard Docker images, most people will pull from the Docker Hub, which is a publically accessible Docker registry. However, a more complicated service may be talking to private Docker registries running different versions of the API. Let’s assemble a test environment with both versions of the docker registry API so we can see the different ways you can access it. First, pull down two versions of the docker registry from the Docker Hub: docker pull registry:0.9.1 0.9.1: Pulling from registry e9e06b06e14c: Pull complete a82efea989f9: Pull complete 37bea4ee0c81: Pull complete 07f8e8c5e660: Pull complete 1f4ab7282e19: Pull complete 3c27027cdae8: Pull complete 7e0e5314436e: Pull complete 2696504d3685: Pull complete 012772dbb1c6: Pull complete e24d9fce1d00: Pull complete fd2726a79da8: Pull complete bffc32d7113a: Pull complete 0cd49aa0e23c: Pull complete 4e698fa80441: Already exists registry:0.9.1: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:98937757728eecbd72c9276bf711260aa29896f15217ce05be0562287e73232d Status: Downloaded newer image for registry:0.9.1 [alexh:~/work] docker pull registry:2.0.1 2.0.1: Pulling from registry 39bb80489af7: Pull complete df2a0347c9d0: Pull complete 7a3871ba15f8: Pull complete a2703ed272d7: Pull complete 68769176e114: Pull complete ab2ab59d7d1b: Pull complete 882ecee9f360: Pull complete 40de65f8e79f: Pull complete 0c4f9c7d798f: Pull complete ca29675fe853: Pull complete 89d10e9463e5: Pull complete 1a5aa415e484: Pull complete 3ea7a9e93b04: Pull complete 769d811a57fd: Pull complete ae8a4a3af1aa: Pull complete 85cc9a791bb5: Pull complete 9cd2c8646022: Pull complete 048c32c549b9: Pull complete cbbbda28c189: Pull complete 2602c005e534: Pull complete 136beb445cfa: Pull complete 0c5e5ef1d7da: Already exists registry:2.0.1: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:0cd177d687589aff586aa2c66c64d1c25657b8d09cff9e1492192f496e7786c3 Status: Downloaded newer image for registry:2.0.1 The next step is to start them. We will start the v1 registry on port 5000, and the v2 registry on port 6000. The v1 registry occasionally fails when starting due to a lock file race condition, so tell it to restart if necessary. [alexh:~/work] docker run -p 5000:5000 -d --restart=on-failure:3 registry:0.9.1 896c651b9bfa9780b14e3710d20428baab8497c30b9bc89946b192e1d1c145aa [alexh:~/work] docker run -p 6000:5000 -d registry:2.0.1 e09d4204921c732879ee9b7544cd40a25275e0d1f1702cacd954412cfd586ffb [alexh:~/work] docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES e09d4204921c registry:2.0.1 "registry cmd/regist 4 seconds ago Up 3 seconds 0.0.0.0:6000->5000/tcp silly_albattani 896c651b9bfa registry:0.9.1 "docker-registry" 35 seconds ago Up 34 seconds 0.0.0.0:5000->5000/tcp jovial_leakey Understanding Docker Namespaces Docker has a concept of namespaces for its repositories which can be confusing. [a target="_blank" href="https://docs.docker.com/docker-hub/official_repos/"]Official Repositories can be referred to without a username prefix: CentOS Ubuntu Internally these are prefixed by library/. This means that command like docker pull ubuntu:15.10 and docker pull library/ubuntu:15.10 are equivalent. If the name includes a '/' character (samalba/docker-registry), the left side refers to the username, and the right side refers to the image name in their public repository. It gets more complex when accessing private registries. The format becomes HOST:PORT/[USERNAME/]IMAGE. However, you should note that there is no authentication performed at this layer of our docker registry environment: anyone can push, pull, or delete from any 'user'. If the USERNAME is omitted, it is internally treated as being an 'official' image, and prefixed with library/. docker pull 127.0.0.1:5000/library/test-ubuntu Pulling repository 127.0.0.1:5000/library/test-ubuntu FATA[0004] Error: image library/test-ubuntu:latest not found [alexh:~/work] docker tag 0fe5a10d2cf8 127.0.0.1:5000/test-ubuntu [alexh:~/work] docker push 127.0.0.1:5000/test-ubuntu The push refers to a repository [127.0.0.1:5000/test-ubuntu] (len: 1) Sending image list Pushing repository 127.0.0.1:5000/test-ubuntu (1 tags) Image 5c1d0c04c3b8 already pushed, skipping Image 8c63e4ac9a5f already pushed, skipping Image 5fc05c0feaea already pushed, skipping Image 0fe5a10d2cf8 already pushed, skipping Pushing tag for rev [0fe5a10d2cf8] on {http://127.0.0.1:5000/v1/repositories/test-ubuntu/tags/latest} [alexh:~/work] docker pull 127.0.0.1:5000/library/test-ubuntu Pulling repository 127.0.0.1:5000/library/test-ubuntu 0fe5a10d2cf8: Download complete 5c1d0c04c3b8: Download complete 8c63e4ac9a5f: Download complete 5fc05c0feaea: Download complete Status: Image is up to date for 127.0.0.1:5000/library/test-ubuntu:latest In the v2 Docker registry, the [a target="_blank" href="https://docs.docker.com/registry/spec/api/#overview"]URI scheme has changed to allow the repository name to be broken up into multiple components. However, the Docker client does not yet support this flexibility. In the future, you should be able to extend the namespace of your registries, ie `redhat/centos/beta or redhat/fedora/stable. Populating the Registries We'll use Ubuntu 15.10 as our example image: docker pull ubuntu:15.10 15.10: Pulling from ubuntu 5c1d0c04c3b8: Pull complete 8c63e4ac9a5f: Pull complete 5fc05c0feaea: Pull complete 0fe5a10d2cf8: Already exists ubuntu:15.10: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security. Digest: sha256:d569b6ebfc62f35f9792392724bd4a74a4f5f5af10ccbc1880974ae2f0660898 Status: Downloaded newer image for ubuntu:15.10 It needs to be tagged with the new URL in order to push it to the private registries: [alexh:~/work] docker tag ubuntu:15.10 127.0.0.1:5000/ubuntu:15.10 [alexh:~/work] docker tag ubuntu:15.10 127.0.0.1:6000/ubuntu:15.10 [alexh:~/work] docker push 127.0.0.1:5000/ubuntu:15.10 The push refers to a repository [127.0.0.1:5000/ubuntu] (len: 1) Sending image list Pushing repository 127.0.0.1:5000/ubuntu (1 tags) 5c1d0c04c3b8: Image successfully pushed 8c63e4ac9a5f: Image successfully pushed 5fc05c0feaea: Image successfully pushed 0fe5a10d2cf8: Image successfully pushed Pushing tag for rev [0fe5a10d2cf8] on {http://127.0.0.1:5000/v1/repositories/ubuntu/tags/15.10} [alexh:~/work] docker push 127.0.0.1:6000/ubuntu:15.10 The push refers to a repository [127.0.0.1:6000/ubuntu] (len: 1) 0fe5a10d2cf8: Image already exists 5fc05c0feaea: Image successfully pushed 8c63e4ac9a5f: Image successfully pushed 5c1d0c04c3b8: Image successfully pushed Digest: sha256:1f93077ce8f2fa1da8aae87735f395eae93a1c21928d3e2d130717c9aeff177d Note that the output between the v1 registry (on port 5000) and v2 (port 6000) are slightly different, but the result is the same: the Ubuntu image is now available on each registry. Docker Registry APIs At this point, we're able to compare the different APIs. In April 2015, Docker [a target="_blank" href="http://docs.docker.com/v1.6/release-notes/"]released version 1.6 and this included v2 of the Registry. Your software should be aware of the different versions of the Docker Registry API to handle these differences. Let's look at what it takes to download the image layers through the various APIs in order to make an offline cache. First, we'll prepare our environment: [alexh:~/work] export image=ubuntu [alexh:~/work] export tag=15.10 v1 The v1 private registry can be examined at this point: [alexh:~/work] curl -s http://127.0.0.1:5000/v1/repositories/library/$image/tags/$tag | python -m json.tool "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547" export v1_image_id=`curl -s http://127.0.0.1:5000/v1/repositories/library/$image/tags/$tag | sed 's/"//g'` [alexh:~/work] curl -s http://127.0.0.1:5000/v1/images/$v1_image_id/ancestry | python -m json.tool [ "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547", "5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1", "8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824", "5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73" ] [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547/layer > 0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1/layer > 5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824/layer > 8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:5000/v1/images/5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73/layer > 5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73.tar.gz v1 on Docker Hub The Docker Hub currently implements the v1 API, but requires an authentication token for certain operations. It also allows multiple endpoints to be returned by the server. We'll take the simple approach of always using the first endpoint: [alexh:~/work] export endpoint=`curl -sSL -o /dev/null -D- "https://index.docker.io/v1/repositories/$image/images" | awk '/X-Docker-Endpoints/{print $2}' | tr -d '\r' | sed 's/,//'` [alexh:~/work] echo $endpoint registry-1.docker.io [alexh:~/work] export token=`curl -sSL -o /dev/null -D- -H 'X-Docker-Token: true' "https://index.docker.io/v1/repositories/$image/images" | tr -d '\r' | awk '/X-Docker-Token/{print $2}'` The token needs to be used for authentication for the rest of the commands, but otherwise they are the same as the v1 private registry: [alexh:~/work] export v1_image_id=`curl -s -H "Authorization: Token $token" https://$endpoint/v1/repositories/library/$image/tags/$tag | sed 's/"//g'` [alexh:~/work] curl -sSL -H "Authorization: Token $token" "https://registry-1.docker.io/v1/images/$v1_image_id/ancestry" | python -m json.tool [ "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547", "5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1", "8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824", "5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73" ] [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547/layer > 0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547.tar.gz [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1/layer > 5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1.tar.gz [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824/layer > 8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824.tar.gz [alexh:~/work] curl -sSL -H "Authorization: Token $token" https://$endpoint/v1/images/5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73/layer > 5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73.tar.gz v2 API The v2 API works with manifest files that include checksums. It's also slightly simpler. A manifest file for a tag contains all of the layer information, rather than requiring an image ID to be looked up for a tag, and then the ancestry for that image to be looked up. [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/manifests/$tag | python -c 'import sys, json, pprint; pprint.pprint(json.load(sys.stdin)["fsLayers"])' [{u'blobSum': u'sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4'}, {u'blobSum': u'sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4'}, {u'blobSum': u'sha256:d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d'}, {u'blobSum': u'sha256:23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca'}, {u'blobSum': u'sha256:7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107'}] [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4 > a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d > d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca > 23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca.tar.gz [alexh:~/work] curl -sSL http://127.0.0.1:6000/v2/$image/blobs/sha256:7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107 > 7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107.tar.gz We can get the checksum for these files to verify that they are what is described in the manifest file: [alexh:~/work] sha256sum *.tar.gz a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4 a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4.tar.gz d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d d4d342aa9da086ca4b7f7273858072e81021f4379c486223bc4708df6862b55d.tar.gz 23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca 23dc26e1038ae691b1a7e8e0152f974a358c42c929104c18c8e20b6d363c41ca.tar.gz 7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107 7772c716a45a828e124d20bc67199e77f2e63fb62589d0046f974f99b406e107.tar.gz The Remote (daemon) API Another API that is available is the Docker daemon running locally. It can be accessed over a Unix socket, or over TCP if the daemon is configured to allow it. [alexh:~/work] echo -e "GET /images/json HTTP/1.0\r\n" | nc -U /var/run/docker.sock | tail -n +6 | python -m json.tool [ { "Created": 1433116930, "Id": "0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547", "Labels": {}, "ParentId": "5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1", "RepoDigests": [], "RepoTags": [ "127.0.0.1:6000/ubuntu:15.10", "ubuntu:15.10", "127.0.0.1:5000/ubuntu:15.10" ], "Size": 0, "VirtualSize": 132392276 }, { "Created": 1432704049, "Id": "0c5e5ef1d7dac23c7164ea48faafc79f0c921f6cf87d2d8ea7469832ea31e4ca", "Labels": {}, "ParentId": "136beb445cfa7f48dbe4e36a80a83d4b7945682827fd8bfb1510ac17b6a200c0", "RepoDigests": [], "RepoTags": [ "registry:2.0.1" ], "Size": 0, "VirtualSize": 548626543 }, { "Created": 1432703977, "Id": "4e698fa804417b34b334793bab8a143403be9384e0651067b0c3933fe8d90eb2", "Labels": {}, "ParentId": "0cd49aa0e23cfe176cbea4bf622d552a6f16b21965cf52d633f8c9e27438f52c", "RepoDigests": [], "RepoTags": [ "registry:0.9.1" ], "Size": 0, "VirtualSize": 413940033 } ] A tarball containing all of the layers for a tag can be generated: [alexh:~/work] echo -e "GET /images/get?names=$image:$tag HTTP/1.0\r\n" | nc -U /var/run/docker.sock | tail -n +5 > $image-$tag.tar [alexh:~/work] mkdir tmp [alexh:~/work] tar -C tmp -xf ubuntu-15.10.tar [alexh:~/work] ls -l tmp total 20 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 0fe5a10d2cf8cdb378a39a81d87b0c8fcfa8fcaaf11bba895a1b6f72baf9a547 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 5c1d0c04c3b846fffd1d70886c956927a5c5f6a1c96f5e9f61c02f2ec1a45a73 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 5fc05c0feaeab977e52b7c2490bffacaba0e3d58e7955b683f271041d3558ad1 drwxr-xr-x 2 alexh alexh 4096 Jun 2 15:33 8c63e4ac9a5f31e482d25a149b022209653b5948cb4f045c2ede9331a18e5824 -rw-r--r-- 1 alexh alexh 87 Jun 2 15:33 repositories Conclusions Docker is a great technology and there are a lot of improvements and new features coming out at a rapid pace. Fortunately it's well documented and discussions about bugs are in the open on GitHub. However, there are still some edge cases to be aware of when talking to the Docker APIs. With some good design choices, your applications can be made backwards and forwards compatible, and will be able to use a wide range of Docker client versions and remote APIs.
June 30, 2015
by Kathy Thomas
· 1,904 Views · 2 Likes
article thumbnail
Azure Service Bus – As I Understand It: Part I (Overview)
Recently we started working on including support for Azure Service Bus in Cloud Portam. Prior to this, I had no experience with this service though it has been around for quite some time and I always wanted to try this out but one thing or another (oh, my stupid excuses :)!) prevented me from doing so. I learned a lot (and I am still learning) about this service while including support for it in Cloud Portam and this blog post talks about my learning. Please note that at the time of writing of all in all I have about a week of learning about this service so it is quite possible that I may be wrong about certain things. If that’s the case, please let me know and I will fix them ASAP. Now that the tone is set, let’s start! Azure Service Bus Offering The way I understand is that “Azure Service Bus” is a cloud-based messaging service that enables you to connect virtually anything – be it applications, services or devices. The beauty of Service Bus is that these things need not be in the cloud. They can run anywhere even inside the firewalled networks! Another thing I learned is that “Azure Service Bus” is essentially an umbrella service. At the time of writing of this post, there are actually four distinct services that are collectively offered under “Service Bus” umbrella – Queues, Topics & Subscriptions, Relays and Notification Hubs. Each service serves a different purpose yet the common theme is that all of them provide rich messaging infrastructure. To give you an analogy, if you have used Azure Storage Service you may already know that it offers four distinct services – Blobs, Files, Queues and Tables. It is the same with Service Bus as well. Queues Queues is the simplest of the service and kind of compares with Azure Storage Queue Service in the sense that it provides a unidirectional messaging infrastructure where a publisher publishes a message and the message is received by a receiver. There can be many receivers ready to receive the messages however one receiver can only receive a message. No two receivers can receive a single message simultaneously. For an in-depth comparison of Service Bus Queue and Storage Queues, please see this link: https://msdn.microsoft.com/en-us/library/azure/hh767287.aspx. Topics Topics are like queues in the sense that it also provides a unidirectional messaging infrastructure where a publisher publishes a message and receivers receive the message. The key difference is that same message can be received by multiple receivers (subscribers). Each subscriber can optionally specify a filter criteria so that they only receive the messages matching that criteria. To understand the difference between the two, let’s consider an example. Let’s say you run an e-commerce site and on successful completion of order, you have two tasks: 1) Send an email to customer about the order and 2) Notify the warehouse. If you were using Queues, you would either create 2 queues and put email notification message in one queue and warehouse notification message in another queue or build a workflow where you would send order confirmation message to a queue. Receiver would take that message and send out an email and then put warehouse notification message in the same queue (or other queue) and then another receiver would receive the message and notify the warehouse. However if you were using Topics, things would be much simpler logistically speaking. Essentially you would have just one message (order confirmation) but there will be two subscribers – one will be responsible for sending the email confirmation and the other will be responsible for notifying the warehouse. Relays Unlike Queues and Topics, which provide unidirectional flow of messages a Relay provides bi-directional flow. Using Relays, two disparate applications, services or devices can exchange messages. Other key difference is that a Relay doesn’t store the message like Queues and Topics. It just passes the messages from source to destination. Event Hubs Event Hubs service is meant for ingesting events and telemetry data in the cloud at massive scale (millions of events / second). Event Hubs are now more than important considering the push for connected devices (Internet-of-Things). Azure Service Bus Tiers Azure Service Bus is offered under two tiers (or SKUs if you would like): Basic and Standard. The difference is the level of functionality offered in each tier and the pricing. For example, Topics, Relays and Notification Hubs are only offered under Standard tier. Even with Queues, a limited set of functionality is exposed under Basic tier. For a list of features offered under each tier, please see this link: http://azure.microsoft.com/en-in/pricing/details/service-bus/. Summary That’s it for this post. In the next posts in this series, I will share my learnings about Queues and other Service Bus services. So stay tuned for that! Again, if you think that I have provided some incorrect information, please let me know and I will fix them ASAP.
June 30, 2015
by Gaurav Mantri
· 1,262 Views
article thumbnail
The Secret to More Efficient Data Science with Neo4j and R [OSCON Preview]
It’s a sad but true fact: Most data scientists spend 50-80% of their time cleaning and munging data and only a fraction of their time actually building predictive models. This is most often true in a traditional stack, where most of this data munging consists of writing lines upon lines of some flavor of SQL, leaving little time for model-building code in statistical programming languages such as R. These long, cryptic SQL queries not only slow development time but also prevent useful collaboration on analytics projects, as contributors struggle to understand each others’ SQL code. For example, in graduate school, I was on a project team where we used Oracle to store Twitter data. The kinds of queries my classmates and I were writing were unmaintainable and impossible to understand unless the author was sitting next to you. No one worked on the same queries together because they were so unwieldy. This not only hindered our collaboration efforts but also slowed our progress on the project. If we had been using an appropriate data store (like a graph database) we would have spent significantly less time pulling our hair out over the queries. Why Today’s Data Is Different This data-munging problem has persisted in the data science field because data is becoming increasingly social and highly-connected. Forcing this kind of interconnected data into an inherently tabular SQL database, where relationships are only abstract, leads to complicated schemas and overly complex queries. Yet, several NoSQL solutions – specifically in the graph database space – exist to store today’s highly-connected data. That is, data where relationships matter. A lot of data analysis today is performed in the context of better understanding people’s behavior or needs, such as: How likely is this visitor to click on advertisement X? Which products should I recommend to this user? How are User A and User B connected? Written by Nicole White People, as we know, are inherently social, so most of these questions can be answered by understanding the connections between people: User A is similar to User B, and we already know that User B likes this product, so let’s recommend this product to User A. The Good News: Data-Munging No More Data science doesn’t have to be 80% data munging. With the appropriate technology stack, a data scientist’s development process is seamless and short. It’s time to spend less time writing queries and more time building models by combining the flexibility of an open-source, NoSQL graph database with the maturity and breadth of R – an open-source statistical programming language. The combination of Neo4j’s ability to store highly-connected, possibly-unstructured data and R’s functional, ad-hoc nature creates the ideal data analysis environment. You don’t have to spend an hour writing CREATE TABLE statements. You don’t have to spend all day on StackOverflow figuring out how to traverse a tree in SQL. Just Cypher and go. Learn More at OSCON 2015 At my upcoming OSCON session we will walk through a project in which we analyze #OSCON Twitter data in a reproducible, low-effort workflow without writing a single line of SQL. For this highly-connected dataset we will use Neo4j, an open-source graph database, to store and query the data while highlighting the advantages of storing such data in a graph versus a relational schema. Finally, we will cover how to connect to Neo4j from an R environment for the purposes of performing common data science tasks, such as analysis, prediction and visualization.
June 30, 2015
by Mark Needham
· 1,630 Views
article thumbnail
Sync issues with your codes on GitHub
It’s no surprise that many if not all programmers use GitHub today to store their codes, but it can be frustrating to keep everyone up to date with the code changes. Recently, GitHub has been integrated with Quire, a tree-structured task management tool that lets programmers to easily keep track of code changes. By linking GitHub commits to the so-called tasks (issues), users can refer to these tasks when they look at code changes, and also trace back to the codes when they look at the tasks. In a blog article, Quire goes into a bit more detail about their new integration and what exactly users can do and benefit from it. Check out the details at the link below. Hello GitHub, We’re Quire | Quire Blog
June 30, 2015
by Crystal Chen
· 825 Views
article thumbnail
Get CoreOS Logs into ELK in 5 Minutes
CoreOS Linux is the operating system for “Super Massive Deployments”. We wanted to see how easily we can get CoreOS logs into Elasticsearch / ELK-powered centralized logging service. Here’s how to get your CoreOS logs into ELK in about 5 minutes, give or take. If you’re familiar with CoreOS and Logsene, you can grab CoreOS/Logsene config files from Github. Here’s an example Kibana Dashboard you can get in the end: CoreOS Kibana Dashboard CoreOS is based on the following: Docker and rkt for containers systemd for startup scripts, and restarting services automatically etcd as centralized configuration key/value store fleetd to distribute services over all machines in the cluster. Yum. journald to manage logs. Another yum. Amazingly, with CoreOS managing a cluster feels a lot like managing a single machine! We’ve come a long way since ENIAC! There’s one thing people notice when working with CoreOS – the repetitive inspection of local or remote logs using “journalctl -M machine-N -f | grep something“. It’s great to have easy access to logs from all machines in the cluster, but … grep? Really? Could this be done better? Of course, it’s 2015! Here is a quick example that shows how to centralize logging with CoreOS with just a few commands. The idea is to forward the output of “journalctl -o short” to Logsene‘s Syslog Receiver and take advantage of all its functionality – log searching, alerting, anomaly detection, integrated Kibana, even correlation of logs with Docker performance metrics — hey, why not, it’s all available right there, so we may as well make use of it all! Let’s get started! Preparation: 1) Get a list of IP addresses of your CoreOS machines fleetctl list-machines 2) Create a new Logsene App (here) 3) Change the Logsene App Settings, and authorize the CoreOS host IP Addresses from step 1) (here’s how/where) Congratulations – you just made it possible for your CoreOS machines to ship their logs to your new Logsene app! Test it by running the following on any of your CoreOS machines: journalctl -o short -f | ncat --ssl logsene-receiver-syslog.sematext.com 10514 …and check if the logs arrive in Logsene (here). If they don’t, yell at us @sematext – there’s nothing better than public shaming on Twitter to get us to fix things. :) Create a fleet unit file called logsene.service [Unit] Description=Logsene Log Forwarder [Service] Restart=always RestartSec=10s ExecStartPre=/bin/sh -c "if [ -n \"$(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\" ]; then echo \"Value Exists: /sematext.com/logsene/`hostname`/lastlog $(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\"; else etcdctl set /sematext.com/logsene/`hostname`/lastlog\"`date +\"%Y-%%m-%d %%H:%M:%S\"`\"; true; fi" ExecStart=/bin/sh -c "journalctl --since \"$(etcdctl get /sematext.com/logsene/`hostname`/lastlog)\" -o short -f | ncat --ssl logsene-receiver-syslog.sematext.com 10514" ExecStopPost=/bin/sh -c "export D=\"`date +\"%Y-%%m-%%d %%H:%M:%S\"`\"; /bin/etcdctl set /sematext.com/logsene/$(hostname)/lastlog \"$D\"" [Install] WantedBy=multi-user.target [X-Fleet] Global=true Activate cluster-wide logging to Logsene with fleet To start logging to Logsene from all machines activate logsene.service: fleetctl load logsene.service fleetctl start logsene.service There. That’s all there is to it! Hope this worked for you! At this point all your CoreOS logs should be going to Logsene. Now you have a central place to see all your CoreOS logs. If you want to send your app logs to Logsene, you can do that, too — anything that can send logs via Syslog or to Elasticsearch can also ship logs to Logsene. If you want some Docker containers & host monitoring to go with your CoreOS logs, just pull spm-agent-docker from Docker Registry. Enjoy!
June 29, 2015
by Stefan Thies
· 2,608 Views
  • Previous
  • ...
  • 295
  • 296
  • 297
  • 298
  • 299
  • 300
  • 301
  • 302
  • 303
  • 304
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×