The 9.2 release of Komodo IDE is live and includes new features such as Docker and Vagrant integration, collaboration improvements, a package installer, and UI changes.
Learn how to use the service registration and discovery services in ZooKeeper to manage microservices when refactoring from an existing monolithic application.
Distributed transaction tracing is a useful way to monitor or evaluate your microservices architecture for performance, particularly when measuring end-to-end requests.
In this article, excerpted from the book Docker in Action, I will show you how to open access to shared memory between containers. Linux provides a few tools for sharing memory between processes running on the same computer. This form of inter-process communication (IPC) performs at memory speeds. It is often used when the latency associated with network or pipe based IPC drags software performance below requirements. The best examples of shared memory based IPC usage is in scientific computing and some popular database technologies like PostgreSQL. Docker creates a unique IPC namespace for each container by default. The Linux IPC namespace partitions shared memory primitives like named shared memory blocks and semaphores, as well as message queues. It is okay if you are not sure what these are. Just know that they are tools used by Linux programs to coordinate processing. The IPC namespace prevents processes in one container from accessing the memory on the host or in other containers. Sharing IPC Primitives Between Containers I’ve created an image named allingeek/ch6_ipc that contains both a producer and consumer. They communicate using shared memory. Listing 1 will help you understand the problem with running these in separate containers. Listing 1: Launch a Communicating Pair of Programs # start a producer docker -d -u nobody --name ch6_ipc_producer \ allingeek/ch6_ipc -producer # start the consumer docker -d -u nobody --name ch6_ipc_consumer \ allingeek/ch6_ipc -consumer Listing 1 starts two containers. The first creates a message queue and starts broadcasting messages on it. The second should pull from the message queue and write the messages to the logs. You can see what each is doing by using the following commands to inspect the logs of each: docker logs ch6_ipc_producer docker logs ch6_ipc_consumer If you executed the commands in Listing 1 something should be wrong. The consumer never sees any messages on the queue. Each process used the same key to identify the shared memory resource but they referred to different memory. The reason is that each container has its own shared memory namespace. If you need to run programs that communicate with shared memory in different containers, then you will need to join their IPC namespaces with the --ipc flag. The --ipc flag has a container mode that will create a new container in the same IPC namespace as another target container. Listing 2: Joining Shared Memory Namespaces # remove the original consumer docker rm -v ch6_ipc_consumer # start a new consumer with a joined IPC namespace docker -d --name ch6_ipc_consumer \ --ipc container:ch6_ipc_producer \ allingeek/ch6_ipc -consumer Listing 2 rebuilds the consumer container and reuses the IPC namespace of the ch6_ipc_producer container. This time the consumer should be able to access the same memory location where the server is writing. You can see this working by using the following commands to inspect the logs of each: docker logs ch6_ipc_producer docker logs ch6_ipc_consumer Remember to cleanup your running containers before moving on: # remember: the v option will clean up volumes, # the f option will kill the container if it is running, # and the rm command takes a list of containers docker rm -vf ch6_ipc_producer ch6_ipc_consumer There are obvious security implications to reusing the shared memory namespaces of containers. But this option is available if you need it. Sharing memory between containers is safer alternative to sharing memory with the host.
[This article was written by Sveta Smirnova] Like any good, thus lazy, engineer I don’t like to start things manually. Creating directories, configuration files, specify paths, ports via command line is too boring. I wrote already how I survive in case when I need to start MySQL server (here). There is also the MySQL Sandbox which can be used for the same purpose. But what to do if you want to start Percona XtraDB Cluster this way? Fortunately we, at Percona, have engineers who created automation solution for starting PXC. This solution uses Docker. To explore it you need: Clone the pxc-docker repository:git clone https://github.com/percona/pxc-docker Install Docker Compose as described here cd pxc-docker/docker-bld Follow instructions from the README file: a) ./docker-gen.sh 5.6 (docker-gen.sh takes a PXC branch as argument, 5.6 is default, and it looks for it on github.com/percona/percona-xtradb-cluster) b) Optional: docker-compose build (if you see it is not updating with changes). c) docker-compose scale bootstrap=1 members=2 for a 3 node cluster Check which ports assigned to containers: $docker port dockerbld_bootstrap_1 3306 0.0.0.0:32768 $docker port dockerbld_members_1 4567 0.0.0.0:32772 $docker port dockerbld_members_2 4568 0.0.0.0:32776 Now you can connect to MySQL clients as usual: $mysql -h 0.0.0.0 -P 32768 -uroot Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 10 Server version: 5.6.21-70.1 MySQL Community Server (GPL), wsrep_25.8.rXXXX Copyright (c) 2009-2015 Percona LLC and/or its affiliates Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql> 6. To change MySQL options either pass it as a mount at runtime with something like volume: /tmp/my.cnf:/etc/my.cnf in docker-compose.yml or connect to container’s bash (docker exec -i -t container_name /bin/bash), then change my.cnf and run docker restart container_name Notes. If you don’t want to build use ready-to-use images If you don’t want to run Docker Compose as root user add yourself to docker group
written by tim brock. scatter plots are a wonderful way of showing ( apparent ) relationships in bivariate data. patterns and clusters that you wouldn't see in a huge block of data in a table can become instantly visible on a page or screen. with all the hype around big data in recent years it's easy to assume that having more data is always an advantage. but as we add more and more data points to a scatter plot we can start to lose these patterns and clusters. this problem, a result of overplotting, is demonstrated in the animation below. the data in the animation above is randomly generated from a pair of simple bivariate distributions. the distinction between the two distributions becomes less and less clear as we add more and more data. so what can we do about overplotting? one simple option is to make the data points smaller. (note this is a poor "solution" if many data points share exactly the same values.) we can also make them semi-transparent. and we can combine these two options: these refinements certainly help when we have ten thousand data points. however, by the time we've reached a million points the two distributions have seemingly merged in to one again. making points smaller and more transparent might help things; nevertheless, at some point we may have to consider a change of visualization. we'll get on to that later. but first let's try to supplement our visualization with some extra information. specifically let's visualize the marginal distributions . we have several options. there's far too much data for a rug plot , but we can bin the data and show histograms . or we can use a smoother option - a kernel density plot . finally, we could use the empirical cumulative distribution . this last option avoids any binning or smoothing but the results are probably less intuitive. i'll go with the kernel density option here, but you might prefer a histogram. the animated gif below is the same as the gif above but with the smoothed marginal distributions added. i've left scales off to avoid clutter and because we're only really interested in rough judgements of relative height. adding marginal distributions, particularly the distribution of variable 2, helps clarify that two different distributions are present in the bivariate data. the twin-peaked nature of variable 2 is evident whether there are a thousand data points or a million. the relative sizes of the two components is also clear. by contrast, the marginal distribution of variable 1 only has a single peak, despite coming from two distinct distributions. this should make it clear that adding marginal distributions is by no means a universal solution to overplotting in scatter plots. to reinforce this point, the animation below shows a completely different set of (generated) data points in a scatter plot with marginal distributions. the data again comes from a random sample of two different 2d distributions, but both marginal distributions of the complete dataset fail to highlight this separation. as previously, when the number of data points is large the distinction between the two clusters can't be seen from the scatter plot either. returning to point size and opacity, what do we get if we make the data points very small and almost completely transparent? we can now clearly distinguish two clusters in each dataset. it's difficult to make out any fine detail though. since we've lost that fine detail anyway, it seems apt to question whether we really want to draw a million data points. it can be tediously slow and impossible in certain contexts. 2d histograms are an alternative. by binning data we can reduce the number of points to plot and, if we pick an appropriate color scale, pick out some of the features that were lost in the clutter of the scatter plot. after some experimenting i picked a color scale that ran from black through green to white at the high end. note, this is (almost) the reverse of the effect created by overplotting in the scatter plots above. in both 2d histograms we can clearly see the two different clusters representing the two distributions from which the data is drawn. in the first case we can also see that there are more counts from the upper-left cluster than the bottom-right cluster, a detail that is lost in the scatter plot with a million data points (but more obvious from the marginal distributions). conversely, in the case of the second dataset we can see that the "heights" of the two clusters are roughly comparable. 3d charts are overused, but here (see below) i think they actually work quite well in terms of providing a broad picture of where the data is and isn't concentrated. feature occlusion is a problem with 3d charts so if you're going to go down this route when exploring your own data i highly recommend using software that allows for user interaction through rotation and zooming. in summary, scatter plots are a simple and often effective way of visualizing bivariate data. if, however, your chart suffers from overplotting, try reducing point size and opacity. failing that, a 2d histogram or even a 3d surface plot may be helpful. in the latter case be wary of occlusion.