DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Curious about the future of data-driven systems? Join our Data Engineering roundtable and learn how to build scalable data platforms.

Data Engineering: The industry has come a long way from organizing unstructured data to adopting today's modern data pipelines. See how.

Threat Detection: Learn core practices for managing security risks and vulnerabilities in your organization — don't regret those threats!

Managing API integrations: Assess your use case and needs — plus learn patterns for the design, build, and maintenance of your integrations.

Avatar

Adam Warski

Software Engineer at Softwaremill

Warsaw, PL

Joined Apr 2006

https://softwaremill.com

About

I am one of the co-founders of SoftwareMill, where I code mainly using Scala and other interesting technologies. I am involved in open-source projects, such as sttp, MacWire, Quicklens, ElasticMQ and others. I have been a speaker at major conferences, such as JavaOne, LambdaConf, Devoxx and ScalaDays. Apart from writing closed- and open-source software, in my free time I try to read the Internet on various (functional) programming-related subjects. Any ideas or insights usually end up with a blog (https://softwaremill.com/blog)

Stats

Reputation: 451
Pageviews: 287.5K
Articles: 3
Comments: 30
  • Articles
  • Comments

Articles

article thumbnail
Benchmarking SQS
sqs, simple message queue , is a message-queue-as-a-service offering from amazon web services. it supports only a handful of messaging operations, far from the complexity of e.g. amqp , but thanks to the easy to understand interfaces, and the as-a-service nature, it is very useful in a number of situations. but how fast is sqs? how does it scale? is it useful only for low-volume messaging, or can it be used for high-load applications as well? if you know how sqs works, and want to skip the details on the testing methodology, you can jump straight to the test results . sqs semantics sqs exposes an http-based interface. to access it, you need aws credentials to sign the requests. but that’s usually done by a client library (there are libraries for most popular languages; we’ll use the official java sdk ). the basic message-related operations are: send a message, up to 256 kb in size, encoded as a string. messages can be sent in bulks of up to 10 (but the total size is capped at 256 kb). receive a message. up to 10 messages can be received in bulk, if available in the queue long-polling of messages. the request will wait up to 20 seconds for messages, if none are available initially delete a message there are also some other operations, concerning security, delaying message delivery, and changing a messages’ visibility timeout, but we won’t use them in the tests. sqs offers at-least-once delivery guarantee. if a message is received, then it is blocked for a period called “visibility timeout”. unless the message is deleted within that period, it will become available for delivery again. hence if a node processing a message crashes, it will be delivered again. however, we also run into the risk of processing a message twice (if e.g. the network connection when deleting the message dies, or if an sqs server dies), which we have to manage on the application side. sqs is a replicated message queue, so you can be sure that once a message is sent, it is safe and will be delivered; quoting from the website: amazon sqs runs within amazon’s high-availability data centers, so queues will be available whenever applications need them. to prevent messages from being lost or becoming unavailable, all messages are stored redundantly across multiple servers and data centers. testing methodology to test how fast sqs is and how it scales, we will be running various numbers of nodes, each running various number of threads either sending or receiving simple, 100-byte messages. each sending node is parametrised with the number of messages to send, and it tries to do so as fast as possible. messages are sent in bulk, with bulk sizes chosen randomly between 1 and 10. message sends are synchronous, that is we want to be sure that the request completed successfully before sending the next bulk. at the end the node reports the average number of messages per second that were sent. the receiving node receives messages in maximum bulks of 10. the amazonsqsbufferedasyncclient is used, which pre-fetches messages to speed up delivery. after receiving, each message is asynchronously deleted. the node assumes that testing is complete once it didn’t receive any messages within a minute, and reports the average number of messages per second that it received. each test sends from 10 000 to 50 000 messages per thread. so the tests are relatively short, 2-5 minutes. there are also longer tests, which last about 15 minutes. the full (but still short) code is here: sender , receiver , sqsmq . one set of nodes runs the mqsender code, the other runs the mqreceiver code. the sending and receiving nodes are m3.large ec2 servers in the eu-west region, hence with the following parameters: 2 cores intel xeon e5-2670 v2 7.5 gb rams the queue is of course also created in the eu-west region. minimal setup the minimal setup consists of 1 sending node and 1 receiving node, both running a single thread. the results are, in messages/second: average min max sender 429 365 466 receiver 427 363 463 scaling threads how do these results scale when we add more threads (still using one sender and one receiver node)? the tests were run with 1, 5, 25, 50 and 75 threads. the numbers are an average msg/second throughput. number of threads: 1 5 25 50 75 sender per thread 429,33 407,35 354,15 289,88 193,71 sender total 429,33 2 036,76 8 853,75 14 493,83 14 528,25 receiver per thread 427,86 381,55 166,38 83,92 47,46 receiver total 427,86 1 907,76 4 159,50 4 196,17 3 559,50 as you can see, on the sender side, we get near-to-linear scalability as the number of thread increases, peaking at 14k msgs/second sent (on a single node!) with 50 threads. going any further doesn’t seem to make a difference. the receiving side is slower, and that is kind of expected, as receiving a single message is in fact two operations: receive + delete, while sending is a single operation. the scalability is worse, but still we can get as much as 4k msgs/second received. scaling nodes another (more promising) method of scaling is adding nodes, which is quite easy as we are “in the cloud”. the test results when running multiple nodes, each running a single thread are: number of nodes: 1 2 4 8 sender per node 429,33 370,36 350,30 337,84 sender total 429,33 740,71 1 401,19 2 702,75 receiver per node 427,86 360,60 329,54 306,40 receiver total 427,86 721,19 1 318,15 2 451,23 in this case, both on the sending&receiving side, we get near-linear scalability, reaching 2.5k messages sent&received per second with 8 nodes. scaling nodes and threads the natural next step is, of course, to scale up both the nodes, and the threads! here are the results, when using 25 threads on each node: number of nodes: 1 2 4 8 sender per node&thread 354,15 338,52 305,03 317,33 sender total 8 853,75 16 925,83 30 503,33 63 466,00 receiver per node&thread 166,38 159,13 170,09 174,26 receiver total 4 159,50 7 956,33 17 008,67 34 851,33 again, we get great scalability results, with the number of receive operations about half the number of send operations per second. 34k msgs/second processed is a very nice number! to the extreme the highest results i managed to get are: 108k msgs/second sent when using 50 threads and 8 nodes 35k msgs/second received when using 25 threads and 8 nodes i also tried running longer “stress” tests with 200k messages/thread, 8 nodes and 25 threads, and the results were the same as with the shorter tests. running the tests – technically to run the tests, i built docker images containing the sender / receiver binaries, pushed to docker’s hub, and downloaded on the nodes by chef. to provision the servers, i used amazon opsworks. this enabled me to quickly spin up and provision a lot of nodes for testing (up to 16 in the above tests). for details on how this works, see my “cluster-wide java/scala application deployments with docker, chef and amazon opsworks” blog . the sender / receiver daemons monitored (by checking each second the last-modification date) a file on s3. if a modification was detected, the file was downloaded – it contained the test parameters – and the test started. summing up sqs has good performance and really great scalability characteristics. i wasn’t able to reach the peak of its possibilities – which would probably require more than 16 nodes in total. but once your requirements get above 35k messages per second, chances are you need custom solutions anyway; not to mention that while sqs is cheap, it may become expensive with such loads. from the results above, i think it is clear that sqs can be safely used for high-volume messaging applications, and scaled on-demand. together with its reliability guarantees, it is a great fit both for small and large applications, which do any kind of asynchronous processing; especially if your service already resides in the amazon cloud. as benchmarking isn’t easy, any remarks on the testing methodology, ideas how to improve the testing code are welcome!
June 25, 2014
· 7,023 Views · 3 Likes
article thumbnail
Akka vs Storm
I was recently working a bit with Twitter’s Storm, and it got me wondering, how does it compare to another high-performance, concurrent-data-processing framework, Akka. WHAT’S AKKA AND STORM? Let’s start with a short description of both systems. Storm is a distributed, real-time computation system. On a Storm cluster, you execute topologies, which process streams of tuples (data). Each topology is a graph consisting of spouts (which produce tuples) and bolts (which transform tuples). Storm takes care of cluster communication, fail-over and distributing topologies across cluster nodes. Akka is a toolkit for building distributed, concurrent, fault-tolerant applications. In an Akka application, the basic construct is an actor; actors process messages asynchronously, and each actor instance is guaranteed to be run using at most one thread at a time, making concurrency much easier. Actors can also be deployed remotely. There’s a clustering module coming, which will handle automatic fail-over and distribution of actors across cluster nodes. Both systems scale very well and can handle large amounts of data. But when to use one, and when to use the other? There’s another good blog post on the subject, but I wanted to take the comparison a bit further: let’s see how elementary constructs in Storm compare to elementary constructs in Akka. COMPARING THE BASICS Firstly, the basic unit of data in Storm is a tuple. A tuple can have any number of elements, and each tuple element can be any object, as long as there’s a serializer for it. In Akka, the basic unit is amessage, which can be any object, but it should be serializable as well (for sending it to remote actors). So here the concepts are almost equivalent. Let’s take a look at the basic unit of computation. In Storm, we have components: bolts andsprouts. A bolt can be any piece of code, which does arbitrary processing on the incoming tuples. It can also store some mutable data, e.g. to accumulate results. Moreover, bolts run in a single thread, so unless you start additional threads in your bolts, you don’t have to worry about concurrent access to the bolt’s data. This is very similar to an actor, isn’t it? Hence a Storm bolt/sprout corresponds to an Akka actor. How do these two compare in detail? Actors can receive arbitrary messages; bolts can receive arbitrary tuples. Both are expected to do some processing basing on the data received. Both have internal state, which is private and protected from concurrent thread access. ACTORS & BOLTS: DIFFERENCES One crucial difference is how actors and bolts communicate. An actor can send a message to any other actor, as long as it has the ActorRef (and if not, an actor can be looked up by-name). It can also send back a reply to the sender of the message that is being handled. Storm, on the other hand is one-way. You cannot send back messages; you also can’t send messages to arbitrary bolts. You can also send a tuple to a named channel (stream), which will cause the tuple (message) to be broadcast to all listeners, defined in the topology. (Bolts also ack messages, which is also a form of communication, to the ackers.) In Storm, multiple copies of a bolt’s/sprout’s code can be run in parallel (depending on theparallelism setting). So this corresponds to a set of (potentially remote) actors, with a load-balancer actor in front of them; a concept well-known from Akka’s routing. There are a couple of choices on how tuples are routed to bolt instances in Storm (random, consistent hashing on a field), and this roughly corresponds to the various router options in Akka (round robin, consistent hashing on the message). There’s also a difference in the “weight” of a bolt and an actor. In Akka, it is normal to have lots of actors (up to millions). In Storm, the expected number of bolts is significantly smaller; this isn’t in any case a downside of Storm, but rather a design decision. Also, Akka actors typically share threads, while each bolt instance tends to have a dedicated thread. OTHER FEATURES Storm also has one crucial feature which isn’t implemented in Akka out-of-the-box: guaranteed message delivery. Storm tracks the whole tree of tuples that originate from any tuple produced by a sprout. If all tuples aren’t acknowledged, the tuple will be replayed. Also the cluster management of Storm is more advanced (automatic fail-over, automatic balancing of workers across the cluster; based on Zookeeper); however the upcoming Akka clustering module should address that. Finally, the layout of the communication in Storm – the topology – is static and defined upfront. In Akka, the communication patterns can change over time and can be totally dynamic; actors can send messages to any other actors, or can even send addresses (ActorRefs). So overall, Storm implements a specific range of usages very well, while Akka is more of a general-purpose toolkit. It would be possible to build a Storm-like system on top of Akka, but not the other way round (at least it would be very hard).
June 26, 2013
· 20,668 Views
article thumbnail
Generational Caching and Envers
Konrad recently shared on our company’s technical room an interesting article on how caching is done is a big polish social network, nk.pl. One of the central concepts in the algorithm is generational caching (see here or here). The basic idea is that for cache keys you use some entity-specific string + version number. The version number increases whenever data changes, thus invalidating any old cache entries, and preventing stale data reads. This makes the assumption that the cache has some garbage collection, e.g. it may simply be a LRU cache. Of course on each request we must know the version number – that’s why it must be stored in a global cache (but depending on our consistency requirements, it also may be distributed across the cluster asynchronously). However the data itself can be stored in local caches. So if our system is read-most, the only “expensive” operation that we will have to do per request is retrieve the version numbers for the entities we are interested in. And this is usually very simple information, which can be kept entirely in-memory. Depending on the type of data and the usage patterns, you can cache individual entities (e.g. for a Person entity, the cache key could be person-9128-123, 9128 being the id, 123 the version number), or the whole lot (e.g. for a Countries entity, the cache key could be countries-8, 8 being the version number). Moreover in the global cache you can keep the latest version number per-id or per-entity; meaning that when the version changes, you invalidate a specific entity or all of them. Having written most of Envers, it quite naturally occurred to me that you may use the entity revision numbers as the cache versions. Subsequent Envers revisions are monotonically increasing numbers, for each transaction you get the next one. So whenever a cached entity changes, you would have to populate the global cache with the latest revision number. Envers provides several ways to get the revision numbers. During the transaction, you can call AuditReader.getCurrentRevision() method, which will give you the revision metadata, including the revision number. If you want more fine-grained control, you may implement your own listener (EntityTrackingRevisionListener), see the docs), and get notified whenever an entity is changed, and update the global cache in there. You can also register an after-transaction-completed callback, and update the cache outside of the transaction boundaries. Or, if you know the entity ids, you may lookup the maximum revision number using either AuditReader.getRevisions or an AuditQueryCreator. As you can obtain the current revision number during a transaction, you may even update the version/revision in the global cache atomically, if you use a transactional cache such as Infinispan. All of that of course in addition to auditing, which is still the main purpose of Envers :)
July 24, 2012
· 4,278 Views

Comments

FireBug Like Tool Inbuilt In MAC Safari.

Jan 28, 2014 · kumar app

Yes, I saw these bindings as well, however I couldn't find any version that would be deployed to any maven repository.

Also, while an API closely matching the C++ one may be good, it may also be a bit weird to use from Java :) However I think JavaCV follows the C++ version quite closely also.

FireBug Like Tool Inbuilt In MAC Safari.

Jan 28, 2014 · kumar app

Yes, I saw these bindings as well, however I couldn't find any version that would be deployed to any maven repository.

Also, while an API closely matching the C++ one may be good, it may also be a bit weird to use from Java :) However I think JavaCV follows the C++ version quite closely also.

JSON to Java with JDK6

Dec 27, 2011 · Julien Viet

It would be interesting to see some performance comparision vs Java JSON parsing solutions :)
Compiling Your Ruby App with RubyScript2Exe

Jun 30, 2011 · Gerd Storm

Well compilation doesn't hide anything apart from variable and parameter names, you can easily decompile everything ;) And for a larger project, compilation takes much more than a couple of seconds ... But I'm with you of course on the static types front :) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 30, 2011 · Gerd Storm

Well compilation doesn't hide anything apart from variable and parameter names, you can easily decompile everything ;) And for a larger project, compilation takes much more than a couple of seconds ... But I'm with you of course on the static types front :) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 30, 2011 · Gerd Storm

Well compilation doesn't hide anything apart from variable and parameter names, you can easily decompile everything ;) And for a larger project, compilation takes much more than a couple of seconds ... But I'm with you of course on the static types front :) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 29, 2011 · Gerd Storm

But JAX, JMS, JTA, JPA can also be used in e.g. a JRuby app deployed on an app server (see Torquebox for example). It doesn't require compilation to work. As for Play, I think it uses similar hacks as JRebel ;) Adam
Compiling Your Ruby App with RubyScript2Exe

Jun 28, 2011 · Gerd Storm

True, tools like Eclipse/Netbeans or JRebel, which I was writing about, make writing JSF or other Java/Scala-framework much more feasible, however the amount of tooling needed for that (and that's tooling with quite important limitations, e.g. as you write: hot swap works as long as you don't add/remove methods or fields) indicate to me that something is wrong. I'd like a statically typed language, with the ease-of-use of a dynamic language :) Adam
Dependency injection discourages object-oriented programming?

Oct 27, 2010 · Pawel Wrzeszcz

Follow up, part one: http://www.warski.org/blog/?p=289
Dependency injection discourages object-oriented programming?

Oct 15, 2010 · Pawel Wrzeszcz

Could be; in the comments to the blog I posted some links about DI in Scala, I think it may also be an interesting read, from the point of view of a OO/functional language (although if scala is functional or OO is also controversial for some ;) )
Dependency injection discourages object-oriented programming?

Oct 14, 2010 · Pawel Wrzeszcz

Sure, that's what I meant when I wrote: "I’m not saying that achieving the above is not possible with a DI framework; only that DI encourages the ProductService approach: it’s just easier to code it procedurally, instead of e.g. creating a ProductShipper factory, passing all the needed dependencies to it and so on." I suspect that at first, you would write a ProductService. At least that's what I often find in projects, which use DI.
Dependency injection discourages object-oriented programming?

Oct 14, 2010 · Pawel Wrzeszcz

I read up about AbstractFactories, but I can't see how they would help me with getting dependencies in non-container managed objects? Or to manage all objects with a container, but somehow instantiate them with custom data? Esp how do they differ from normal Factories usage, in the scopes of the problems I wrote about in the blog?
Dependency injection discourages object-oriented programming?

Oct 14, 2010 · Pawel Wrzeszcz

Sure, using a constructor instead of a method argument by itself isn't more or less OO :). But I think that encapsulating the data (Product) inside an object is more OO then passing around the product to ship as an argument. It's not only: new ProductShipper(product).ship() vs productShipperService.ship(product), but also the fact that you can for example pass the ProductShipper object to another place. And that's better and more OO in my opinion than shipping a product to ship between many service method invocations. Of course if that would be better depends much on what exactly product and shipping is ;). But I was just hoping to illustrate my thinking I'll read up on the Abstract Factories, thank you very much for the pointer.
Dependency injection discourages object-oriented programming?

Oct 14, 2010 · Pawel Wrzeszcz

Ah yes, typed it in the wrong place, sorry. Well then how would you solve the problems I described in the blog entry nicely? Again, please note: I'm not saying that solving them is not possible. Just that you often need several steps, which can indicate that the framework is missing something.
Dependency injection discourages object-oriented programming?

Oct 14, 2010 · Pawel Wrzeszcz

Well maybe inaccurate in the statement that you can have only one managed instance - but that is not the main point. The fact that you cannot encapsulate objects and still have DI still holds in my opinion.
Retrieve document from server with proper error handling

Mar 10, 2010 · Evgeniy Karyakin

Moreover any parameters included in seem to be stripped which leaves me with a good question on how to redirect after a POST to a page which has the parameters :).
Fluent Navigation in JSF 2

Mar 10, 2010 · Dan Allen

Moreover any parameters included in seem to be stripped which leaves me with a good question on how to redirect after a POST to a page which has the parameters :).
Retrieve document from server with proper error handling

Mar 10, 2010 · Evgeniy Karyakin

Also, the ?faces-redirect=true parameter doesn't seem to have any effect.
Fluent Navigation in JSF 2

Mar 10, 2010 · Dan Allen

Also, the ?faces-redirect=true parameter doesn't seem to have any effect.
Retrieve document from server with proper error handling

Mar 10, 2010 · Evgeniy Karyakin

Unless I'm doing something wrong the element can't have any children or attributes, and you mention two:

  • view-param
  • include-view-params

Are you referring to some future version of JSF, or were these two removed from the final spec? Is there a way to include the view parameters after a POST?

Adam
Fluent Navigation in JSF 2

Mar 10, 2010 · Dan Allen

Unless I'm doing something wrong the element can't have any children or attributes, and you mention two:

  • view-param
  • include-view-params

Are you referring to some future version of JSF, or were these two removed from the final spec? Is there a way to include the view parameters after a POST?

Adam

User has been successfully modified

Failed to modify user

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: