DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Data Topics

article thumbnail
Modeling Data in Neo4j: Bidirectional Relationships
transitioning from the relational world to the beautiful world of graphs requires a shift in thinking about data. although graphs are often much more intuitive than tables, there are certain mistakes people tend to make when modelling their data as a graph for the first time. in this article, we look at one common source of confusion: bidirectional relationships. directed relationships relationships in neo4j must have a type, giving the relationship a semantic meaning, and a direction. frequently, the direction becomes part of the relationship's meaning. in other words, the relationship would be ambiguous without it. for example, the following graph shows that the czech republic defeated sweden in ice hockey. had the direction of the relationship been reversed, the swedes would be much happier. with no direction at all, the relationship would be ambiguous, since it would not be clear who the winner was. note that the existence of this relationship implies a relationship of a different type going in the opposite direction, as the next graph illustrates. this is often the case. to give another example, the fact that pulp fiction was directed_by quentin tarantino implies that quentin tarantino is_director_of pulp fiction. you could come up with a huge number of such relationship pairs. one common mistake people often make when modelling their domain in neo4j is creating both types of relationships. since one relationship implies the other, this is wasteful, both in terms of space and traversal time. neo4j can traverse relationships in both directions. more importantly, thanks to the way neo4j organizes its data, the speed of traversal does not depend on the direction of the relationships being traversed. bidirectional relationships some relationships, on the other hand, are naturally bidirectional. a classic example is facebook or real-life friendship. this relationship is mutual - when someone is your friend, you are (hopefully) his friend, too. depending on how we look at the model, we could also say such relationship is undirected. graphaware and neo technology are partner companies. since this is a mutual relationship, we could model it as bidirectional or undirected relationship, respectively. but since none of this is directly possible in neo4j, beginners often resort to the following model, which suffers from the exact same problem as the incorrect ice hockey model: an extra unnecessary relationship. neo4j apis allow developers to completely ignore relationship direction when querying the graph, if they so desire. for example, in neo4j's own query language, cypher, the key part of a query finding all partner companies of neo technology would look something like match (neo)-[:partner]-(partner) the result would be the same as executing and merging the results of the following two different queries: match (neo)-[:partner]->(partner) and match (neo)<-[:partner]-(partner) therefore, the correct (or at least most efficient) way of modelling the partner relationships is using a single partner relationship with an arbitrary direction . conclusion relationships in neo4j can be traversed in both directions with the same speed. moreover, direction can be completely ignored. therefore, there is no need to create two different relationships between nodes, if one implies the other.
November 6, 2013
by Michal Bachman
· 28,367 Views · 2 Likes
article thumbnail
LINQ - AddRange Method in C#
In this post I’m going to explain you about Linq AddRange method. This method is quite useful when you want to add multiple elements to a end of list. Following is a method signature for this. public void AddRange( IEnumerable collection ) To Understand let’s take simple example like following. using System; using System.Collections.Generic; namespace Linq { class Program { static void Main(string[] args) { List names=new List {"Jalpesh"}; string[] newnames=new string[]{"Vishal","Tushar","Vikas","Himanshu"}; foreach (var newname in newnames) { names.Add(newname); } foreach (var n in names) { Console.WriteLine(n); } } } } Here in the above code I am adding content of array to a already created list via foreach loop. You can use AddRange method instead of for loop like following.It will same output as above. using System; using System.Collections.Generic; namespace Linq { class Program { static void Main(string[] args) { List names=new List {"Jalpesh"}; string[] newnames=new string[]{"Vishal","Tushar","Vikas","Himanshu"}; names.AddRange(newnames); foreach (var n in names) { Console.WriteLine(n); } } } } Now when you run that example output is like following. Add Range in more complex scenario: You can also use add range to more complex scenarios also like following.You can use other operator with add range as following. using System; using System.Collections.Generic; using System.Linq; namespace Linq { class Program { static void Main(string[] args) { List names=new List {"Jalpesh"}; string[] newnames=new string[]{"Vishal","Tushar","Vikas","Himanshu"}; names.AddRange(newnames.Where(nn=>nn.StartsWith("Vi"))); foreach (var n in names) { Console.WriteLine(n); } } } } Here in the above code I have created array with string and filter it with where operator while adding it to an existing list. Following is output as expected. That’s it. Hope you like it. Stay tuned for more..
November 3, 2013
by Jalpesh Vadgama
· 62,419 Views · 2 Likes
article thumbnail
EasyNetQ: Publisher Confirms
Publisher confirms are a RabbitMQ addition to AMQP to guarantee message delivery. You can read all about them here and here. In short they provide a asynchronous confirmation that a publish has successfully reached all the queues that it was routed to. To turn on publisher confirms with EasyNetQ set the publisherConfirms connection string parameter like this: var bus = RabbitHutch.CreateBus("host=localhost;publisherConfirms=true"); When you set this flag, EasyNetQ will wait for the confirmation, or a timeout, before returning from the Publish method: bus.Publish(new MyMessage { Text = "Hello World!" }); // here the publish has been confirmed. Nice and easy. There’s a problem though. If I run the above code in a while loop without publisher confirms, I can publish around 4000 messages per second, but with publisher confirms switched on that drops to around 140 per second. Not so good. With EasyNetQ 0.15 we introduced a new PublishAsync method that returns a Task. The Task completes when the publish is confirmed: bus.PublishAsync(message).ContinueWith(task => { if (task.IsCompleted) { Console.WriteLine("Publish completed fine."); } if (task.IsFaulted) { Console.WriteLine(task.Exception); } }); Using this code in a while loop gets us back to 4000 messages per second with publisher confirms on. Happy confirms!
November 1, 2013
by Mike Hadlow
· 8,792 Views
article thumbnail
Adding Appsec to Agile: Security Stories, Evil User Stories and Abuse(r) Stories
Because Agile development teams work from a backlog of stories, one way to inject application security into software development is by writing up application security risks and activities as stories, making them explicit and adding them to the backlog so that application security work can be managed, estimated, prioritized and done like everything else that the team has to do. Security Stories SAFECode has tried to do this by writing a set of common, non-functional Security Stories following the well-known “As a [type of user] I want {something} so that {reason}” template. These stories are not customer- or user-focused: not the kind that a Product Owner would understand or care about. Instead, they are meant for the development team (architects, developers and testers). Example: As a(n) architect/developer, I want to ensure AND as QA, I want to verify that sensitive data is kept restricted to actors authorized to access it. There are stories to prevent/check for the common security vulnerabilities in applications: XSS, path traversal, remote execution, CSRF, OS command injection, SQL injection, password brute forcing. Checks for information exposure through error messages, proper use of encryption, authentication and session management, transport layer security, restricted uploads and URL redirection to un-trusted sites; and basic code quality issues: NULL pointer checking, boundary checking, numeric conversion, initialization, thread/process synchronization, exception handling, use of unsafe/restricted functions. SAFECode also includes a list of secure development practices (operational tasks) for the team that includes making sure that you’re using the latest compiler, patching the run-time and libraries, static analysis, vulnerability scanning, code reviews of high-risk code, tracking and fixing security bugs; and more advanced practices that require help from security experts like fuzzing, threat modeling, pen tests, environmental hardening. Altogether this is a good list of problems that need to be watched out for and things that should be done on most projects. But although SAFECode’s stories look like stories, they can’t be used as stories by the team. These Security Stories are non-functional requirements (NFRs) and technical constraints that (like requirements for scalability and maintainability and supportability) need to be considered in the design of the system, and may need to be included as part of the definition of done and conditions of acceptance for every user story that the team works on. Security Stories can’t be pulled from the backlog and delivered like other stories and removed from the backlog when they are done, because they are never “done”. The team has to keep worrying about them throughout the life of the project and of the system. As Rohit Sethi points out, asking developers to juggle long lists of technical constraints like this is not practical: If you start adding in other NFR constraints, such as accessibility, the list of constraints can quickly grow overwhelming to developers. Once the list grows unwieldy, our experience is that developers tend to ignore the list entirely. They instead rely on their own memories to apply NFR constraints. Since the number of NFRs continues to grow in increasingly specialized domains such as application security, the cognitive burden on developers’ memories is substantial. OWASP Evil User Stories – Hacking the Backlog Someone at OWASP has suggested an alternative, much smaller set of non-functional Evil User Stories that can be "hacked" into the backlog: A way for a security guy to get security on the agenda of the development team is by “hacking the backlog”. The way to do this is by crafting Evil User Stories, a few general negative cases that the team needs to consider when they implement other stories. Example #1. "As a hacker, I can send bad data in URLs, so I can access data and functions for which I'm not authorized." Example #2. "As a hacker, I can send bad data in the content of requests, so I can access data and functions for which I'm not authorized." Example #3. "As a hacker, I can send bad data in HTTP headers, so I can access data and functions for which I'm not authorized." Example #4. "As a hacker, I can read and even modify all data that is input and output by your application." Thinking like a Bad Guy – Abuse Cases and Abuser Stories Another way to beef up security in software development is to get the team to carefully look at the system they are building from the bad guy's perspective. In “Misuse and Abuse Cases: Getting Past the Positive”, Dr. Gary McGraw at Cigital talks about the importance of anticipating things going wrong, and thinking about behaviour that the system needs to prevent. Assume that the customer/user is not going to behave, or is actively out to attack the application. Question all of the assumptions in the design (the can’ts and won’ts), especially trust conditions – what if the bad guy can be anywhere along the path of an action (for example, using an attack proxy between the client and the server)? Abuse Cases are created by security experts working with the team as part of a critical review – either of the design or of an existing application. The goal of a review like this is to understand how the system behaves under attack/failure conditions, and document any weaknesses or gaps that need to be addressed. At Agile 2013 Judy Neher presented a hands-on workshop on how to write Abuser Stories, a lighter-weight, Agile practice which makes “thinking like a bad guy” part of the team’s job of defining and refining user requirements. Take a story, and as part of elaborating the story and listing the scenarios, step back and look at the story through a security lens. Don’t just think of what the user wants to do and can do - think about what they don’t want to do and can’t do. Get the same people who are working on the story to “put their black hats on” and think evil for a little while, brainstorm to come up with negative cases. As {some kind of bad guy} I want to {do some bad thing}… The {bad guy} doesn’t have to be a hacker. They could be an insider with a grudge or a selfish customer who is willing to take advantage of other users, or an admin user who needs to be protected from making expensive mistakes, or an external system that may not always function correctly. Ask questions like: How do I know who the user is and that I can trust them? Who is allowed to do what, and where are the authorization checks applied? Look for holes in multi-step workflows – what happens if somebody bypasses a check or tries to skip a step or do something out of sequence? What happens if an action or a check times-out or blocks or fails – what access should be allowed, what kind of information should be shown, what kind shouldn’t be? Are we interacting with children? Are we dealing with money? With dangerous command-and-control/admin functions? With confidential or pirvate data? Look closer at the data. Where is it coming from? Can I trust it? Is the source authenticated? Where is it validated – do I have to check it myself? Where is it stored (does it have to be stored)? If it has to be stored, should it be encrypted or masked (including in log files)? Who should be able to see it? Who shouldn’t be able to see it? Who can change it, and to the changes need to be audited? Do we need to make sure the data hasn't been tampered with (checksum, HMAC, digital signature)? Use this exercise to come up with refutation criteria (user can do this, but can’t do that; they can see this but they can’t see that), instead of, or as part of the conditions of acceptance for the story. Prioritize these cases based on risk, add the cases that you agree need to be taken care of as scenarios to the current story, or as new stories to the backlog if they are big enough. “Thinking like a bad guy” as you are working on a story seems more useful and practical than other story-based approaches. It doesn’t take a lot of time, and it’s not expensive. You don’t need to write Abuser Stories for every user Story and the more Abuser Stories that you do, the easier it will get – you'll get better at it, and you’ll keep running into the same kinds of problems that can be solved with the same patterns. You end up with something concrete and functional and actionable, work that has to be done and can be tested. Concrete, actionable cases like this are easier for the team to understand and appreciate – including the Product Owner, which is critical in Scrum, because the Product Owner decides what is important and what gets done. And because Abuser Stories are done in phase, by the people who are working on the stories already (rather than a separate activity that needs to be setup and scheduled) they are more likely to get done. Simple, quick, informal threat modeling like this isn’t enough to make a system secure – the team won’t be able to find and plug all of the security holes in the system this way, even if the developers are well-trained in secure software development and take their work seriously. Abuser Stories are good for identifying business logic vulnerabilities, reviewing security features (authentication, access control, auditing, password management, licensing) improving error handling and basic validation, and keeping onside of privacy regulations. Effective software security involves a lot more work than this: choosing a good framework and using it properly, watching out for changes to the system's attack surface, carefully reviewing high-risk code for design and coding errors, writing good defensive code as much as possible, using static analysis to catch common coding mistakes, and regular security testing (pen testing and dynamic analysis). But getting developers and testers to think like a bad guy as they build a system should go a long way to improving the security and robustness of your app.
October 31, 2013
by Jim Bird
· 21,824 Views · 1 Like
article thumbnail
How to Use MongoDB as a Pure In-memory DB (Redis Style)
The Idea There has been a growing interest in using MongoDB as an in-memory database, meaning that the data is not stored on disk at all. This can be super useful for applications like: a write-heavy cache in front of a slower RDBMS system embedded systems PCI compliant systems where no data should be persisted unit testing where the database should be light and easily cleaned That would be really neat indeed if it was possible: one could leverage the advanced querying / indexing capabilities of MongoDB without hitting the disk. As you probably know the disk IO (especially random) is the system bottleneck in 99% of cases, and if you are writing data you cannot avoid hitting the disk. One sweet design choice of MongoDB is that it uses memory-mapped files to handle access to data files on disk. This means that MongoDB does not know the difference between RAM and disk, it just accesses bytes at offsets in giant arrays representing files and the OS takes care of the rest! It is this design decision that allows MongoDB to run in RAM with no modification. How it is done This is all achieved by using a special type of filesystem called tmpfs. Linux will make it appear as a regular FS but it is entirely located in RAM (unless it is larger than RAM in which case it can swap, which can be useful!). I have 32GB RAM on this server, let’s create a 16GB tmpfs: # mkdir /ramdata # mount -t tmpfs -o size=16000M tmpfs /ramdata/ # df Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvde1 5905712 4973924 871792 86% / none 15344936 0 15344936 0% /dev/shm tmpfs 16384000 0 16384000 0% /ramdata Now let’s start MongoDB with the appropriate settings. smallfiles and noprealloc should be used to reduce the amount of RAM wasted, and will not affect performance since it’s all RAM based. nojournal should be used since it does not make sense to have a journal in this context! dbpath=/ramdata nojournal = true smallFiles = true noprealloc = true After starting MongoDB, you will find that it works just fine and the files are as expected in the FS: # mongo MongoDB shell version: 2.3.2 connecting to: test > db.test.insert({a:1}) > db.test.find() { "_id" : ObjectId("51802115eafa5d80b5d2c145"), "a" : 1 } # ls -l /ramdata/ total 65684 -rw-------. 1 root root 16777216 Apr 30 15:52 local.0 -rw-------. 1 root root 16777216 Apr 30 15:52 local.ns -rwxr-xr-x. 1 root root 5 Apr 30 15:52 mongod.lock -rw-------. 1 root root 16777216 Apr 30 15:52 test.0 -rw-------. 1 root root 16777216 Apr 30 15:52 test.ns drwxr-xr-x. 2 root root 40 Apr 30 15:52 _tmp Now let’s add some data and make sure it behaves properly. We will create a 1KB document and add 4 million of them: > str = "" > aaa = "aaaaaaaaaa" aaaaaaaaaa > for (var i = 0; i < 100; ++i) { str += aaa; } > for (var i = 0; i < 4000000; ++i) { db.foo.insert({a: Math.random(), s: str});} > db.foo.stats() { "ns" : "test.foo", "count" : 4000000, "size" : 4544000160, "avgObjSize" : 1136.00004, "storageSize" : 5030768544, "numExtents" : 26, "nindexes" : 1, "lastExtentSize" : 536600560, "paddingFactor" : 1, "systemFlags" : 1, "userFlags" : 0, "totalIndexSize" : 129794000, "indexSizes" : { "_id_" : 129794000 }, "ok" : 1 } The document average size is 1136 bytes and it takes up about 5GB of storage. The index on _id takes about 130MB. Now we need to verify something very important: is the data duplicated in RAM, existing both within MongoDB and the filesystem? Remember that MongoDB does not buffer any data within its own process, instead data is cached in the FS cache. Let’s drop the FS cache and see what is in RAM: # echo 3 > /proc/sys/vm/drop_caches # free total used free shared buffers cached Mem: 30689876 6292780 24397096 0 1044 5817368 -/+ buffers/cache: 474368 30215508 Swap: 0 0 0 As you can see there is 6.3GB of used RAM of which 5.8GB is in FS cache (buffers). Why is there still 5.8GB of FS cache even after all caches were dropped?? The reason is that Linux is smart and it does not duplicate the pages between tmpfs and its cache… Bingo! That means your data exists with a single copy in RAM. Let’s access all documents and verify RAM usage is unchanged: > db.foo.find().itcount() 4000000 # free total used free shared buffers cached Mem: 30689876 6327988 24361888 0 1324 5818012 -/+ buffers/cache: 508652 30181224 Swap: 0 0 0 # ls -l /ramdata/ total 5808780 -rw-------. 1 root root 16777216 Apr 30 15:52 local.0 -rw-------. 1 root root 16777216 Apr 30 15:52 local.ns -rwxr-xr-x. 1 root root 5 Apr 30 15:52 mongod.lock -rw-------. 1 root root 16777216 Apr 30 16:00 test.0 -rw-------. 1 root root 33554432 Apr 30 16:00 test.1 -rw-------. 1 root root 536608768 Apr 30 16:02 test.10 -rw-------. 1 root root 536608768 Apr 30 16:03 test.11 -rw-------. 1 root root 536608768 Apr 30 16:03 test.12 -rw-------. 1 root root 536608768 Apr 30 16:04 test.13 -rw-------. 1 root root 536608768 Apr 30 16:04 test.14 -rw-------. 1 root root 67108864 Apr 30 16:00 test.2 -rw-------. 1 root root 134217728 Apr 30 16:00 test.3 -rw-------. 1 root root 268435456 Apr 30 16:00 test.4 -rw-------. 1 root root 536608768 Apr 30 16:01 test.5 -rw-------. 1 root root 536608768 Apr 30 16:01 test.6 -rw-------. 1 root root 536608768 Apr 30 16:04 test.7 -rw-------. 1 root root 536608768 Apr 30 16:03 test.8 -rw-------. 1 root root 536608768 Apr 30 16:02 test.9 -rw-------. 1 root root 16777216 Apr 30 15:52 test.ns drwxr-xr-x. 2 root root 40 Apr 30 16:04 _tmp # df Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvde1 5905712 4973960 871756 86% / none 15344936 0 15344936 0% /dev/shm tmpfs 16384000 5808780 10575220 36% /ramdata And that verifies it! :) What about replication? You probably want to use replication since a server loses its RAM data upon reboot! Using a standard replica set you will get automatic failover and more read capacity. If a server is rebooted MongoDB will automatically rebuild its data by pulling it from another server in the same replica set (resync). This should be fast enough even in cases with a lot of data and indices since all operations are RAM only :) It is important to remember that write operations get written to a special collection called oplog which resides in the local database and takes 5% of the volume by default. In my case the oplog would take 5% of 16GB which is 800MB. In doubt, it is safer to choose a fixed oplog size using the oplogSize option. If a secondary server is down for a longer time than the oplog contains, it will have to be resynced. To set it to 1GB, use: oplogSize = 1000 What about sharding? Now that you have all the querying capabilities of MongoDB, what if you want to implement a large service with it? Well you can use sharding freely to implement a large scalable in-memory store. Still the config servers (that contain the chunk distribution) should be disk based since their activity is small and rebuilding a cluster from scratch is not fun. What to watch for RAM is a scarce resource, and in this case you definitely want the entire data set to fit in RAM. Even though tmpfs can resort to swapping the performance would drop dramatically. To make best use of the RAM you should consider: usePowerOf2Sizes option to normalize the storage buckets run a compact command or resync the node periodically. use a schema design that is fairly normalized (avoid large document growth) Conclusion Sweet, you can now use MongoDB and all its features as an in-memory RAM-only store! Its performance should be pretty impressive: during the test with a single thread / core I was achieving 20k writes per second, and it should scale linearly over the number of cores.
October 28, 2013
by Antoine Girbal
· 61,035 Views
article thumbnail
HashMap – Single Key and Multiple Values Example
Sometimes you want to store multiple values for the same hash key. The following code examples show you three different ways to do this.
October 26, 2013
by Jagadeesh Motamarri
· 799,629 Views · 9 Likes
article thumbnail
Extracting File Metadata with C# and the .NET Framework
How to extract extended image metadata using C# and the Windows API Code Pack, simplifying access to detailed file properties typically seen in Windows Explorer.
October 26, 2013
by Rob Sanders
· 39,936 Views · 2 Likes
article thumbnail
Too Many Parameters in Java Methods, Part 7: Mutable State
In this seventh post of my series on addressing the issue of too many parameters in a Java method or constructor, I look at using state to reduce the need to pass parameters. One of the reasons I have waited until the 7th post of this series to address this is that it is one of my least favorite approaches for reducing parameters passed to methods and constructors. That stated, there are multiple flavors of this approach and I definitely prefer some flavors over others. Perhaps the best known and most widely scorned approach in all of software development for using state to reduce parameter methods is the use global variables. Although it may be semantically accurate to say thatJava does not have global variables, the reality is that for good or for bad the equivalent of global variables is achieved in Java via public static constructs. A particularly popular way to achieve this in Java is via the Stateful Singleton. In Patterns of Enterprise Application Architecture, Martin Fowler wrote that "any global data is always guilty until proven innocent." Global variables and "global-like" constructs in Java are considered bad form for several reasons. They can make it difficult for developers maintaining and reading code to know where the values are defined or last changed or even come from. By their very nature and intent, global data violates the principles of encapsulation and data hiding. Miško Hevery has written the following regarding the problems of static globals in an object-oriented language: Accessing global state statically doesn’t clarify those shared dependencies to readers of the constructors and methods that use the Global State. Global State and Singletons make APIs lie about their true dependencies. ... The root problem with global state is that it is globally accessible. In an ideal world, an object should be able to interact only with other objects which were directly passed into it (through a constructor, or method call). Having state available globally reduces the need for parameters because there is no need for one object to pass data to another object if both objects already have direct access to that data. However, as Hevery put it, that's completely orthogonal to the intent of object-oriented design. Mutable state is also an increasing problem as concurrent applications become more common. In his JavaOne 2012 presentation on Scala, Scala creator Martin Odersky stated that "every piece of mutable state you have is a liability" in a highly concurrent world and added that the problem is "non-determinism caused by concurrent threads accessing shared mutable state." Although there are reasons to avoid mutable state, it still remains a generally popular approach in software development. I think there are several reasons for this including that it's superfically easy to write mutable state sharing code and mutable shared code does provide ease of access. Some types of mutable data are popular because those types of mutable data have been taught and learned as effective for years. Finally, three are times when mutable state may be the most appropriate solution. For that last reason and to be complete, I now look at how the use of mutable state can reduce the number of parameters a method must expect. Stateful Singleton and Static Variables A Java implementation of Singleton and other public Java static fields are generally available to any Java code within the same Java Virtual Machine (JVM) and loaded with the same classloader [for more details, see When is a Singleton not a Singleton?]. Any data stored universally (at least from JVM/classloader perspective) is already available to client code in the same JVM and loaded with the same class loader. Because of this, there is no need to pass that data between clients and methods or constructors in that same JVM/classloader combination. Instance State While "statics" are considered "globally available," narrower instance-level state can also be used in a similar fashion to reduce the need to pass parameters between methods of the same class. An advantage of this over global variables is that the accessibility is limited to instances of the class (private fields) or instances of the class's children (private fields). Of course, if the fields are public, accessibility is pretty wide open, but the same data is not automatically available to other code in the same JVM/classloader. The next code listing demonstrates how state data can and sometimes is used to reduce the need for parameters between two methods internal to a given class. Example of Instance State Used to Avoid Passing Parameters /** * Simple example of using instance variable so that there is no need to * pass parameters to other methods defined in the same class. */ public void doSomethingGoodWithInstanceVariables() { this.person = Person.createInstanceWithNameAndAddressOnly( new FullName.FullNameBuilder(new Name("Flintstone"), new Name("Fred")).createFullName(), new Address.AddressBuilder(new City("Bedrock"), State.UN).createAddress()); printPerson(); } /** * Prints instance of Person without requiring it to be passed in because it * is an instance variable. */ public void printPerson() { out.println(this.person); } The above example is somewhat contrived and simplified, but does illustrate the point: the instance variableperson can be accessed by other instance methods defined in the same class, so that instance does not need to be passed between those instance methods. This does reduce the signature of potentially (public accessibility means it may be used by external methods) internal methods, but also introduces state and now means that the invoked method impacts the state of that same object. In other words, the benefit of not having to pass the parameter comes at the cost of another piece of mutable state. The other side of the trade-off, needing to pass the instance of Person because it is not an instance variable, is shown in the next code listing for comparison. Example of Passing Parameter Rather than Using Instance Variable /** * Simple example of passing a parameter rather than using an instance variable. */ public void doSomethingGoodWithoutInstanceVariables() { final Person person = Person.createInstanceWithNameAndAddressOnly( new FullName.FullNameBuilder(new Name("Flintstone"), new Name("Fred")).createFullName(), new Address.AddressBuilder(new City("Bedrock"), State.UN).createAddress()); printPerson(person); } /** * Prints instance of Person that is passed in as a parameter. * * @param person Instance of Person to be printed. */ public void printPerson(final Person person) { out.println(person); } The previous two code listings illustrate that parameter passing can be reduced by using instance state. I generally prefer to not use instance state solely to avoid parameter passing. If instance state is needed for other reasons, than the reduction of parameters to be passed is a nice side benefit, but I don't like introducing unnecessary instance state simply to remove or reduce the number of parameters. Although there was a time when the readability of reduced parameters might have justified instance state in a large single-threaded environment, I feel that the slight readability gain from reduced parameters is not worth the cost of classes that are not thread-safe in an increasingly multi-threaded world. I still don't like to pass a whole lot of parameters between methods of the same class, but I can use the parameters object (perhaps with a package-private scope class) to reduce the number of these parameters and pass that parameters object around instead of the large number of parameters. JavaBean Style Construction The JavaBeans convention/style has become extremely popular in the Java development community. Many frameworks such as Spring Framework and Hibernate rely on classes adhering to the JavaBeans conventions and some of the standards like Java Persistence API also are built around the JavaBeans conventions. There are multiple reasons for the popularity of the JavaBeans style including its ease-of-use and the ability to usereflection against this code adhering to this convention to avoid additional configuration. The general idea behind the JavaBean style is to instantiate an object with a no-argument constructor and then set its fields via single-argument "set" methods and access it fields via no-argument "get" methods. This is demonstrated in the next code listings. The first listing shows a simple example of a PersonBean class with no-arguments constructor and getter and setter methods. That code listing also includes some of the JavaBeans-style classes it uses. That code listing is followed by code using that JavaBean style class. Examples of JavaBeans Style Class public class PersonBean { private FullNameBean name; private AddressBean address; private Gender gender; private EmploymentStatus employment; private HomeownerStatus homeOwnerStatus; /** No-arguments constructor. */ public PersonBean() {} public FullNameBean getName() { return this.name; } public void setName(final FullNameBean newName) { this.name = newName; } public AddressBean getAddress() { return this.address; } public void setAddress(final AddressBean newAddress) { this.address = newAddress; } public Gender getGender() { return this.gender; } public void setGender(final Gender newGender) { this.gender = newGender; } public EmploymentStatus getEmployment() { return this.employment; } public void setEmployment(final EmploymentStatus newEmployment) { this.employment = newEmployment; } public HomeownerStatus getHomeOwnerStatus() { return this.homeOwnerStatus; } public void setHomeOwnerStatus(final HomeownerStatus newHomeOwnerStatus) { this.homeOwnerStatus = newHomeOwnerStatus; } } /** * Full name of a person in JavaBean style. * * @author Dustin */ public final class FullNameBean { private Name lastName; private Name firstName; private Name middleName; private Salutation salutation; private Suffix suffix; /** No-args constructor for JavaBean style instantiation. */ private FullNameBean() {} public Name getFirstName() { return this.firstName; } public void setFirstName(final Name newFirstName) { this.firstName = newFirstName; } public Name getLastName() { return this.lastName; } public void setLastName(final Name newLastName) { this.lastName = newLastName; } public Name getMiddleName() { return this.middleName; } public void setMiddleName(final Name newMiddleName) { this.middleName = newMiddleName; } public Salutation getSalutation() { return this.salutation; } public void setSalutation(final Salutation newSalutation) { this.salutation = newSalutation; } public Suffix getSuffix() { return this.suffix; } public void setSuffix(final Suffix newSuffix) { this.suffix = newSuffix; } @Override public String toString() { return this.salutation + " " + this.firstName + " " + this.middleName + this.lastName + ", " + this.suffix; } } package dustin.examples; /** * Representation of a United States address (JavaBeans style). * * @author Dustin */ public final class AddressBean { private StreetAddress streetAddress; private City city; private State state; /** No-arguments constructor for JavaBeans-style instantiation. */ private AddressBean() {} public StreetAddress getStreetAddress() { return this.streetAddress; } public void setStreetAddress(final StreetAddress newStreetAddress) { this.streetAddress = newStreetAddress; } public City getCity() { return this.city; } public void setCity(final City newCity) { this.city = newCity; } public State getState() { return this.state; } public void setState(final State newState) { this.state = newState; } @Override public String toString() { return this.streetAddress + ", " + this.city + ", " + this.state; } } Example of JavaBeans Style Instantiation and Population public PersonBean createPerson() { final PersonBean person = new PersonBean(); final FullNameBean personName = new FullNameBean(); personName.setFirstName(new Name("Fred")); personName.setLastName(new Name("Flintstone")); person.setName(personName); final AddressBean address = new AddressBean(); address.setStreetAddress(new StreetAddress("345 Cave Stone Road")); address.setCity(new City("Bedrock")); person.setAddress(address); return person; } The examples just shown demonstrate how the JavaBeans style approach can be used. This approach makes some concessions to reduce the need to pass a large number of parameters to a class's constructor. Instead, no parameters are passed to the constructor and each individual attribute that is needed must be set. One of the advantages of the JavaBeans style approach is that readability is enhanced as compared to a constructor with a large number of parameters because each of the "set" methods is hopefully named in a readable way. The JavaBeans approach is simple to understand and definitely achieves the goal of reducing lengthy parameters in the case of constructors. However, there are some disadvantages to this approach as well. One advantage is a lot of tedious client code for instantiating the object and setting its attributes one-at-a-time. It is easy with this approach to neglect to set a required attribute because there is no way for the compiler to enforce all required parameters be set without leaving the JavaBeans convention. Perhaps most damaging, there are several objects instantiated in this last code listing and these objects exist in different incomplete states from the time they are instantiated until the time the final "set" method is called. During that time, the objects are in what is really an "undefined" or "incomplete" state. The existence of "set" methods necessarily means that the class's attributes cannot be final, rendering the entire object highly mutable. Regarding the prevalent use of the JavaBeans pattern in Java, several credible authors have called into questionits value. Allen Holub's controversial article Why getter and setter methods are evil starts off with no holds barred: Though getter/setter methods are commonplace in Java, they are not particularly object oriented (OO). In fact, they can damage your code's maintainability. Moreover, the presence of numerous getter and setter methods is a red flag that the program isn't necessarily well designed from an OO perspective. Josh Bloch, in his less forceful and more gently persuasive tone, says of the JavaBeans getter/setter style: "The JavaBeans pattern has serious disadvantages of its own" (Effective Java, Second Edition, Item #2). It is in this context that Bloch recommends the builder pattern instead for object construction. I'm not against using the JavaBeans get/set style when the framework I've selected for other reasons requires it and the reasons for using that framework justify it. There are also areas where the JavaBeans style class is particularly well suited such as interacting with a data store and holding data from the data store for use by the application. However, I am not a fan of using the JavaBeans style for instantiating a question simply to avoid the need to pass parameters. I prefer one of the other approaches such as builder for that purpose. Benefits and Advantages I've covered different approaches to reducing the number of arguments to a method or constructor in this post, but they also share the same trade-off: exposing mutable state to reduce or eliminate the number of parameters that must be passed to a method or to a constructor. The advantages of these approaches are simplicity, generally readable (though "globals" can be difficult to read), and ease of first writing and use. Of course, their biggest advantage from this post's perspective is that they generally eliminate the need for any parameter passing. Costs and Disadvantages The trait that all approaches covered in this post share is the exposure of mutable state. This can lead to an extremely high cost if the code is used in a highly concurrent environment. There is a certain degree of unpredictability when object state is exposed for anyone to tinker with it as they like. It can be difficult to know which code made the wrong change or failed to make a necessary change (such as failing to call a "set" method when populating a newly instantiated object). Conclusion Some of the approaches covered in this post are highly popular despite their drawbacks. This may be for a variety of reasons including prevalence of use in popular frameworks (forcing users of the framework to use that style and also providing examples to others for their own code development). Other reasons for these approaches' popularity is the relative ease of initial development and the seemingly (deceptively) relatively little thought that needs to go into design with these approaches. In general, I prefer to spend a little more design and implementation effort to use builders and less mutable approaches when practical. However, there are cases where these mutable approaches work well in reducing the number of parameters passed around and introduce no more risk than was already present. My feeling is that Java developers should carefully consider use of any mutable Java classes and ensure that the mutability is either desired or is a cost that is justified by the reasons for using a mutable state approach.
October 25, 2013
by Dustin Marx
· 16,608 Views
article thumbnail
What’s the Difference Between System.String and string?
One of the questions that lot of developers ask is – Is there any difference between string andSystem.String and what should be used? Short Answer There is no difference between the two. You can use either of them in your code. Explanation System.String is a class (reference type) defined the mscorlib in the namespace System. In other words, System.String is a type in the CLR. string is a keyword in C# Before we understand the difference, let us understand BCL and FCL terms. BCL is Common Language Infrastructure (CLI) available to languages like C#, A#, Boo, Cobra, F#, IronRuby, IronPython and other CLI languages. It includes common functions such as File Read/Write or IO and database/XML interactions. BCL was first implemented in Microsoft .NET in the form of mscorlib.dll FCL is standard Microsoft .NET specific library containing reusable classes/assets like System, System.CodeDom, System.Collections, System.Diagnostics, System.Globalization, System.IO, System.Resources and System.Text Now in C#, string (keyword in BCL) directly maps to System.String (an FCL type). Similarly, intmaps directly to System.Int32. Here int is mapped to a integer type that is 32 bit. But in other language, you could probably map int (keyword in BCL) to a 64 bit integer (FCL type). So the fact that using string and System.String in C# makes no difference is well established. Is it better to still use string instead of System.String? There is no universally agreed answer to this. But, as per me, even though both string and System.String mean the same and have no difference in performance of the application, it is better to use string. This is because string is a C# language specific keyword. Also C# language specification states, As a matter of style, use of the keyword is favored over use of the complete system type name Following this practice ensures that your code consistently uses keywords wherever possible rather than having a code with BCL and FCL types used.
October 25, 2013
by Punit Ganshani
· 11,387 Views · 3 Likes
article thumbnail
Reasons to Move from DataTables to Generic Collections
These days, no community member writes or speaks about using DataTables and DataSets for data operations. But, there are a number of real projects built using them, and many developers still feel happy when they use them in their projects. Sometimes it is not easy to completely replace DataTables with typed generic lists, particularly in bulky projects. But now is the right time to move, as future developers may not even learn about DataTables :). Generic collections have a number of advantages over DataTables. One cannot imagine a day without generic collections once he/she gets to know how beneficial they are. The following is a list of the reasons to move from DataTables to collections that I could think of now: DataTable stores boxed objects, and one needs to unbox values when needed. This adds overhead on the runtime environment. However, values in generic collections are strongly typed, so no boxing involved. Unboxing happens at runtime, as does the type checking. If there is a mismatch between types of source and target, it leads to a runtime exception. This may lead to a number of issues while using DataTables. In case of collections, as the types are checked at the compile time, such type mismatches are caught during compilation. .NET languages got very nice support for creating collections, like object initializer and collection initializer. We don’t have such features for DataTables. LINQ queries can be used on both DataTables and collections. But the experience of writing the queries on generic collections is better because of IntelliSense support provided by Visual Studio. DataTables are framework specific; we often see issues with serializing and de-serializing them in web services. Generic collections are easier to serialize and de-serialize, so they can be easily used in any service and consumed from a client written in any language. ORMs are becoming increasingly popular, and they use generic collections for all data operations. Mocking DataTables in unit tests is a pain, as it involves creating the structure of the table wherever needed. But a generic collection needs a class defined just once. These are my opinions on preferring collections over DataTables. Any feedback is welcome. Happy coding!
October 21, 2013
by Rabi Kiran Srirangam
· 30,143 Views · 3 Likes
article thumbnail
Database vs. Data Science
One thing that Big Data certainly made happen is that it brought the database/infrastructure community and the data analysis/statistics/machine learning communities closer together. As always, each community had it’s own set of models, methods, and ideas about how to structure and interpret the world. You can still tell these differences when looking at current Big Data projects, and I think it’s important to be aware of the distinctions in order to better understand the relationships between different projects. Because, let’s face it, every project claims to re-invent Big Data. Hadoop and MapReduce being something like the founding fathers of Big Data, other’s projects have since appeared. Most notably, there are stream processing projects like Twitter’s Storm who move from batch-oriented processing to event-based processing which is more suited for real-time, low-latency processing. Spark is yet something different, a bit like Hadoop, but puts greater emphasis on iterative algorithms, and in-memory processing to achieve that landmark “100x faster than Hadoop” every current project seems to need to sport. Twitter’s summingbird project tries to bridge the gap between MapReduce and stream processing by providing us with a high-level set of operators which can then either run on MapReduce or Storm. However, both Spark or summingbird leave me sort of flat because you can see that they come from a database background, which means that there will still be a considerable gap to serious machine learning. So, what exactly is the difference? In the end, it’s the difference between relational and linear algebra. In the database world, you model relationships between objects, which you encode in tables, and foreign keys to link up entries between different tables. Probably the most important insight of the database world was to develop a query language, a declarative description of what you want to extract from your database, leaving the optimization of the query and the exact details of how to perform them efficiently to the database guys. The machine learning community, on the other hand, has its roots in linear algebra and probability theory. Objects are usually encoded as a feature vector, that is, a list of numbers describing different properties of an object. Data is often collected in matrices where each row corresponds to an object, and each column to a feature, not much unlike a table in a database. However, the operations you perform in order to do data analysis are quite different from the data base world. Take something as basic as linear regression: your try to learn a linear function f(x)=di=1wixi in a d-dimensional space (that is, where your objects are described by a d-dimensional vector) given n examples Xi, and Yi, where Xi are the features describing your objects and Yi is the real number you attach to Xi. One way to “learn” w is to tune it such that the quadratic error on the training examples is minimal. The solution can be written in closed form as w=(XXT)−1XY where X is the matrix built from the Xi (putting the Xi in the columns of X), and Y is the vector of outputs Yi. In order to solve this, you need to solve the linear equation (XXT)w=XY which can be done by one of a large number of algorithms, starting with Gaussian elimination, which you’ve probably learned in your undergrad studies, or the conjugate gradient algorithm, or by first computing a Cholesky decomposition. All of these algorithms have in common that they are iterative. They go through a number of operations, for example O(d3) for the Gaussian elimination case. They also need to store intermediate results. Gaussian elimination and Cholesky decomposition have rather elementary operations acting on individual entries, while the conjugate gradient algorithm performs a matrix-vector multiplication in each iteration. Most importantly, these algorithms can only be expressed very badly in SQL! It’s certainly not impossible, but you’d need to store your data in much different ways than you would in idiomatic database usage. So, it’s not about whether or not your framework can support iterative algorithms without significant latency, it’s about understanding that joins, group bys, and count() won’t get you far, but you need scalar products, matrix-vector and matrix-matrix multiplications. You don’t need indices for most ML algorithms, maybe except for being able to quickly find the k-nearest neighbors, because most algorithms tend to either take in the whole data set in each iteration or otherwise stream the whole set by some model which is iteratively updated like in stochastic gradient descent. I’m not sure projects like Spark or Stratosphere have fully grasped the significance of this yet. Database infrastructure-inspired Big Data has it’s place when it comes to extracting and preprocessing data, but eventually, you move from database land to machine learning land, which invariably means linear algebra land (or probability theory land, which often also reduces to linear algebra like computations). What often happens today is that you either painstakingly have to break down your linear algebra into MapReduce jobs, or you actively look for algorithms which fit the database view better. I think we’re still at the beginning of what is possible. Or, to be a bit more aggressive, claims that existing (infrastructure, database, parallelism inspired) frameworks provide you with sophistic data analytics are widely exaggerated. They take care of a very important problem by giving you a reliable infrastructure to scale your data analysis code, but there’s still a lot of work that needs to be done on your side. High-level DSLs like Apache Hive or Pig are a first step in this direction but still too much rooted in the database world IMHO. In summary, one should be aware of the difference between a framework which mostly is concerned with scaling and a tool which actually provides some piece of data analysis. And even if it comes with basic database-like analytics mechanisms, there is still a long way to go to do some serious data science. That’s why we’re also thinking that streamdrill occupies an interesting spot, because it is a bit of infrastructure, allowing you to process a serious amount of event data, but it also provides valuable analysis based on algorithms you wouldn’t want to implement yourself, even if you had some Big Data framework like Hadoop at hand. That’s an interesting direction I also would like to see more of in the future. Note: Just saw that Spark has a logistic regression example on their landing page. Well, doing matrix operations explicitly via map() on collections doesn’t count in my view ;)
October 18, 2013
by Mikio Braun
· 11,386 Views · 1 Like
article thumbnail
Generating SQL Railroad Diagrams
simple talk - How to get SQL Railroad Diagrams from MSDN BNF syntax notation. On SQL Server Books-On-Line, in the Transact-SQL Reference (database Engine), every SQL Statement has its syntax represented in ‘Backus–Naur Form’ notation (BNF) syntax. For a programmer in a hurry, this should be ideal because It is the only quick way to understand and appreciate all the permutations of the syntax. It is a great feature once you get your eye in. It isn’t the only way to get the information; You can, of course, reverse-engineer an understanding of the syntax from the examples, but your understanding won’t be complete, and you’ll have wasted time doing it. BNF is a good start in representing the syntax: Oracle and SQLite go one step further, and have proper railroad diagrams for their syntax, which is a far more accessible way of doing it. There are three problems with the BNF on MSDN. Firstly, it is isn’t a standard version of BNF, but an ancient fork from EBNF, inherited from Sybase. Secondly, it is excruciatingly difficult to understand, and thirdly it has a number of syntactic and semantic errors. The page describing DML triggers, for example, currently has the absurd BNF error that makes it state that all statements in the body of the trigger must be separated by commas. There are a few other detail problems too. Here is the offending syntax for a DML trigger, pasted from MSDN. ... I’ve been trying to create railroad diagrams for all the important SQL Server SQL statements, as good as you’d find for Oracle, and have so far published the CREATE TABLE and ALTER TABLE railroad diagrams based on the BNF. Although I’ve been aware of them, I’ve never realised until recently how many errors there are. Then, Colin Daley created a translator for the SQL Server dialect of BNF which outputs standard EBNF notation used by the W3C. The example MSDN BNF for the trigger would be rendered as … ... Colin’s intention was to allow anyone to paste SQL Server’s BNF notation into his website-based parser, and from this generate classic railroad diagrams via Gunther Rademacher's Railroad Diagram Generator. Colin's application does this for you: you're not aware that you are moving to a different site. Because Colin's 'translator' it is a parser, it will pick up syntax errors. Once you’ve fixed the syntax errors, you will get the syntax in the form of a human-readable railroad diagram and, in this form, the semantic mistakes become flamingly obvious. Gunter’s Railroad Diagram Generator is brilliant. To be able, after correcting the MSDN dialect of BNF, to generate a standard EBNF, and from thence to create railroad diagrams for SQL Server’s syntax that are as good as Oracle’s, is a great boon, and many thanks to Colin for the idea. Here is the result of the W3C EBNF from Colin’s application then being run through the Railroad diagram generator. Now that’s much better, you’ll agree. This is pretty easy to understand, and at this point any error is immediately obvious. This should be seriously useful, and it is to me. However there is that snag. The BNF is generally incorrect, and you can’t expect the average visitor to mess about with it. The answer is, of course, to correct the BNF on MSDN and maybe even add railroad diagrams for the syntax. Stop giggling! I agree it won’t happen. In the meantime, we need to collaboratively store and publish these corrected syntaxes ourselves as we do them. How? GitHub? SQL Server Central? Simple-Talk? What should those of us who use the system do with our corrected EBNF so that anyone can use them without hassle? Grammar Translator If you are familiar with the Grammar Translator, go ahead and create railroad diagrams from the Transact-SQL Reference. Otherwise, please see the FAQ. In particular, be sure to try thetutorial. Welcome to Railroad Diagram Generator! This is a tool for creating syntax diagrams, also known as railroad diagrams, from context-free grammars specified in EBNF. Syntax diagrams have been used for decades now, so the concept is well-known, and some tools for diagram generation are in existence. The features of this one are usage of the W3C's EBNF notation, web-scraping of grammars from W3C specifications, online editing of grammars, diagram presentation in SVG, and it was completely written in web languages (XQuery, XHTML, CSS, JavaScript). There's nothing like a diagram to help grok something (and the MSDN BNF SQL stuff really makes my brain hurt...)
October 18, 2013
by Greg Duncan
· 9,162 Views
article thumbnail
Extracting File Metadata with C# and the .NET Framework
The Windows Explorer (shell) provides extended file property information which can be quite valuable. The challenge was how to extract this information, given that the .NET Framework has somewhat limited support for this type of extraction?
October 14, 2013
by Rob Sanders
· 64,229 Views
article thumbnail
Oracle Weblogic Stuck Thread Detection
The following question will again test your knowledge of the Oracle Weblogic threading model. I’m looking forward for your comments and experience on the same. If you are a Weblogic administrator, I’m certain that you heard of this common problem: stuck threads. This is one of the most common problems you will face when supporting a Weblogic production environment. A Weblogic stuck thread simply means a thread performing the same request for a very long time and more than the configurable Stuck Thread Max Time. Question: How can you detect the presence of STUCK threads during and following a production incident? Answer: As we saw from our last article “Weblogic Thread Monitoring Tips”, Weblogic provides functionalities allowing us to closely monitor its internal self-tuning thread pool. It will also highlight you the presence of any stuck thread. This monitoring view is very useful when you do a live analysis but what about after a production incident? The good news is that Oracle Weblogic will also log any detected stuck thread to the server log. Such information includes details on the request and more importantly, the thread stack trace. This data is crucial and will allow you to potentially better understand the root cause of any slowdown condition that occurred at a certain time. < ExecuteThread: '11' for queue: 'weblogic.kernel.Default (self-tuning)'> <[STUCK] ExecuteThread: '35' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "608" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 608213 ms POST /App1/jsp/test.jsp HTTP/1.1 Accept: application/x-ms-application... Referer: http://.. Accept-Language: en-US User-Agent: Mozilla/4.0 .. Content-Type: application/x-www-form-urlencoded Accept-Encoding: gzip, deflate Content-Length: 539 Connection: Keep-Alive Cache-Control: no-cache Cookie: JSESSIONID= ]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace: ................................... javax.servlet.http.HttpServlet.service(HttpServlet.java:727) javax.servlet.http.HttpServlet.service(HttpServlet.java:820) weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227) weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125) weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:301) weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:184) weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.... weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run() weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321) weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120) weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2281) weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2180) weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1491) weblogic.work.ExecuteThread.execute(ExecuteThread.java:256) weblogic.work.ExecuteThread.run(ExecuteThread.java:221) Here is one more tip: the generation and analysis of a JVM thread dump will also highlight you stuck threads. As we can see from the snapshot below, the Weblogic thread state is now updated to STUCK, which means that this particular request is being executed since at least 600 seconds or 10 minutes. This is very useful information since the native thread state will typically remain to RUNNABLE. The native thread state will only get updated when dealing with BLOCKED threads etc. You have to keep in mind that RUNNABLE simply means that this thread is healthy from a JVM perspective. However, it does not mean that it truly is from a middleware or Java EE container perspective. This is why Oracle Weblogic has its own internal ExecuteThread state. Finally, if your organization or client is using any commercial monitoring tool, I recommend that you enable some alerting around both hogging thread and stuck thread. This will allow your support team to take some pro-active actions before the affected Weblogic managed server(s) become fully unresponsive.
October 9, 2013
by Pierre - Hugues Charbonneau
· 55,017 Views
article thumbnail
Cache Scope with EHCache
In another blog post we explained how you can use a new feature of Mule 3.3 to cache data in your Mule flows. Here we look at how to configure Mule to use EHCache to handle the caching part, rather than storing the data in the default InMemoryObjectStore. Let’s get going. First let’s start by saying that there are millions of different ways to do this... We’ve taken the route of configuring everything through Spring. So just because you configured your EHCache differently, does not mean it’s wrong or ours is better. We prefer this way since Mule integrates very nicely with Spring. Now we have settled that, let’s make a list of what we need to do: Define cache manager Define cache factory bean Create a custom object store Define a Mule caching strategy The first job is to define a cache manager and cache factory bean. Spring provides two very handy classes for this, specific for EHCache: EhCacheManagerFactoryBean and EhCacheFactoryBean. The cache manager just needs to be defined. However, on the cache factory bean you can configure all EHCache details such as time to live, time to idle, when to overflow on disk, the eviction policy, and much more. For more information, you can check the API here or the EHCache website. Also, from the cache factory bean, you need to refer back to the cache manager. An example is shown in the following gist: Once the cache and the cache manager are configured, we need to define a custom object store that uses EHCache to store and retrieve the data. This is very easy to do, we just need to create a new class that implements the standard Mule’s ObjectStore interface, and use EHCache to do the operations. A working custom EHCache object store is shown in the following gist: package com.ricston.cache; import java.io.Serializable; import net.sf.ehcache.Ehcache; import net.sf.ehcache.Element; import org.mule.api.store.ObjectStore; import org.mule.api.store.ObjectStoreException; public class EhcacheObjectStore implements ObjectStore { private Ehcache cache; @Override public synchronized boolean contains(Serializable key) throws ObjectStoreException { return cache.isKeyInCache(key); } @Override public synchronized void store(Serializable key, T value) throws ObjectStoreException { Element element = new Element(key, value); cache.put(element); } @SuppressWarnings("unchecked") @Override public synchronized T retrieve(Serializable key) throws ObjectStoreException { Element element = cache.get(key); if (element == null) { return null; } return (T) element.getValue(); } @Override public synchronized T remove(Serializable key) throws ObjectStoreException { T value = retrieve(key); cache.remove(key); return value; } @Override public boolean isPersistent() { return false; } public Ehcache getCache() { return cache; } public void setCache(Ehcache cache) { this.cache = cache; } } As you can clearly see, this object store encapsulates an EHCache instance. This should be set before we start using this object store. As you can imagine, we will do this through Spring. The next step is to configure a caching strategy which uses our brand new EHCache object store. The caching strategy using our custom object store, and in the object store, we are using Spring to inject the cache defined earlier in this blog post. The rest in Mule can be exactly the same as in the other blog post we explained before. So here we have shown you how we can use EHCache as the caching engine for the cache scopes provided by Mule 3.3. A reason why you would do this is that with EHCache, you have a very good and proven caching product with a ton of settings that you can exploit and tune for your application. As a side note, if you have issues with EHCache classloading in Mule, place the EHCache jars inside $MULE_HOME/lib/user rather than in your application. Enjoy.
October 4, 2013
by Alan Cassar
· 14,187 Views · 1 Like
article thumbnail
TestNG @Test Annotation and DataProviderClass Example
In the previous post, we have seen an example where dataProvider attribute has been used3 to test methods with different sets of input data for the same test method. TestNG provides another attribute dataProviderClass in conjunction with dataProvider to fetch the input data for the test methods from an external class. The actual class that holds input data is set to the dataProviderClass attribute and datProvider by itself holds the method name where the input data is actually fetched. Here is a quick example to show how to use dataProviderClass and dataProvide attribute Code Service Class ? view source print? 01.package com.skilledmonster.example; 02./** 03.* Simple calculator service to demonstrate TestNG Framework 04.* 05.* @author Jagadeesh Motamarri 06.* @version 1.0 07.*/ 08.public interface CalculatorService { 09.int sum(int a, int b); 10.int multiply(int a, int b); 11.int div(int a, int b); 12.int sub(int a, int b); 13.} Service Implementation Class ? view source print? 01.package com.skilledmonster.example; 02./** 03.* Simple calculator service implementation to demonstrate TestNG Framework 04.* 05.* @author Jagadeesh Motamarri 06.* @version 1.0 07.*/ 08.public class SimpleCalculator implements CalculatorService { 09.public int sum(int a, int b) { 10.return a + b; 11.} 12.public int multiply(int a, int b) { 13.return a * b; 14.} 15.public int div(int a, int b) { 16.return a / b; 17.} 18.public int sub(int a, int b) { 19.return a - b; 20.} 21.} Data Provider Class ? view source print? 01.package com.skilledmonster.common; 02.import org.testng.annotations.DataProvider; 03./** 04.* Data Provider class for TestNG test cases 05.* 06.* @author Jagadeesh Motamarri 07.* @version 1.0 08.*/ 09.public class TestNGDataProvider { 10./** 11.* Data Provider for testing sum of 2 numbers 12.* 13.* @return 14.*/ 15.@DataProvider 16.public static Object[][] testSumInput() { 17.return new Object[][] { { 5, 5 }, { 10, 10 }, { 20, 20 } }; 18.} 19./** 20.* Data Provider for testing multiplication of 2 numbers 21.* 22.* @return 23.*/ 24.@DataProvider 25.public static Object[][] testMultipleInput() { 26.return new Object[][] { { 5, 5 }, { 10, 10 }, { 20, 20 } }; 27.} 28.} Finally, test class that uses dataProviderClass attribute to feed the input data for the test methods ? package com.skilledmonster.example; import org.testng.Assert; import org.testng.annotations.BeforeClass; import org.testng.annotations.Test; import com.skilledmonster.common.TestNGDataProvider; /** * Example to demonstrate use of dataProviderClass and dataProvide attributes of TestNG framework * * @author Jagadeesh Motamarri * @version 1.0 */ public class TestNGAnnotationTestDataProviderExample { public CalculatorService service; @BeforeClass public void init() { System.out.println("@BeforeClass: The annotated method will be run before the first test method in the current class is invoked."); System.out.println("init service"); service = new SimpleCalculator(); } @Test(dataProviderClass = TestNGDataProvider.class, dataProvider = "testSumInput") public void testSum(int a, int b) { System.out.println("@Test : testSum()"); int result = service.sum(a, b); Assert.assertEquals(result, a + b); } @Test(dataProviderClass = TestNGDataProvider.class, dataProvider = "testMultipleInput") public void testMultiple(int a, int b) { System.out.println("@Test : testMultiple()"); int result = service.multiply(a, b); Assert.assertEquals(result, a * b); } } Output As shown in the above console output, each of the testSum() and testMutiple() methods are invoked with different sets of input data using an external class with dataProviderClass attribute. Advantage More flexibility and re-usability of commonly used data across several test classes. Download Download TestNG DataProvider Example
October 2, 2013
by Jagadeesh Motamarri
· 25,472 Views
article thumbnail
Clojure: Converting a string to a date
I wanted to do some date manipulation in Clojure recently and figured that since clj-time is a wrapper around Joda Time it’d probably do the trick. The first thing we need to do is add the dependency to our project file and then run lein reps to pull down the appropriate JARs. The project file should look something like this: project.clj (defproject ranking-algorithms "0.1.0-SNAPSHOT" :license {:name "Eclipse Public License" :url "http://www.eclipse.org/legal/epl-v10.html"} :dependencies [[org.clojure/clojure "1.4.0"] [clj-time "0.6.0"]]) Now let’s load the clj-time.format namespace into the REPL since we know we’ll be parsing dates: > (require '(clj-time [format :as f])) The string that I want to convert into a date looks like this: (def string-date "18 September 2012") The first thing we should do is check whether there is an existing formatter that we can use by evaluating the following function: > (f/show-formatters) ... :hour-minute 06:45 :hour-minute-second 06:45:22 :hour-minute-second-fraction 06:45:22.473 :hour-minute-second-ms 06:45:22.473 :mysql 2013-09-20 06:45:22 :ordinal-date 2013-263 :ordinal-date-time 2013-263T06:45:22.473Z :ordinal-date-time-no-ms 2013-263T06:45:22Z :rfc822 Fri, 20 Sep 2013 06:45:22 +0000 ... There are a lot of different built in formatters but unfortunately I couldn’t find one that exactly matched our date format so we’ll have to write our own one. For that we’ll need to refresh our knowledge of Java date formatting: We end up with the following formatter: > (f/parse (f/formatter "dd MMM YYYY") string-date) # It took me much longer than it should have to remember that ‘MMM’ is the pattern to match a short form of a month but it’s just the same as what we’d have to do in Java but with some neat wrapper functions.
October 2, 2013
by Mark Needham
· 5,848 Views
article thumbnail
Getting Started with NHibernate and ASP.NET MVC- CRUD Operations
In this post we are going to learn how we can use NHibernate in ASP.NET MVC application. What is NHibernate: ORMs(Object Relational Mapper) are quite popular this days. ORM is a mechanism to map database entities to Class entity objects without writing a code for fetching data and write some SQL queries. It automatically generates SQL Query for us and fetch data behalf on us. NHibernate is also a kind of Object Relational Mapper which is a port of popular Java ORM Hibernate. It provides a framework for mapping an domain model classes to a traditional relational databases. Its give us freedom of writing repetitive ADO.NET code as this will be act as our database layer. Let’s get started with NHibernate. How to download: There are two ways you can download this ORM. From nuget package and from the source forge site. Nuget - http://www.nuget.org/packages/NHibernate/ Source Forge-http://sourceforge.net/projects/nhibernate/ Creating a table for CRUD: I am going to use SQL Server 2012 express edition as a database. Following is a table with four fields Id, First Name, Last name, Designation. Creating ASP.NET MVC project for NHibernate: Let’s create a ASP.NET MVC project for NHibernate via click on File-> New Project –> ASP.NET MVC 4 web application. Installing NuGet package for NHibernate: I have installed nuget package from Package Manager console via following Command. It will install like following. NHibertnate configuration file: Nhibernate needs one configuration file for setting database connection and other details. You need to create a file with ‘hibernate.cfg.xml’ in model Nhibernate folder of your application with following details. NHibernate.Connection.DriverConnectionProvider NHibernate.Driver.SqlClientDriver Server=(local);database=LocalDatabase;Integrated Security=SSPI; NHibernate.Dialect.MsSql2012Dialect Here you have got different settings for NHibernate. You need to selected driver class, connection provider as per your database. If you are using other databases like Orcle or MySQL you will have different configuration. ThisNHibernate ORM can work with any databases. Creating a model class for NHibernate: Now it’s time to create model class for our CRUD operations. Following is a code for that. Property name is identical to database table columns. namespace NhibernateMVC.Models { public class Employee { public virtual int Id { get; set; } public virtual string FirstName { get; set; } public virtual string LastName { get; set; } public virtual string Designation { get; set; } } } Creating a mapping file between class and table: Now we need a xml mapping file between class and model with name “Employee.hbm.xml” like following in Nhibernate folder. Creating a class to open session for NHibernate I have created a class in models folder called NHIbernateSession and a static function it to open a session for NHibertnate. using System.Web; using NHibernate; using NHibernate.Cfg; namespace NhibernateMVC.Models { public class NHibertnateSession { public static ISession OpenSession() { var configuration = new Configuration(); var configurationPath = HttpContext.Current.Server.MapPath(@"~\Models\Nhibernate\hibernate.cfg.xml"); configuration.Configure(configurationPath); var employeeConfigurationFile = HttpContext.Current.Server.MapPath(@"~\Models\Nhibernate\Employee.hbm.xml"); configuration.AddFile(employeeConfigurationFile); ISessionFactory sessionFactory = configuration.BuildSessionFactory(); return sessionFactory.OpenSession(); } } } Listing: Now we have our open session method ready its time to write controller code to fetch data from the database. Following is a code for that. using System; using System.Web.Mvc; using NHibernate; using NHibernate.Linq; using System.Linq; using NhibernateMVC.Models; namespace NhibernateMVC.Controllers { public class EmployeeController : Controller { public ActionResult Index() { using (ISession session = NHibertnateSession.OpenSession()) { var employees = session.Query().ToList(); return View(employees); } } } } Here you can see I have get a session via OpenSession method and then I have queried database for fetching employee database. Let’s create a new for this you can create this via right lick on view on above method.We are going to create a strongly typed view for this. Our listing screen is ready once you run project it will fetch data as following. Create/Add: Now its time to write add employee code. Following is a code I have written for that. Here I have used session.save method to save new employee. First method is for returning a blank view and another method with HttpPost attribute will save the data into the database. public ActionResult Create() { return View(); } [HttpPost] public ActionResult Create(Employee emplolyee) { try { using (ISession session = NHibertnateSession.OpenSession()) { using (ITransaction transaction = session.BeginTransaction()) { session.Save(emplolyee); transaction.Commit(); } } return RedirectToAction("Index"); } catch(Exception exception) { return View(); } } Now let’s create a create view strongly typed view via right clicking on view and add view. Once you run this application and click on create new it will load following screen. Edit/Update: Now let’s create a edit functionality with NHibernate and ASP.NET MVC. For that I have written two action result method once for loading edit view and another for save data. Following is a code for that. public ActionResult Edit(int id) { using (ISession session = NHibertnateSession.OpenSession()) { var employee = session.Get(id); return View(employee); } } [HttpPost] public ActionResult Edit(int id, Employee employee) { try { using (ISession session = NHibertnateSession.OpenSession()) { var employeetoUpdate = session.Get(id); employeetoUpdate.Designation = employee.Designation; employeetoUpdate.FirstName = employee.FirstName; employeetoUpdate.LastName = employee.LastName; using (ITransaction transaction = session.BeginTransaction()) { session.Save(employeetoUpdate); transaction.Commit(); } } return RedirectToAction("Index"); } catch { return View(); } } Here in first action result I have fetched existing employee via get method of NHibernate session and in second I have fetched and changed the current employee with update details. You can create view for this via right click –>add view like below. I have created a strongly typed view for edit. Once you run code it will look like following. Details: Now it’s time to create a detail view where user can see the employee detail. I have written following logic for details view. public ActionResult Details(int id) { using (ISession session = NHibertnateSession.OpenSession()) { var employee = session.Get(id); return View(employee); } } You can add view like following via right click on actionresult view. now once you run this in browser it will look like following. Delete: Now its time to write delete functionality code. Following code I have written for that. public ActionResult Delete(int id) { using (ISession session = NHibertnateSession.OpenSession()) { var employee = session.Get(id); return View(employee); } } [HttpPost] public ActionResult Delete(int id, Employee employee) { try { using (ISession session = NHibertnateSession.OpenSession()) { using (ITransaction transaction = session.BeginTransaction()) { session.Delete(employee); transaction.Commit(); } } return RedirectToAction("Index"); } catch(Exception exception) { return View(); } } Here in the above first action result will have the delete confirmation view and another will perform actual delete operation with session delete method. When you run into the browser it will look like following. That’s it. It’s very easy to have crud operation with NHibernate. Stay tuned for more.
October 1, 2013
by Jalpesh Vadgama
· 47,294 Views
article thumbnail
ElasticSearch: Java API
ElasticSearch provides Java API, thus it executes all operations asynchronously by using client object.
September 30, 2013
by Hüseyin Akdoğan DZone Core CORE
· 137,562 Views · 4 Likes
article thumbnail
Clojure: Converting an Array/Set into a Hash Map
When I was implementing the Elo Rating algorithm a few weeks ago one thing I needed to do was come up with a base ranking for each team. I started out with a set of teams that looked like this: (def teams #{ "Man Utd" "Man City" "Arsenal" "Chelsea"}) and I wanted to transform that into a map from the team to their ranking e.g. Man Utd -> {:points 1200} Man City -> {:points 1200} Arsenal -> {:points 1200} Chelsea -> {:points 1200} I had read the documentation of array-map, a function which can be used to transform a collection of pairs into a map, and it seemed like it might do the trick. I started out by building an array of pairs using mapcat: > (mapcat (fn [x] [x {:points 1200}]) teams) ("Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200}) array-map constructs a map from pairs of values e.g. > (array-map "Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200}) ("Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200}) Since we have a collection of pairs rather than individual pairs we need to use the apply function as well: > (apply array-map ["Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200}]) {"Chelsea" {:points 1200}, "Man City" {:points 1200}, "Arsenal" {:points 1200}, "Man Utd" {:points 1200} And if we put it all together we end up with the following: > (apply array-map (mapcat (fn [x] [x {:points 1200}]) teams)) {"Man Utd" {:points 1200}, "Man City" {:points 1200}, "Arsenal" {:points 1200}, "Chelsea" {:points 1200} It works but the function we pass to mapcat feels a bit clunky. Since we just need to create a collection of team/ranking pairs we can use the vector and repeat functions to build that up instead: > (mapcat vector teams (repeat {:points 1200})) ("Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200}) And if we put the apply array-map code back in we still get the desired result: > (apply array-map (mapcat vector teams (repeat {:points 1200}))) {"Chelsea" {:points 1200}, "Man City" {:points 1200}, "Arsenal" {:points 1200}, "Man Utd" {:points 1200} Alternatively we could use assoc like this: > (apply assoc {} (mapcat vector teams (repeat {:points 1200}))) {"Man Utd" {:points 1200}, "Arsenal" {:points 1200}, "Man City" {:points 1200}, "Chelsea" {:points 1200} I also came across the into function which seemed useful but took in a collection of vectors: > (into {} [["Chelsea" {:points 1200}] ["Man City" {:points 1200}] ["Arsenal" {:points 1200}] ["Man Utd" {:points 1200}] ]) We therefore need to change the code to use map instead of mapcat: > (into {} (map vector teams (repeat {:points 1200}))) {"Chelsea" {:points 1200}, "Man City" {:points 1200}, "Arsenal" {:points 1200}, "Man Utd" {:points 1200} However, my favourite version so far uses the zipmap function like so: > (zipmap teams (repeat {:points 1200})) {"Man Utd" {:points 1200}, "Arsenal" {:points 1200}, "Man City" {:points 1200}, "Chelsea" {:points 1200} I’m sure there are other ways to do this as well so if you know any let me know in the comments.
September 28, 2013
by Mark Needham
· 13,089 Views
  • Previous
  • ...
  • 416
  • 417
  • 418
  • 419
  • 420
  • 421
  • 422
  • 423
  • 424
  • 425
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×