Open Source Resources for Engineers

The Latest Open Source Topics

What is strus ? The project strus is a collection of libraries and tools written in C++ to build a competitive search engine. Currently it is a single person project that started in September 2014 and therefore the competitiveness in terms of features of the software is more a promise than a fact. It definitely needs more brain to be put into it to catch up with the big players for open source search engines Lucene and Xapian. But strus is not only a me-too-project for search. Strus introduces expression matching and information extraction on a different level than other known open source engines (read more…). Strus simplifies the architecture of a search engine by “outsourcing” of components like the key/value store database storing the data blocks. This componentization (see components of strus) reduces the amount of code drastically and it raises opportunities for experts on a specific topic to contribute (read more…). Strus is not the first attempt to try that, but it is the first attempt as open source project, that has a performance within reach of the big open source search engines. And it does that without a 10 years history of optimization in the back. Strus might not be there at eye level, but let’s see what happens, if more different reasoning and competition is put into it. For who is strus ? People I would primarily like to address with this blog are developers or hackers as potential contributors or for feedback. On the other hand the project could already be interesting for experimental projects that can afford to go along with the development of strus. As stakeholder you can influence the project too. As the demo project, the search on the complete Wikipedia collection (English) shows, it is already possible to build projects, but you have to be aware, that dead lines should not exist, because you might hit a point where a feature you need is not instantaneously available. Project planning gets difficult at the current stage. Furthermore the state of documentation is still quite poor. Programming paradigms All interfaces of strus are pure. No inheritance is used in the main header files. Strus is more a lego thing than a provider of solution classes. If you want for example to build a sequence of terms as feature for your search, you have to build its expression tree with help of a stack, rather than picking a class that implements a sequence query. In PHP this looks as follows: $terms = [ “hello”, “world” ]; $query->pushTerm( “word”/*feature type*/, $term[0] ); $query->pushTerm( “word”/*feature type*/, $term[1] ); $query->pushExpression( “sequence”, 2/*nof terms*/, 2/*position range*/); $query->defineFeature( “docfeat” /*name addressing this feature set*/); The number of interface classes is small (see for example the interface classes of the core), but you have to understand them. If you want to contribute, you should also have a closer look at the programing guidelines. Try it There exist a guide how to fetch, build and install strus. Unfortunately a tutorial is still missing. There will be one soon ! Support I will reply to questions. Please mail me to contact at project dash strus dot net. Thanks I want to thank the authors of LevelDB here. I was looking for some time for a key/value store database that had an upper bound seek function in the interface. The upper bound seek is crucial because it allows you to minimize block accesses on disk when joining sets. A key/value store without upper bound seek would have forced me to create virtual blocks that point to other blocks. This would mean more disk accesses to fetch the data blocks needed. LevelDB has it. Any other alternative candidate to implement the database interface has to have it too. Social Media Github: patrickfrey Twitter: @ProjectStrus

June 23, 2015

by Patrick Frey

· 956 Views

PostgreSQL Powers All New Apps for 77% of the Database's Users

Survey of open source PostgreSQL users found adoption continues to rise with 55% of users deploying it for mission-critical applications Bedford, MA – June 23, 2015 – EnterpriseDB (EDB), the leading provider of enterprise-class Postgres products and database compatibility solutions, today announced the results of its “PostgreSQL Adoption Survey 2015,” a biennial survey of open source PostgreSQL users. Conducted by EnterpriseDB, the survey found PostgreSQL adoption continuing to rise, with 55% of users – up from 40% two years ago – deploying it for mission-critical applications and 77% of users are dedicating all new application deployments to PostgreSQL. These findings give voice to end users and confirm such industry indicators as increasing job listings and monthly rankings on DB-Engines that have pointed to rising interest in and demand for PostgreSQL, also called Postgres. The growing popularity of Postgres also comes as traditional software vendors suffer setbacks in the marketplace. The enterprise-class performance, security and stability of Postgres, on par with traditional database vendors for most corporate workloads, meanwhile have helped position Postgres among the solutions from the world’s largest vendors. The opportunity to transform their data center economics has helped fuel downloads of Postgres as well. End users reported cutting costs with Postgres, with 41% reporting they had first-year cost savings of 50% or more. They’re using Postgres to build web 2.0 applications using unstructured data as evidenced by the 64% of respondents who said they were working with JSON/JSONB and the 47% who said they were using Postgres for collaboration applications. “Postgres is empowering organizations to transform the economics of IT. IT can invest in the customer engagement applications that differentiate their operations from their competition instead of continuing to pay the steep and rising licensing and support fees charged by traditional database vendors,” said Marc Linster, senior vice president of products and services of EnterpriseDB. “With the expanding adoption, EnterpriseDB has experienced dramatic growth year over year, providing the software, services and support that organizations need to be successful with Postgres.” Database Migrations, Replacements The findings also support statements in a recent Gartner report that reflect the widespread acceptance of open source databases. “By 2018, more than 70% of new in-house applications will be developed on an OSDBMS, and 50% of existing commercial RDBMS instances will have been converted or will be in process,” according to the April 2015 Gartner report, The State of Open-Source RDBMs, 2015.* Among Postgres users, the survey findings show migrations are already under way with 37% reporting they had migrated applications from Oracle or Microsoft SQL Server to Postgres. Many users were still planning further migrations, with 37% of PostgreSQL users saying they will gradually replace their legacy systems with Postgres, compared to 29% who said that in the 2013 survey. Further, end users predict their deployments of Postgres will expand significantly, with 32% saying they anticipate production deployments of Postgres to increase by at least 50% over the next year. The survey, conducted by EnterpriseDB using an online tool in May 2015, queried registered users of PostgreSQL and drew 274 respondents worldwide from government organizations and companies ranging in size and industry. *The State of Open-Source RDBMs, 2015, by Donald Feinberg and Merv Adrian, published on April 21, 2015. Connect with EnterpriseDB Read the blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb Become a fan on Facebook: http://www.facebook.com/EnterpriseDB?ref=ts Join us on Google+: https://plus.google.com/108046988421677398468 Connect on LinkedIn: http://www.linkedin.com/company/enterprisedb

June 23, 2015

by Fran Cator

· 1,009 Views

This Week In Modern Software: Inside Obama’s Geek Squad

[This article was written by Kevin Casey] Welcome to This Week in Modern Software, orTWiMS, New Relic’s weekly roundup of the need-to-know news, stories, and events of interest surrounding software analytics, cloud computing, application monitoring, development methodologies, programming languages, and the myriad of other issues that influence modern software. This week, our top story goes inside President Obama’s secret team of tech geeks, 140 of them and counting: TWiMS Top Story: Inside Obama’s Stealth Startup—Fast Company What it’s about:If the President of the United States walked into the room and personally recruited you to rebuild the country’s technology infrastructure, could you turn him down? He’s serious, and that room is theRoosevelt Room in the West Wing of the White House, by the way. AsLisa Gelobtersays: “What are you going to say that?” Gelobter’s answer was “Yes”—she’s now chief digital officer for the US Department of Education, part of a 140-person-and-counting tech team that’s functioning something like an elite startup embedded inside the federal government. Its business? Only modernizing the technical infrastructure, applications, and processes of just about every federal agency. Why you should care:What was once something of a tech desert—the federal government—is beginning to draw top private-sector talent inside the Beltway. The team, led by Mikey Dickerson (who helped lead the team that rescuedHealthcare.gov) andformer US CTO Todd Park, also includes the likes of former Googler Matthew Weaver, and it hopes to hit 500 people by the end 2016, shortly before President Obama will leave office. Its challenges are immense, from tackling government bureaucracy (to test just how entrenched the suits were, Weaver requested the official title “Rogue Leader”—and he got it) to the fact that its recruiting pitch includes the phrase: “You’ll have to take a pay cut.” But its mission is both noble and necessary, and the appeal of working on major problems with enormous public impacts appears to be working. Recommended reading. Further reading: Mikey Dickerson’s 10 Tips for Dealing with Bureaucracy—New Relic Blog [Video] Airbnb Open Sources Software to Lure Talent Amid ‘Insane’ Competition—CIO Journal What it’s about:Airbnb added three new apps to its open source portfolio earlier this month, but the motivation wasn’t just trying to give employees the best business tools or contribute to the software community at large. Sure, that might have been part of the equation, but the rental booking site hopes open-sourcing some of its toolkit will help recruit the best software talent in the face of what director of engineeringMike Curtiscalls “insane” competition in the Silicon Valley labor market. Why you should care:In the software arms race, any little edge counts. Curtis tellsCIO Journalthat Airbnb will keep the proprietary stuff closely guarded, of course. But it will open source “generic” tools with wider industry use cases, such as its recently releasedAerosolvemachine-learning package and itsAirpalcloud-based data querying tool. The latter, which works with Facebook’s open sourcePrestoDB, aims to simplify SQL queries to the point where you don’t need to be a big data wonk or business intelligence guru to run it. Indeed, one in three Airbnb employees have run a query on it in the year since it launched. Airbnb has contributed a dozen open source tools on its aptly namedNerds site(gotta love that!) to date, something the company hopes both contributes to greater good but also advertises its software innovation to potential hires. Google Is Wielding Its Own Secret Weapon in the Cloud—The New York Times What it’s about:In thecutthroat competitionfor public cloud business, Google may be its own best customer testimonial. In advance of this week’sOpen Network Summit, theTimes’Bits bloglooked at Google’s plan to not only unveil cloud customers such as HTC but reveal much more than ever before about its own infrastructure. Google did just that on Wednesday, offering a look inside itsdata center networking, including its massive-capacity, lightning-fast Jupiter network. Why you should care:As major cloud players continue to zap prices with their shrink-rays, it’s increasingly clear that features and underlying platforms will distinguish one from the other when enterprise users make their pick. Google is taking a big step toward writing its own story in this regard, and the synopsis might read something like: “We’re pretty good at this stuff.” Its Jupiter fabrics deliver 1 petabit per second of bisection bandwidth, according to Google, or “enough for 100,000 servers to exchange information at 10Gb/s each, enough to read the entire scanned contents of the Library of Congress in less than 1/10th of a second.” If it sounds like a bit of bragging, well, yeah—it is. But it’s bragging with a purpose: Attracting devs who want access to the same technology without having to build it themselves.Google’s Amin Vahdat connected the dots in a blog post: “The same networks that power all of Google’s internal infrastructure and services also power Google Cloud Platform.” Move Over, Meeker: Byron Deeter’s State of the Cloud Report—Bessemer Venture Partners What it’s about:With a nod to Mary Meeker’s classicState of the Internet report,Bessemer Venture Partners’Byron Deeterchecks in with his 2015 State of the Cloud Report. Given cloud computing’s relative youth and rampant ascension, it’s no surprise the stats are staggering. Here’s one to start: Cloud revenues have increased tenfold in the last six years, from a scant $5.6 billion in 2008 to more than $56 billion in 2014. And it’s going to double again in the next four years, according to BVP’s projections, to $127.5 billion in 2018. Why you should care:Deeter’s full presentation is worth a weekend watch or read, but it’s the forward-looking slides that may be most compelling for software pros. Deeter notes both the immense risks and opportunities in cloud security, unveiling a 10-point security plan for cloud startups on slide 37. To underscore the security landscape, Deeter quotes an unnamed cloud CEO who says aDDoSattack that took down the firm’s API caused more customer churn in one day than in the rest of its history. Wow. He also addresses the exploding market for cloud services built specifically for developers including, yes, New Relic. And for mobile developers, slide 44 underscores something we’ve talked about before in this space:the real money’s in enterprise apps, and it’s still a largely untapped market. Click through thefull slide deck hereorwatch video of Deeter’s presentation here. Bandwidth: The Next Frontier of Cloud Computing—ZDnet What it’s about:Is networking the next big thing in the everything-as-a-service age? It just might be, as firms likePacnetvie to deliver networking capacity on a pay-for-what-you-use model that some industry folks say better suits cloud environments facing significant but uneven networking needs. Why you should care:As author Drew Turney notes, there’s a common blind spot when it comes to cloud computing’s many shapes and sizes: Moving all that data from points A to Z, and everywhere in between, which can cause both performance problems and undue financial pressures. The promise of Networking-as-a-Service (NaaS), industry execs tell Turney, is that it can provide more efficient, scalable networking for short-term usage bursts such as customer traffic spikes or large cloud backup-and-storage jobs, enabling companies to later dial down their capacity as needed. Combined withSoftware-Defined Networking (SDN),NaaS makes it possible to build intelligent applications that manage their own networking needs, which might be the most significant enterprise potential of NaaS, saysNuage NetworksarchitectMarten Hauville. Page Bloat: Average Web Page Now More Than 2MB—The Performance Beacon (SOASTA) What it’s about:Do you need to put your website on a diet? Apparently so: The average Web page topped 2 MB as of May 2015, according to ongoing tracking atThe Performance Beacon. That’s double the average page weight from just three years ago. The site projects average page weight will exceed 3 MB in late 2017. Why you should care:Performance, performance, performance:Slow speedsare a killerin the modern software era. While author andSOASTAUX evangelistTammy Evertsrightly notes that page weight is not the only factor in Web optimization, we’re simply not paying it enough attention when designing and building Web pages. Images are the big culprit in the Web’s expanding waistline: they comprise nearly two-thirds of the average page’s weight, and video is a growing part of our Web diet, too. But other factors such as custom fonts play a role, adding weight even as the Web sheds previous performance hogs like Flash. The ideal weight? 1 MB, she says, which will save crucial seconds in load times. Sounds like it’s time to hit the virtual treadmill.

June 23, 2015

by Fredric Paul

· 1,103 Views

Why 12 Factor Application Patterns, Microservices and CloudFoundry Matter (Part 2)

Learn why 12 Factor Application Patterns, Microservices and CloudFoundry matter when trying to change the way your product is produced.

June 12, 2015

by Tim Spann

CORE

· 15,699 Views · 4 Likes

Ecosystem of Hadoop Animal Zoo

hadoop is best known for map reduce and it's distributed file system (hdfs). recently other productivity tools developed on top of these will form a complete ecosystem of hadoop. most of the projects are hosted under apache software foundation . hadoop ecosystem projects are listed below. hadoop common a set of components and interfaces for distributed file system and i/o (serialization, java rpc, persistent data structures) http://hadoop.apache.org/ hadoop ecosystem hdfs a distributed file system that runs on large clusters of commodity hardware. hadoop distributed file system, hdfs renamed form ndfs. scalable data store that stores semi-structured, un-structured and structured data. http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/hdfsuserguide.html http://wiki.apache.org/hadoop/hdfs map reduce map reduce is the distributed, parallel computing programming model for hadoop. inspired from google map reduce research paper . hadoop includes implementation of map reduce programming model. in map reduce there are two phases, not surprisingly map and reduce. to be precise in between map and reduce phase, there is another phase called sort and shuffle. job tracker in name node machine manages other cluster nodes. map reduce programming can be written in java. if you like sql or other non- java languages, you are still in luck. you can use utility called hadoop streaming. http://wiki.apache.org/hadoop/hadoopmapreduce hadoop streaming a utility to enable map reduce code in many languages like c, perl, python, c++, bash etc., examples include a python mapper and awk reducer. http://hadoop.apache.org/docs/r1.2.1/streaming.html avro a serialization system for efficient, cross-language rpc and persistent data storage. avro is a framework for performing remote procedure calls and data serialization. in the context of hadoop, it can be used to pass data from one program or language to another, e.g. from c to pig. it is particularly suited for use with scripting languages such as pig, because data is always stored with its schema in avro. http://avro.apache.org/ apache thrift apache thrift allows you to define data types and service interfaces in a simple definition file. taking that file as input, the compiler generates code to be used to easily build rpc clients and servers that communicate seamlessly across programming languages. instead of writing a load of boilerplate code to serialize and transport your objects and invoke remote methods, you can get right down to business. http://thrift.apache.org/ hive and hue if you like sql, you would be delighted to hear that you can write sql and hive convert it to a map reduce job. but, you don't get a full ansi-sql environment. hue gives you a browser based graphical interface to do your hive work. hue features a file browser for hdfs, a job browser for map reduce/yarn, an hbase browser, query editors for hive, pig, cloudera impala and sqoop2.it also ships with an oozie application for creating and monitoring workflows, a zookeeper browser and an sdk. pig a high-level programming data flow language and execution environment to do map reduce coding the pig language is called pig latin. you may find naming conventions some what un-conventional, but you get incredible price-performance and high availability. https://pig.apache.org/ jaql jaql is a functional, declarative programming language designed especially for working with large volumes of structured, semi-structured and unstructured data. as its name implies, a primary use of jaql is to handle data stored as json documents, but jaql can work on various types of data. for example, it can support xml, comma-separated values (csv) data and flat files. a "sql within jaql" capability lets programmers work with structured sql data while employing a json data model that's less restrictive than its structured query language counterparts. 1. jaql in google code 2. what is jaql? by ibm sqoop sqoop provides a bi-directional data transfer between hadoop -hdfs and your favorite relational database. for example you might be storing your app data in relational store such as oracle, now you want to scale your application with hadoop so you can migrate oracle database data to hadoop hdfs using sqoop. http://sqoop.apache.org/ oozie manages hadoop workflow. this doesn't replace your scheduler or BPM tooling, but it will provide if-then-else branching and control with hadoop jobs. https://oozie.apache.org/ zookeeper a distributed, highly available coordination service. zookeeper provides primitives such as distributed locks that can be used for building the highly scalable applications. it is used to manage synchronization for cluster. http://zookeeper.apache.org/ hbase based on google's bigtable , hbase "is an open-source, distributed, version, column-oriented store" that sits on top of hdfs. a super scalable key-value store. it works very much like a persistent hash-map (for python developers think like a dictionary). it is not a conventional relational database. it is a distributed, column oriented database. hbase uses hdfs for it's underlying. supports both batch-style computations using map reduce and point queries for random reads. https://hbase.apache.org/ cassandra a column oriented nosql data store which offers scalability, high availability with out compromising on performance. it perfect platform for commodity hardware and cloud infrastructure.cassandra's data model offers the convenience of column indexes with the performance of log-structured updates, strong support for de-normalization and materialized views , and powerful built-in caching. http://cassandra.apache.org/ flume a real time loader for streaming your data into hadoop. it stores data in hdfs and hbase.flume "channels" data between "sources" and "sinks" and its data harvesting can either be scheduled or event-driven. possible sources for flume include avro, files, and system logs, and possible sinks include hdfs and hbase. http://flume.apache.org/ mahout machine learning for hadoop, used for predictive analytics and other advanced analysis. there are currently four main groups of algorithms in mahout: recommendations, a.k.a. collective filtering classification, a.k.a categorization clustering frequent item set mining, a.k.a parallel frequent pattern mining mahout is not simply a collection of pre-existing algorithms; many machine learning algorithms are intrinsically non-scalable; that is, given the types of operations they perform, they cannot be executed as a set of parallel processes. algorithms in the mahout library belong to the subset that can be executed in a distributed fashion. http://en.wikipedia.org/wiki/list_of_machine_learning_algorithms https://www.coursera.org/course/machlearning https://mahout.apache.org/ fuse makes the hdfs system to look like a regular file system so that you can use ls, rm, cd etc., directly on hdfs data. whirr apache whirr is a set of libraries for running cloud services. whirr provides a cloud-neutral way to run services. you don't have to worry about the idiosyncrasies of each provider.a common service api. the details of provisioning are particular to the service. smart defaults for services. you can get a properly configured system running quickly, while still being able to override settings as needed. you can also use whirr as a command line tool for deploying clusters. https://whirr.apache.org/ giraph an open source graph processing api like pregel from google https://giraph.apache.org/ chukwa chukwa, an incubator project on apache, is a data collection and analysis system built on top of hdfs and map reduce. tailored for collecting logs and other data from distributed monitoring systems, chukwa provides a workflow that allows for incremental data collection, processing and storage in hadoop. it is included in the apache hadoop distribution as an independent module. https://chukwa.apache.org/ drill apache drill, an incubator project on apache, is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. drill is the open source version of google's dremel system which is available as an iaas service called google big query. one explicitly stated design goal is that drill is able to scale to 10,000 servers or more and to be able to process petabytes of data and trillions of records in seconds. http://incubator.apache.org/drill/ impala (cloudera) released by cloudera, impala is an open-source project which, like apache drill, was inspired by google's paper on dremel; the purpose of both is to facilitate real-time querying of data in hdfs or hbase. impala uses an sql-like language that, though similar to hiveql, is currently more limited than hiveql. because impala relies on the hive meta store, hive must be installed on a cluster in order for impala to work. the secret behind impala's speed is that it "circumvents map reduce to directly access the data through a specialized distributed query engine that is very similar to those found in commercial parallel rdbmss." (source: cloudera) http://www.cloudera.com/content/cloudera/en/products-and-services/cdh/impala.html http://training.cloudera.com/elearning/impala/

June 3, 2015

by Umashankar Ankuri

· 23,923 Views · 3 Likes

Agrona Event Counters

Efficient open source event counters from the Agrona library. Agrona The Agrona library is an open source Java library of utility code. Unlike libraries such as Google Guava or Apache Commons which are general purpose Java utility libraries, Agrona is targeted at providing high performance code. It initially consists of code from the open source Aeron messaging library. Event Counters One of the features of the Agrona library is the event counters framework. One of the design goals of Aeron was to be easy to monitor. We wanted to make sure that people could easily check up on what Aeron is doing with services such as Nagios, internally written monitoring software or just from the commandline. Writing integrations with many services is a herculean task in and of itself, but we definitely wanted to be able to expose an API. We also didn't want to incorporate large 3rd party external dependencies, any allocation heavy code or things we couldn't control the performance of. This meant that we were going to have to write our own event counters rather than using something like the Coda Hale metrics framework. Our requirements for monitoring were very simple though. Update or increment the counter's value. Read or write to/from the counter value from a different thread or process. No garbage creation after initial setup. Labels should be associated with each counter value for readability's sake. Design Threadsafe updates of a long value is a very simple operation and already supported in Java through the AtomicLong class. The problem with using an AtomicLong as your event counter is that an external program, running in a different process, can't read from the value from your Java heap. Consequently Agrona's event counters are allocated in an off-heap buffer. This can be placed on a memory-mapped file which means it can be shared between two different processes. In order to give our counters names we store a table of name and counter id entries on another buffer. This can be placed on the same memory mapped file for convenience. API The CountersManager is responsible for allocating event counters. It needs to be instantiated with the buffers upon which to store the event counters and their labels. Here is an example of how to instantiate a counter from the counters manager. AtomicCounter conductorProxyFails = countersManager.newCounter("Failed offers to DriverConductorProxy"); You can also iterate over the current counter names and their ids. Here is some code that uses that to print a table of the event counter values: countersManager.forEach((id, label)->{finalint offset =CountersManager.counterOffset(id);finallong value = valuesBuffer.getLongVolatile(offset);System.out.format("%3d: %,20d - %s\n", id, value, label);}); Each instance of AtomicCounter represents one event counter. Here are some examples of using the atomic counter in code. // Increment the counter in a thread-safe manner conductorProxyFails.increment();// Increment the counter if you're only writing from a single thread conductorProxyFails.orderedIncrement();// atomically add 5 to the counter value conductorProxyFails.add(5);// Reset the counter conductorProxyFails.set(0); Conclusions I've just gone through a few simple examples of how to use the event counters from Agrona, which hopefully you've found useful. This isn't the only code in Agrona though - there are utilities for agents, and executing timing events as well as collections such as queues, ringbuffers and hashmaps. We're also expanding the library which is already on maven central. Currently documentation is a little bit thin on the ground, but contributions are always welcome. Thanks to Martin Thompson and Chris West for feedback on this blog post.

April 23, 2015

by Richard Warburton

· 5,872 Views

Using Multiple Grok Statements to Parse a Java Stack Trace

Parse your Java stack trace log information with the Logstash tool.

April 14, 2015

by Bipin Patwardhan

· 78,021 Views · 6 Likes

To Shard, or Not to Shard

When I talk with customers about sharding decisions I often start by telling the following true story… A couple of years ago, a customer came to me looking for advice on how to shard his system. He told me he was already convinced he needed to do that since he read that some smart people at MySQL giants like Facebook and Twitter were sharding—so naturally this was something he should be doing, too. I paused for a moment and then I asked him what the size of his database was. “10GB,” he said. I nodded and asked if he handles many queries or if they were very complicated. “No,” he said. “Just a few hundred queries per second, and they have not been loading down the system by more than a few percent.” I asked him whether he was expecting exponential growth in the near future—looking to double every week or something like that. “No, our load and data size grew about 7 percent last year and we expect about the same growth this year and for the foreseeable future.” My recommendation to him was not to waste time and effort on sharding because it is just not needed in his company’s case. Before you decide how to shard, you’d best understand whether or not you really need to shard to begin with. Yes, on the extremely large-scale side of database demands, sharding is the only game in town. And not just for MySQL, but for pretty much any technology out there. Yet thanks to emerging technologies there is an increasing amount of applications that can run databases without sharding. Today we can easily run with terabytes of data per MySQL instance and serve tens of thousands of queries in many OLTP environments. This allows organizations to build very large applications without needing to shard. And keep this in mind: Sharding is a pain under all circumstances. Even if you have sharding provided out of the box by the database system, it is a pain because it introduces more components and complexity. Creating good distributed query execution plans is a very complicated task that needs to take network topology and load into account in addition to the data distribution and load of individual nodes. Before you decide if you need to shard, you should look at alternatives to scale your application. In the MySQL world, the solutions are typically as follows: Alternatives to Sharding Functional Partitioning: In many environments a single MySQL instance becomes a dumping ground for all kinds of databases—you might end up having your main application share a database instance with Drupal, which powers your website, with WordPress, which powers your blog, and with vBulletin, which powers your forums. Splitting those pieces into different database instances is something you should consider before you look into sharding. Custom-made systems will often have many applications using different data sets that can be easily split out. Replication: Many applications are read-heavy, so scaling reads becomes the issue earlier than it does with scaling writes. Replication is a great solution for this. MySQL’s built-in replication is very robust, though due to its asynchronous nature it adds complexity to the application. The developer must decide which of the reads can be done from the replica servers and which can’t, because you must be absolutely certain that you’re reading the most recent, actual data. This is the reason that alternative, synchronous replication technologies for MySQL like Percona XtraDB Cluster, are gaining popularity: They provide single database-like behavior from the cluster in most cases. Caching and Queueing: Caching is a great technology for reducing the amount of reads that hit the database. There are many applications that have reduced read load on the database by 80-95% using this technology. Queueing, in contrast, optimizes writes. It does this by merging multiple write operations together so they hit the database efficiently. Most large-scale applications should rely heavily on both of these technologies. Memcached and Redis are two popular caching technologies in the MySQL space. For queueing, the most popular technologies are ActiveMQ and RabbitMQ [1]. Supplemental Technologies: MySQL is great at many things but not at everything. If you’re looking for high-performance full-text search, consider ElasticSearch, Sphinx, or Lucene. If you’re looking at large-scale data analytics, a Hadoop-based infrastructure or Vertica might work well for you. You should let MySQL handle the things it is good at, and leave the rest to supporting tools. Optimizations to Make Before Sharding Scaling isn’t just about architecture either. You also need to make sure your system is reasonably optimized. Many people decide sharding is inevitable for them even though there are much easier and more cost-effective ways to get the performance and scale they are looking for. All of which, I might add, are also going to be valuable if sharding is indeed eventually needed. Hardware: Are you using the right hardware? I’ve seen many people looking into sharding when in fact simply purchasing decent hardware would solve their problems for years to come. Make sure you have plenty of memory and high-performance flash storage if you’re working with a large database. In many cases it can transform your system so much it will look like magic. MySQL version and Configuration: Use a recent MySQL version. By that I mean the latest GA version (MySQL 5.6 at the time of this article’s publication). Percona Server, which is free, often offers additional performance improvements for demanding workloads. Use the most recent operating system too, especially if you’re using modern hardware. Finally, make sure MySQL is configured properly. The difference in MySQL performance between poorly configured MySQL and well-tuned MySQL can be 10x or more. Schema and Queries: The same application logic can be expressed using a variety of schema and queries. I’ve seen a lot of similar applications approaching things differently, and the difference in the performance between an optimal approach and a poor one (but still used in production) can be 100x or more. Many of the changes can be retrofitted to existing schema—such as minor query changes and changes to the index structure—however, if your schema doesn’t fit your application needs well, then you might be looking at a complete redesign. So it is a good idea to think things through early. When to Shard So when should you start thinking about sharding? Basically, if none of the measures listed above have given you the performance you need, it might be time to consider sharding. Sharding does have the advantage of allowing you to potentially use lower-cost hardware or cheaper cloud instances. Most developers are using agile development methods these days and there is a common term, “Architectural Runway,” which defines how far the application can go with its current architecture. If you’ve already found success using replication in particular, it might be a bad decision to add sharding because it will force your developers to deal with the complexity of sharding and asynchronous replication. However, replication is still typically used to achieve high availability even if you’re already sharding, but in this case it’s not for scaling reads. If you’ve come to the point where you’re sure you need to shard, here are some of the questions you need to ask about how you’ll implement your sharding strategy: Shard Level: At which level should we shard? It does not have to be at the database level. Many applications, SaaS in particular, often “shard” on higher levels, deploying multiple copies of their full stack to offer complete isolation for availability, performance, security etc. In many large scale applications you will see multiple copies of a full stack deployed, each having its own sharded MySQL environment. Shard Key: How do we shard? In many cases the choice depends on whether you’re authenticating for user accounts or your organization, but in other cases it is not so obvious. When making a sharding choice, you need to think about two things: 1) as many data access points as possible should go into a single shard, because cross-shard access is expensive if supported at all, and 2) making sure such sharding does not produce a shard that is too large to handle either in terms of data size or traffic. For example, sharding by country is a poor idea because the requirements to handle Belgium traffic won’t be the same for the United States or China, which require a lot more resources. Shard by Schema or Instance: What is the unit of your shard? The typical choices are MySQL instance or database (schema). I like the shard = database approach, which doesn’t limit you to a single MySQL instance per physical box. That way you do not have to run too many MySQL instances, but you can run more than one if the application works better that way. Shard Unit: If you shard by a single MySQL server, you will run into a problem with high availability very soon. When you have 100 MySQL servers there are roughly 100 more chances for one of them to crash compared with having only one, so ensuring there is a high availability solution becomes critical. Instead of sharing across MySQL servers you will usually be sharding across “Replication Clusters,” such as one MySQL primary node and one or several replica or PXC (Percona XtraDB Cluster) nodes. Shard Technology: What technology can you use to assist you with sharding? Within the MySQL world there is no standard sharding technology as of yet that everyone uses. Most of the large web properties have implemented something in-house for their sharding needs, and some have released their solutions as open source projects. One example is Vitess, contributed by Google, and another is JetPants, contributed by Tumblr. Rolling out your own simple sharding framework might look easy for some developers until you have to deal with operational issues like balancing the shards, resharding, etc., on a large scale. There are a number of purpose-built technologies that can help you with sharding if this doesn’t sound like something your team can manage. Sharding Technologies Here are technologies that you should consider: MySQL Fabric: This is the sharding technology being developed by the MySQL team at Oracle. MySQL Fabric is GA, but its functionality right now is rather limited, especially in terms of their support for multi-sharded queries. Given more time however, it has the potential to become the standard sharding technology for MySQL. Tesora: Tesora has a proxy-based solution for MySQL sharding that became open source some time ago. I would be especially looking at Tesora if you’re also looking at deploying OpenStack, as they’ve invested a lot into the integration. ScaleArc: ScaleArc is a commercial database proxy solution that can do caching, filtering, routing, and sharding. It is a pretty mature solution that handles multiple database technologies and not just MySQL. ScaleBase: ScaleBase is a sharding solution designed specifically for MySQL and the cloud, which similarly to MySQL, operates at the proxy level. There are many technologies in the MySQL space that can help you scale your application without sharding. If you’re going to build the next “Facebook,” however, you will surely need to shard, and there are a number of technologies that can help you do it as painlessly as possible. Large-scale applications on large-scale databases will always introduce complexity, which makes them more complicated to develop against and manage. Success comes with cost. [1] http://dzone.com/research/guide-to-enterprise-integration Peter Zaitsev co-founded Percona in 2006, assuming the role of CEO. Percona helps companies of all sizes maximize their success with MySQL. Peter enjoys mixing business leadership with hands on technical expertise. Peter is also the co-author of O’Reilly’s High Performance MySQL, one of the most popular books on MySQL performance.

April 2, 2015

by Peter Zaitsev

· 21,045 Views · 1 Like

Test your C++ skills - find bugs in popular open-source projects

Authors of PVS-Studio static code analyzers offer programmers to test their sight and to try finding errors in C/C++ code fragments. Code analyzers work tirelessly and are able to find many bugs that can be difficult to notice. We chose some code fragments in which we had founded some errors using PVS-Studio. Quiz is not intended to check C++ language knowledge. There are many quality and interesting tests. For instance, we would recommend this C++ Quiz then. In our case, we made our test just for fun. We quite frequently hear an opinion that code analyzers are pointless tools. It is possible to find misplaced parenthesis or comma in five seconds. However, analyzer would not find difficult logical errors. Therefore, this tool could be useful only for students. We decided to troll these people. There is a time limit in tests. We ask them to find an error in five seconds. Well, OK, not in five seconds, but in a minute. Fifteen randomly selected problems would be shown. Every solved problem worth one point, but only if user provided the answer in one minute. We want to stress that we are not talking about syntax errors. We found all these code fragments in open-source projects that compiles flawlessly. Let us explain on a pair of examples how to point out the correct answer. First example. For instance, you got this code: The bug here is highlighted with red color. Of course, there would be no such emphasizing in a quiz problem. Programmer accidently made a misprint and wrote index 3 instead of index 2. Mouse cursor movement would highlight fragments of code, such as words and numbers. You should point the cursor into number 3 and press left mouse button. This would be the correct answer. Second example. It is not always possible to point out the error exactly. Buffer size should be compared with number 48. An excess sizeof() operator was put there by accident. In result, buffer size is compared with size of int type. At my opinion, an error there is in sizeof operator, and it is required to point it out to score a correct answer. However, without knowledge about the whole text, it is possible to think this way. Sizeof operator should have evaluated the size of some buffer, but accidently evaluates the value of the macro. The error is in “SSL3_MASTER_SECRET_LENGTH” usage. In this case, the answer will be scored no matter what you choose: “sizeof” or “SSL3_MASTER_SECRET_LENGTH”. Good luck! You can start a game. Footnote. Test does not support mobile devices. It is very easy to miss with finger. We are working on new version of tests with better mobile devices support, new problems to solve etc. However, it is not implemented yet. We offer you to subscribe on twitter to read about our new and interesting news and to read about new things in a C++ world.

December 26, 2014

by Andrey Karpov

· 12,547 Views

5 Error Tracking Tools Java Developers Should Know

Raygun, Stack Hunter, Sentry, Takipi and Airbrake: Modern developer tools to help you crush bugs before bugs crush your app With the Java ecosystem going forward, web applications serving growing numbers of requests and users’ demand for high performance - comes a new breed of modern development tools. A fast paced environment with rapid new deployments requires tracking errors and gaining insight to an application's behavior on a level traditional methods can’t sustain. In this post we’ve decided to gather 5 of those tools, see how they integrate with Java and find out what kind of tricks they have up their sleeves. It’s time to smash some bugs. Raygun Mindscape’s Raygun is a web based error management system that keeps track of exceptions coming from your apps. It supports various desktop, mobile and web programming languages, including Java, Scala, .NET, Python, PHP, and JavaScript. Besides that, sending errors to Raygun is possible through a REST API and a few more Providers (that’s how they call language and framework integrations) came to life thanks to developer community involvement. Key Features: Error grouping - Every occurrence of a bug is presented within one group with access to single instances of it, including its stack trace. Full text search - Error groups and all collected data is searchable. View app activity - Every action on an error group is displayed for all your team to see: status updates, comments and more. Affected users - Counts of affected users appear by each error. External integrations - Github, Bitbucket, Asana, JIRA, HipChat and many more. The Java angle: To use Raygun with Java, you’ll need to add some dependencies to your pom.xml file if you’re using Maven or add the jars manually. The second step would be to add an UncaughtExceptionHandler that would create an instance of RaygunClient and send your exceptions to it. In addition, you can also add custom data fields to your exceptions and send them together to Raygun. The full walkthrough is available here. Behind the curtain: Meet Robie Robot, the certified operator of Raygun. As in, the actual ray gun. Check it out on: https://raygun.io Sentry Started as a side-project, Sentry is an open-source web based solution that serves as a real time event logging and aggregation platform. It monitors errors and displays when, where and to whom they happen, promising to do so without relying solely on user feedback. Supported languages and frameworks include Ruby, Python, JS, Java, Django, iOS, .NET and more. Key Features: See the impact of new deployments in real time Provide support to specific users interrupted by an error Detect and thwart fraud as its attempted - notifications of unusual amounts of failures on purchases, authentication, and other sensitive areas External Integrations - GitHub, HipChat, Heroku, and many more The Java angle: Sentry’s Java client is called Raven and supports major existing logging frameworks like java.util.logging, Log4j, Log4j2 and Logback with Slf4j. An independent method to send events directly to Sentry is also available. To set up Sentry for Java with Logback for example, you’ll need to add the dependencies manually or through Maven, then add a new Sentry appender configuration and you’re good to do. Instructions are available here. Behind the curtain: Sentry was an internal project at Disqus back in 2010 to solve exception logging on a Django application by Chris Jennings and David Cramer Check it out on: https://www.getsentry.com/ Takipi Unlike most of the other tools, Takipi is far more than a stack trace prettifier. It was built with a simple objective in mind: Telling developers exactly when and why production code breaks. Whenever a new exception is thrown or a log error occurs – Takipi captures it and shows you the variable state which caused it, across methods and machines. Takipi will overlay this over the actual code which executed at the moment of error – so you can analyze the exception as if you were there when it happened. Key features: Detect – Caught/uncaught exceptions, Http and logged errors. Prioritize – How often errors happen across your cluster, if they involve new or modified code, and whether that rate is increasing. Analyze – See the actual code and variable state, even across different machines and applications. Easy to install - No code or configuration changes needed. Less than 2% overhead. The Java angle: Takipi was built for production environments in Java and Scala. The installation takes less than 1min, and includes attaching a Java agent to your JVM. Behind the curtain: Each exception type and error has a unique monster that represents it. You can find these monster here. Check it out on: http://www.takipi.com/ Airbrake Another tool that has put exception tracking on its eyesights is Rackspace’s Airbrake, taking on the mission of “No More Searching Log Files”. It provides users with a web based interface that includes a dashboard with error details and an application specific view. Supported languages include Ruby, PHP, Java, .NET, Python and even… Swift. Key Features: Detailed stack traces, grouping by error type, users and environment variables Team productivity - Filter importance errors from the noise Team collaboration - See who’s causing bugs and whose fixing them External Integrations - HipChat, GitHub, JIRA, Pivotal and over 30 more The Java angle: Airbrake officially supports only Log4j, although a Logback library is also available. Log4j2 support is currently lacking. The installation procedure is similar to Sentry, adding a few dependencies manually or through Maven, adding an appender, and you’re ready to start. Similarly, a direct way to send messages to Airbrake is also available with AirbrakeNotice and AirbrakeNotifier. More details are available here. Behind the curtain: Airbrake was acquired by Exceptional, which then got acquired by Rackspace. Check it out on: https://airbrake.io/ StackHunter Currently in beta, Stack Hunter provides a self hosted tool to track your Java exceptions. A change of scenery from the past hosted tools. Other than that, it aims to provide a similar feature set to inform developers of their exceptions and help solve them faster. Key Features: A single self hosted web interface to view all exceptions Collections of stack trace data and context including key metrics such as total exceptions, unique exceptions, users affected, & sessions affected Instant email alerts when exceptions occur Exceptions grouping by root cause The Java angle: Built specifically for Java, StackHunter runs on any servlet container running Java 6 or above. Installation includes running StackHunter on a local servlet, configuring an outgoing mail server for alerts, and configuring the application you’re wishing to log. Full instructions are available here. Behind the curtain: StackHunter is developed by Dele Taylor, who also works on Data Pipeline - a tool for transforming and migrating data in Java. Check it out on: http://stackhunter.com/ Bonus: ABRT Another approach to error tracking worth mentioning is used by ABRT, an automatic bug detection and reporting tool from the Fedora ecosystem, which is a Red Hat sponsored community project. Unlike the 5 tools we covered here, this one is intended to be used not only by app developers - but their users as well. Reporting bugs back to Red Hat with richer context that otherwise would have been harder to understand and debug. The Java angle: Support for Java exceptions is still in its proof of concept stage. A Java connector developed by Jakub Filák is available here. Behind the curtain: ABRT is an open-source project developed by Red Hat. Check it out on: https://github.com/abrt/abrt Did we miss any other tools? How do you keep track of your exceptions? Please let me know in the comments section below.

September 18, 2014

by Chen Harel

· 8,797 Views · 2 Likes

Jar Hell Made Easy - Demystifying the Classpath

Some of the hardest problems a Java Developer will ever have to face are classpath errors: ClassNotFoundException, NoClassDefFoundError, Jar Hell, Xerces Hell and company. In this post we will go through the root causes of these problems, and see how a minimal tool (JHades) can help solving them quickly. We will see why Maven cannot (always) prevent classpath duplicates, and also: The only way to deal with Jar Hell Class loaders The Class loader chain Class loader priority: Parent First vs Parent Last Debugging server startup problems Making sense of Jar Hell with jHades Simple strategy for avoiding classpath problems The classpath gets fixed in Java 9? The only way to deal with Jar Hell Classpath problems can be time-consuming to debug, and tend to happen at the worst possible times and places: before releases, and often in environments where there is little to no access by the development team. They can also happen at the IDE level, and become a source of reduced productivity. We developers tend to find these problems early and often, and this is the usual response: Let's try to save us some hair and get to the bottom of this. These type of problems are hard to approach via trial and error. The only real way to solve them is to really understand what is going on, but where to start? It turns out that Jar Hell problems are simpler than what they look, and only a few concepts are needed to solve them. In the end, the common root causes for Jar Hell problems are: a Jar is missing there is one Jar too many a class is not visible where it should be But if it's that simple, then why are classpath problems so hard to debug? Jar Hell stack traces are incomplete One reason is that the stack traces for classpath problems have a lot of information missing that is needed to troubleshoot the problem. Take for example this stack trace: java.lang.IncompatibleClassChangeError: Class org.jhades.SomeServiceImpl does not implement the requested interfaceorg.jhades.SomeService org.jhades.TestServlet.doGet(TestServlet.java:19) It says that a class does not implement a certain interface. But if we look at the class source: publicclassSomeServiceImpl implementsSomeService { @Override publicvoiddoSomething() { System.out.println( "Call successful!"); } Well, the class clearly implements the missing interface! So what is going on then? The problem is that the stack trace is missing a lot of information that is critical to understanding the problem. The stack trace should have probably contained an error message such as this (we will learn what this means): The Class SomeServiceImpl of class loader /path/to/tomcat/lib does not implement the interface SomeService loaded from class loader Tomcat - WebApp - /path/to/tomcat/webapps/test This would be at least an indication of where to start: Someone new learning Java would at least know that there is this notion of class loader that is essential to understand what is going on It would make clear that one class involved was not being loaded from a WAR, but somehow from some directory on the server (SomeServiceImpl). What is a Class Loader? To start, a Class Loader is just a Java class, more exactly an instance of a class at runtime. It is NOT an inaccessible internal component of the JVM like for example the garbage collector. Take for example the WebAppClassLoader of Tomcat, here is it's javadoc. As you can see it's just a plain Java class, we can even write our own class loader if needed. Any subclass of ClassLoader will qualify as a class loader. The main responsibilities of a class loader is to known where class files are located, and then load classes on JVM demand. Everything is linked to a class loader Each object in the JVM is linked to it's Class via getClass(), and each class is linked to a class loader via getClassLoader(). This means that: Every object in the JVM is linked to a class loader! Let's see how this fact can be used to troubleshoot a classpath error scenario. How-To find where a class file really is Let's take an object and see where it's class file is located in the file system: System.out.println(service.getClass() .getClassLoader() .getResource("org/jhades/SomeServiceImpl.class")); This is the full path to the class file: jar:file:/Users/user1/.m2/repository/org/jhades/jar-2/1.0-SNAPSHOT/jar-2-1.0-SNAPSHOT.jar!/org/jhades/SomeServiceImpl.class As we can see the class loader is just a runtime component that knowns where in the file system to look for class files and how to load them. But what happens if the class loader cannot find a given class? The Class loader Chain By default in the JVM, if a class loader does not find a class, it will then ask it's parent class loader for that same class and so forth. This continues all the way up until the JVM bootstrap class loader (more on this later). This chain of class loaders is the class loader delegation chain. Class loader priority: Parent First vs Parent Last Some class loaders delegate requests immediately to the parent class loader, without searching first in their own known set of directories for the class file. A class loader operating on this mode is said to be in Parent First mode. If a class loader first looks for a class locally and only after queries the parent if the class is not found, then that class loader is said to be working in Parent Last mode. Do all applications have a class loader chain ? Even the most simple Hello World main method has 3 class loaders: The Application class loader, responsible for loading the application classes (parent first) The Extensions class loader, that loads jars from $JAVA_HOME/jre/lib/ext (parent first) The Bootstrap class loader, that loads any class shipped with the JDK such as java.lang.String (no parent class loader) What does the class loader chain of a WAR application look like? In the case of application servers like Tomcat or Websphere, the class loader chain is configured differently than a simple Hello World main method program. Take for example the case of the Tomcat class loader chain: Here we wee that each WAR runs in a WebAppClassLoader, that works in parent last mode (it can be set to parent first as well). The Common class loader loads libraries installed at the level of the server. What does the Servlet spec say about class loading? Only a small part of the class loader chain behavior is defined by the Servlet container specification: The WAR application runs on it's own application class loader, that might be shared with other applications or not The files in WEB-INF/classes take precedence over everything else After that, it's anyones guess! The rest is completely open for interpretation by container providers. Why isn't there a common approach for class loading across vendors? Usually open source containers like Tomcat or Jetty are configured by default to look for classes in the WAR first, and only then search in server class loaders. This allows for applications to use their own versions of libraries that override the ones available on the server. What about the big iron servers? Commercial products like Websphere will try to 'sell' you their own server provided libraries, that by default take precedence over the ones installed on the WAR. This is done assuming that if you bought the server you want also to use the JEE libraries and versions it provides, which is often NOT the case. This makes deploying to certain commercial products a huge hassle, as they behave differently then the Tomcat or Jetty that developers use to run applications in their workstation. We will see further on a solution for this. Common Problem: duplicate class versions At this moment you probably have a huge question: What if there are two jars inside a WAR that contain the exact same class? The answer is that the behavior is undetermined and only at runtime one of the two classes will be chosen. Which one gets chosen depends on the internal implementation of the class loader, there is no way to know upfront. But luckily most projects these days use Maven, and Maven solves this problem by ensuring only one version of a given jar is added to the WAR. So a Maven project is immune to this particular type of Jar Hell, right? Why Maven does not prevent classpath duplicates Unfortunately Maven cannot help in all Jar Hell situations. In fact, many Maven projects that don't use certain quality control plugins can have hundreds of duplicate class files on the classpath (I saw trunks with over 500 duplicates). There are several reasons for that: Library publishers occasionally change the artifact name of a jar: This happens due to re-branding or other reasons. Take for example the example of the JAXB jar. There is no way Maven can identify those artifacts as being the same jar! Some jars are published with and without dependencies: Some library providers provide a 'with dependencies' version of a jar, which includes other jars inside. If we have transitive dependencies with the two versions, we will end up with duplicates. Some classes are copied between jars: Some library creators, when faced with the need for a certain class will just grab it from another project and copy it to a new jar without changing the package name. Are all class files duplicates dangerous? If the duplicate class files exist inside the same class loader, and the two duplicate class files are exactly identical then it does not matter which one gets chosen first - this situation is not dangerous. If the two class files are inside the same class loader and they are not identical, then there is no way which one will be chosen at runtime - this is problematic and can manifest itself when deploying to different environments. If the class files are in two different class loaders, then they are never considered identical (see the class identity crisis section further on). How can WAR classpath duplicates be avoided? This problem can be avoided for example by using the Maven Enforcer Plugin, with the extra rule of Ban Duplicate Classes turned on. You can quickly check if your WAR is clean using the JHades WAR duplicate classes report as well. This tool has an option to filter 'harmless' duplicates (same class file size). But even a clean WAR might have deployment problems: Classes missing, classes taken from the server instead of the WAR and thus with the wrong version, class cast exceptions, etc. Debugging the classpath with JHades Classpath problems often show up when the application server is starting up, which is a particularly bad moment specially when deploying to an environment where there is limited access. JHades is a tool to help deal it with Jar Hell (disclaimer: I wrote it). It's a single Jar with no dependencies other than the JDK7 itself. This is an example of how to use it: newJHades() .printClassLoaders() .printClasspath() .overlappingJarsReport() .multipleClassVersionsReport() .findClassByName("org.jhades.SomeServiceImpl") This prints to the screen the class loader chain, jars, duplicate classes, etc. Debugging server startup problems JHades works works well in scenarios where the server does not start properly. A servlet listener is provided that allows to print classpath debugging information even before any other component of the application starts running. ClassCastException and the Class Identity Crisis When troubleshooting Jar Hell, beware of ClassCastExceptions. A class is identified in the JVM not only by it's fully qualified class name, but also by it's class loader. This is counterintuitive but in hindsight makes sense: We can create two different classes with the same package and name, ship them in two jars and put them in two different class loaders. One let's say extends ArrayList and the other is a Map. The classes are therefore completely different (despite the same name) and cannot be cast to each other! The runtime will throw a CCE to prevent this potential error case, because there is no guarantee that the classes are castable. Adding the class loader to the class identifier was the outcome of the Class Identity Crisis that occurred in earlier Java days. A Strategy for Avoiding Classpath Problems This is easier said then done, but the best way to avoid classpath related deployment problems is to run the production server in Parent Last mode. This way the class versions of the WAR take precedence over the ones on the server, and the same classes are used in production and in a developer workstation where it's likely that Tomcat, Jetty or other open source Parent Last server is being used. In certain servers like Websphere, this is not sufficient and you also have to provide special properties on the manifest file to explicitly turn off certain libraries like for example JAX-WS. Fixing the classpath in Java 9 In Java 9 the classpath gets completely revamped with the new Jigsaw modularity system. In Java 9 a jar can be declared as a module and it will run in it's own isolated class loader, that reads class files from other similar module class loaders in an OSGI sort of way. This will allow multiple versions of the same Jar to coexist in the same application if needed. Conclusions In the end, Jar Hell problems are not that low level or unapproachable as they might seem at first. It's all about zip files (jars) being present/ not being present in certain directories, how to find those directories, and how to debug the classpath in environments with limited access. By knowing a limited set of concepts such as Class Loaders, the Class Loader Chain and Parent First / Parent Last modes, these problems can be tackled effectively. External links This presentation Do you really get class loaders from Jevgeni Kabanov of ZeroTurnaround (JRebel company) is a great resource about Jar Hell and the different type of classpath related exceptions.

September 8, 2014

by Vasco Cavalheiro

· 55,126 Views · 7 Likes

Using the OpenXML SDK Productivity Tool to "decompile" Office Documents

Ode To Code - Easily Generate Microsoft Office Files From C# "... These days, Office files are no longer in a proprietary binary format, and are we can create the files directly without using COM automation. A .docx Word file, for example, is a collection of XML documents zipped into a single file. The official name of the format is Open XML. There is an SDK to help with reading and writing OpenXML, and a Productivity Tool that can generate C# code for a given file. All you need to do is load a document, presentation, or workbook into the tool and press the “Reflect Code” button. The downside to this tool is that even a simple document will generate 4,000 lines of code. Another downside is that the generated code assumes it will write directly to the file system, however it is easy to pass in an abstract Stream object instead. So while this code isn’t perfect, the code does produce valid document and..." I've been blogging about the OpenXML SDK for years now, but I think this is the first time I've seen this part of it, this utility. And like he says, 4K LoC is like, well, allot, it does look like an awesome way to learn the low level OpenXML SDK ins and outs. Related Past Post XRef: Open Sesame - Open XML SDK is now open source Using OpenXML to load an Excel Worksheet into a DataTable (or just how different OpenXML is from the old Excel API we're used too) Using OpenXML SDK to generate Word documents via templates (and without Word being installed) Checking for Microsoft Word DocX/DocM Revisions/Track Changes without using Word... (via OpenXML SDK, LINQ to XML or XML DOM) LINQ to XlsX... Using VB.Net, LINQ, the OpenXML SDK and a little C# helper, to query an Excel XlsX Using native OpenXML to create an XlsX (Which provides an example of why I highlight tools that make OpenXML easier...) Generating Xlsx's on the Server? You're using OpenXML, right? With help from the PowerTools for OpenXML? Official boat-load, as in supertanker, sized OpenXML content list (Insert "One OpenXML content list to rule them all" here) So how do I get from here to OpenXML? Got a map for you, an Open XML SDK Blog Map… Where to go to scratch your OpenXML dev info itch… "Open XML Explained" Free eBook (PDF) The Noob's Guide to Open XML Dev (If you know how to spell OpenXML but that's about it, this is your Getting Started guide...) Reusing the PowerShell PowerTools for Open XML in your C# or VB.Net world PowerShell, OpenXML, WMI and the PowerTools for OpenXML = Doc generation for our inner geek Because it’s a PowerShell kind of day… PowerTools for Open XML V1.1 Released OpenXML PowerTools updated – Cell your Excel via PowerShell Powering into OpenXML with PowerShell Open XML SDK 2.0 for Microsoft Office Released – Automate Office documents without Office Open XML 2.0 Code Snippets for VS2010 (and VS2008 too) Open XML Format SDK 2.0 Code Snippets for Visual Studio 2008 – 52 C#/VB Code Snippets to help ease your Open XML coding Open XML File Format Code Snippets for Visual Studio 2005 (Office 2007 NOT required) Open XML SDK v1 Released OpenXML Viewer 1.0 Released – Open source DocX to HTML conversion, with IE, Firefox and Opera (and/or command line) support

July 31, 2014

by Greg Duncan

· 16,591 Views

The Mobile Landscape: Cross-Platform Problems and Solutions

This article was originally published in DZone's 2014 Guide to Mobile Development Mobile development has become a ubiquitous part of the software industry, and most developers understand the central dilemma organizations face when building a mobile app: cross-platform development. What options exist for deploying an app to multiple platforms simultaneously? What are the strengths and weaknesses of each platform? The backbone of mobile development is the native application, but there are a growing number of alternatives: web apps provide a browser-based solution, hybrid apps leverage web development skills in a native package, and code translators apply one platform’s native development skillset to the codebase of another. However, the differences can be subtle, and every option carries its own set of drawbacks. NATIVE DEVELOPMENT Native applications are built from the ground up for a specific platform and tailored to fit it. The precise, platform-centered nature of native development means that these apps have no limits in terms of access to APIs and device features, performance optimization, and platform-specific best practices for user interface design. Ideally, every mobile app would be built this way: to suit its exact purpose while utilizing all of the available resources. One of the major benefits of native mobile development is the availability of resources. For example, developers targeting Android have the Android Software Development Kit (SDK) at their disposal, which includes a suite of tools to streamline the development process: the SDK Manager condenses updates and tool installations into a single menu, the AVD Manager provides access to the Android Emulator and other virtual devices, and the Dalvik Debug Monitor Server (DDMS) is a powerful debugging tool, just to name a few. iOS and Windows Phone developers have similar toolsets available in their SDKs, covering everything from the UI and device feature tools of Cocoa Touch in the iOS SDK to the real world testing conditions of the Simulation Dashboard for Windows Phone 8. These toolsets make native SDKs invaluable and thorough resources. Unfortunately, the native SDKs are all robust toolsets that a native developer has to learn for each platform. To develop native apps from scratch (rather than through an intermediate tool), developers must be skilled with the required language, IDE, and development tools for each targeted platform, and if developers with diverse skillsets are not available, additional developers must be hired. This can be a serious problem, given the increasing push to develop on multiple platforms. For example, according to DZone’s 2014 Mobile Developer Survey, 62% of respondents targeted both Android and iOS. The economic constraints of native development are a major factor in the growing popularity of web apps, hybrid apps, code translators, and Mobile Application Development Platforms (MADPs), which allow developers to reach multiple platforms with just one tooling ecosystem. WEB APPS The skillset for building a basic mobile web app is more common than that of native development. Essentially, mobile web apps are just regular websites optimized to look good and function well on mobile devices, and they can provide a quality app-like experience if the developer is very skilled in web technologies. Widely understood front-end web development languages such as HTML, CSS, and JavaScript provide the logic behind a web app, and there are plenty of tools and libraries out there to help web developers direct their skills toward mobile devices. jQuery Mobile and Sencha Touch are two examples of mobile web frameworks that provide UI components and logic for sliders, swipes, and other touch-activated controls that are common to native mobile applications. The community around open source web technologies is another key difference between native and web development. Web technologies like Node.js and AngularJS are some of the most popular projects in the open source community according to GitHub statistics. This suggests that the community support and knowledge base around web technologies is broader than native technologies. In addition to being a more common skill set, mobile web development can also solve a fundamental issue with native application development. Aside from possible browser compatibility issues, web apps present a near-universal cross-platform option. Most APIs and hardware features will not be accessible by web apps, and because they are not discrete applications in the same way that native apps are, web apps cannot be distributed through common means, such as Apple’s App Store and Google’s Android Marketplace. Web apps may be a particularly flexible option, but they lack a presence on fundamental mobile distribution. HYBRID APPS Many of the drawbacks for web apps are alleviated by another cross-platform option built on the same core web development skillset: the hybrid app. Like web apps, hybrid apps require web development skills, but unlike web apps, they include some native features to allow greater flexibility. It gets the name hybrid because it is built with web languages and technologies at its core. With the help of a native packaging tool, it can be deployed just like a native app and access more native device capabilities (device APIs) than a pure web application. A hybrid app is created by first coding the application to run in the device’s native webview, which is basically a stripped-down version of the browser. For iOS this view is called UIWebView, while on Android it’s called WebView. This view can present the HTML and JavaScript files in a full-screen format, and pure web apps can achieve this full-screen view as well. WebKit is the most commonly targeted browser rendering engine because it is used on iOS, Android, and Blackberry. Where a web app really starts to become a hybrid app is when the app is placed inside of a native wrapper, which packages the hybrid app as a discrete application and makes it viable for app store distribution. In addition to the native wrapper, a native bridge allows the app to communicate with device APIs, such as alarm settings, accelerometers, and cameras. The native bridge is an abstraction layer that exposes the device APIs to the hybrid app as a JavaScript API. This is one feature that clearly separates hybrid and pure web apps, because web apps are unable to pass through the security structures between the browser and native device APIs. Access to many of the hardware features on mobile devices makes hybrid apps feel more like native apps than web apps from the user perspective. MADPS AND CODE TRANSLATORS Some tools can go even further in terms of taking a single codebase and deploying it on multiple mobile platforms. MADPs are development tools, sometimes including a mobile middleware server, that build hybrid or native apps for each platform using one codebase. Some MADPs, such as Appcelerator’s Titanium and Trigger.io, can take advantage of native elements where native is necessary or higher performing. UI widgets may be native, for instance, while a more flexible JavaScript API condenses the universal parts of mobile development and maximizes code reuse. As more native elements are introduced, some of the drawbacks of native development reappear, such as the costly need for multiple skillsets. MADPs are most useful in scenarios where an application needs to work with many back-end data sources, many other mobile apps, or many operating systems. (Inspired by Trigger.io) A less comprehensive but more straightforward solution is to use code translators when building native apps for multiple operating systems. These tools take native code and translate it into another platform’s native code, or translate native code into a neutral low-level alternative, such as bytecode. One example is Google’s J2ObjC, which translates Java classes into their Objective-C equivalents, alleviating a lot the initial development of an iOS version of the app. Although it’s much more than a code translator, a product called Xamarin does something similar by allowing developers working with C# and .NET in Visual Studio to produce a native ARM executable. They can then take advantage of ahead-of-time (AOT) or just-in-time (JIT) compilation to run their apps on iOS and Android in addition to Windows Phone. As is the case with hybrid apps, the UI presents a problem. Because UI development cannot be translated between platforms, code translators still require a significant knowledge of the native platform to write the UI. In other words, code translators can provide substantial benefits in terms of cutting down development time, but they’re not necessarily a “write once, run anywhere” solution. NO SILVER BULLETS Between native apps, web apps, hybrid apps, and the growing number of MADPs, there are a lot of options for mobile development. It’s important to note that there is no one solution that does everything. Some sacrifice affordability and accessibility for pure native performance, UI for easy cross-platform deployment, or ease of development for native authenticity. Even the simplest tools come with some degree of a learning curve. If a method with no trade-offs existed, the industry would adopt it en masse, and you would know about it. Because there are trade-offs, developers and decision-makers will have to recognize their needs, and the needs of their users, in order to determine the best way to approach mobile development. Want to read more articles like this? Download the free guide today! 2014 Guide to Mobile Development DZone's 2014 Guide to Mobile Development provides an analysis of the current state of mobile development and important strategies, tools, and insights for accelerating mobile development and includes: In-depth articles written by industry experts Survey results from over 1000 mobile developers Profiles on 39 mobile developement tools and frameworks And much more! DOWNLOAD NOW

June 11, 2014

by Alec Noller

· 11,897 Views

Cyclop: A Web Based Editor for Cassandra Query Language

Cyclop is a web-based tool for querying Cassandra databases with features like syntax highlighting and query completion.

May 9, 2014

by Comsysto Gmbh

· 10,519 Views

The 7 Log Management Tools Java Developers Should Know

splunk vs. sumo logic vs. logstash vs. graylog vs. loggly vs. papertrails vs. splunk>storm splunk, sumo logic, logstash, graylog, loggly, papertrails - did i miss someone? i’m pretty sure i did. logs are like fossil fuels - we’ve been wanting to get rid of them for the past 20 years, but we’re not quite there yet. well, if that's the case i want a bmw! to deal with the growth of log data a host of log management & analysis tools have been built over the last few years to help developers and operations make sense of the growing data. i thought it’d be interesting to look at our options and what are each tools’ selling point, from a developer’s standpoint . splunk as the biggest tool in this space, i decided to put splunk in a category of its own. that’s not to say it’s the best tool for what you need, but more to give credit to a product who essentially created a new category. pros splunk is probably the most feature rich solution in the space. it’s got hundreds of apps (i counted 537 ) to make sense of almost every format of log data, from security to business analytics to infrastructure monitoring. splunk’s search and charting tools are feature rich to the point that there’s probably no set of data you can’t get to through its ui or apis. cons splunk has two major cons. the first, that is more subjective, is that it’s an on-premise solution which means that setup costs in terms of money and complexity are high. to deploy in a high-scale environment you will need to install and configure a dedicated cluster. as a developer, it’s usually something you can't or don’t want to do as your first choice. splunk’s second con is that it’s expensive. to support a real-world application you’re looking at tens of thousands of dollars, which most likely means you’ll need sign offs from high-ups in your organization, and the process is going to be slow. if you’ve got a new app and you want something fast that you can quickly spin up and ramp as things progress - keep reading. some more enterprise log analyzers can be found here . saas log analyzers sumo logic sumo was founded as a saas version of splunk, going so far as to imitate some of splunk’s features and visuals early on. having said that, sl has developed to a full fledged enterprise class log management solution. pros sl is chock-full of features to reduce, search and chart mass amounts of data. out of all the saas log analyzers, it’s probably the most feature rich. also, being a saas offering it inherently means setup and ongoing operation are easier. one of sumo logic’s main points of attraction is the ability to establish baselines and to actively notify you when key metrics change after an event such as a new version rollout or a breach attempt. cons this one is shared across all saas log analyzers, which is you need to get the data to the service to actually do something with it. this means that you’ll be looking at possible gbs (or more) uploaded from your servers. this can create issues on multiple fronts - as a developer, if you're logging sensitive or pii you need to make sure it’s redacted. there may be a lag between the time data is logged and the time it’s visible to to the service. there’s additional overhead on your machines transmitting gbs of data, which really depends on your logging throughput. sumo’s pricing is also not transparent , which means you might be looking at a buying process which is more complex than swiping your team’s credit card to get going. loggly loggly is also a robust log analyzer, focusing on simplicity and ease of use for a devops audience. pros whereas sumo logic has a strong enterprise and security focus, loggly is geared more towards helping devops find and fix operational problems. this makes it very developer-friendly. things like creating custom performance and devops dashboards are super-easy to do. pricing is also transparent, which makes start of use easier. cons don't expect loggly to scale into a full blown infrastructure, security or analytics solution. if you need forensics or infrastructure monitoring you’re in the wrong place. this is a tools mainly for devops to parse data coming from your app servers. anything beyond that you’ll have to build yourself. papertrails papertrails is a simple way to look and search through logs from multiple machines, in one consolidated easy-to-use interface. think of it like tailing your log in the cloud, and you won't be too far off. pros pt is what it is. a simple way to look at log files from multiple machines in a singular view in the cloud. the ux itself is very similar to looking at a log on your machine, and so are the search commands. it aims to do something simple and useful, and does it elegantly. it’s also very affordable . cons pt is mostly text based. looking for any advanced integrations, predictive or reporting capabilities? you're barking up the wrong tree. splunk>storm this is splunk’s little (some may say step) saas brother. it’s a pretty similar offering that’s hosted on splunk’s servers. pros storm lets you experiment with splunk without having to install the actual software on-premise, and contains much of the features available in the full version. cons this isn't really a commercial offering, and you're limited in the amount of data you can send. it seems to be more of an online limited version of splunk meant to help people test out the product without having to deploy first. a new service called splunk cloud is aimed at providing a full-blown splunk saas experience. open source analyzers logstash logstash is an open source tool for collecting and managing log files. it’s part of an open-source stack which includes elasticsearch for indexing and searching through data and kibana for charting and visualizing data. together they form a powerful log management solution. pros being an open-source solution means you're inherently getting a lot of a control and a very good price. logstash uses three mature and powerful components, all heavily maintained, to create a very robust and extensible package. for an open-source solution it’s also very easy to install and start using. we use logstash and love it. cons as logstash is essentially a stack, it means you're dealing with three different products. that means that extensibility also becomes complex. logstash filters are written in ruby, kibana is pure javascript and elasticsearch has its own rest api as well as json templates. when you move to production, you’ll also need to separate the three into different machines, which adds to the complexity. graylog2 a fairly new player in the space, gl2 is an open-source log analyzer backed by mongodb as well as elasticsearch (similar to logstash) for storing and searching through log errors. it’s mainly focused on helping developers detect and fix errors in their apps. also in this category you can find fluentd and kafka whose one of its main use-cases is also storing log data. phew, so many choices! takipi for logs while this post is not about takipi, i thought there’s one feature it has which you might find relevant to all of this. the biggest disadvantage in all log analyzers and log files in general, is that the right data has to be put there by you first. from a dev perspective, it means that if an exception isn’t logged, or the variable data you need to understand why it happened isn't there, no log file or analyzer in the world can help you. production debugging sucks. one of the things we’ve added to takipi is the ability to jump into a recorded debugging session straight from a log file error. this means that for every log error you can see the actual source code and variable values at the moment of error. you can learn more about it here . this is one post where i would love to hear from you guys about your experiences with some of the tools mentioned (and some that i didn’t). i’m sure there are things you would disagree with or would like to correct me on - so go ahead, the comment section is below and i would love to hear from you. originally posted on takipi blog

April 29, 2014

by Chen Harel

· 37,818 Views

OpenSource License Manager

What is a License Manager? License managers are used to enforce license rights, or at least to support the enforcement. When you develop an open source program, there is no much you need to or can do to enforce license rights. The code is there and if anyone just wants to abuse the program there is nothing technical that could stop them. Closed source programs are different. (Are they?) In that case the source code is not available for the client. It is not possible to alter the program so that it circumvents the license enforcement code, and thus there is a real role for license rights enforcement. But this is not true. The truth is that there is no fundamental difference between closed and open source code in this respect. Closed source codes can also be altered. The ultimate “source” for the execution is there after all: the machine code. There are tools that help to analyze and decode the binary to more or less human readable format and thus it is possible to circumvent the license management. It is possible and there is a great source of examples for it. On some sites hosted in some countries you can simply download the cracked version of practically any software. I do not recommend to do that and not only for ethical reasons though. You just never know which of the sites are funded by secret services or criminals (if there is any difference) and you never know if you install spy software on your machine using the cracked version. Once I worked for a company where one of the success measurements of their software was the number of the days after release till the cracked versions appeared on the different sites compared to the same value of the competitor. The smaller the number was for their software the happier they were. Were they crazy? Why were they happy to know that their software was cracked? When the number of the days was only one single they, why did not they consider applying stronger license enforcement measure, like morphing code, hardware key and so on? The answer is the following. This company knows very well that license management is not to prevent the unauthorized use. It can be used that way but it will have two major effects which will ruin your business: Writing license management code you spend your time on non-productive code. License management (this way) works against your customer. Never implement license management against your customer. When your license management solution is too restrictive you may restrict the software use of your customer. When you deliver your code using hardware key you impose inconvenience to your customer. When you bind your license to Ethernet MAC address of the machine the application is running on, again: you work against your customer. Set != Set Face the said truth: there will always be people, who use your software without paying for it. They are not your customers. Do they steal from you? Not necessarily. If there is someone who is not buying your software, he is not your customer. If you know that there is no way they would pay for the software and the decision was in your hands whether you want them to use the software or use that of your competitor what would you choose? I guess you would like your software to be used to get more feedback and more knowledge even in the area of non-customers. People using your software may become your customer more likely than people not using it. This is why big companies sell out educational licenses to universities and other academic institutions. Should we use license management at all in that case? Is license management bad down to ground in all aspects? My answer is that it is not. There is a correct use case for license management, even when the software is open source (but not free, like Atlassian products). To find and understand this use case there is one major thing to understand: The software is for the customer, and any line in the code has to support the customers to reach their business goals. Paying the fee for the software is for the customers. If nobody finances a software the software will die. There is nothing like free lunch. Somebody has to pay for it. To become a customer and pay for the software used is the most straightforward business model and provides the strongest feedback and control for the customer over the vendor to get the features needed. At the same time paying for the software use is not the core business of the customer. Paying for the resources used supports them to reach their business goals is indirect. This is where license management comes into picture. It helps the customer to due their duties. It helps them remember their long term needs. This also means that license management should not prevent functionality. No functionality should stop if a license expires. Not to mention functionality that may prevent access to data that actually belongs to the customer. If you approach license management with this mindset you can see that even open source (but not free) software may need it. License Management Tool: license3j Many years ago I was looking for some license management library and I found that there was none open source. I wanted to create an open source (but not free) application and it required that the license management is also open source. What I found was also overpriced taking into account our budget that was just zero for a part time start-up software (which actually failed business wise miserably, but that is another story). For this reason I created License3jwhich surprisingly became one of the most used library of my OS projects. License3j is very simple in terms of business objects. It uses a simple property file and lets the application check the content of the individual fields. The added value is handling electronic signature and checking the authenticity of the license file. Essentially it is hardly more than a single class file. com.verhas license3j 1.0.4 Feel free to use it if you like.

April 14, 2014

by Peter Verhas

CORE

· 16,208 Views · 1 Like

Be a Lazy but a Productive Android Developer, Part 4: Card UI

Welcome to part 4 of the “Be a lazy but a productive android developer” series. If you are lazy android developers for creating row items for ListView/GridView but would want to create an awesome ListView/GridView in easy steps then this article is for you. This series so far: Part 1: We looked at RoboGuice, a dependency injection library by which we can reduce the boiler plate code, save time and there by achieve productivity during Android app development. Part 2: We saw and explored about Genymotion, which is a rocket speed emulator and super-fast emulator as compared to native emulator. And we can use Genymotion while developing apps and can quickly test apps and there by can achieve productivity. Part 3: We understood and explored about JSON Parsing libraries (GSON and Jackson), using which we can increase app performance, we can decrease boilerplate code and there by can optimize productivity. In this Part In this part, we are going to explore 2-3 card UI libraries which are open source and available on GitHub and we can use either of it into our app development to have a quick listview/gridview with awesome card view. What is Card UI and Why Should We Follow Card UI Design? Ever wondered about Google play store UI which is built around cards. Card is nothing but a single row item of ListView or GridView. As depicted below, card can be of various sizes and can be either app card, movie, books, games or app suggestions card or birthday card or even it can be a simple list/grid item too. The main benefit of designing app with card UI is it gives consistent looks throughout the application, doesn’t matter whether it gets loaded in mobile or tablet. Cards Libraries Now, I am sure you are excited to read and explore about cards libraries existed on web. As I said, Google play store UI is built around card, we can build the same card UI either defining our own custom adapter with styles/images or we can achieve this type of card UI directly by using some open-source card libraries. I am sure you are lazy android developer but want to be a productive developer so you would go for using card UI library Regarding card library, it just provides an easy way to display card UIs in your android app. I have found 3 widely used card libraries in android development: Cardslib by Gabriele MariottiGabriele Mariotti – https://github.com/gabrielemariotti/cardslib Cards UI by Aidan Follestad – https://github.com/afollestad/Cards-UI CardsUI by Nadavfima – https://github.com/nadavfima/cardsui-for-android Being a lazy but a productive android developer, so far I have used Cardslib by Gabriele. As far as I have used Cardslib, I would say you don’t need to define a row layout or custom adapter to display simple card list, but yes you would have to design custom xml layout in case if you would want to customize card layout as per your designs and requirements. I would recommend Cardslib by Gabriele because it’s very well documented and is being improved actively. He has been putting a lot of effort to include new stuffs into the library like he recently included a support for preparing staggered grid with cards. How to Use Cardslib? Cardslib is available as a separate library project so you can reference it as a local library. It’s also pushed as a AAR tp Maven Central. Read detailed instructions regarding How to include, build or reference cardlib. Example 1: Simple Card UI Example To give demo, currently I have used eclipse so I have downloaded cardslib library project and will be referencing into our example projects. Let’s develop a simple card view example using 1st library listed above. row_card.xml Java code to set row_card xml layout, set title, header, image, etc. // Create a Card Card card = new Card(this, R.layout.row_card); // Create a CardHeader CardHeader header = new CardHeader(this); header.setTitle("Hello world"); card.setTitle("Simple card demo"); CardThumbnail thumb = new CardThumbnail(this); thumb.setDrawableResource(R.drawable.ic_launcher); card.addCardThumbnail(thumb); // Add Header to card card.addCardHeader(header); // Set card in the cardView CardView cardView = (CardView) findViewById(R.id.carddemo); cardView.setCard(card); Example 2: Card list example activity_list.xml CardListActivity.java package com.technotalkative.cardslibdemo; import it.gmariotti.cardslib.library.internal.Card; import it.gmariotti.cardslib.library.internal.CardArrayAdapter; import it.gmariotti.cardslib.library.internal.CardHeader; import it.gmariotti.cardslib.library.internal.CardThumbnail; import it.gmariotti.cardslib.library.view.CardListView; import java.util.ArrayList; import android.app.Activity; import android.os.Bundle; public class CardListActivity extends Activity { @Override protected void onCreate(Bundle savedInstanceState) { // TODO Auto-generated method stub super.onCreate(savedInstanceState); setContentView(R.layout.activity_list); int listImages[] = new int[]{R.drawable.angry_1, R.drawable.angry_2, R.drawable.angry_3, R.drawable.angry_4, R.drawable.angry_5}; ArrayList cards = new ArrayList(); for (int i = 0; i<5; i++) { // Create a Card Card card = new Card(this); // Create a CardHeader CardHeader header = new CardHeader(this); // Add Header to card header.setTitle("Angry bird: " + i); card.setTitle("sample title"); card.addCardHeader(header); CardThumbnail thumb = new CardThumbnail(this); thumb.setDrawableResource(listImages[i]); card.addCardThumbnail(thumb); cards.add(card); } CardArrayAdapter mCardArrayAdapter = new CardArrayAdapter(this, cards); CardListView listView = (CardListView) this.findViewById(R.id.myList); if (listView != null) { listView.setAdapter(mCardArrayAdapter); } } } Download Source Code You can download source code of above examples from here: https://github.com/PareshMayani/CardslibDemo. To run this example, first you have to download library project and then reference it into our example. Above were just simple examples, if you explore card library then you would be able to understand usage of it and would be able to reduce boiler plate code by not writing adapter/layout code again and there by would be able optimize productivity. Hope you liked this part of “Lazy android developer: Be productive” series. Till the next part, keep building card UI, card list, card grid and enjoy!

April 10, 2014

by Paresh Mayani

· 57,964 Views

Be a Lazy but Productive Android Developer, Part 3: JSON Parsing Library

If you are lazy Android developers for JSON parsing but want to be a productive by using JSON parsing library then this article is for you.

April 2, 2014

by Paresh Mayani

· 83,332 Views · 1 Like

Hunting for an SWT Test Framework? Say Hello to Red Deer

This is the first in a series of posts on the new “Red Deer” (https://github.com/jboss-reddeer/reddeer) open source testing framework for Eclipse. In this post, we’ll introduce Red Deer, and take a look at the some of the advantages that it offers by building a sample test program from scratch. Some of the features that Red Deer automated offers are: An easy to use, high-level API for testing standard Eclipse components Support for creating custom extensions for your own applications A requirements validation mechanism to assist you in configuring complex tests Eclipse Tooling to Assist in Creating new Projects A record and playback tool to enable you to quickly create automated tests An integration with Selenium for testing web based applications Support for running tests in a Jenkins CI environment Note that as of this writing, Red Deer is in an incubation stage. The current release is at level 0.5. The target date for the 1.0 release of Red Deer is late 2014. But, as a community-based, open source project, now is a great time to try Red Deer and make suggestions or even contribute code! A Look at Red Deer’s Architecture The Red Deer project itself is comprised of utilities and the API that supports the development and execution of automated tests. The API (the parts of the above diagram that are enclosed in dashed line boxes) can be thought of as having three layers: The top layer consists of extensions to Red Deer’s abstract classes or implementations for Eclipse components such as Views, Editors, Wizards, or Shells. For example, if you are writing tests for a feature that uses a custom Eclipse View, you can extend Red Deer’s View class by adding support for the specific functions of the feature. The advantage that this API layer gives you is that your test programs do not have to focus on manipulating the individual UI elements directly to perform operations. Your programs can instead instantiate an instance of an Eclipse component such as a View, and then use that instance’s methods to perform operations on the View. This layer of abstraction makes your test programs easier to write, understand, and maintain. The middle layer consists of the Red Deer implementations for SWT UI elements such as: Button, Combo, Label, Menu, Shell, TabItem, Table, ToolBar, Tree. This API layer supports the API’s higher level by providing the building blocks for the API’s Views, Editors, Shells, and WIzards. This middle layer of the API also provides Red Deer packages that enable your tests to enforce requirements, so that necessary setup tasks are performed before a test is run. The bottom layer consists of Red Deer packages that support the execution of tests such as: Conditions, Matchers, Widgets, Workbench, and Red Deer extensions to JUnit. What Makes Red Deer different from other Tools? A Layer of Abstraction The top-most layer of the API enables you to instantiate Eclipse UI elements as objects, and then manipulate them through their methods. The resulting code is easier to read and maintain, instead of being brittle and subject to failures when the UI changes. For example, for a test that has to open a view and press a button, without Red Deer, the test would have to navigate the top level menu, find the view menu, then the view type in that menu, then find the view open dialog, then locate the “OK” button, etc. Your test would have to spend a lot of time navigating through the UI elements before it could even begin to perform the test’s steps. With Red Deer, the code to open a view (in this case, the servers view) is simply: ServersView view = new ServersView(); view.open(); Furthermore, within that ServersView, your test program can perform operations on the View through methods which are defined in the view (and are incidentally also well debugged by the Red Deer team), instead of having to explicitly locate and manipulate the UI elements directly. For example, to obtain a list of all the servers, instead of locating the UI tree that contains the server list, and extracting that list of servers into an array, your Red Deer program can simply call the “getServers()” method. Likewise, the code to open a PackageExplorer, and then select a project within that PackageExplorer is as follows: PackageExplorer packageExplorer = new PackageExplorer(); packageExplorer.open(); packageExplorer.getProject("myTestProject").select(); And, the code to retrieve all the projects within that PackageExplorer is simply: packageExplorer.getProjects(); The result are that your tests are easier to write and maintain and you can focus on testing your application’s logic instead of writing brittle code to navigate through the application. Installing Red Deer The only prerequisites to using Red Deer are Eclipse and Java. In this post, we’ll use Eclipse Kepler and OpenJDK 1.7, running on Red Hat Enterprise Linux (RHEL) 6. To install Red Deer 0.4 (this is the latest stable milestone version as of this writing) follow these steps: Open up Eclipse Navigate to: Help->Install New Software Define a new download site using the Red Deer update site URL: http://download.jboss.org/jbosstools/updates/stable/kepler/core/reddeer/0.4.0/ Select Red Deer, click on the Finish button and Red Deer will install Now that you have Red Deer installed, let’s move onto building a new Red Deer test. Building your First Red Deer Test To create a new Red Deer test project, you make use of the Red Deer UI tooling and select New->Project->Other->Red Deer Test: Before we move on, let’s take a look at the WEB-INF/MANIFEST.MF file that is created in the project: Manifest-Version: 1.0 Bundle-ManifestVersion: 2 Bundle-Name: com.example.reddeer.sample Bundle-SymbolicName: com.example.reddeer.sample;singleton:=true Bundle-Version: 1.0.0.qualifier Bundle-ActivationPolicy: lazy Bundle-Vendor: Sample Co Bundle-RequiredExecutionEnvironment: JavaSE-1.6 Require-Bundle: org.junit, org.jboss.reddeer.junit, org.jboss.reddeer.swt, org.jboss.reddeer.eclipse The line we’re interested in is the final line in the file. These are the bundles that are required by Red Deer. After the empty project is created by the wizard, you can define a package and create a test class. Here's the code for a minimal functional test. The test will verify that the eclipse configuration is not empty. package com.example.reddeer.sample; import static org.junit.Assert.assertFalse; import java.util.List; import org.jboss.reddeer.swt.api.TreeItem; import org.jboss.reddeer.swt.impl.button.PushButton; import org.jboss.reddeer.swt.impl.menu.ShellMenu; import org.jboss.reddeer.swt.impl.tree.DefaultTree; import org.junit.Test; import org.junit.runner.RunWith; import org.jboss.reddeer.junit.runner.RedDeerSuite; @RunWith(RedDeerSuite.class) public class SimpleTest { @Test public void TestIt() { new ShellMenu("Help", "About Eclipse Platform").select(); new PushButton("Installation Details").click(); DefaultTree ConfigTree = new DefaultTree(); List ConfigItems = ConfigTree.getAllItems(); assertFalse ("The list is empty!", ConfigItems.isEmpty()); for (TreeItem item : ConfigItems) { System.out.println ("Found: " + item.getText()); } } } After you save the test's source file, you can run the test. To run the test, select the Run As->Red Deer Test option: And - there's the green bar! Simplifying Tests with Requirements Red Deer requirements enable you to define actions that you want happen before a test is executed. The advantage to using requirements is that you define the actions with annotations instead of using a @BeforeClass method. The result is that your test code is easier to read and maintain. The biggest difference between a Red Deer requirement and the the @BeforeClass annotation from the JUnit framework is that if a requirement cannot be fulfilled the test is not executed. Like everything else in Red Deer, you can make use of predefined requirements, or you can extend the feature by adding your own custom requirements. These custom requirements can be made complex and for convenience can be stored in external properties files. (We’ll take a look at defining custom requirements in a later post in this series when we examine how to create and contribute extensions to Red Deer.) The current milestone release of Red Deer provides predefined requirements that enable you to clean out your current workspace and open a perspective. Let’s add these to our example. To do this, we need to add these import statements: import org.jboss.reddeer.eclipse.ui.perspectives.JavaBrowsingPerspective; import org.jboss.reddeer.requirements.cleanworkspace.CleanWorkspaceRequirement.CleanWorkspace; import org.jboss.reddeer.requirements.openperspective.OpenPerspectiveRequirement.OpenPerspective; And these annotations: @CleanWorkspace @OpenPerspective(JavaBrowsingPerspective.class) And, we also have to a reference to org.jboss.reddeer.requirements to the required bundle list in our example’s MANIFEST.MF file: Require-Bundle: org.junit, org.jboss.reddeer.junit, org.jboss.reddeer.swt, org.jboss.reddeer.eclipse, org.jboss.reddeer.requirements When we’re done, our example looks like this: package com.example.reddeer.sample; import static org.junit.Assert.assertFalse; import java.util.List; import org.jboss.reddeer.swt.api.TreeItem; import org.jboss.reddeer.swt.impl.button.PushButton; import org.jboss.reddeer.swt.impl.menu.ShellMenu; import org.jboss.reddeer.swt.impl.tree.DefaultTree; import org.junit.Test; import org.junit.runner.RunWith; import org.jboss.reddeer.junit.runner.RedDeerSuite; import org.jboss.reddeer.eclipse.ui.perspectives.JavaBrowsingPerspective; import org.jboss.reddeer.requirements.cleanworkspace.CleanWorkspaceRequirement.CleanWorkspace; import org.jboss.reddeer.requirements.openperspective.OpenPerspectiveRequirement.OpenPerspective; @RunWith(RedDeerSuite.class) @CleanWorkspace @OpenPerspective(JavaBrowsingPerspective.class) public class SimpleTest { @Test public void TestIt() { new ShellMenu("Help", "About Eclipse Platform").select(); new PushButton("Installation Details").click(); DefaultTree ConfigTree = new DefaultTree(); List ConfigItems = ConfigTree.getAllItems(); assertFalse ("The list is empty!", ConfigItems.isEmpty()); for (TreeItem item : ConfigItems) { System.out.println ("Found: " + item.getText()); } } } Notice how we were able to add those functions to the test code, while only adding a very small amount of actual new code? Yes, it can pay to be a lazy programmer. ;-) What’s Next? What’s next for Red Deer is its continued development as it progresses through its incubation stage until its 1.0 release. What’s next for this series of posts will be discussions about: The Red Deer Recorder - To enable you to capture manual actions and convert them into test programs How you can Extend Red Deer - To provide test coverage for your plugins’ specific functions. And How you can Contribute these extensions to the Red Deer project. How you can Define Complex Requirements - To enable you to perform setup tasks for your tests. Red Deer’s Integration with Selenium - To enable you to test web interfaces provided by your plugins. Running Red Deer tests with Jenkins - To enable you to take advantage of Jenkins’ Continuous Integration (CI) test framework. Author’s Acknowledgements I’d like to thank all the contributors to Red Deer for their vision and contributions. It’s a new project, but it is growing fast! The contributors (in alphabetic order) are: Stefan Bunciak, Radim Hopp, Jaroslav Jankovic, Lucia Jelinkova, Marian Labuda, Martin Malina, Jan Niederman, Vlado Pakan, Jiri Peterka, Andrej Podhradsky, Milos Prchlik, Radoslav Rabara, Petr Suchy, and Rastislav Wagner.

January 7, 2014

by Len DiMaggio

· 7,779 Views

Top 24 Java-Based Content Management Systems

CMS, or content management systems, are platforms for managing and administering website content. There is no denying that CMSes are important in today's web ecosystem. These content management systems not only provide an easy way to build and maintain websites, but they also lend a helping hand in updating and editing website content without the need to spend hours or days writing and altering codes and scripts. Some of the leading CMSes are PHP-based, Ruby on Rails-based, ASP.NET-based, and Java-based. Among these, due to scalability, modernized architecture and open-source standards of a few, Java-based CMSs are getting quite a lot of attention lately, especially for enterprise websites, because of the scalable, modern, open source technology behind most of them. There are plenty of CMS tools based on Java to help developers create multi-lingual and multi-channel websites. But how do we decide on the best one for our use case? In this article, we’re going to explore the top 24 content management systems based on Java. Let’s have a look at each of them in detail: 1. Alfresco : Alfresco is one of the top open-source content management systems of Java. It comes with enterprise repository and portlet capabilities along with document management, collaboration, records management, knowledge management, web content management, imaging, and a lot more. Alfresco has a modular architecture and enables end users to efficiently manage websites across the cloud, mobile, hybrid and on-premise environments using open source Java technologies, such as Spring, Hibernate, Lucene and JSF. 2. Magnolia : Magnolia is a well-documented, easy to use, enterprise-grade open source CMS based on the Java Content Repository Standard. It is a highly popular CMS due to its out-of-the-box functionality and ease of use under an open source license. Moreover, Magnolia supports unique content delivery capabilities in a search-engine optimized manner and also follows W3C standards. Magnolia CMS has been deployed by enterprises and governments in more than 100 countries across the world. Here's a case study on Magnolia-based website development 3. LogicalDOC : Though less known than other software such as Alfresco, LogicalDOC is emerging as a powerful and more affordable alternative. With primary focus on Document Management, it offers very interesting content management, knowledge management and collaboration features, and all this in a really efficient way. A peculiar aspect of the interface is the use of Google GWT , this makes the user interface very responsive while the data transfer with the server is minimum. Also the availability of Free Apps for Android and Apple devices (iPhone and iPad) is an interesting feature. 4. Asbru: Asbru is another powerful, fully-featured, easy to use content management system with database-driven capabilities. It is built on the Spring framework with integrated community, databases, eCommerce and statistics modules, which helps developers to create, publish and manage rich and user-friendly internet, extranet and intranet websites on the go. Available in various editions, Asbru provides users with a simple, user-friendly platform to manage websites along with a host of other benefits and features such as custom templates and data, password protected content, multi-lingual content, communities, eCommerce and website analytics, a cutting-edge WYSIWYG content editor and a lot more. 5. OpenCMS : OpenCMS is based on Java and XML technology that allows you to build highly customizable and interactive websites and portals. It comes integrated with a WYSIWYG editor and fully-featured Template Engine which is fully compliant with W3C standards. OpenCMS can be deployed both in an open-source environment (Linux, Apache, Tomcat, MySQL) as well as a commercial environment (Windows NT, IIS, BEA Weblogic, Oracle) 6. Walrus: Walrus is yet another Spring-based CMS that provides unique and effective content management capabilities with a smart administrative interface and drag-and-drop facilities. Easy-to-setup and undo/redo features make Walrus a highly preferred and suitable CMS for government and non-profit enterprises. 7. Pulse : Pulse is a Java-based framework and portal solution that offers easy-to-use and extensible patterns for creating rich browser web applications and responsive websites. It brings a bunch of innovative and powerful components including content management, web shops, user management and more. A few of its key features include a WebDAV based virtual file system for digital asset management, mature user and role management, built-in internationalization, and more. 8. MeshCMS: MeshCMS is an easy to use online editing system written in Java. It comes with a host of features that you will find in any ideal content management system however, it uses a conventional approach in managing and editing website content. It is considered one of the fastest CMSes for editing files online, managing files, and building some very common components like menus, breadcrumbs, mail forms and so on. MeshCMS is accompanied by cross-browser capabilities, a WYSIWYG editor, hot-linking prevention, and tag-library that makes content management an interesting affair. 9. Liferay: Liferay is one of the most popular CMSes based on Java, and is recommended by many industry experts. It comes with awesome features that can make your content management tasks simple. Liferay is a very popular for developing personal as well as professional websites with ease. 10. DotCMS: DotCMS is a next-gen enterprise CMS that wears an open-source hat. It is highly popular and widely used CMS due to its open APIs, extensible and scalable architecture that it used to create personalized and engaging websites, intranets, extranets and applications with ease. 11. Jease: Jease highly known as ‘Java with ease’ is another open source content management system that is built on popular Java technologies like db40, Perst, Lucence and ZK. It is an extremely lightweight CMS with excellent Ajax interface. Due to its intuitive and interactive interface, it is highly simple and easy to customize and deploy websites in Jease even for inexperienced Java developers. 12. Hippo: Hippo is again a powerful open-source CMS made in Java that features enterprise level capabilities that helps in delivering personalized websites and channels. Hippo outlines its competitor by delivering outstanding customer experience through innovative solutions. Hippo has come a long way since 1999 serving medium to large organizations by offering a personalized multichannel content distribution platform including website, mobile, tablet, extranets and intranets. Its major version update was in December 2012 and since then it is seeing minor updates every couple of months. 13. Apache Lenya: Apache Lenya is another open-source Java CMS that features revision control , multisite management, scheduling, search, WYSIWYG editors, and workflow which makes website development and management quite interesting and easy for developers. Available in a variety of languages, Apache Lenya is highly preferred CMS among enterprises that desire to develop multi-lingual websites. 14. Contelligent: Conteligent is another smart CMS solution offered under Java technology stack. It is fully compliant with J2EE and offers great solution for creating and managing personalized websites. 15. InfoGlue: InfoGlue again is a Java-based CMS that is known for its advanced, scalable and robust open-source architecture. It is a highly flexible CMS built on JSR-168 and comes with full multi-language support, excellent information reuse and high integration capabilities. 16. OpenEdit: OpenEdit CMS is a dynamic tool for managing website content with online editing capabilities. Built in open-source architecture, OpenEdit provides facilities like user manager, file manager, version control and notification tools for managing media-rich websites. OpenCMS features enterprise grade plugins such as eCommerce, Content Management, Blog, Events Calendar, Social Networking Tools and more. 17. AtLeap: Atleap is a multi-lingual CMS based on Java which offers amazing content delivery assistance with SEO and full text search functionalities. AtLeap, a product of Blandware, is not only a CMS but a highly robust framework for developing website and web applications 18. Weceem: Weceem is yet another open source content management system, unlikely other CMS it is built upon well-known Java framework grails, spring and Java itself. Weceem has garner positive reviews and is an ideal CMS when it comes to grails, but faces tough competition in best Java CMS category. I came across a LinkedIn discussion which was enough for me to put this CMS in the Best Java CMS list. 19. Nuxeo: Nuxeno is a powerful open source CMS built on Java-based architecture. It offers solutions related to document management, case management and digital asset management. It is free from licensing free but do costs you when reach out for support and maintenance help. It has strong groups of customers including Electronic Arts, U.S. Navy and as stated on the company website, it’s been used in over 145 countries across thousands of organizations. 20. XperienCentral: Xperien central is currently the only CMS that offers unique content to a visitor as per his earlier journey, so you can tailor the content to increase the conversion. It offers multi-channel content delivery across website, mobile social media channels and applications. It is built on Java and hence it is extremely scalable and agile. 21 Atex: Atex is a web CMS that uses polopoly technology to deliver content. As per claims, it is the only industry leading CMS with built in paywall. Atex again is one of the premium CMS that offers amazing solutions for managing websites and helps marketers deliver the right content to relevant audiences. It has rich set of clientele. 22 Escenic: Customers of escenic include News of the World, The Sun, The Times, the Independent titles. It’s a closed source Java framework. Both Atex and Escenic are found to be highly popular in Sweden. Some of the biggest sites in Sweden use both these CMS. idg.se uses Atex and Aftonbladet.se uses Escenic 23. Adobe Experience Manager/ CQ5 : Best CMS list cannot be completed without including adobe experience manager. It is an all-round CMS which offers all kinds of agility and flexibility an organization may want. It helps deliver unique customer experience by delivering different content on different channels. Adobe Experience Manager was recently named a leader in web content management by Garnters magic quadrant. Earlier it was known as CQ5 but later was acquire by Adobe in 2010 24. SDL Tridion Again a well known CMS and highly recommended by industry experts. Its simple intuitive UI makes it simple to manage the content and deliver it uniformly across all channels. It recently received top score in overall content management experience according to an independent research firm - Forrester Research, Inc., This completes the list 23 top Java-based content management system. Hope after reading about all the CMS, you have got enough inferences and insight as to which CMS would be best for your website development project.

December 9, 2013

by Boni Satani

· 328,686 Views · 4 Likes