Language Resources

The Latest Languages Topics

I have project where I need to output some reports as CSV-files. I found a good library called CsvHelper from NuGet and it works perfect for me. After some playing with it I was able to generate CSV-files that were shown correctly in Excel. Here is some sample code and also extensions that make it easier to work with DataTables. Simple report Here’s the simple fragment of code that illustrates how to use CsvHelper. using (var writer = new StreamWriter(Response.OutputStream)) using (var csvWriter = new CsvWriter(writer)) { csvWriter.Configuration.Delimiter = ";"; csvWriter.WriteField("Task No"); csvWriter.WriteField("Customer"); csvWriter.WriteField("Title"); csvWriter.WriteField("Manager"); csvWriter.NextRecord(); foreach (var project in data) { csvWriter.WriteField(project.Code); csvWriter.WriteField(project.CustomerName); csvWriter.WriteField(project.Name); csvWriter.WriteField(project.ProjectManagerName); csvWriter.NextRecord(); } } Of course, you can use other methods to output whole object or object list with one shot. I just needed here custom headers that doesn’t match property names 1:1. Generic helper for DataTable Some of my projects come from service layer as DataTable. I don’t want to add new models or Data Transfer Objects (DTO) with no good reason and DataTable is actually flexible enough if you need to add new fields to report and you want to do it fast. As DataTables are not supported by default (yet?), I wrote simple extension methods that work on DataTable views. When called on DataTable it selects default view automatically. The idea is – you can set filter on default data view and leave out the rows you don’t need. If you just want to show DataTable to screen as table then check out my posting Simple view to display contents of DataTable. public static class CsvHelperExtensions { public static void WriteDataTable(this CsvWriter csvWriter, DataTable table) { WriteDataView(csvWriter, table.DefaultView); } public static void WriteDataView(this CsvWriter csvWriter, DataView view) { foreach (DataColumn col in view.Table.Columns) { csvWriter.WriteField(col.ColumnName); } csvWriter.NextRecord(); foreach (DataRowView row in view) { foreach (DataColumn col in view.Table.Columns) { csvWriter.WriteField(row[col.ColumnName]); } csvWriter.NextRecord(); } } } And here is simple MVC controller action that gets data as DataTable and returns it as CSV-file. The result is CSV-file that opens correctly in Excel. [HttpPost] public void ExportIncomesReport() { var data = // Get DataTable here Response.ContentType = "text/csv"; Response.AddHeader("Content-disposition", "attachment;filename=IncomesReport.csv"); var preamble = Encoding.UTF8.GetPreamble(); Response.OutputStream.Write(preamble, 0, preamble.Length); using (var writer = new StreamWriter(Response.OutputStream)) using (var csvWriter = new CsvWriter(writer)) { csvWriter.Configuration.Delimiter = ";"; csvWriter.WriteDataTable(data); } } One thing to notice – with CsvHelper we have full control over a stream where we write data and this way we can write more performant code. Related Posts .Net Framework 4.0: string.IsNullOrWhiteSpace() method Exporting GridView Data to Excel Code Contracts: Hiding ContractException How to dump object properties My object to object mapper source released The post Generating CSV-files on .NET appeared first on Gunnar Peipman - Programming Blog.

June 26, 2015

by Gunnar Peipman

· 4,736 Views · 1 Like

R: dplyr -- Segfault Cause 'memory not mapped'

In my continued playing around with web logs in R I wanted to process the logs for a day and see what the most popular URIs were. I first read in all the lines using the read_lines function in readr and put the vector it produced into a data frame so I could process it using dplyr. library(readr) dlines = data.frame(column = read_lines("~/projects/logs/2015-06-18-22-docs")) In the previous post I showed some code to extract the URI from a log line. I extracted this code out into a function and adapted it so that I could pass in a list of values instead of a single value: extract_uri = function(log) { parts = str_extract_all(log, "\"[^\"]*\"") return(lapply(parts, function(p) str_match(p[1], "GET (.*) HTTP")[2] %>% as.character)) } Next I ran the following function to count the number of times each URI appeared in the logs: library(dplyr) pages_viewed = dlines %>% mutate(uri = extract_uri(column)) %>% count(uri) %>% arrange(desc(n)) This crashed my R process with the following error message: segfault cause 'memory not mapped' I narrowed it down to a problem when doing a group by operation on the ‘uri’ field and came across this post which suggested that it was handled more cleanly in more recently version of dplyr. I upgraded to 0.4.2 and tried again: ## Error in eval(expr, envir, enclos): cannot group column uri, of class 'list' That makes more sense. We’re probably returning a list from extract_uri rather than a vector which would fit nicely back into the data frame. That’s fixed easily enough by unlisting the result: extract_uri = function(log) { parts = str_extract_all(log, "\"[^\"]*\"") return(unlist(lapply(parts, function(p) str_match(p[1], "GET (.*) HTTP")[2] %>% as.character))) } And now when we run the count function it’s happy again, good times!

June 24, 2015

by Mark Needham

· 1,660 Views

R: Regex -- Capturing Multiple Matches of the Same Group

I’ve been playing around with some web logs using R and I wanted to extract everything that existed in double quotes within a logged entry. This is an example of a log entry that I want to parse: log = '2015-06-18-22:277:548311224723746831\t2015-06-18T22:00:11\t2015-06-18T22:00:05Z\t93317114\tip-127-0-0-1\t127.0.0.5\tUser\tNotice\tneo4j.com.access.log\t127.0.0.3 - - [18/Jun/2015:22:00:11 +0000] "GET /docs/stable/query-updating.html HTTP/1.1" 304 0 "http://neo4j.com/docs/stable/cypher-introduction.html" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.124 Safari/537.36"' And I want to extract these 3 things: /docs/stable/query-updating.html http://neo4j.com/docs/stable/cypher-introduction.html Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.124 Safari/537.36 i.e. the URI, the referrer and browser details. I’ll be using the stringr library which seems to work quite well for this type of work. To extract these values we need to find all the occurrences of double quotes and get the text inside those quotes. We might start by using the str_match function: > library(stringr) > str_match(log, "\"[^\"]*\"") [,1] [1,] "\"GET /docs/stable/query-updating.html HTTP/1.1\"" Unfortunately that only picked up the first occurrence of the pattern so we’ve got the URI but not the referrer or browser details. I tried str_extract with similar results before I found str_extract_all which does the job: > str_extract_all(log, "\"[^\"]*\"") [[1]] [1] "\"GET /docs/stable/query-updating.html HTTP/1.1\"" [2] "\"http://neo4j.com/docs/stable/cypher-introduction.html\"" [3] "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.124 Safari/537.36\"" We still need to do a bit of cleanup to get rid of the ‘GET’ and ‘HTTP/1.1′ in the URI and the quotes in all of them: parts = str_extract_all(log, "\"[^\"]*\"")[[1]] uri = str_match(parts[1], "GET (.*) HTTP")[2] referer = str_match(parts[2], "\"(.*)\"")[2] browser = str_match(parts[3], "\"(.*)\"")[2] > uri [1] "/docs/stable/query-updating.html" > referer [1] "https://www.google.com/" > browser [1] "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.124 Safari/537.36" We could then go on to split out the browser string into its sub components but that’ll do for now!

June 24, 2015

by Mark Needham

· 1,041 Views

PostgreSQL Powers All New Apps for 77% of the Database's Users

Survey of open source PostgreSQL users found adoption continues to rise with 55% of users deploying it for mission-critical applications Bedford, MA – June 23, 2015 – EnterpriseDB (EDB), the leading provider of enterprise-class Postgres products and database compatibility solutions, today announced the results of its “PostgreSQL Adoption Survey 2015,” a biennial survey of open source PostgreSQL users. Conducted by EnterpriseDB, the survey found PostgreSQL adoption continuing to rise, with 55% of users – up from 40% two years ago – deploying it for mission-critical applications and 77% of users are dedicating all new application deployments to PostgreSQL. These findings give voice to end users and confirm such industry indicators as increasing job listings and monthly rankings on DB-Engines that have pointed to rising interest in and demand for PostgreSQL, also called Postgres. The growing popularity of Postgres also comes as traditional software vendors suffer setbacks in the marketplace. The enterprise-class performance, security and stability of Postgres, on par with traditional database vendors for most corporate workloads, meanwhile have helped position Postgres among the solutions from the world’s largest vendors. The opportunity to transform their data center economics has helped fuel downloads of Postgres as well. End users reported cutting costs with Postgres, with 41% reporting they had first-year cost savings of 50% or more. They’re using Postgres to build web 2.0 applications using unstructured data as evidenced by the 64% of respondents who said they were working with JSON/JSONB and the 47% who said they were using Postgres for collaboration applications. “Postgres is empowering organizations to transform the economics of IT. IT can invest in the customer engagement applications that differentiate their operations from their competition instead of continuing to pay the steep and rising licensing and support fees charged by traditional database vendors,” said Marc Linster, senior vice president of products and services of EnterpriseDB. “With the expanding adoption, EnterpriseDB has experienced dramatic growth year over year, providing the software, services and support that organizations need to be successful with Postgres.” Database Migrations, Replacements The findings also support statements in a recent Gartner report that reflect the widespread acceptance of open source databases. “By 2018, more than 70% of new in-house applications will be developed on an OSDBMS, and 50% of existing commercial RDBMS instances will have been converted or will be in process,” according to the April 2015 Gartner report, The State of Open-Source RDBMs, 2015.* Among Postgres users, the survey findings show migrations are already under way with 37% reporting they had migrated applications from Oracle or Microsoft SQL Server to Postgres. Many users were still planning further migrations, with 37% of PostgreSQL users saying they will gradually replace their legacy systems with Postgres, compared to 29% who said that in the 2013 survey. Further, end users predict their deployments of Postgres will expand significantly, with 32% saying they anticipate production deployments of Postgres to increase by at least 50% over the next year. The survey, conducted by EnterpriseDB using an online tool in May 2015, queried registered users of PostgreSQL and drew 274 respondents worldwide from government organizations and companies ranging in size and industry. *The State of Open-Source RDBMs, 2015, by Donald Feinberg and Merv Adrian, published on April 21, 2015. Connect with EnterpriseDB Read the blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb Become a fan on Facebook: http://www.facebook.com/EnterpriseDB?ref=ts Join us on Google+: https://plus.google.com/108046988421677398468 Connect on LinkedIn: http://www.linkedin.com/company/enterprisedb

June 23, 2015

by Fran Cator

· 1,007 Views

Optimized Text-Stamp Operations, Enhanced PDF to HTML & DOC Conversion in Java Apps

What's New in this Release? Aspose team is pleased to announce the release of Aspose.Pdf for Java 10.3.0. It provides better license initialization capabilities. As shared in earlier blogs, we introduced a method clear() in com.aspose.pdf.MemoryCleaner class, which provides Memory Cleanup features so that memory is set free from unused objects. This method optimizes API performance as system resources are released, leaving API with sample resources to perform various PDF creation and manipulation operations. In this new release, we have also optimized TextStamp operation. Other than these improvements, a better support for UTF8 and UTF16 characters is provided, when converting TEXT files to PDF format. Cross file format conversions are one of the salient features offered by our API. Therefore, the PDF to HTML, the PDF to DOC, transformation of PDF pages to Image format as well as the Image to PDF conversion features are specifically improved. Among these features, the text manipulation is also improved while searching and replacing TextFragments inside the PDF file. Starting this new release, we are providing a single code base (.jar) file targeting JDK 1.6 and its compatible with JDK 1.6, 1.7 and later versions. Some important improved features included in this release are given below Increase TextStamp creation performance com.aspose.pdf.MemoryCleaner.clear() method nulls the license object as well Aspose.Pdf 9.5.2 to HTML conversion issue on particular file UTF-8 characters not appearing properly License implementation difference in 9.3.0 and 10.2.0 with Java web application java.awt.HeadlessException in Headless Mode PDF to Image - Conversion process stucks in infinite loop Text to PDF: Incorrect rendering of UTF8 text in output PDF Text to PDF: Incorrect rendering of UTF16 text in output PDF gets wrong coordinates of seached Text Image to PDF: API throws IllegalArgumentException PDF to PNG - Process hangs during conversion PDF to HTML: text is distorted in output HTML PDF to DOC: Text renders incorrectly Image to PDF throws IllegalArgumentException exception PDF to HTML - StringIndexOutOfBoundsException being generated PDF to Image - conversion method stuck and never returns Hyperlink text/contents are not visible in PDF file Overview: Aspose.Pdf for Java Aspose.Pdf is a Java PDF component to create PDF documents without using Adobe Acrobat. It supports Floating box, PDF form field, PDF attachments, security, Foot note & end note, Multiple columns document, Table of Contents, List of Tables, Nested tables, Rich text format, images, hyperlinks, JavaScript, annotation, bookmarks, headers, footers and many more. Now you can create PDF by API, XML and XSL-FO files. It also enables you to converting HTML, XSL-FO and Excel files into PDF. Homepage of Aspose.Pdf for Java Download Aspose.Pdf for Java

June 22, 2015

by David Zondray

· 1,068 Views

Making litigation more affordable

Last year some data from the Citizens Advise Bureau revealed that 7 out of 10 potentially successful employment cases are not being pursued, with a good 50 percent of those being down to financial issues. Whilst it’s tempting to think that we are all equal in front of the law, there remains a distinct sense that we are anything but. It’s a major reason why companies such as Logikcull are trying to make the whole process easier and more efficient. It’s believed that the e-discovery process can contribute to around 70 percent of the costs of any legal proceeding, so reducing the time involved in that can be a huge cost saver. Using the crowd Other organizations are attempting to make the legal process more affordable by recruiting the crowd to help meet the legal costs involved. For instance, I wrote about LexShares towards the end of last year, who are a kind of crowd based investment site. You can ‘invest’ in a particular case, thus giving the plaintiff funds to pursue their case. If the case is successful, the backer gets their money back plus a bit of the damages. If the case fails, then they lose their money. Another crowd based venture launched in the UK recently. The site, called CrowdJustice, aims to provide funding to cases that would normally struggle to do so. Supporting public interest cases The site was founded by Julia Salasky, who previously worked for the UN, and aims to specialize in so called public interest cases. “CrowdJustice allows communities to band together to access the courts to protect their communal assets – like their local hospital – or shared values – like human rights. Successive governments have made access to justice harder and more expensive but we are using the power of the crowd to try and stem the tide,” she says. She suggests that cuts to legal aid has made it harder for poorer people to access adequate legal protection, especially when it comes to challenging large institutions. This is especially so when the end game doesn’t necessarily result in a large payout. This could include, for instance, the destruction of a local bird sanctuary or even much larger issues such as torture. Despite effecting huge numbers of people, it is often very difficult for communities to channel their energies towards fighting the case collectively. As such, these kind of cases typically require a determined individual to pursue the cause on their own. The hope is that the CrowdJustice platform will make this considerably easier. Whether it’s CrowdJustice or LexStorm or Logikcull, there are certainly a wide range of projects aiming to change the legal industry for the better. It will be fascinating to watch them as they unfold and witness the impact they have. Original post

June 22, 2015

by Adi Gaskell

· 1,015 Views

Heroku PostgreSQL vs. Amazon RDS for PostgreSQL

Written by Barry Jones. PostgreSQL is becoming the relational database of choice for web development for a whole host of good reasons. That means that development teams have to make a decision on whether to host their own or use a database as a service provider. The two biggest players in the world of PostgreSQL are Heroku PostgreSQL and Amazon RDS for PostgreSQL. Today I’m going to compare both platforms. Heroku was the first big provider to make a push for PostgreSQL instead of MySQL for application development. They launched their Heroku PostgreSQL platform back in 2007. Amazon Web Services first announced their RDS for PostgreSQL service in November 2013 during the AWS re:Invent conference to an overwhelming ovation by the programmers in attendance. Pricing Comparison Before I get too far into the features, let’s cover the pricing differences up front. Of course, both services have areas with different value propositions for productivity and maintenance that go beyond these direct costs. However, it’s worth it to understand the basic costs so you can weigh those values against your needs later. Heroku PostgreSQL has the simplest pricing. The rates and what you get for them are very clearly set at a simple per-month rate that includes the database, storage, data transfer, I/O, backups, SLA, and any other features built into the pricing tier. With RDS for PostgreSQL, pricing is broken down into smaller units of individual resource usage. That means there are more factors involved in estimating the price, so it’s a little tougher to draw an exact comparison to Heroku PostgreSQL. You have the price per hour for the instance type, higher if it’s a multiple availability zone instance, cheaper if you pay an upfront cost to reserve the instance for one to three years; storage cost and storage class (both single and multi AZ); provisioned IOPs rate; backup storage, and data transfer… then there are a whole lot of special cases to consider. Also, keep in mind that you get one year free of the cheapest plan when you sign up. Here is a comparison of an RDS plan to the Heroku Premium 4 plan: Heroku Premium 4 $1,200 / Month 15 GB RAM 512 GB storage 500 connections High Availability Max 15 minutes downtime/month 1 week rollback Point in time recovery Encryption at rest Continuous protection (offsite Write-Ahead-Log) RDS for PostgreSQL $1,156/month on demand or $756/month 1 year reserved db.m3.xlarge Multi-AZ at $0.780/hr ($580) 4 vCPU, 15GB RAM Encryption at rest 512 GB provisioned (SSD) at $0.250/GB ($128) 2000 provisioned IOPS at $0.20/IOPS ($400) Estimated backup storage in excess of free for 1 week rollback, 512 GB at $0.095/GB ($48) Data transfer estimated at $0 for most use cases 22 minutes downtime/month (based on AWS RDS SLA 99.95% uptime) Now, here are the caveats with such a comparison: Heroku isn’t disclosing the number of CPUs associated with their plans. Heroku’s High Availability is equivalent to AWS RDS Multi-AZ. In both setups, a read replica is maintained in a different geographic region specifically for the purpose of automatic failover in the event of an outage. With Heroku, your storage is fully allocated, and you do not pay for IOPS. As such, we don’t know what the limits are for IOPS, but they are very high performance databases. I allocated the minimum IOPS that AWS would allow for 512 GB, which was 2,000. We could go as high as 5,000 IOPS which would increase the price by $600/month. The AWS RDS backups may cost nothing depending on how much of the provisioned storage is actually being used. Backup storage is free up to the level of provisioned storage, and backups are generally smaller, incremental, and do not include the significant space used by indexes. This estimate was based on the seven days of storage needed to allow for one week rollback. AWS RDS storage can be scaled up on the fly, so your specific needs for RAM versus storage could create a wildly different pricing pattern. This comparison is aiming to draw an equivalent. AWS only charges for data transfers out of your availability zone (not including multi-AZ transfers), so transfer rates will not apply in most cases. Clear as mud. Setup Complexity Heroku PostgreSQL setup is dead simple. Whenever you create a PostgreSQL project, a free dev plan is already created with it with a connection waiting. Upgrading the database simply gives you a new connection string with a set username, password, hostname, and database identifier that are all randomly generated by their system. The database connection must be secure but is accessible anywhere on the internet, including directly from your home computer. You can also choose whether to deploy it in the US East region or in the European region. RDS for PostgreSQL setup is slightly more involved; you must select the various options outlined in the pricing section, including the instance type, whether or not it should be Multi-AZ, whether to enable encryption at rest, type of storage, how much to provision, IOPs to provision (if any), backup retention period, whether or not to enable automatic minor version upgrades, selection of backup and maintenance windows, database identifier, name, port, master user and password, which availability zone you want it to be created in and the selection of your VPC group and subnet group, and your database configuration. Obviously, RDS gives you significantly more control over the details. Depending on your point of view, that could be good or bad. The database configuration, for example, has a set of defaults for each database version for each instance type. You can take these defaults and make modifications to them with your own custom settings and then save those as your own parameter group to assign to this and any future databases that you may choose to create. The initial setup time can be slightly more involved because of the various factors like VPC, subnet groups, and public accessibility. However, once these have been defined the first time for your account, everything gets much closer to a point-and-click experience. Host Locations, Regional Restrictions Heroku operates with the AWS US East Region (us-east-1) and Europe (eu-west-1). This also means that your database will be restricted to these regions. Availability Zones are managed internally. If you choose to use Heroku PostgreSQL with something hosted in a different AWS region than those two, you should expect more latency between database requests and transfer rates may apply. Likewise, if you wish to use AWS RDS for PostgreSQL with a Heroku application, just ensure that it is set up in the appropriate region. Security and Access Considerations Within Heroku PostgreSQL, you’re given a randomized username with a randomized password and a randomized database name that must be connected to over SSL. Their network (as well as Amazon’s) have built-in protections against scanners that could potentially brute-force access such a database. That is fairly secure. The downside is that anybody who needs access to the database and has the connection information can do so from anywhere in the world. This is more of a Human Resources-level risk from departed programmers on a project than anything else, but it is something to be aware of nonetheless. Swapping out the database credentials after having a programmer leave the team will generally alleviate this concern. On the other hand, AWS RDS for PostgreSQL has a much more comprehensive security policy. The ability to set and define a VPC and private subnet groups will allow you to restrict database access to only the servers and people who need it. You have the ability to create as many database users with various permission levels as you like in order to more easily manage multiple users or applications accessing the database with different permission levels, while providing a log trail. Thanks to VPC, even if somebody did have the connection information, they still couldn’t access the database without being able to get inside the VPC. For stricter (although more complex) security, RDS wins hands down. Depending on complexity, team, and the development state of your application, this level of security paranoia may not yet make sense and could be more of a headache than you want to manage. You can also configure it with the same access rules used by Heroku PostgreSQL. Backup/Restore/Upgrade Both platforms offer very similar options for backup and restore. Both have scheduled backups, point-in-time recovery, restoration to a new copy, and the ability to create snapshots. Upgrades are more involved. On both platforms, major version upgrades will involve some downtime, which can’t be avoided. Heroku provided three options that all involve some manual steps to complete: copying data, promoting an upgraded follower, or using the pg:upgrade command for an in-place upgrade of larger databases. The pg:upgrade most closely resembles the upgrade process on RDS. With RDS, you select the Modify option for your instance and change the version. It will create pre- and post-snapshots around the in-place upgrade while maintaining the exact same connection string. RDS will allow you to schedule the database upgrade automatically within your set maintenance window. Heroku PostgreSQL will automatically apply minor upgrades and security patches, while RDS allows you to choose whether or not you want them to do that automatically within your maintenance window. Both are fairly straightforward processes, although the RDS process is a little more hands-off in this case. Feature/Extension Availability As of this writing, AWS RDS for PostgreSQL has version 9.3.1–9.3.6 and 9.4.1, while Heroku PostgreSQL has 9.1, 9.2, 9.3, and 9.4. Minor version upgrades are automatic with Heroku, so the point releases are unnecessary. Heroku PostgreSQL has been around longer and because of that has more legacy versions available for their existing users. RDS launched with 9.3 and does not appear to have any intention to support older versions. In addition to all of the functionality built into PostgreSQL, there’s a constantly growing set of extensions. Both platforms have these extensions in common: hstore citext ltree isn cube dict_int unaccent PostGIS dblink earthdistance fuzzystrmatch intarray pg_stat_statements pgcrypto pg_trgm tablefunc uuid-ossp pgrowlocks btree_gist PL/pgSQL PL/Tcl PL/Perl PL/V8 Available on Heroku PostgreSQL: pgstattuple Available on AWS RDS for PostgreSQL: postgres_fdw chkpass intagg tsearch2 sslinfo Here are the full lists for both Heroku PostgreSQL and AWS RDS for PostgreSQL. Scaling Options “Scaling” is a tricky word with databases because it means different things depending on the needs of your application. Scaling for writes vs. reads is based on low intensity and high volume (web traffic) compared to low volume and high intensity (analytics). The most common scaling case on the web is scaling for read traffic. Both Heroku and RDS address this need with the ability to create read replicas. RDS calls them read replicas and Heroku calls them followers, but they’re essentially the same thing: a copy of the database, receiving live updates via the write-ahead-log over the wire to allow you to spread read traffic over multiple servers. This is commonly referred to as horizontal scaling. To create read replicas on either platform is a point-and-click operation. Vertical scaling refers to increasing or decreasing the power of the hardware of your database in place. AWS and Heroku each handle this scenario differently. Heroku instructs users to create a follower of the newly desired database class and then promote it to the primary database once it’s caught up, destroying the original afterwards. Your application will need to update its database connection information to use the new database. If your RDS database is a multi-AZ database, then the failover database will be upgraded first. Once ready, the connection will automatically failover to that instance while the primary is then upgraded, switching back to the primary afterwards. Without a Multi-AZ, you can do the upgrade in place, but downtime will vary depending on the size of the database. Your other option is to create a read replica with the newly desired stats and then promote it to primary when it is ready, just as Heroku recommends. To scale beyond the standard vertical and horizontal options for something that can handle distributed write scaling, neither option is a particularly good fit. It will probably be necessary to either manage your own Postgres-XC installation or restructure your application to isolate the write-heavy traffic into a more use-case specific data source. Monitoring AWS RDS for PostgreSQL comes with all of the standard AWS monitoring options via Cloudwatch. Cloudwatch provides extensive metrics that you can track history with a granular ability to set up alerts via email or SNS notifications (basically webhooks). These are great for integrating with tools like PagerDuty. Heroku PostgreSQL monitoring relies more on logs and command line tools. Their pgextras command line tool will show current information about what’s going on in the database, including bloat, blocking queries, cache and index hit ratios, identification of unused indexes, and the ability to kill specific queries. These tools, while not involving the stat tracking over time that you get from Cloudwatch, provide extremely valuable insights into what’s going on with your database that you don’t come close to getting from RDS. You can see more examples of pg-extras on GitHub. These type of insights are invaluable in tuning your application and database to avoid the problems you’d need a monitor to catch in the first place. Other historical data is available in the logs, although Heroku recommends trying out Librato (which can work with any PostgreSQL database but has a Heroku plugin available for automatic configuration). Additionally, free New Relic plans will provide a wealth of insight into what’s going on with your application and database. While Cloudwatch provides more detailed insight as to what’s going on within the machine, Heroku uses the metrics seen within pg-extras to monitor and notify you of the various problems they see that require correction on your end. If data corruption happens, Heroku identifies and fixes it. Security problems, they’ll handle it. A DBA or a DevOps position will care significantly more about the Cloudwatch metrics. Heroku PostgreSQL tries to focus on making sure you don’t have to worry about it. Dataclips One bonus feature that you get from Heroku PostgreSQL is Dataclips. Dataclips are basically a method for storing and sharing read-only queries among your team for the sake of reporting without having to grant access to every person who may need to see them. Just type in a query and view the results right there on the page. The queries are version controlled; if your team is passing them around and tweaking them, you’ll be able to see the changes over time. In my personal experience, I’ve found dataclips to be a lifesaver, specifically for working with non-programmer teams. When business or support staff need information on sales, fraud, user behavior, account activity, or anything else we happen to have in there, I’ve always had the ability to write up a query to get at the information. Before dataclips, this meant that I needed to write up the query, save it somewhere, usually export the result set to a CSV or spreadsheet, and then email it to whomever was requesting it. Eventually, this becomes a routine activity that you’re having to handle at every request. Enter dataclips. Now I can take that query and just send the random hashed link over to whoever requested the information. If they want more up-to-date information the next day, week, or month, they need only refresh the page. I write the query, then never hear that request again. That is a developer time-saver right there. You can save them and name them, as well as manage more strict access if need be. Summary and Recommendation Overall, AWS RDS for PostgreSQL will usually be cheaper and more tightly tailorable to exactly what your application’s needs are. You’ll have much more granular control over access, security, monitoring, alerts, geographic location, and maintenance plans. With Heroku PostgreSQL, you’ll pay a little bit more on a simplified pricing structure, although all of your development databases will be free. You won’t be able to control a lot of the details that RDS gives you access to, but that’s partially by design so that you don’t have to deal with managing those details. With Heroku, you’ll get insights directly into how your database is performing and using the internal resources to help you catch, tune, and improve your setup before it becomes a problem. If I had to choose, I’d probably go with Heroku and Heroku PostgreSQL as a startup while I focused on actually getting my application developed and getting customers in the door. The value proposition of saving time to focus on business goals so we can build a revenue stream would be of the greatest importance. Then when things grew to a point that the database was no longer changing as much, it might make sense to start migrating things over to RDS as we focus on locking things down to focus on stability, long-term maintenance, and security. In the end, it really boils down to what costs you more: time or infrastructure. If time costs you more, go with Heroku PostgreSQL. If infrastructure costs you more, go with RDS. Having both platforms living within the AWS datacenters makes switching between the two a lot easier as your needs change.

June 22, 2015

by Moritz Plassnig

· 3,940 Views

Get Back Up and Try Again: Retrying in Python

I don't often write about tools I use when for my daily software development tasks. I recently realized that I really should start to share more often my workflows and weapons of choice. One thing that I have a hard time enduring while doing Python code reviews, is people writing utility code that is not directly tied to the core of their business. This looks to me as wasted time maintaining code that should be reused from elsewhere. So today I'd like to start with retrying, a Python package that you can use to… retry anything. It's OK to fail Often in computing, you have to deal with external resources. That means accessing resources you don't control. Resources that can fail, become flapping, unreachable or unavailable. Most applications don't deal with that at all, and explode in flight, leaving a skeptical user in front of the computer. A lot of software engineers refuse to deal with failure, and don't bother handling this kind of scenario in their code. In the best case, applications usually handle simply the case where the external reached system is out of order. They log something, and inform the user that it should try again later. In this cloud computing area, we tend to design software components with service-oriented architecture in mind. That means having a lot of different services talking to each others over the network. And we all know that networks tend to fail, and distributed systems too. Writing software with failing being part of normal operation is a terrific idea. Retrying In order to help applications with the handling of these potential failures, you need a plan. Leaving to the user the burden to "try again later" is rarely a good choice. Therefore, most of the time you want your application to retry. Retrying an action is a full strategy on its own, with a lot of options. You can retry only on certain condition, and with the number of tries based on time (e.g. every second), based on a number of tentative (e.g. retry 3 times and abort), based on the problem encountered, or even on all of those. For all of that, I use the retrying library that you can retrieve easily on PyPI. retrying provides a decorator called retry that you can use on top of any function or method in Python to make it retry in case of failure. By default, retry calls your function endlessly until it returns rather than raising an error. import randomfrom retrying import retry @retrydef pick_one(): if random.randint(0, 10) != 1: raise Exception("1 was not picked") This will execute the function pick_one until 1 is returned by random.randint. retry accepts a few arguments, such as the minimum and maximum delays to use, which also can be randomized. Randomizing delay is a good strategy to avoid detectable pattern or congestion. But more over, it supports exponential delay, which can be used to implement exponential backoff, a good solution for retrying tasks while really avoiding congestion. It's especially handy for background tasks. @retry(wait_exponential_multiplier=1000, wait_exponential_max=10000)def wait_exponential_1000(): print "Wait 2^x * 1000 milliseconds between each retry, up to 10 seconds, then 10 seconds afterwards" raise Exception("Retry!") You can mix that with a maximum delay, which can give you a good strategy to retry for a while, and then fail anyway: # Stop retrying after 30 seconds anyway>>> @retry(wait_exponential_multiplier=1000, wait_exponential_max=10000, stop_max_delay=30000)... def wait_exponential_1000():... print "Wait 2^x * 1000 milliseconds between each retry, up to 10 seconds, then 10 seconds afterwards"... raise Exception("Retry!")...>>> wait_exponential_1000()Wait 2^x * 1000 milliseconds between each retry, up to 10 seconds, then 10 seconds afterwardsWait 2^x * 1000 milliseconds between each retry, up to 10 seconds, then 10 seconds afterwardsWait 2^x * 1000 milliseconds between each retry, up to 10 seconds, then 10 seconds afterwardsWait 2^x * 1000 milliseconds between each retry, up to 10 seconds, then 10 seconds afterwardsWait 2^x * 1000 milliseconds between each retry, up to 10 seconds, then 10 seconds afterwardsWait 2^x * 1000 milliseconds between each retry, up to 10 seconds, then 10 seconds afterwardsTraceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.7/site-packages/retrying.py", line 49, in wrapped_f return Retrying(*dargs, **dkw).call(f, *args, **kw) File "/usr/local/lib/python2.7/site-packages/retrying.py", line 212, in call raise attempt.get() File "/usr/local/lib/python2.7/site-packages/retrying.py", line 247, in get six.reraise(self.value[0], self.value[1], self.value[2]) File "/usr/local/lib/python2.7/site-packages/retrying.py", line 200, in call attempt = Attempt(fn(*args, **kwargs), attempt_number, False) File "", line 4, in wait_exponential_1000 Exception: Retry! A pattern I use very often, is the ability to retry only based on some exception type. You can specify a function to filter out exception you want to ignore or the one you want to use to retry. def retry_on_ioerror(exc): return isinstance(exc, IOError) @retry(retry_on_exception=retry_on_ioerror)def read_file(): with open("myfile", "r") as f: return f.read() retry will call the function passed as retry_on_exception with the exception raised as first argument. It's up to the function to then return a boolean indicating if a retry should be performed or not. In the example above, this will only retry to read the file if an IOError occurs; if any other exception type is raised, no retry will be performed. The same pattern can be implemented using the keyword argument retry_on_result, where you can provide a function that analyses the result and retry based on it. def retry_if_file_empty(result): return len(result) <= 0 @retry(retry_on_result=retry_if_file_empty)def read_file(): with open("myfile", "r") as f: return f.read() This example will read the file until it stops being empty. If the file does not exist, an IOError is raised, and the default behavior which triggers retry on all exceptions kicks-in – the retry is therefore performed. That's it! retry is really a good and small library that you should leverage rather than implementing your own half-baked solution!

June 21, 2015

by Julien Danjou

· 3,652 Views

R: dplyr - Removing Empty Rows

I’m still working my way through the exercises in Think Bayes and in Chapter 6 needed to do some cleaning of the data in a CSV file containing information about the Price is Right. I downloaded the file using wget: wget http://www.greenteapress.com/thinkbayes/showcases.2011.csv And then loaded it into R and explored the first few rows using dplyr library(dplyr) df2011 = read.csv("~/projects/rLearning/showcases.2011.csv") > df2011 %>% head(10) X Sep..19 Sep..20 Sep..21 Sep..22 Sep..23 Sep..26 Sep..27 Sep..28 Sep..29 Sep..30 Oct..3 1 5631K 5632K 5633K 5634K 5635K 5641K 5642K 5643K 5644K 5645K 5681K 2 3 Showcase 1 50969 21901 32815 44432 24273 30554 20963 28941 25851 28800 37703 4 Showcase 2 45429 34061 53186 31428 22320 24337 41373 45437 41125 36319 38752 5 ... As you can see, we have some empty rows which we want to get rid of to ease future processing. I couldn’t find an easy way to filter those out but what we can do instead is have empty columns converted to ‘NA’ and then filter those. First we need to tell read.csv to treat empty columns as NA: df2011 = read.csv("~/projects/rLearning/showcases.2011.csv", na.strings = c("", "NA")) And now we can filter them out using na.omit: df2011 = df2011 %>% na.omit() > df2011 %>% head(5) X Sep..19 Sep..20 Sep..21 Sep..22 Sep..23 Sep..26 Sep..27 Sep..28 Sep..29 Sep..30 Oct..3 3 Showcase 1 50969 21901 32815 44432 24273 30554 20963 28941 25851 28800 37703 4 Showcase 2 45429 34061 53186 31428 22320 24337 41373 45437 41125 36319 38752 6 Bid 1 42000 14000 32000 27000 18750 27222 25000 35000 22500 21300 21567 7 Bid 2 34000 59900 45000 38000 23000 18525 32000 45000 32000 27500 23800 9 Difference 1 8969 7901 815 17432 5523 3332 -4037 -6059 3351 7500 16136 ... Much better!

June 21, 2015

by Mark Needham

· 58,890 Views · 1 Like

Java RegEx: How to Replace All With Pre-processing on a Captured Group

Need to replace all occurances of a pattern text and replace it with a captured group? Something like this in Java works nicely: String html = "myurl\n" + "myurl2\n" + "myurl3"; html = html.replaceAll("id=(\\w+)'?", "productId=$1'"); Here I swapped the query name from "id" to "productId" on all the links that matched my criteria. But what happen if I needed to pre-process the captured ID value before replacing it? Let's say now I want to do a lookup and transform the ID value to something else? This extra requirement would lead us to dig deeper into Java RegEx package. Here is what I come up with: import java.util.regex.*; ... public String replaceAndLookupIds(String html) { StringBuffer newHtml = new StringBuffer(); Pattern p = Pattern.compile("id=(\\w+)'?"); Matcher m = p.matcher(html); while (m.find()) { String id= m.group(1); String newId = lookup(id); String rep = "productId=" + newId + "'"; m.appendReplacement(newHtml, rep); } m.appendTail(newHtml); return newHtml.toString(); }

June 17, 2015

by Zemian Deng

· 14,062 Views · 1 Like

Purpose of ThreadLocal in Java and When to Use ThreadLocal

ThreadLocal is a simple way to have per-thread data that cannot be accessed concurrently by other threads, without requiring great effort or design compromises.

June 7, 2015

by Santosh Singh

· 21,544 Views · 3 Likes

Top 80 Thread- Java Interview Questions and Answers (Part 2)

PART 1 > THREADS - Top 80 interview questions and answers (detailed explanation with programs) Question 61. class MyRunnable implements Runnable{ public void run(){ for(int i=0;i<3;i++){ System.out.println("i="+i+" ,ThreadName="+Thread.currentThread().getName()); } } } public class MyClass { public static void main(String...args){ MyRunnable runnable=new MyRunnable(); System.out.println("start main() method"); Thread thread1=new Thread(runnable); Thread thread2=new Thread(runnable); thread1.start(); thread2.start(); System.out.println("end main() method"); } } Answer. Thread behaviour is unpredictable because execution of Threads depends on Thread scheduler, start main() method will be the printed first, but after that we cannot guarantee the order of thread1, thread2 and main thread they might run simultaneously or sequentially, so order of end main() method will not be guaranteed. /*OUTPUT start main() method end main() method i=0 ,ThreadName=Thread-0 i=0 ,ThreadName=Thread-1 i=1 ,ThreadName=Thread-0 i=2 ,ThreadName=Thread-0 i=1 ,ThreadName=Thread-1 i=2 ,ThreadName=Thread-1 */ Question 62. class MyRunnable implements Runnable{ public void run(){ for(int i=0;i<3;i++){ System.out.println("i="+i+" ,ThreadName="+Thread.currentThread().getName()); } } } public class MyClass { public static void main(String...args) throws InterruptedException{ System.out.println("In main() method"); MyRunnable runnable=new MyRunnable(); Thread thread1=new Thread(runnable); Thread thread2=new Thread(runnable); thread1.start(); thread1.join(); thread2.start(); thread2.join(); System.out.println("end main() method"); } } Answer. We use join() methodto ensure all threads that started from main must end in order in which they started and also main should end in last. In other words join() method waited for this thread to die. /*OUTPUT In main() method i=0 ,ThreadName=Thread-0 i=1 ,ThreadName=Thread-0 i=2 ,ThreadName=Thread-0 i=0 ,ThreadName=Thread-1 i=1 ,ThreadName=Thread-1 i=2 ,ThreadName=Thread-1 end main() method */ Question 63. class MyRunnable implements Runnable { public void run() { try { while (!Thread.currentThread().isInterrupted()) { Thread.sleep(1000); System.out.println("x"); } } catch (InterruptedException e) { System.out.println(Thread.currentThread().getName() + " ENDED"); } } } public class MyClass { public static void main(String args[]) throws Exception { MyRunnable obj = new MyRunnable(); Thread t = new Thread(obj, "Thread-1"); t.start(); System.out.println("press enter"); System.in.read(); t.interrupt(); } } Answer. "press enter" will be printed first then thread1 will keep on printing x until enter is pressed, once enter is pressed "Thread-1 ENDED" will be printed. System.in.read() causes main thread to go from running to waiting state (thread waits for user input) /* OUTPUT press enter x x x x Thread-1 ENDED */ Question 64. class MyRunnable implements Runnable{ public void run(){ synchronized (this) { System.out.println("1 "); try { this.wait(); System.out.println("2 "); } catch (InterruptedException e) { e.printStackTrace(); } } } } public class MyClass { public static void main(String[] args) { MyRunnable myRunnable=new MyRunnable(); Thread thread1=new Thread(myRunnable,"Thread-1"); thread1.start(); } } Answer. Thread acquires lock on myRunnable object so 1 was printed but notify wasn't called so 2 will never be printed, this is called frozen process. Deadlock is formed, These type of deadlocksare called Frozen processes. /*OUTPUT 1 */ Question 65. import java.util.ArrayList; /* Producer is producing, Producer will allow consumer to * consume only when 10 products have been produced (i.e. when production is over). */ class Producer implements Runnable{ ArrayList sharedQueue; Producer(){ sharedQueue=new ArrayList(); } @Override public void run(){ synchronized (this) { for(int i=1;i<=3;i++){ //Producer will produce 10 products sharedQueue.add(i); System.out.println("Producer is still Producing, Produced : "+i); try{ Thread.sleep(1000); }catch(InterruptedException e){e.printStackTrace();} } System.out.println("Production is over, consumer can consume."); this.notify(); } } } class Consumer extends Thread{ Producer prod; Consumer(Producer obj){ prod=obj; } public void run(){ synchronized (this.prod) { System.out.println("Consumer waiting for production to get over."); try{ this.prod.wait(); }catch(InterruptedException e){e.printStackTrace();} } int productSize=this.prod.sharedQueue.size(); for(int i=0;i Q61- Q80

June 6, 2015

by Ankit Mittal

· 13,729 Views · 3 Likes

Python: Joining Multiple Generators/Iterators

In my previous blog post I described how I’d refactored some scraping code I’ve been working on to use iterators and ended up with a function which returned a generator containing all the events for one BBC live text match: match_id = "32683310" events = extract_events("data/raw/%s" % (match_id)) >>> print type(events) The next thing I wanted to do is get the events for multiple matches which meant I needed to glue together multiple generators into one big generator. itertools’ chain function does exactly what we want: itertools.chain(*iterables) Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. Used for treating consecutive sequences as a single sequence. First let’s try it out on a collection of range generators: import itertools gens = [(n*2 for n in range(0, 3)), (n*2 for n in range(4,7))] >>> gens [ at 0x10ff3b140>, at 0x10ff7d870>] output = itertools.chain() for gen in gens: output = itertools.chain(output, gen) Now if we iterate through ‘output’ we’d expect to see the multiples of 2 up to and including 12: >>> for item in output: ... print item ... 0 2 4 8 10 12 Exactly as we expected! Our scraping code looks like this once we plug the chaining in: matches = ["32683310", "32683303", "32384894", "31816155"] raw_events = itertools.chain() for match_id in matches: raw_events = itertools.chain(raw_events, extract_events("data/raw/%s" % (match_id))) ‘raw_events’ now contains a single generator that we can iterate through and process the events for all matches.

June 4, 2015

by Mark Needham

· 6,619 Views · 1 Like

Easy SQLite on Android with RxJava

Whenever I consider using an ORM library on my Android projects, I always end up abandoning the idea and rolling my own layer instead for a few reasons: My database models have never reached the level of complexity that ORM’s help with. Every ounce of performance counts on Android and I can’t help but fear that the SQL generated will not be as optimized as it should be. Recently, I started using a pretty simple design pattern that uses Rx to offer what I think is a fairly simple way of managing your database access with RxJava. Easy reads One of the important design principles on Android is to never perform I/O on the main thread, and this obviously applies to database access. RxJava turns out to be a great fit for this problem. I usually create one Java class per table and these tables are then managed by my SQLiteOpenHelper. With this new approach, I decided to extend my use of the helper and make it the only point of access to anything that needs to read or write to my SQL tables. Let’s consider a simple example: a USERS table managed by the UserTable class: // MySqliteOpenHelper.java Observable> getUsers(String userId) { return makeObservable(mUserTable.getUsers(getReadableDatabase(), userId)) .subscribeOn(Schedulers:io()) } The problem with this method is that if you’re not careful, you will call it on the main thread, so it’s up to the caller to make sure they are always invoking this method on a background thread (and then to post their UI update back on the main thread, if they are updating the UI). Instead of relying on managing yet another thread pool or, worse, using AsyncTask, we are going to rely on RxJava to take care of the threading model for us. Let’s rewrite this method to return a callable instead: // MySqliteOpenHelper.java private static Observable makeObservable(final Callable func) { return Observable.create( new Observable.OnSubscribe() { @Override public void call(Subscriber subscriber) { try { subscriber.onNext(func.call()); } catch(Exception ex) { Log.e(TAG, "Error reading from the database", ex); } } }); } In effect, we simply refactored our method to return a lazy result, which makes it possible for the database helper to turn this result into an Observable: // MySqliteOpenHelper.java Observable> getUsers(String userId) { return makeObservable(mUserTable.getUsers(getReadableDatabase(), userId)) .subscribeOn(Schedulers:io()) } Notice that on top of turning the lazy result into an Observable, the helper forces the subscription to happen on a background thread (the IO thread here, since we’re accessing the database). This guarantees that callers don’t have to worry about ever blocking the main thread. Finally, the makeObservable method is pretty straightforward (and completely generic): // MySqliteOpenHelper.java private static Observable makeObservable(final Callable func) { return Observable.create( new Observable.OnSubscribe() { @Override public void call(Subscriber subscriber) { try { subscriber.onNext(func.call()); } catch(Exception ex) { Log.e(TAG, "Error reading from the database", ex); } } }); } At this point, all our database reads have become observables that guarantee that the queries run on a background thread. Accessing the database is now pretty standard Rx code: // DisplayUsersFragment.java @Inject MySqliteOpenHelper mDbHelper; // ... mDbHelper.getUsers(userId) .observeOn(AndroidSchedulers.mainThread()) .subscribe(new Action1>()) { @Override public void onNext(List users) { // Update our UI with the users } } } And if you don’t need to update your UI with the results, just observe on a background thread. Since your database layer is now returning observables, it’s trivial to compose and transform these results as they come in. For example, you might decide that your ContactTable is a low layer class that should not know anything about your model (the User class) and that instead, it should only return low level objects (maybe a Cursor or ContentValues). Then you can use use Rx to map these low level values into your model classes for an even cleaner separation of layers. Two additional remarks: Your Table Java classes should contain no public methods: only package protected methods (which are accessed exclusively by your Helper, located in the same package) and private methods. No other classes should ever access these Table classes directly. This approach is extremely compatible with dependency injection: it’s trivial to have both your database helper and your individual tables injected (additional bonus: with Dagger 2, your tables can have their own component since the database helper is the only refence needed to instantiate them). This is a very simple design pattern that has scaled remarkably well for our projects while fully enabling the power of RxJava. I also started extending this layer to provide a flexible update notification mechanism for list view adapters (not unlike what SQLBrite offers), but this will be for a future post. This is still a work in progress, so feedback welcome!

June 4, 2015

by Cedric Beust

· 16,252 Views

Deploying Web Application Using Vagrant

In this article, we will deploy a Spring web application in Tomcat 7 on Ubuntu 12.04 VM, created and provisioned using Vagrant. As an initial step, Download Vagrant specific to your operating system and install it in your machine. Then create a folder on a drive, for me its "workspace" created on D drive (D:/workspace). Now, open a command prompt and go to "workspace" folder and execute below command to clone the GIT repository on local drive: git clone https://github.com/arpitaggarwal/hello-spring.git Once clone completed, execute command: vagrant init Above command will create a Vagrant file (Vagrantfile), replace the content with below: Vagrant.configure(2) do |config| # A standard Ubuntu 12.04 LTS 32-bit box # For more boxes, you can look at https://atlas.hashicorp.com/boxes/search config.vm.box = "hashicorp/precise32" config.vm.provision "shell", path: "vagrant_provision.sh" # Create a private network, which allows host-only access to the machine # using a specific IP. config.vm.network "private_network", ip: "192.168.33.10" end Then create a vagrant_provision.sh file under same directory and copy below contents to the file: #!/usr/bin/env bash sudo apt-get update echo "Installing Apache.." sudo apt-get install -y apache2 echo "Installing Tomcat.." sudo apt-get install -y tomcat7 echo "Installing Tomcat7 docs.." sudo apt-get install -y tomcat7-docs echo "Installing Tomcat7 administration webapps.." sudo apt-get install -y tomcat7-admin echo "Installing Tomcat7 examples webapps.." sudo apt-get install tomcat7-examples echo "Installing Git.." sudo apt-get install -y git echo "Installing Maven.." sudo apt-get install -y maven echo "Installing Java 7.." sudo apt-get install -y software-properties-common python-software-properties echo oracle-java7-installer shared/accepted-oracle-license-v1-1 select true | sudo /usr/bin/debconf-set-selections sudo add-apt-repository ppa:webupd8team/java -y sudo apt-get update sudo apt-get install oracle-java7-installer echo "Setting environment variables for Java 7.." sudo apt-get install -y oracle-java7-set-default Then execute command: vagrant up Above command will make our VM up and running, also make an instance of Apache and Tomcat running (which we mentioned in vagrant_provison.sh). To check if Apache is running, hit url : http://192.168.33.10 and to check if Tomcat is running, hit url : http://192.168.33.10:8080 Now, we will log into our newly created VM and build our project (cloned from Git) and copy it to the Tomcat deployment directory. So, we will first open our VM terminal executing below command: vagrant ssh Now we will first check where our project is, so execute below command: cd / ls Now we can see folder name "vagrant", this is the folder which is linked to our host machine and going inside we will see our cloned project. Next we will go inside and build it executing below commands: cd vagrant cd hello-spring mvn clean install It will generate a target folder and our .war file inside it, now we will copy this .war to Tomcat 7 deployment directory after coming back to root directory, using command below: cd / sudo cp /vagrant/hello-spring/target/hello-spring.war /var/lib/tomcat7/webapps/ Now, our application is copied to tomcat deployment directory and we are ready to hit the url and see our application running, but before that we have to change the Java version for tomcat, as our application is compiled using 1.7 and tomcat is using 1.6. It's easy, just go to the root directory and execute below commands: cd etc/default nano tomcat7 And search for JAVA_HOME, uncomment it and edit as below: JAVA_HOME=/usr/lib/jvm/java-7-oracle Now, just restart the Tomcat instance, using command below: service tomcat7 restart Open the url : http://192.168.33.10:8080/hello-spring in host browser and we will see the welcome page of our Spring application.

June 4, 2015

by Arpit Aggarwal

· 29,459 Views · 5 Likes

Ecosystem of Hadoop Animal Zoo

hadoop is best known for map reduce and it's distributed file system (hdfs). recently other productivity tools developed on top of these will form a complete ecosystem of hadoop. most of the projects are hosted under apache software foundation . hadoop ecosystem projects are listed below. hadoop common a set of components and interfaces for distributed file system and i/o (serialization, java rpc, persistent data structures) http://hadoop.apache.org/ hadoop ecosystem hdfs a distributed file system that runs on large clusters of commodity hardware. hadoop distributed file system, hdfs renamed form ndfs. scalable data store that stores semi-structured, un-structured and structured data. http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/hdfsuserguide.html http://wiki.apache.org/hadoop/hdfs map reduce map reduce is the distributed, parallel computing programming model for hadoop. inspired from google map reduce research paper . hadoop includes implementation of map reduce programming model. in map reduce there are two phases, not surprisingly map and reduce. to be precise in between map and reduce phase, there is another phase called sort and shuffle. job tracker in name node machine manages other cluster nodes. map reduce programming can be written in java. if you like sql or other non- java languages, you are still in luck. you can use utility called hadoop streaming. http://wiki.apache.org/hadoop/hadoopmapreduce hadoop streaming a utility to enable map reduce code in many languages like c, perl, python, c++, bash etc., examples include a python mapper and awk reducer. http://hadoop.apache.org/docs/r1.2.1/streaming.html avro a serialization system for efficient, cross-language rpc and persistent data storage. avro is a framework for performing remote procedure calls and data serialization. in the context of hadoop, it can be used to pass data from one program or language to another, e.g. from c to pig. it is particularly suited for use with scripting languages such as pig, because data is always stored with its schema in avro. http://avro.apache.org/ apache thrift apache thrift allows you to define data types and service interfaces in a simple definition file. taking that file as input, the compiler generates code to be used to easily build rpc clients and servers that communicate seamlessly across programming languages. instead of writing a load of boilerplate code to serialize and transport your objects and invoke remote methods, you can get right down to business. http://thrift.apache.org/ hive and hue if you like sql, you would be delighted to hear that you can write sql and hive convert it to a map reduce job. but, you don't get a full ansi-sql environment. hue gives you a browser based graphical interface to do your hive work. hue features a file browser for hdfs, a job browser for map reduce/yarn, an hbase browser, query editors for hive, pig, cloudera impala and sqoop2.it also ships with an oozie application for creating and monitoring workflows, a zookeeper browser and an sdk. pig a high-level programming data flow language and execution environment to do map reduce coding the pig language is called pig latin. you may find naming conventions some what un-conventional, but you get incredible price-performance and high availability. https://pig.apache.org/ jaql jaql is a functional, declarative programming language designed especially for working with large volumes of structured, semi-structured and unstructured data. as its name implies, a primary use of jaql is to handle data stored as json documents, but jaql can work on various types of data. for example, it can support xml, comma-separated values (csv) data and flat files. a "sql within jaql" capability lets programmers work with structured sql data while employing a json data model that's less restrictive than its structured query language counterparts. 1. jaql in google code 2. what is jaql? by ibm sqoop sqoop provides a bi-directional data transfer between hadoop -hdfs and your favorite relational database. for example you might be storing your app data in relational store such as oracle, now you want to scale your application with hadoop so you can migrate oracle database data to hadoop hdfs using sqoop. http://sqoop.apache.org/ oozie manages hadoop workflow. this doesn't replace your scheduler or BPM tooling, but it will provide if-then-else branching and control with hadoop jobs. https://oozie.apache.org/ zookeeper a distributed, highly available coordination service. zookeeper provides primitives such as distributed locks that can be used for building the highly scalable applications. it is used to manage synchronization for cluster. http://zookeeper.apache.org/ hbase based on google's bigtable , hbase "is an open-source, distributed, version, column-oriented store" that sits on top of hdfs. a super scalable key-value store. it works very much like a persistent hash-map (for python developers think like a dictionary). it is not a conventional relational database. it is a distributed, column oriented database. hbase uses hdfs for it's underlying. supports both batch-style computations using map reduce and point queries for random reads. https://hbase.apache.org/ cassandra a column oriented nosql data store which offers scalability, high availability with out compromising on performance. it perfect platform for commodity hardware and cloud infrastructure.cassandra's data model offers the convenience of column indexes with the performance of log-structured updates, strong support for de-normalization and materialized views , and powerful built-in caching. http://cassandra.apache.org/ flume a real time loader for streaming your data into hadoop. it stores data in hdfs and hbase.flume "channels" data between "sources" and "sinks" and its data harvesting can either be scheduled or event-driven. possible sources for flume include avro, files, and system logs, and possible sinks include hdfs and hbase. http://flume.apache.org/ mahout machine learning for hadoop, used for predictive analytics and other advanced analysis. there are currently four main groups of algorithms in mahout: recommendations, a.k.a. collective filtering classification, a.k.a categorization clustering frequent item set mining, a.k.a parallel frequent pattern mining mahout is not simply a collection of pre-existing algorithms; many machine learning algorithms are intrinsically non-scalable; that is, given the types of operations they perform, they cannot be executed as a set of parallel processes. algorithms in the mahout library belong to the subset that can be executed in a distributed fashion. http://en.wikipedia.org/wiki/list_of_machine_learning_algorithms https://www.coursera.org/course/machlearning https://mahout.apache.org/ fuse makes the hdfs system to look like a regular file system so that you can use ls, rm, cd etc., directly on hdfs data. whirr apache whirr is a set of libraries for running cloud services. whirr provides a cloud-neutral way to run services. you don't have to worry about the idiosyncrasies of each provider.a common service api. the details of provisioning are particular to the service. smart defaults for services. you can get a properly configured system running quickly, while still being able to override settings as needed. you can also use whirr as a command line tool for deploying clusters. https://whirr.apache.org/ giraph an open source graph processing api like pregel from google https://giraph.apache.org/ chukwa chukwa, an incubator project on apache, is a data collection and analysis system built on top of hdfs and map reduce. tailored for collecting logs and other data from distributed monitoring systems, chukwa provides a workflow that allows for incremental data collection, processing and storage in hadoop. it is included in the apache hadoop distribution as an independent module. https://chukwa.apache.org/ drill apache drill, an incubator project on apache, is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. drill is the open source version of google's dremel system which is available as an iaas service called google big query. one explicitly stated design goal is that drill is able to scale to 10,000 servers or more and to be able to process petabytes of data and trillions of records in seconds. http://incubator.apache.org/drill/ impala (cloudera) released by cloudera, impala is an open-source project which, like apache drill, was inspired by google's paper on dremel; the purpose of both is to facilitate real-time querying of data in hdfs or hbase. impala uses an sql-like language that, though similar to hiveql, is currently more limited than hiveql. because impala relies on the hive meta store, hive must be installed on a cluster in order for impala to work. the secret behind impala's speed is that it "circumvents map reduce to directly access the data through a specialized distributed query engine that is very similar to those found in commercial parallel rdbmss." (source: cloudera) http://www.cloudera.com/content/cloudera/en/products-and-services/cdh/impala.html http://training.cloudera.com/elearning/impala/

June 3, 2015

by Umashankar Ankuri

· 23,915 Views · 3 Likes

Top 80 Thread- Java Interview Questions and Answers (Part 1)

Question 1. What is Thread in java? Answer. Threads consumes CPU in best possible manner, hence enables multi processing. Multi threading reduces idle time of CPU which improves performance of application. Thread are light weight process. A thread class belongs to java.lang package. We can create multiple threads in java, even if we don’t create any Thread, one Thread at least do exist i.e. main thread. Multiple threads run parallely in java. Threads have their own stack. Advantage of Thread : Suppose one thread needs 10 minutes to get certain task, 10 threads used at a time could complete that task in 1 minute, because threads can run parallely. Question 2. What is difference between Process and Thread in java? Answer. One process can have multiple Threads, Thread are subdivision of Process. One or more Threads runs in the context of process. Threads can execute any part of process. And same part of process can be executed by multiple Threads. Processes have their own copy of the data segment of the parent process while Threads have direct access to the data segment of its process. Processes have their own address while Threads share the address space of the process that created it. Process creation needs whole lot of stuff to be done, we might need to copy whole parent process, but Thread can be easily created. Processes can easily communicate with child processes but interprocess communication is difficult. While, Threads can easily communicate with other threads of the same process using wait() and notify() methods. In process all threads share system resource like heap Memory etc. while Thread has its own stack. Any change made to process does not affect child processes, but any change made to thread can affect the behavior of the other threads of the process. Example to see where threads on are created on different processes and same process. Question 3. How to implement Threads in java? Answer. This is very basic threading question. Threads can be created in two ways i.e. by implementing java.lang.Runnable interface or extending java.lang.Thread class and then extending run method. Thread has its own variables and methods, it lives and dies on the heap. But a thread of execution is an individual process that has its own call stack. Thread are lightweight process in java. Thread creation by implementingjava.lang.Runnableinterface. We will create object of class which implements Runnable interface : MyRunnable runnable=new MyRunnable(); Thread thread=new Thread(runnable); 2) And then create Thread object by calling constructor and passing reference of Runnable interface i.e. runnable object : Thread thread=new Thread(runnable); Question 4 . Does Thread implements their own Stack, if yes how? (Important) Answer. Yes, Threads have their own stack. This is very interesting question, where interviewer tends to check your basic knowledge about how threads internally maintains their own stacks. I’ll be explaining you the concept by diagram. Question 5. We should implement Runnable interface or extend Thread class. What are differences between implementing Runnable and extending Thread? Answer. Well the answer is you must extend Thread only when you are looking to modify run() and other methods as well. If you are simply looking to modify only the run() method implementing Runnable is the best option (Runnable interface has only one abstract method i.e. run() ). Differences between implementing Runnable interface and extending Thread class - Multiple inheritance in not allowed in java : When we implement Runnable interface we can extend another class as well, but if we extend Thread class we cannot extend any other class because java does not allow multiple inheritance. So, same work is done by implementing Runnable and extending Thread but in case of implementing Runnable we are still left with option of extending some other class. So, it’s better to implement Runnable. Thread safety : When we implement Runnable interface, same object is shared amongst multiple threads, but when we extend Thread class each and every thread gets associated with new object. Inheritance (Implementing Runnable is lightweight operation) : When we extend Thread unnecessary all Thread class features are inherited, but when we implement Runnable interface no extra feature are inherited, as Runnable only consists only of one abstract method i.e. run() method. So, implementing Runnable is lightweight operation. Coding to interface : Even java recommends coding to interface. So, we must implement Runnable rather than extending thread. Also, Thread class implements Runnable interface. Don’t extend unless you wanna modify fundamental behaviour of class, Runnable interface has only one abstract method i.e. run() : We must extend Thread only when you are looking to modify run() and other methods as well. If you are simply looking to modify only the run() method implementing Runnable is the best option (Runnable interface has only one abstract method i.e. run() ). We must not extend Thread class unless we're looking to modify fundamental behaviour of Thread class. Flexibility in code when we implement Runnable : When we extend Thread first a fall all thread features are inherited and our class becomes direct subclass of Thread , so whatever action we are doing is in Thread class. But, when we implement Runnable we create a new thread and pass runnable object as parameter,we could pass runnable object to executorService & much more. So, we have more options when we implement Runnable and our code becomes more flexible. ExecutorService : If we implement Runnable, we can start multiple thread created on runnable object with ExecutorService (because we can start Runnable object with new threads), but not in the case when we extend Thread (because thread can be started only once). Question 6. How can you say Thread behaviour is unpredictable? (Important) Answer. The solution to question is quite simple, Thread behaviour is unpredictable because execution of Threads depends on Thread scheduler, thread scheduler may have different implementation on different platforms like windows, unix etc. Same threading program may produce different output in subsequent executions even on same platform. To achieve we are going to create 2 threads on same Runnable Object, create for loop in run() method and start both threads. There is no surety that which threads will complete first, both threads will enter anonymously in for loop. Question 7 . When threads are not lightweight process in java? Answer. Threads are lightweight process only if threads of same process are executing concurrently. But if threads of different processes are executing concurrently then threads are heavy weight process. Question 8. How can you ensure all threads that started from main must end in order in which they started and also main should end in last? (Important) Answer. Interviewers tend to know interviewees knowledge about Thread methods. So this is time to prove your point by answering correctly. We can use join() methodto ensure all threads that started from main must end in order in which they started and also main should end in last.In other words waits for this thread to die. Calling join() method internally calls join(0); DETAILED DESCRIPTION : Join() method - ensure all threads that started from main must end in order in which they started and also main should end in last. Types of join() method with programs- 10 salient features of join. Question 9.What is difference between starting thread with run() and start() method? (Important) Answer. This is quite interesting question, it might confuse you a bit and at time may make you think is there really any difference between starting thread with run() and start() method. When you call start() method, main thread internally calls run() method to start newly created Thread, so run() method is ultimately called by newly created thread. When you call run() method main thread rather than starting run() method with newly thread it start run() method by itself. Question 10. What is significance of using Volatile keyword? (Important) Answer. Java allows threads to access shared variables. As a rule, to ensure that shared variables are consistently updated, a thread should ensure that it has exclusive use of such variables by obtaining a lock that enforces mutual exclusion for those shared variables. If a field is declared volatile, in that case the Java memory model ensures that all threads see a consistent value for the variable. Few small questions> Q. Can we have volatile methods in java? No, volatile is only a keyword, can be used only with variables. Q. Can we have synchronized variable in java? No, synchronized can be used only with methods, i.e. in method declaration. Question 11. Differences between synchronized and volatile keyword in Java? (Important) Answer.Its very important question from interview perspective. Volatilecan be used as a keyword against the variable, we cannot use volatile against method declaration. volatile void method1(){} //it’s illegal, compilation error. While synchronization can be used in method declaration or we can create synchronization blocks (In both cases thread acquires lock on object’s monitor). Variables cannot be synchronized. Synchronized method: synchronized void method2(){} //legal Synchronized block: void method2(){ synchronized (this) { //code inside synchronized block. } } Synchronized variable (illegal): synchronized int i;//it’s illegal, compilatiomn error. Volatile does not acquire any lock on variable or object, but Synchronization acquires lock on method or block in which it is used. Volatile variables are not cached, but variables used inside synchronized method or block are cached. When volatile is used will never create deadlock in program, as volatile never obtains any kind of lock . But in case if synchronization is not done properly, we might end up creating dedlock in program. Synchronization may cost us performance issues, as one thread might be waiting for another thread to release lock on object. But volatile is never expensive in terms of performance. DETAILED DESCRIPTION : Differences between synchronized and volatile keyword in detail with programs. Question 12. Can you again start Thread? Answer.No, we cannot start Thread again, doing so will throw runtimeException java.lang.IllegalThreadStateException. The reason is once run() method is executed by Thread, it goes into dead state. Let’s take an example- Thinking of starting thread again and calling start() method on it (which internally is going to call run() method) for us is some what like asking dead man to wake up and run. As, after completing his life person goes to dead state. Question 13. What is race condition in multithreading and how can we solve it? (Important) Answer. This is very important question, this forms the core of multi threading, you should be able to explain about race condition in detail. When more than one thread try to access same resource without synchronization causes race condition. So we can solve race condition by using either synchronized block or synchronized method. When no two threads can access same resource at a time phenomenon is also called as mutual exclusion. Few sub questions> What if two threads try to read same resource without synchronization? When two threads try to read on same resource without synchronization, it’s never going to create any problem. What if two threads try to write to same resource without synchronization? When two threads try to write to same resource without synchronization, it’s going to create synchronization problems. Question 14. How threads communicate between each other? Answer. This is very must know question for all the interviewees, you will most probably face this question in almost every time you go for interview. Threads can communicate with each other by using wait(), notify() and notifyAll() methods. Question 15. Why wait(), notify() and notifyAll() are in Object class and not in Thread class? (Important) Answer. Every Object has a monitor, acquiring that monitors allow thread to hold lock on object. But Thread class does not have any monitors. wait(), notify() and notifyAll()are called on objects only >When wait() method is called on object by thread it waits for another thread on that object to release object monitor by calling notify() or notifyAll() method on that object. When notify() method is called on object by thread it notifies all the threads which are waiting for that object monitor that object monitor is available now. So, this shows that wait(), notify() and notifyAll() are called on objects only. Now, Straight forward question that comes to mind is how thread acquires object lock by acquiring object monitor? Let’s try to understand this basic concept in detail? Wait(), notify() and notifyAll() method being in Object class allows all the threads created on that object to communicate with other. . As multiple threads exists on same object. Only one thread can hold object monitor at a time. As a result thread can notify other threads of same object that lock is available now. But, thread having these methods does not make any sense because multiple threads exists on object its not other way around (i.e. multiple objects exists on thread). Now let’s discuss one hypothetical scenario, what will happen if Thread class contains wait(), notify() and notifyAll() methods? Having wait(), notify() and notifyAll() methods means Thread class also must have their monitor. Every thread having their monitor will create few problems - >Thread communication problem. >Synchronization on object won’t be possible- Because object has monitor, one object can have multiple threads and thread hold lock on object by holding object monitor. But if each thread will have monitor, we won’t have any way of achieving synchronization. >Inconsistency in state of object (because synchronization won't be possible). Question 16. Is it important to acquire object lock before calling wait(), notify() and notifyAll()? Answer.Yes, it’s mandatory to acquire object lock before calling these methods on object. As discussed above wait(), notify() and notifyAll() methods are always called from Synchronized block only, and as soon as thread enters synchronized block it acquires object lock (by holding object monitor). If we call these methods without acquiring object lock i.e. from outside synchronize block then java.lang. IllegalMonitorStateException is thrown at runtime. Wait() method needs to enclosed in try-catch block, because it throws compile time exception i.e. InterruptedException. Question 17. How can you solve consumer producer problem by using wait() and notify() method? (Important) Answer. Here come the time to answer very very important question from interview perspective. Interviewers tends to check how sound you are in threads inter communication. Because for solving this problem we got to use synchronization blocks, wait() and notify() method very cautiously. If you misplace synchronization block or any of the method, that may cause your program to go horribly wrong. So, before going into this question first i’ll recommend you to understand how to use synchronized blocks, wait() and notify() methods. Key points we need to ensure before programming : >Producer will produce total of 10 products and cannot produce more than 2 products at a time until products are being consumed by consumer. Example> when sharedQueue’s size is 2, wait for consumer to consume (consumer will consume by calling remove(0) method on sharedQueue and reduce sharedQueue’s size). As soon as size is less than 2, producer will start producing. >Consumer can consume only when there are some products to consume. Example> when sharedQueue’s size is 0, wait for producer to produce (producer will produce by calling add() method on sharedQueue and increase sharedQueue’s size). As soon as size is greater than 0, consumer will start consuming. Explanation of Logic > We will create sharedQueue that will be shared amongst Producer and Consumer. We will now start consumer and producer thread. Note: it does not matter order in which threads are started (because rest of code has taken care of synchronization and key points mentioned above) First we will start consumerThread > consumerThread.start(); consumerThread will enter run method and call consume() method. There it will check for sharedQueue’s size. -if size is equal to 0 that means producer hasn’t produced any product, wait for producer to produce by using below piece of code- synchronized (sharedQueue) { while (sharedQueue.size() == 0) { sharedQueue.wait(); } } -if size is greater than 0, consumer will start consuming by using below piece of code. synchronized (sharedQueue) { Thread.sleep((long)(Math.random() * 2000)); System.out.println("consumed : "+ sharedQueue.remove(0)); sharedQueue.notify(); } Than we will start producerThread > producerThread.start(); producerThread will enter run method and call produce() method. There it will check for sharedQueue’s size. -if size is equal to 2 (i.e. maximum number of products which sharedQueue can hold at a time), wait for consumer to consume by using below piece of code- synchronized (sharedQueue) { while (sharedQueue.size() == maxSize) { //maxsize is 2 sharedQueue.wait(); } } -if size is less than 2, producer will start producing by using below piece of code. synchronized (sharedQueue) { System.out.println("Produced : " + i); sharedQueue.add(i); Thread.sleep((long)(Math.random() * 1000)); sharedQueue.notify(); } DETAILED DESCRIPTION with program : Solve Consumer Producer problem by using wait() and notify() methods in multithreading. Question 18. How to solve Consumer Producer problem without using wait() and notify() methods, where consumer can consume only when production is over.? Answer. In this problem, producer will allow consumer to consume only when 10 products have been produced (i.e. when production is over). We will approach by keeping one boolean variable productionInProcess and initially setting it to true, and later when production will be over we will set it to false. Question 19. How can you solve consumer producer pattern by using BlockingQueue? (Important) Answer. Now it’s time to gear up to face question which is most probably going to be followed up by previous question i.e. after how to solve consumer producer problem using wait() and notify() method. Generally you might wonder why interviewer's are so much interested in asking about solving consumer producer problem using BlockingQueue, answer is they want to know how strong knowledge you have about java concurrent Api’s, this Api use consumer producer pattern in very optimized manner, BlockingQueue is designed is such a manner that it offer us the best performance. BlockingQueue is a interface and we will use its implementation class LinkedBlockingQueue. Key methods for solving consumer producer pattern are > put(i); //used by producer to put/produce in sharedQueue. take();//used by consumer to take/consume from sharedQueue. Question 20. What is deadlock in multithreading? Write a program to form DeadLock in multi threading and also how to solve DeadLock situation. What measures you should take to avoid deadlock? (Important) Answer. This is very important question from interview perspective. But, what makes this question important is it checks interviewees capability of creating and detecting deadlock. If you can write a code to form deadlock, than I am sure you must be well capable in solving that deadlock as well. If not, later on this post we will learn how to solve deadlock as well. First question comes to mind is, what is deadlock in multi threading program? Deadlock is a situation where two threads are waiting for each other to release lock holded by them on resources. But how deadlock could be formed : Thread-1 acquires lock on String.class and then calls sleep() method which gives Thread-2 the chance to execute immediately after Thread-1 has acquired lock on String.class and Thread-2 acquires lock on Object.class then calls sleep() method and now it waits for Thread-1 to release lock on String.class. Conclusion: Now, Thread-1 is waiting for Thread-2 to release lock on Object.class and Thread-2 is waiting for Thread-1 to release lock on String.class and deadlock is formed. //Code called by Thread-1 public void run() { synchronized (String.class) { Thread.sleep(100); synchronized (Object.class) { } } } //Code called by Thread-2 publicvoid run() { synchronized (Object.class) { Thread.sleep(100); synchronized (String.class) { } } } Here comes the important part, how above formed deadlock could be solved : Thread-1 acquires lock on String.class and then calls sleep() method which gives Thread-2 the chance to execute immediately after Thread-1 has acquired lock on String.class and Thread-2 tries to acquire lock on String.class but lock is holded by Thread-1. Meanwhile, Thread-1 completes successfully. As Thread-1 has completed successfully it releases lock on String.class, Thread-2 can now acquire lock on String.class and complete successfully without any deadlock formation. Conclusion: No deadlock is formed. //Code called by Thread-1 publicvoid run() { synchronized (String.class) { Thread.sleep(100); synchronized (Object.class) { } } } //Code called by Thread-2 publicvoid run() { synchronized (String.class) { Thread.sleep(100); synchronized (Object.class) { } } } Few important measures to avoid Deadlock > Lock specific member variables of class rather than locking whole class: We must try to lock specific member variables of class rather than locking whole class. Use join() method: If possible try touse join() method, although it may refrain us from taking full advantage of multithreading environment because threads will start and end sequentially, but it can be handy in avoiding deadlocks. If possible try avoid using nested synchronization blocks. Question 21. Have you ever generated thread dumps or analyzed Thread Dumps? (Important) Answer. Answering this questions will show your in depth knowledge of Threads. Every experienced must know how to generate Thread Dumps. VisualVM is most popular way to generate Thread Dump and is most widely used by developers. It’s important to understand usage of VisualVM for in depth knowledge of VisualVM. I’ll recommend every developer must understand this topic to become master in multi threading. It helps us in analyzing threads performance, thread states, CPU consumed by threads, garbage collection and much more. For detailed information see Generating and analyzing Thread Dumps using VisualVM - step by step detail to setup VisualVM with screenshots jstack is very easy way to generate Thread dump and is widely used by developers. I’ll recommend every developer must understand this topic to become master in multi threading. For creating Thread dumps we need not to download any jar or any extra software. For detailed information see Generating and analyzing Thread Dumps using JSATCK - step by step detail to setup JSTACK with screenshots. Question 22. What is life cycle of Thread, explain thread states? (Important) Answer. Thread states/ Thread life cycle is very basic question, before going deep into concepts we must understand Thread life cycle. Thread have following states > New Runnable Running Waiting/blocked/sleeping Terminated (Dead) Thread states/ Thread life cycle in diagram > Thread states in detail > New : When instance of thread is created using new operator it is in new state, but the start() method has not been invoked on the thread yet, thread is not eligible to run yet. Runnable : When start() method is called on thread it enters runnable state. Running : Thread scheduler selects thread to go fromrunnable to running state. In running state Thread starts executing by entering run() method. Waiting/blocked/sleeping : In this state a thread is not eligible to run. >Thread is still alive, but currently it’s not eligible to run. In other words. > How can Thread go from running to waiting state? By calling wait()method thread go from running to waiting state. In waiting state it will wait for other threads to release object monitor/lock. > How can Thread go from running to sleeping state? By calling sleep() methodthread go from running to sleeping state. In sleeping state it will wait for sleep time to get over. Terminated (Dead) : A thread is considered dead when its run() method completes. Question 23. Are you aware of preemptive scheduling and time slicing? Answer. In preemptive scheduling, the highest priority thread executes until it enters into the waiting or dead state. In time slicing, a thread executes for a certain predefined time and then enters runnable pool. Than thread can enter running state when selected by thread scheduler. Question 24. What are daemon threads? Answer.Daemon threads are low priority threads which runs intermittently in background for doing garbage collection. 12 Few salient features of daemon() threads> Thread scheduler schedules these threads only when CPU is idle. Daemon threads are service oriented threads, they serves all other threads. These threads are created before user threads are created and die after all other user threads dies. Priority of daemon threads is always 1 (i.e. MIN_PRIORITY). User created threads are non daemon threads. JVM can exit when only daemon threads exist in system. we can use isDaemon() method to check whether thread is daemon thread or not. we can use setDaemon(boolean on) method to make any user method a daemon thread. If setDaemon(boolean on) is called on thread after calling start() method than IllegalThreadStateException is thrown. You may like to see how daemon threads work, for that you can use VisualVM or jStack. I have provided Thread dumps over there which shows daemon threads which were intermittently running in background. Some of the daemon threads which intermittently run in background are > "RMI TCP Connection(3)-10.175.2.71" daemon"RMI TCP Connection(idle)" daemon"RMI Scheduler(0)" daemon"C2 CompilerThread1" daemon "GC task thread#0 (ParallelGC)" Question 25. Why suspend() and resume() methods are deprecated? Answer.Suspend() method is deadlock prone. If the target thread holds a lock on object when it is suspended, no thread can lock this object until the target thread is resumed. If the thread that would resume the target thread attempts to lock this monitor prior to calling resume, it results in deadlock formation. These deadlocksare generally called Frozen processes. Suspend() method puts thread from running to waiting state. And thread can go from waiting to runnable state only when resume() method is called on thread. It is deprecated method. Resume() method is only used with suspend() method that’s why it’s also deprecated method. Question 26. Why destroy() methods is deprecated? Answer. This question is again going to check your in depth knowledge of thread methods i.e. destroy() method is deadlock prone. If the target thread holds a lock on object when it is destroyed, no thread can lock this object (Deadlock formed are similar to deadlock formed when suspend() and resume() methods are used improperly). It results in deadlock formation. These deadlocksare generally called Frozen processes. Additionally you must know calling destroy() method on Threads throw runtimeException i.e. NoSuchMethodError. Destroy() method puts thread from running to dead state. Question 27. As stop() method is deprecated, How can we terminate or stop infinitely running thread in java? (Important) Answer. This is very interesting question where interviewees thread basics basic will be tested. Interviewers tend to know user’s knowledge about main thread’s and thread invoked by main thread. We will try to address the problem by creating new thread which will run infinitely until certain condition is satisfied and will be called by main Thread. Infinitely running thread can be stopped using boolean variable. Infinitely running thread can be stopped using interrupt() method. Let’s understand Why stop() method is deprecated : Stopping a thread with Thread.stop() causes it to release all of the monitors that it has locked. If any of the objects previously protected by these monitors were in an inconsistent state, the damaged objects become visible to other threads, which might lead to unpredictable behavior. Question 28. what is significance of yield() method, what state does it put thread in? yield() is a native method it’s implementation in java 6 has been changed as compared to its implementation java 5. As method is native it’s implementation is provided by JVM. In java 5, yield() method internally used to call sleep() method giving all the other threads of same or higher priority to execute before yielded thread by leaving allocated CPU for time gap of 15 millisec. But java 6, calling yield() method gives a hint to the thread scheduler that the current thread is willing to yield its current use of a processor. The thread scheduler is free to ignore this hint. So, sometimes even after using yield() method, you may not notice any difference in output. salient features of yield() method > Definition : yield() method when called on thread gives a hint to the thread scheduler that the current thread is willing to yield its current use of a processor.The thread scheduler is free to ignore this hint. Thread state : when yield() method is called on thread it goes from running to runnable state, not in waiting state. Thread is eligible to run but not running and could be picked by scheduler at anytime. Waiting time : yield() method stops thread for unpredictable time. Static method : yield()is a static method, hence calling Thread.yield() causes currently executing thread to yield. Native method : implementation of yield() method is provided by JVM. Let’s see definition of yield() method as given in java.lang.Thread - public static native void yield(); synchronized block : thread need not to to acquire object lock before calling yield()method i.e. yield() method can be called from outside synchronized block. Question 29.What is significance of sleep() method in detail, what statedoes it put thread in ? sleep() is a native method, it’s implementation is provided by JVM. 10 salient features of sleep() method > Definition : sleep() methods causes current thread to sleep for specified number of milliseconds (i.e. time passed in sleep method as parameter). Ex- Thread.sleep(10) causes currently executing thread to sleep for 10 millisec. Thread state : when sleep() is called on thread it goes from running to waiting state and can return to runnable state when sleep time is up. Exception : sleep() method must catch or throw compile time exception i.e. InterruptedException. Waiting time : sleep() method have got few options. sleep(long millis) - Causes the currently executing thread to sleep for the specified number of milliseconds public static native void sleep(long millis) throws InterruptedException; sleep(long millis, int nanos) - Causes the currently executing thread to sleep for the specified number of milliseconds plus the specified number of nanoseconds. public static native void sleep(long millis,int nanos) throws InterruptedException; static method : sleep()is a static method, causes the currently executing thread to sleep for the specified number of milliseconds. Belongs to which class :sleep() method belongs to java.lang.Thread class. synchronized block : thread need not to to acquire object lock before calling sleep()method i.e. sleep() method can be called from outside synchronized block. Question 30. Difference between wait() and sleep() ? (Important) Answer. Should be called from synchronized block :wait() method is always called from synchronized block i.e. wait() method needs to lock object monitor before object on which it is called. But sleep() method can be called from outside synchronized block i.e. sleep() method doesn’t need any object monitor. IllegalMonitorStateException : if wait() method is called without acquiring object lock than IllegalMonitorStateException is thrown at runtime, but sleep() methodnever throws such exception. Belongs to which class : wait() method belongs to java.lang.Object class but sleep() method belongs to java.lang.Thread class. Called on object or thread : wait() method is called on objects but sleep() method is called on Threads not objects. Thread state : when wait() method is called on object, thread that holded object’s monitor goes from running to waiting state and can return to runnable state only when notify() or notifyAll()method is called on that object. And later thread scheduler schedules that thread to go from from runnable to running state. when sleep() is called on thread it goes from running to waiting state and can return to runnable state when sleep time is up. When called from synchronized block :when wait() method is called thread leaves the object lock. But sleep()method when called from synchronized block or method thread doesn’t leaves object lock. Question 31. Differences and similarities between yield() and sleep()? Answer. Differences yield() and sleep() : Definition : yield() method when called on thread gives a hint to the thread scheduler that the current thread is willing to yield its current use of a processor.The thread scheduler is free to ignore this hint. sleep() methods causes current thread to sleep for specified number of milliseconds (i.e. time passed in sleep method as parameter). Ex- Thread.sleep(10) causes currently executing thread to sleep for 10 millisec. Thread state : when sleep() is called on thread it goes from running to waiting state and can return to runnable state when sleep time is up. when yield() method is called on thread it goes from running to runnable state, not in waiting state. Thread is eligible to run but not running and could be picked by scheduler at anytime. Exception : yield() method need not to catch or throw any exception. But sleep() method must catch or throw compile time exception i.e. InterruptedException. Waiting time : yield() method stops thread for unpredictable time, that depends on thread scheduler. But sleep() method have got few options. sleep(long millis) - Causes the currently executing thread to sleep for the specified number of milliseconds sleep(long millis, int nanos) - Causes the currently executing thread to sleep for the specified number of milliseconds plus the specified number of nanoseconds. similarity between yield() and sleep(): > yield() and sleep() method belongs to java.lang.Thread class. > yield() and sleep() method can be called from outside synchronized block. > yield() and sleep() method are called on Threads not objects. Question 32. Mention some guidelines to write thread safe code, most important point we must take care of in multithreading programs? Answer. In multithreading environment it’s important very important to write thread safe code, thread unsafe code can cause a major threat to your application. I have posted many articles regarding thread safety. So overall this will be revision of what we have learned so far i.e. writing thread safe healthy code and avoiding any kind of deadlocks. If method is exposed in multithreading environment and it’s not synchronized (thread unsafe) than it might lead us to race condition, we must try to use synchronized block and synchronized methods. Multiple threads may exist on same object but only one thread of that object can enter synchronized method at a time, though threads on different object can enter same method at same time. Even static variables are not thread safe, they are used in static methods and if static methods are not synchronized then thread on same or different object can enter method concurrently. Multiple threads may exist on same or different objects of class but only one thread can enter static synchronized method at a time, we must consider making static methods as synchronized. If possible, try to use volatile variables. If a field is declared volatile all threads see a consistent value for the variable. Volatile variables at times can be used as alternate to synchronized methods as well. Final variables are thread safe because once assigned some reference of object they cannot point to reference of other object. s is pointing to String object. public class MyClass { final String s=new String("a"); void method(){ s="b"; //compilation error, s cannot point to new reference. } } If final is holding some primitive value it cannot point to other value. public class MyClass { final inti=0; void method(){ i=0; //compilation error, i cannot point to new value. } } Usage of local variables : If possible try to use local variables, local variables are thread safe, because every thread has its own stack, i.e. every thread has its own local variables and its pushes all the local variables on stack. public class MyClass { void method(){ inti=0; //Local variable, is thread safe. } } Using thread safe collections : Rather than using ArrayList we must Vector and in place of using HashMap we must use ConcurrentHashMap or HashTable. We must use VisualVM or jstack to detect problems such as deadlocks and time taken by threads to complete in multi threading programs. Using ThreadLocal:ThreadLocal is a class which provides thread-local variables. Every thread has its own ThreadLocal value that makes ThreadLocal value threadsafe as well. Rather than StringBuffer try using immutable classes such as String. Any change to String produces new String. Question 33. How thread can enter waiting, sleeping and blocked state and how can they go to runnable state ? Answer. This is very prominently asked question in interview which will test your knowledge about thread states. And it’s very important for developers to have in depth knowledge of this thread state transition. I will try to explain this thread state transition by framing few sub questions. I hope reading sub questions will be quite interesting. > How can Thread go from running to waiting state ? By calling wait()method thread go from running to waiting state. In waiting state it will wait for other threads to release object monitor/lock. > How can Thread return from waiting to runnable state ? Once notify() or notifyAll()method is called object monitor/lock becomes available and thread can again return to runnable state. > How can Thread go from running to sleeping state ? By calling sleep() methodthread go from running to sleeping state. In sleeping state it will wait for sleep time to get over. > How can Thread return from sleeping to runnable state ? Once specified sleep time is up thread can again return to runnable state. Suspend() method can be used to put thread in waiting state and resume() method is the only way which could put thread in runnable state. Thread also may go from running to waiting state if it is waiting for some I/O operation to take place. Once input is available thread may return to running state. >When threads are in running state, yield()method can make thread to go in Runnable state. Question 34. Difference between notify() and notifyAll() methods, can you write a code to prove your point? Answer. Goodness. Theoretically you must have heard or you must be aware of differences between notify() and notifyAll().But have you created program to achieve it? If not let’s do it. First, I will like give you a brief description of what notify() and notifyAll() methods do. notify()- Wakes up a single thread that is waiting on this object's monitor. If any threads are waiting on this object, one of them is chosen to be awakened. The choice is random and occurs at the discretion of the implementation. A thread waits on an object's monitor by calling one of the wait methods. The awakened threads will not be able to proceed until the current thread relinquishes the lock on this object. public final native void notify(); notifyAll()- Wakes up all threads that are waiting on this object's monitor. A thread waits on an object's monitor by calling one of the wait methods. The awakened threads will not be able to proceed until the current thread relinquishes the lock on this object. public final native void notifyAll(); Now it’s time to write down a program to prove the point. Question 35. Does thread leaves object lock when sleep() method is called? Answer. When sleep() method is called Thread does not leaves object lock and goes from running to waiting state. Thread waits for sleep time to over and once sleep time is up it goes from waiting to runnable state. Question 36. Does thread leaves object lock when wait() method is called? Answer. When wait() method is called Thread leaves the object lock and goes from running to waiting state. Thread waits for other threads on same object to call notify() or notifyAll() and once any of notify() or notifyAll() is called it goes from waiting to runnable state and again acquires object lock. Question 37. What will happen if we don’t override run method? Answer. This question will test your basic knowledge how start and run methods work internally in Thread Api. When we call start() method on thread, it internally calls run() method with newly created thread. So, if we don’t override run() method newly created thread won’t be called and nothing will happen. class MyThread extends Thread { //don't override run() method } publicclass DontOverrideRun { publicstaticvoid main(String[] args) { System.out.println("main has started."); MyThread thread1=new MyThread(); thread1.start(); System.out.println("main has ended."); } } /*OUTPUT main has started. main has ended. */ As we saw in output, we didn’t override run() method that’s why on calling start() method nothing happened. Question 38. What will happen if we override start method? Answer. This question will again test your basic core java knowledge how overriding works at runtime, what what will be called at runtime and how start and run methods work internally in Thread Api. When we call start() method on thread, it internally calls run() method with newly created thread. So, if we override start() method, run() method will not be called until we write code for calling run() method. class MyThread extends Thread { @Override publicvoid run() { System.out.println("in run() method"); } @Override publicvoid start(){ System.out.println("In start() method"); } } publicclass OverrideStartMethod { publicstaticvoid main(String[] args) { System.out.println("main has started."); MyThread thread1=new MyThread(); thread1.start(); System.out.println("main has ended."); } } /*OUTPUT main has started. In start() method main has ended. */ If we note output. we have overridden start method and didn’t called run() method from it, so, run() method wasn’t call. Question 39. Can we acquire lock on class? What are ways in which you can acquire lock on class? Answer. Yes, we can acquire lock on class’s class object in 2 ways to acquire lock on class. Thread can acquire lock on class’s class object by- Entering synchronized block or Let’s say there is one class MyClass. Now we can create synchronization block, and parameter passed with synchronization tells which class has to be synchronized. In below code, we have synchronized MyClass synchronized (MyClass.class) { //thread has acquired lock on MyClass’s class object. } by entering static synchronized methods. public staticsynchronizedvoid method1() { //thread has acquired lock on MyRunnable’s class object. } As soon as thread entered Synchronization method, thread acquired lock on class’s class object. Thread will leave lock when it exits static synchronized method. Question 40. Difference between object lock and class lock? Answer. It is very important question from multithreading point of view. We must understand difference between object lock and class lock to answer interview, ocjp answers correctly. Object lock Class lock Thread can acquire object lock by- Entering synchronized block or by entering synchronized methods. Thread can acquire lock on class’s class object by- Entering synchronized block or by entering static synchronized methods. Multiple threads may exist on same object but only one thread of that object can enter synchronized method at a time. Threads on different object can enter same method at same time. Multiple threads may exist on same or different objects of class but only one thread can enter static synchronized method at a time. Multiple objects of class may exist and every object has it’s own lock. Multiple objects of class may exist but there is always one class’s class object lock available. First let’s acquire object lock by entering synchronized block. Example- Let’s say there is one class MyClassand we have created it’s object and reference to that object is myClass. Now we can create synchronization block, and parameter passed with synchronization tells which object has to be synchronized. In below code, we have synchronized object reference by myClass. MyClass myClass=newMyclass(); synchronized (myClass) { } As soon thread entered Synchronization block, thread acquired object lock on object referenced by myClass (by acquiring object’s monitor.) Thread will leave lock when it exits synchronized block. First let’s acquire lock on class’s class object by entering synchronized block. Example- Let’s say there is one class MyClass. Now we can create synchronization block, and parameter passed with synchronization tells which class has to be synchronized. In below code, we have synchronized MyClass synchronized (MyClass.class) { } As soon as thread entered Synchronization block, thread acquired MyClass’s class object. Thread will leave lock when it exits synchronized block. publicsynchronizedvoid method1() { } As soon as thread entered Synchronization method, thread acquired object lock. Thread will leave lock when it exits synchronized method. public staticsynchronizedvoid method1() {} As soon as thread entered static Synchronization method, thread acquired lock on class’s class object. Thread will leave lock when it exits synchronized method. Let’s me give you some tricky situation based question, Question 41. Suppose you have 2 threads (Thread-1 and Thread-2) on same object. Thread-1 is in synchronized method1(), can Thread-2 enter synchronized method2() at same time? Answer.No, here when Thread-1 is in synchronized method1() it must be holding lock on object’s monitor and will release lock on object’s monitor only when it exits synchronized method1(). So, Thread-2 will have to waitfor Thread-1 to release lock on object’s monitor so that it could enter synchronized method2(). Likewise, Thread-2 even cannot enter synchronized method1() which is being executed by Thread-1. Thread-2 will have to wait for Thread-1 to release lock on object’s monitor so that it could enter synchronized method1(). Now, let’s see a program to prove our point. Question 42. Suppose you have 2 threads (Thread-1 and Thread-2) on same object. Thread-1 is in static synchronized method1(), can Thread-2 enter static synchronized method2() at same time? Answer.No, here when Thread-1 is in static synchronized method1() it must be holding lock on class class’s object and will release lock on class’s classobject only when it exits static synchronized method1(). So, Thread-2 will have to wait for Thread-1 to release lock on class’s classobject so that it could enter static synchronized method2(). Likewise, Thread-2 even cannot enter static synchronized method1() which is being executed by Thread-1. Thread-2 will have to wait for Thread-1 to release lock on class’s classobject so that it could enter static synchronized method1(). Now, let’s see a program to prove our point. Question 43. Suppose you have 2 threads (Thread-1 and Thread-2) on same object. Thread-1 is in synchronized method1(), can Thread-2 enter static synchronized method2() at same time? Answer.Yes, here when Thread-1 is in synchronized method1() it must be holding lock on object’s monitor and Thread-2 can enter static synchronized method2() by acquiring lock on class’s class object. Now, let’s see a program to prove our point. Question 44. Suppose you have thread and it is in synchronized method and now can thread enter other synchronized method from that method? Answer.Yes, here when thread is in synchronized method it must be holding lock on object’s monitor and using that lock thread can enter other synchronized method. Now, let’s see a program to prove our point. Question 45. Suppose you have thread and it is in static synchronized method and now can thread enter other static synchronized method from that method? Answer. Yes, here when thread is in static synchronized method it must be holding lock on class’s class object and using that lock thread can enter other static synchronized method. Now, let’s see a program to prove our point. Question 46. Suppose you have thread and it is in static synchronized method and now can thread enter other non static synchronized method from that method? Answer.Yes, here when thread is in static synchronized method it must be holding lock on class’s class object and when it enters synchronized method it will hold lock on object’s monitor as well. So, now thread holds 2 locks (it’s also called nested synchronization)- >first one on class’s class object. >second one on object’s monitor (This lock will be released when thread exits non static method).Now, let’s see a program to prove our point. Question 47. Suppose you have thread and it is in synchronized method and now can thread enter other static synchronized method from that method? Answer.Yes, here when thread is in synchronized method it must be holding lock on object’s monitor and when it enters static synchronized method it will hold lock on class’s class object as well. So, now thread holds 2 locks (it’s also called nested synchronization)- >first one on object’s monitor. >second one on class’s class object.(This lock will be released when thread exits static method).Now, let’s see a program to prove our point. Question 48. Suppose you have 2 threads (Thread-1 on object1 and Thread-2 on object2). Thread-1 is in synchronized method1(), can Thread-2 enter synchronized method2() at same time? Answer.Yes, here when Thread-1 is in synchronized method1() it must be holding lock on object1’s monitor. Thread-2 will acquire lock on object2’s monitor and enter synchronized method2(). Likewise, Thread-2 even enter synchronized method1() as well which is being executed by Thread-1 (because threads are created on different objects). Now, let’s see a program to prove our point. Question 49. Suppose you have 2 threads (Thread-1 on object1 and Thread-2 on object2). Thread-1 is in static synchronized method1(), can Thread-2 enter static synchronized method2() at same time? Answer.No, it might confuse you a bit that threads are created on different objects. But, not to forgot that multiple objects may exist but there is always one class’s class object lock available. Here, when Thread-1 is in static synchronized method1() it must be holding lock on class class’s object and will release lock on class’s classobject only when it exits static synchronized method1(). So, Thread-2 will have to wait for Thread-1 to release lock on class’s classobject so that it could enter static synchronized method2(). Likewise, Thread-2 even cannot enter static synchronized method1() which is being executed by Thread-1. Thread-2 will have to wait for Thread-1 to release lock on class’s classobject so that it could enter static synchronized method1(). Now, let’s see a program to prove our point. Question 50. Difference between wait() and wait(long timeout), What are thread states when these method are called? Answer. wait() wait(long timeout) When wait() method is called on object, it causes causes the current thread to wait until another thread invokes the notify() or notifyAll() method for this object. wait(long timeout) - Causes the current thread to wait until either another thread invokes the notify() or notifyAll() methods for this object, or a specified timeout time has elapsed. When wait() is called on object - Thread enters from running to waiting state. It waits for some other thread to call notify so that it could enter runnable state. When wait(1000) is called on object - Thread enters from running to waiting state. Than even if notify() or notifyAll() is not called after timeout time has elapsed thread will go from waiting to runnable state. Question 51. How can you implement your own Thread Pool in java? Answer. What is ThreadPool? ThreadPool is a pool of threads which reuses a fixed number of threads to execute tasks. At any point, at most nThreads threads will be active processing tasks. If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available. ThreadPool implementation internally uses LinkedBlockingQueue for adding and removing tasks. In this post i will be using LinkedBlockingQueue provide by java Api, you can refer this post for implementing ThreadPool using custom LinkedBlockingQueue. Need/Advantage of ThreadPool? Instead of creating new thread every time for executing tasks, we can create ThreadPool which reuses a fixed number of threads for executing tasks. As threads are reused, performance of our application improves drastically. How ThreadPool works? We will instantiate ThreadPool, in ThreadPool’s constructor nThreads number of threads are created and started. ThreadPool threadPool=new ThreadPool(2); Here 2 threads will be created and started in ThreadPool. Then, threads will enter run() method of ThreadPoolsThread class and will call take() method on taskQueue. If tasks are available thread will execute task by entering run() method of task (As tasks executed always implements Runnable). publicvoid run() { . . . while (true) { . . . Runnable runnable = taskQueue.take(); runnable.run(); . . . } . . . } Else waits for tasks to become available. When tasks are added? When execute() method of ThreadPool is called, it internally calls put() method on taskQueue to add tasks. taskQueue.put(task); Once tasks are available all waiting threads are notified that task is available. Question 52. What is significance of using ThreadLocal? Answer. This question will test your command in multi threading, can you really create some perfect multithreading application or not. ThreadLocal is a class which provides thread-local variables. What is ThreadLocal ? ThreadLocal is a class which provides thread-local variables. Every thread has its own ThreadLocal value that makes ThreadLocal value threadsafe as well. For how long Thread holds ThreadLocal value? Thread holds ThreadLocal value till it hasn’t entered dead state. Can one thread see other thread’s ThreadLocal value? No, thread can see only it’s ThreadLocal value. Are ThreadLocal variables thread safe. Why? Yes, ThreadLocal variables are thread safe. As every thread has its own ThreadLocal value and one thread can’t see other threads ThreadLocal value. Application of ThreadLocal? ThreadLocal are used by many web frameworks for maintaining some context (may be session or request) related value. In any single threaded application, same thread is assigned for every request made to same action, so ThreadLocal values will be available in next request as well. In multi threaded application, different thread is assigned for every request made to same action, so ThreadLocal values will be different for every request. When threads have started at different time they might like to store time at which they have started. So, thread’s start time can be stored in ThreadLocal. Creating ThreadLocal > private ThreadLocal threadLocal = new ThreadLocal(); We will create instance of ThreadLocal. ThreadLocal is a generic class, i will be using String to demonstrate threadLocal. All threads will see same instance of ThreadLocal, but a thread will be able to see value which was set by it only. How thread set value of ThreadLocal > threadLocal.set( new Date().toString()); Thread set value of ThreadLocal by calling set(“”) method on threadLocal. How thread get value of ThreadLocal > threadLocal.get() Thread get value of ThreadLocal by calling get() method on threadLocal. See here for detailed explanation of threadLocal. Question 53. What is busy spin? Answer. What is busy spin? When one thread loops continuously waiting for another thread to signal. Performance point of view - Busy spin is very bad from performance point of view, because one thread keeps on looping continuously ( and consumes CPU) waiting for another thread to signal. Solution to busy spin - We must use sleep() or wait() and notify() method. Using wait() is better option. Why using wait() and notify() is much better option to solve busy spin? Because in case when we use sleep() method, thread will wake up again and again after specified sleep time until boolean variable is true. But, in case of wait() thread will wake up only when when notified by calling notify() or notifyAll(), hence end up consuming CPU in best possible manner. Program - Consumer Producer problem with busy spin > Consumer thread continuously execute (busy spin) in while loop tillproductionInProcess is true. Once producer thread has ended it will make boolean variable productionInProcess false and busy spin will be over. while(productionInProcess){ System.out.println("BUSY SPIN - Consumer waiting for production to get over"); } Question 54. Can a constructor be synchronized? Answer. No, constructor cannot be synchronized. Because constructor is used for instantiating object, when we are in constructor object is under creation. So, until object is not instantiated it does not need any synchronization. Enclosing constructor in synchronized block will generate compilation error. Using synchronized in constructor definition will also show compilation error. COMPILATION ERROR = Illegal modifier for the constructor in type ConstructorSynchronizeTest; only public, protected & private are permitted Though we can use synchronized block inside constructor. Read More about : Constructor in java cannot be synchronized Question 55. Can you find whether thread holds lock on object or not? Answer. holdsLock(object) method can be used to find out whether current thread holds the lock on monitor of specified object. holdsLock(object) method returns true if the current thread holds the lock on monitor of specified object. Question 56. What do you mean by thread starvation? Answer. When thread does not enough CPU for its execution Thread starvation happens. Thread starvation may happen in following scenarios > Low priority threads gets less CPU (time for execution) as compared to high priority threads. Lower priority thread may starve away waiting to get enough CPU to perform calculations. In deadlock two threads waits for each other to release lock holded by them on resources. There both Threads starves away to get CPU. Thread might be waiting indefinitely for lock on object’s monitor (by calling wait() method), because no other thread is calling notify()/notifAll() method on object. In that case, Thread starves away to get CPU. Thread might be waiting indefinitely for lock on object’s monitor (by calling wait() method), but notify() may be repeatedly awakening some other threads. In that case also Thread starves away to get CPU. Question 57. What is addShutdownHook method in java? Answer. addShutdownHook method in java > addShutdownHook method registers a new virtual-machine shutdown hook. A shutdown hook is a initialized but unstarted thread. When JVM starts its shutdown it will start all registered shutdown hooks in some unspecified order and let them run concurrently. When JVM (Java virtual machine) shuts down > When the last non-daemon thread finishes, or when the System.exit is called. Once JVM’s shutdown has begunnew shutdown hook cannot be registered neither previously-registered hook can be de-registered. Any attempt made to do any of these operations causes an IllegalStateException. For more detail with program read : Threads addShutdownHook method in java Question 58. How you can handle uncaught runtime exception generated in run method? Answer. We can use setDefaultUncaughtExceptionHandler method which can handle uncaught unchecked(runtime) exception generated in run() method. What is setDefaultUncaughtExceptionHandler method? setDefaultUncaughtExceptionHandler method sets the default handler which is called when a thread terminates due to an uncaught unchecked(runtime) exception. setDefaultUncaughtExceptionHandler method features > setDefaultUncaughtExceptionHandler method sets the default handler which is called when a thread terminates due to an uncaught unchecked(runtime) exception. setDefaultUncaughtExceptionHandler is a static method method, so we can directly call Thread.setDefaultUncaughtExceptionHandler to set the default handler to handle uncaught unchecked(runtime) exception. It avoids abrupt termination of thread caused by uncaught runtime exceptions. Defining setDefaultUncaughtExceptionHandler method > Thread.setDefaultUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler(){ publicvoid uncaughtException(Thread thread, Throwable throwable) { System.out.println(thread.getName() + " has thrown " + throwable); } }); Question 59. What is ThreadGroup in java, What is default priority of newly created threadGroup, mention some important ThreadGroup methods ? Answer. When program starts JVM creates a ThreadGroup named main. Unless specified, all newly created threads become members of the main thread group. ThreadGroup is initialized with default priority of 10. ThreadGroup important methods > getName() name of ThreadGroup. activeGroupCount() count of active groups in ThreadGroup. activeCount() count of active threads in ThreadGroup. list() list() method has prints ThreadGroups information getMaxPriority() Method returns the maximum priority of ThreadGroup. setMaxPriority(int pri) Sets the maximum priority of ThreadGroup. Question 60. What are thread priorities? Answer. Thread Priority range is from 1 to 10. Where 1 is minimum priority and 10 is maximum priority. Thread class provides variables of final static int type for setting thread priority. /* The minimum priority that a thread can have. */ publicfinalstaticintMIN_PRIORITY= 1; /* The default priority that is assigned to a thread. */ publicfinalstaticintNORM_PRIORITY= 5; /* The maximum priority that a thread can have. */ publicfinalstaticintMAX_PRIORITY= 10; Thread with MAX_PRIORITY is likely to get more CPU as compared to low priority threads. But occasionally low priority thread might get more CPU. Because thread scheduler schedules thread on discretion of implementation and thread behaviour is totally unpredictable. Thread with MIN_PRIORITY is likely to get less CPU as compared to high priority threads. But occasionally high priority thread might less CPU. Because thread scheduler schedules thread on discretion of implementation and thread behaviour is totally unpredictable. setPriority()method is used for Changing the priority of thread. getPriority()method returns the thread’s priority.

May 29, 2015

by Ankit Mittal

· 338,545 Views · 38 Likes

Four Ways to Quickly Test Swift Code

As developers, we are always looking for a better, faster way of doing things. Whenever I am learning a new language that typically runs in an IDE, then I begin to look for ways to test code snippets through either the Terminal for Mac or the command prompt on Windows. Swift is no exception. As I’ve been working more and more with this language, I’ve uncovered four ways to quickly test Swift code that are not only great for your day-to-day job, but can be used to collaborate and help others learn this new language. #1 : REPL (Read-Eval-Print-Loop) Xcode’s debugger includes an interactive version of the Swift language, known as the REPL (Read-Eval-Print-Loop). This allows you to try out the Swift language within LLDB in Xcode’s console, or from Terminal. If you have at least Xcode 6.1 or higher, then you can simply open your terminal and type: swift You can also invoke it with the following commands on earlier versions of Xcode 6 : xcrun swift lldb --repl It looks like the following: This is great for quick code snippets that you might want to try without launching Xcode. #2 : Swift playgrounds Swift playgrounds are a way to compile and run Swift code live as you type. The results of each line are presented in a timeline as they execute, and variables can be inspected at any point. Playgrounds are typically created as a standalone project (as the image below indicates), but they can be created within an existing Xcode project as well. There are plenty of sample playgrounds out there, and you are free to usemine to get started. Below you will see an example of the timeline in action, providing a visual look of arrays, for loops and more. The obvious reason to use Swift playgrounds is the rich editor that includes syntax highlighting, code completion and more. The disadvantage is that you have to open Xcode in order to do so. #3 : Using an Online Editor SwiftStub has become one of the most popular ways to compile and run Swift code on the fly without requiring a Mac. All you need is a web-browser open to SwiftStub and off you go. It includes the functionality that you would expect, such as a custom URLs and uploading or saving a playground, but it also supports team collaboration. You can easily add people to your current Swift project and even add audio and group chat if neccessary. #4 : Using iTerm2 with Guard-shell This is my preferred environment, but it is geared towards power users that don’t mind spending a few extra minutes setting it up. Don’t worry if you have never done this before as I’ll walk you through the process, step-by-step. I prefer to use iTerm2. Think of it as a replacement for the Terminal app on Mac. In the words of the authors, “iTerm2 brings the terminal into the modern age with features you never knew you always wanted.” I’ve been using it for a couple of months and couldn’t agree more. We are also going to use the help of Guard-shell to automatically run shell commands when watched files are modified. In this case, we’ll be watching files with the .swift extention. Once you have these applications downloaded, you only need to remember a few commands to get started… Within iTerm2, press ⌘D to get a Vertical Split and ⇧⌘D for a horizontal split. Navigate to your home directory and type: vim Guardfile Once you are inside the Guardfile, you will need to switch to “Insert” mode. Simply type the following and when you are finished press “esc” and then type :w to save the file. Type :x to save and exit vim. source 'https://rubygems.org' gem 'guard-shell' You will now have a file named Gemfile and it is time to install the gem. Simply type: bundle install You should then see the following: Fetching gem metadata from https://rubygems.org/............ Fetching version metadata from https://rubygems.org/.. Resolving dependencies... Using hitimes 1.2.2 Using timers 4.0.1 Using celluloid 0.16.0 Installing coderay 1.1.0 Using ffi 1.9.8 Installing formatador 0.2.5 Using rb-fsevent 0.9.4 Using rb-inotify 0.9.5 Using listen 2.9.0 Installing lumberjack 1.0.9 Installing nenv 0.2.0 Installing shellany 0.0.1 Installing notiffany 0.0.6 Installing method_source 0.8.2 Installing slop 3.6.0 Installing pry 0.10.1 Installing thor 0.19.1 Installing guard 2.12.5 Installing guard-compat 1.2.1 Installing guard-shell 0.7.1 Using bundler 1.8.5 Bundle complete! 1 Gemfile dependency, 21 gems now installed. Now would be a good time to create a directory where you want guard-shell to be monitoring for .swift files that have changed. I created a folder called Swift, then ran the following command : bundle exec guard init shell A new file called Guardfile will be created in that folder. Now type vim Guardfile, enter the following lines and save the file the same way you did before. guard :shell do watch(/(.*).swift/) do |m| puts puts puts puts "Running #{m[0]}" puts `swift #{m[0]}` end end Finally type: bundle exec guard If everything worked successfully, then Guard-shell will inform you that it is watching a folder as shown below: Switch over to your left-hand panel and make sure you are in the folder that Guard is watching and type “vim test.swift” and type the following Swift code: var first = "hello" var second = "world" println("\(first) \(second)") Use :w to save the file and see the output in the right-hand panel as shown below. Wrap-up Hopefully you can find a solution that works for your development process out of the four options that I presented today. I assume that, since you are interested in testing Swift code snippets, you are building Swift apps as well. You may be interested in my article on how to build a task app in Swift as well. In addition, Telerik provides several powerful UI componentsfor iOS such as Charts, Calandar, ListView and more. Thanks for reading and sound off in the comments below with your ideal environment.

May 27, 2015

by Michael Crump

· 9,076 Views

Converting to/from Unix Timestamp in C#

a few days ago, visual studio 2015 rc was released. among the many updates to .net framework 4.6 with this release, we now have some new utility methods allowing conversion to/from unix timestamps. although these were added primarily to enable more cross-platform support in .net core framework , unix timestamps are also sometimes useful in a windows environment. for instance, unix timestamps are often used to facilitate redis sorted sets where the score is a datetime (since the score can only be a double ). unix timestamp conversion before .net 4.6 until now, you had to implement conversions to/from unix time yourself. that actually isn’t hard to do. by definition , unix time is the number of seconds since 1st january 1970, 00:00:00 utc. thus we can convert from a local datetime to unix time as follows: var datetime = new datetime(2015, 05, 24, 10, 2, 0, datetimekind.local); var epoch = new datetime(1970, 1, 1, 0, 0, 0, datetimekind.utc); var unixdatetime = (datetime.touniversaltime() - epoch).totalseconds; we can convert back to a local datetime as follows: var timespan = timespan.fromseconds(unixdatetime); var localdatetime = new datetime(timespan.ticks).tolocaltime(); unix timestamp conversion in .net 4.6 quoting the visual studio 2015 rc release notes : new methods have been added to support converting datetime to or from unix time. the following apis have been added to datetimeoffset: static datetimeoffset fromunixtimeseconds(long seconds) static datetimeoffset fromunixtimemilliseconds(long milliseconds) long tounixtimeseconds() long tounixtimemilliseconds() so .net 4.6 gives us some new methods, but to use them, you’ll first have to convert from datetime to datetimeoffset. first, make sure you’re targeting the right version of the .net framework: you can then use the new methods: var datetime = new datetime(2015, 05, 24, 10, 2, 0, datetimekind.local); var datetimeoffset = new datetimeoffset(datetime); var unixdatetime = datetimeoffset.tounixtimeseconds(); …and to change back… var localdatetimeoffset = datetimeoffset.fromunixtimeseconds(unixdatetime) .datetime.tolocaltime();

May 26, 2015

by Daniel D'agostino

· 94,788 Views · 1 Like

How To Set Up a Tomcat, Apache and mod_jk Cluster

In this article I will go through a common set-up for a small production environment. A single tier, load balanced application server cluster. Overview A high level overview of what we will be doing. Downloading and installing Apache HTTP server and mod_jk Downloading Tomcat Downloading Java Configuring two local Tomcat servers Clustering the two Tomcat servers Configuring Apache to use mod_jk to forward request to Tomcat Deploying application to Tomcat server that tests our set-up Introduction What is Apache? Apache is an HTTP server. What is mod_jk? It is an Apache module that allows AJP communication between Apache and a back end application server like Tomcat.I am running this on Ubuntu 14.04LTS installed on a dual boot PC with Windows 7. Download Apache2 We are going to use Ubuntu's APT package maintenance system to obtain and install Apache2. sudo apt-get install apache2 This will install in /etc/apache2 Download and install mod_jk The mod_jk module is not included in the Apache2 download so must be obtained and installed separately. The installation requires that the mod_jk module is visible to Apache and configured to ensure that Apache knows where to look for it and what to do with the requests you want to proxy. sudo apt-get install libapache2-mod-jk This will install in /etc/libapache2-mod-jk also two files have been added to the /etc/apache2/mods-available folder. Downloading and installing Tomcat 8 At the time of writing this Tomcat 8 does not have a package in APT so you must download the binaries from the tomcat website.http://tomcat.apache.org/download-80.cgi select the appropriate binary distribution and extract it as follows. tar xvzf apache-tomcat-8.0.5.tar.gz We need two copies of the Tomcat server to be load balanced. I created two directories in the /opt/ location: /opt/tomcat-server1/ and /opt/tomcat-server2/ and copied tomcat into each one. Download and install Java Download Java from APT as follows: apt-get install openjdk-7-jdk and set JAVA_HOME in .bashrc vim ~/.bashrc export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 Configure two local Tomcat servers We will edit only the server.xml of the server2 installation of tomcat. We need to change port numbers to avoid conflicts.We change the following: and comment out the HTTP Connector as we only want the web application to be accessible through the load balancer.Here is my server2 Tomcat server.xml configuration. Configure mod_jk Load balancing is configured in the workers.properties file, located /etc/libapache2-mod-jk/ where workers represent actual or virtual workers.We will define two actual workers and two virtual workers which map to the Tomcat servers. In the worker.list property I have defined two virtual workers: status and loadbalancer, I will refer to these later in the Apache configuration.Workers for each server have been defined using values for the server.xml configuration files. I used the port values for the AJP connectors and I have included an lbfactor that sets the preference that the load balancer will show for that server.Finally we define the virtual workers. The loadbalancer worker is set to type lb and set the workers that represent the Tomcat servers in the balancer_workers properties. The status only needs to be set to type status. worker.list=loadbalancer,status worker.server1.port=8009worker.server1.host=localhostworker.server1.type=ajp13 worker.server2.port=9009worker.server2.host=localhostworker.server2.type=ajp13 worker.server1.lbfactor=1worker.server2.lbfactor=1 worker.loadbalancer.type=lbworker.loadbalancer.balance_workers=server1,server2 worker.status.type=status Ensure that you remove any other worker configuration that are not being used. Configure Apache Web Server to forward requests You will need to add the following to the Apache configurations located in etc/apache2/sites-enabled/000-default.conf JkMount /status status JkMount /* loadbalancer Verify the installation To test that all has been configured correctly we need to deploy an application. A sample application that has been used for years to test such configurations is called the ClusterJSP sample application. You can find it by googling in or from the JBoss site.Now deploy the war to the webapps folder on both servers and start each server using the start-up script /opt/tomcat-server1/bin/startup.sh.Go to http://localhost/clusterjsp/HaJsp.jsp and you should see the page show HttpSession information. Now lets look at the mod_jk status page: http://localhost/status. You will see that this page shows information about the load balancer workers and the workers it is balancing. If everything is working you will see the worker error state show OK or OK/IDLE if they are not currently balancing load. Things to try out Enable sticky sessions: Configure jvmRoute in the server.xml configuration. Further reading Loadbalancing with mod_jk and ApacheWorking with mod_jk Connecting Apache's Web Server to Multiple Instances of Tomcat

May 19, 2015

by Alex Theedom

· 10,841 Views · 1 Like