DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

The Latest Tools Topics

article thumbnail
8 New & Beefed Up NetBeans Keyboard Shortcuts!
My colleague Tom McGinn recently published 10 Time Savers in NetBeans, an excellent overview of tips that everyone using NetBeans should read. For those of us who are keyboard oriented (i.e., we hate the mouse because it breaks our workflow), here is an appendix to Tom's article, listing the very newest keyboard shortcuts, available in NetBeans IDE 7. Most are new or changed in NetBeans IDE 7.2, while the last one is new but unchanged from NetBeans IDE 7.1. Alt + Scroll Up/Down. Increase/decrease the font size. Font resizing, which was first introduced in NetBeans IDE 7.1, was initially done by holding down the Ctrl key while scrolling. However, this conflicted with other actions, as explained in issue 212484. Following discussions with the NetBeans community, the Ctrl key was changed for the Alt key, and now, when you hold down the Alt key and then scroll up/down, in any file, the font size will increase/decrease. Alt + Shift + Page Up/Down. Quickly move entire code elements, i.e., statements and class members, up or down. Take the following situation. The cursor is on the first line below; if you were to use the standard Alt + Shift + Down, only the line would be moved down, which would immediately create an error in the editor because now the class cannot be compiled anymore. However, if you use Alt + Shift + Page Down instead (i.e., Page Down instead of simply Down), the entire method will move downwards, over the method below that, and will then appear below the method that is currently below it. In other words, the move action now has semantic knowledge of the code being moved, as well as semantic knowledge of the code around it. A similarly semantic-aware keyboard shortcut, though not new in NetBeans IDE 7, is Alt + Shift + Period, which lets you semantically expand the selection from the currently selected word to the next level of semantic knowledge, e.g., method or class, and back again via Alt + Shift + Comma. Alt + Backspace. Quickly remove the enclosing parts of a nested statement. Put the cursor within the word "for", or "if", or "else", for example, then press Alt + Backspace and the popup below is displayed. Mouse down or up in the list and then press Enter to confirm. Ctrl + Shift + M. Add/remove a bookmark to/from the Bookmarks Window. As always, you can set bookmarks in your code. However, now, when you do so, the bookmarks are, in addition to being marked in the file, added to the new Bookmarks Window (available via Window | Navigating | Bookmarks). From the Bookmarks Window, you can now jump across all your files and all your projects so that you can, for example, create a task-oriented track through your projects: In a related change, when you now use Ctrl + Shift + Period or Ctrl + Shift + Comma, to toggle back and forward between bookmarks, this handy popup appears, listing all the bookmarks found within the Bookmarks Window: Alt + Shift + F. The "Reformat" action can now be used across multiple projects, packages, and classes, for the first time. That means that when you use the old Alt + Shift + F while multiple nodes are selected in the Projects window, all files within the selected nodes will be reformatted at the same time. Ctrl + Z. The "Undo" keyboard shortcut now applies to refactorings too, for the first time. In previous releases, you needed to use a separate action especially for undoing refactorings, which was very hidden in the Refactoring menu and therefore hard to find, and therefore not frequently used because typically you wouldn't know it existed. Ctrl + Space. Once you have pressed Ctrl-F or Ctrl-H to open the Find bar or the Search bar at the bottom of any file (not only Java source files, but also HTML files, for example), you can press Ctrl + Space which lets you use code completion within the Find field, which can be very useful to get a quick overview of the items you could be trying to find. In the Debugger, the same is true in the New Breakpoint dialog. Ctrl + Shift + R. Change the cursor to a block selector. Then put the cursor next to the top left of the block and Shift+Click its bottom right corner. You now have a block, which you can manipulate, e.g., move or cut or copy or delete or change all the lines within the block. Start typing while having a block selected and all the lines will change simultaneously. Applies to all file types, not only Java source files, as can be seen here: Note: And, a final tip, if you go to the Help menu in NetBeans IDE, you'll find the updated NetBeans IDE Keyboard Shortcut Card in PDF format, ideal for printing out and hanging in a nice frame in your workspace!
August 19, 2012
by Geertjan Wielenga
· 54,822 Views
article thumbnail
How to Migrate Drupal to Azure Web Sites
DrupalCon Munich is next week, and I am lucky enough to be going. As part of preparing for the conference, I thought it would be worthwhile to see just how easy (or difficult) it would be to migrate an existing Drupal site to Windows Azure Web Sites. So, in this post, I’ll do just that. Fortunately, because Windows Azure Web Sites supports both PHP and MySQL, the migration process is relatively straightforward. And, because Drupal and PHP run on any platform, the process I’ll describe should work for moving Drupal to Windows Azure Web Sites regardless of what platform you are moving from. Of course, Drupal installations can vary widely, so YMMV. I tested the instructions below on relatively small (and simple) Drupal installation running on CentOS 5. (Unfortunately, I won’t be using Drush since it isn’t supported on Windows Azure Websites.) If you are considering moving a large and complex Drupal application, may want to consider moving to Windows Azure Cloud Services (more information about that here: Migrating a Drupal Site from LAMP to Windows Azure). Before getting started, it’s worth noting that Windows Azure Websites lets you run up to 10 Web Sites for free in a multitenant environment. And, you can seamlessly upgrade to private, reserved VM instances as your traffic grows. To sign up, try the Windows Azure 90-day free trial. 1. Create a Windows Azure Web Site and MySQL database There is a step-by-step tutorial on http://www.windowsazure.com that walks you through creating a new website and a MySQL database, so I’ll refer you there to get started: Create a PHP-MySQL Windows Azure web site and deploy using Git. If you intend to use Git to publish your Drupal site, then go ahead and follow the instructions for setting up a Git repository. Make sure to follow the instructions in the Get remote MySQL connection information section as you will need that information later. You can ignore the remainder of the tutorial for the purposes of deploying your Drupal site, but if you are new to Windows Azure Web Sites (and to Git), you might find the additional reading informative. Ok, now you have a new website with a MySQL database, your have your MySQL database connection information, and you have (optionally) created a remote Git repository and made note of the Git deployment instructions. Now you are ready to copy your database to MySQL in Windows Azure Web Sites. 2. Copy database to MySQL in Windows Azure Web Sites I’m sure there is more than one way to copy your Drupal database, but I found the mysqldump tool to be effective and easy to use. To copy from a local machine to Windows Azure Web Sites, here’s the command I used: mysqldump -u local_username --password=local_password drupal | mysql -h remote_host -u remote_username --password=remote_password remote_db_name You will, of course, have to provide the username and password for your existing Drupal database, and you will have to provide the hostname, username, password, and database name for the MySQL database you created in step 1. This information is available in the connection string information that you should have noted in step 1. i.e. You should have a connection string that looks something like this: Database=remote_db_name;Data Source=remote_host;User Id=remote_username;Password=remote_password Depending on the size of your database, the copying process could take several minutes. Now your Drupal database is live in Windows Azure Websites. Before you deploy your Drupal code, you need to modify it so it can connect to the new database. 3. Modify database connection info in settings.php Here, you will again need your new database connection information. Open the /drupal/sites/default/setting.php file in your favorite text editor, and replace the values of ‘database’, ‘username’, ‘password’, and ‘host’ in the $databases array with the correct values for your new database. When you are finished, you should have something similar to this: $databases = array ( 'default' => array ( 'default' => array ( 'database' => 'remote_db_name', 'username' => 'remote_username', 'password' => 'remote_password', 'host' => 'remote_host', 'port' => '', 'driver' => 'mysql', 'prefix' => '', ), ), ); Be sure to save the settings.phpfile, then you are ready to deploy. 4. Deploy Drupal code using Git or FTP The last step is to deploy your code to Windows Azure Web Sites using Git or FTP. If you are using FTP, you can get the FTP hostname and username from you website’s dashboard. Then, use your favorite FTP client to upload your Drupal files to the /site/wwwroot folder of the remote site. If you are using Git, you need to set up a Git repository in Windows Azure Web Sites (steps for this are in the tutorial mentioned earlier). And, you will need Git installed on your local machine. Then, just follow the instructions provided after you created the repository: One note about using Git here: depending on your Git settings, your .gitignore file (a hidden file and a sibling to the .git folder created in your local root directory after you executed git commit), some files in your Drupal application may be ignored. In my case, all the files in the sites directory were ignored. If this happens, you will want to edit the .gitignore file so that these files aren’t ignored and redeploy. After you have deployed Drupal to Windows Azure Web Sites, you can continue to deploy updates via Git or FTP. Related information If you are looking for more information about Windows Azure Web Sites, these posts might be helpful: Windows Azure Websites- A PHP Perspective Windows Azure Websites, Web Roles, and VMs- When to use which- Configuring PHP in Windows Azure Websites with .user.ini Files One last thing you might consider, depending on your site, is using the Windows Azure Integration Module to store and serve your site’s media files.
August 19, 2012
by Brian Swan
· 10,214 Views
article thumbnail
JaCoCo Jenkins Plugin
In my post about JaCoCo and MavenI wrote about the problems of using the JaCoCo Maven plugin in multimodule Maven projects because of having one report for each module separately instead of one report for all modules, and how it can be fixed using JaCoCo Ant Task. In this post we are going to see how to use the JaCoCo Jenkins plugin to achieve the same goal of Ant Tasks and have overall code coverage statistics for all modules. The first step is installing the JaCoCo Jenkins plugin. Go to Jenkins -> Manage Jenkins -> Plugin Manager -> Available and find JaCoCo Plugin The next step, if it is not done already, is configuring your JaCoCo Maven plugin into parent pom: org.jacoco jacoco-maven-plugin ${jacoco.version} prepare-agent report prepare-package report And finally a post-action must be configured to the job responsible for packaging the application. Note that in previous pom file reports are generated just before the package goal is executed. Go to Configure -> Post-build Actions -> Add post-build action -> Record JaCoCo coverage report. Then we have to set folders or files containing JaCoCoXML reports, which are using the previous pom to **/target/site/jacoco/jacoco*.xml, and also set when we consider that a build is healthy in terms of coverage. Then we can save the job configuration and run the build project. After the project is build, a new report will appear just under the test result trend graph, called code coverage trend, where we can see the code coverage of all project modules. From the left menu, you can enter into Coverage Report and see code coverage of each module separately. Furthermore, visiting the Jenkins main page will give you a nice quick overview of a job when you mouse over the weather icon as shown: Keep in mind that this approach for merging code coverage files will only work if you are using Jenkins as a CI system. Ant Task is a more generic solution and can also be used with the JaCoCo Jenkins plugin. We Keep Learning, Alex.
August 14, 2012
by Alex Soto
· 58,506 Views · 4 Likes
article thumbnail
How to do a Production Hotfix
Situation It’s Thursday/Friday evening, the daily version / master branch was deemed too risky to install, and you decide to wait for Sunday/Monday with the deploy to production. There’s a new critical bug found in production. We do not want to install the bug on top of all the other changes, because of the risk factor. What do we do? Develop the fix on top of the production branch, in our local machine, git push, and deploy the fix, without all the other changes. How can I do this? My example uses a Play Framework service, but that’s immaterial. gitk –all – review the situation Suppose the latest version deployed in prod is 1.2.3, and master has some commits after that. You checkout this version: git checkout 1.2.3 Create a new branch for this hotfix. git checkout -b 1.2.3_hotfix1 Fix the bug locally, and commit. Test it locally. git push On the production machine: git fetch (not pull!) sudo service play stop git checkout 1.2.3_hotfix1 sudo service play start Test on production Merge the fix back to master: git checkout master git merge 1.2.3_hotfix1 git push Clean up the local branch: git branch -d 1.2.3_hotfix1 (Note: the branch will still be saved on origin, you’re not losing any information by deleting it locally)
August 14, 2012
by Ron Gross
· 12,063 Views
article thumbnail
Gradle Plugin for NetBeans IDE 7.2
I (https://github.com/kelemen) have been (and I still do) use Maven for development of Java code. My main reason for using Maven for development is its great NetBeans IDE support, so that I don't need to maintain IDE project files separately. As much as I like this support in the Maven world, I feel the limits of Maven every day I use it. Since I first saw the Gradle project, I knew that this is the least that I've always wanted from a build tool. So I started to look for NetBeans IDE support for Gradle. To my sadness there is only support for Eclipse and Idea. Aside from the fact that I prefer to use NetBeans IDE, I felt the IDE support to be limited for Gradle, (although the last time I read about Gradle support in IDEA, it seemed promising). Not long ago, I came across Geertjan's plugin and I felt that that writing such plugin is possible without enormous effort. So I downloaded his sources and started to analyze them and rewrite the plugin, so that it works with most Gradle scripts. There are many new features available in my version, such as these: slow tasks are done in a background thread source paths are retrieved from the model "test single" Screenshots: Project menus: Project dependencies: Project debug test: However, I removed subprojects and now each project needs to be opened manually, it is more efficient if you don't plan to edit all the subprojects; the drawback is that you cannot open projects without a build.gradle. The main problem with Gradle daemon performance is that on the first project load, Gradle downloads every single dependency. After that, I found the performance acceptable (especially after I implemented caching of already loaded projects). I have tested it with a relatively large project (>60 subprojects, lots of Java code): It took me about 2 minutes to load the project which seems ok to me for such an enormous project. Other than the project loading, the performance depends on NetBeans which is good. How to try it There is currently no compiled version of the plugin, so you have to compile it for yourself, if you want to try it. You can clone/download the sources from here: https://github.com/kelemen/netbeans-gradle-project After downloading the sources, open the project in NetBeans IDE. There generally two approaches you could consider: Generate the .nbm file (by choosing "Create NBM" in the project's popup) and install the plugin as you would do with any other third-party plugin. "Run" the project. This is the safest thing to do because this will start a new NetBeans instance with a brand new user directory (in the build folder). This way your own NetBeans installation will be unaffected. Help I would appreciate I'm pretty new to the NetBeans APIs (i.e., this is my first time using them), so someone might help me with the project dependencies (possibly Mavenize the plugin). And if it is possible, allow for the plugin to rely on a user specified installation of Gradle (there is some risk in it because Gradle does not seem to be very backward compatible). If you happen to know the Project API in NetBeans well, that could prove really helpful, so that I don't need to spend days figuring out, how things need to be done in the API. How to contact me You can contact me through my GitHub account: https://github.com/kelemen
August 12, 2012
by Attila Kelemen
· 11,268 Views
article thumbnail
Build Flow Jenkins Plugin
With the advent of Continuous Integration and Continuous Delivery, our builds are split into different steps creating the deployment pipeline. Some of these steps can be compiled and run fast tests, run slow tests, run automated acceptance tests, or releasing the application, to cite a few. Most of us are using Jenkins/Hudson to implement Continuous Integration/Delivery, and we manage job orchestration combining some Jenkins plugins like build pipeline, parameterized-build, join or downstream-ext. We have to configure all of them which implies polluting the job configuration through multiple jobs, which , makes the system configuration very complex to maintain. Build Flow enables us to define an upper level flow item to manage job orchestration and link up rules, using a dedicated DSL. Let's see a very simple example: First step is installing the plugin. Go to Jenkins -> Manage Jenkins -> Plugin Manager -> Available and find for CloudBees Build Flowplugin. Then you can go to Jenkins -> New Job and you will see a new kind of job called Build Flow. In this example we are going to name it build-all-yy. And now you only have to program using flow DSL how this job should orchestrate the other jobs. In "Define build flow using flow DSL" input text you can specify the sequence of commands to execute. In current example I have already created two jobs, one executing clean compile goal (yy-compile job name) and the other one executing javadoc goal (yy-javadoc job name). I know that this deployment pipeline is not real in a true environment but for now it is enough. Then we want javadoc job running after project is compiled. To configure this we don't have to create any upstream or downstream actions, simply add next lines at DSL text area: build("yy-compile"); build("yy-javadoc"); Save and execute build-all-yy job and both projects will be built in a sequential way. Now suppose that we add a third job called yy-sonar which runs sonar goal that generates code quality sonar report. In this case it seems obvious that after project is compiled, generation of javadocs and code quality jobs can be run in parallel. So script is changed to: build("yy-compile") parallel ( {build("yy-javadoc")}, {build("yy-sonar")} ) This plugin also supports more operations like retry (similar behaviour of retry-failed-job plugin) or guard-rescue, that it works mostly like a try+finally block. Also you can create parameterized builds, accessing to build execution or printing to Jenkins console. Next example will print build number of yy-compile job execution: b = build("yy-compile") out.println b.build.number And finally you can also have a quick graphical overview of the execution in Status section. It is true that could be improved more, but for now it is acceptable, and can be used without any problem. Build Flow plugin is in its early stages, in fact it is only at version 0.4. But will be a plugin to be considered in future, and I think it is good to know that it exists. Moreover is being developed by CloudBees folks so it is a guarantee of being fully supported by Jenkins. We Keep Learning. Alex. Warning: In order to run parallel tasks with the plugin Anonymous users must have Read Job access (Jenkins -> Manage Jenkins -> Configure System). There is an issue already inserted into Jira to fix this problem.
August 2, 2012
by Alex Soto
· 37,654 Views · 1 Like
article thumbnail
Bringing Order to Your Jenkins Jobs
Once you’ve been working with Jenkins and uberSVN for a while, you may find yourself in a situation where you have several jobs that need to run in a specific order, for example: Job 1 and Job 3 can run simultaneously. BUT Job 2 should only start when Job 1 and Job 3 have finished running. AND Job 4 should only start when Job 2 has finished. How can you implement this complicated setup? This is where Jenkins’ ‘Advanced Project Options’ and build triggers come in handy. In this tutorial, we’ll walk through the different options for scheduling jobs using Jenkins and uberSVN, the free ALM platform for Apache Subversion. Note, this tutorial assumes you have already created a job and configured it to automatically poll your Subversion repository. 1) Open the Jenkins tab of your uberSVN installation and select a job. 2) Click the ‘Configure’ option from the left-hand menu. 3) In the ‘Advanced Project Options’ tab, select the ‘Advanced…’ button 4) This contains two options that are useful for ordering your jobs: Block build when upstream project is building – blocks builds when a dependency is in the queue, or building. Note, these dependencies include both direct and transitive dependencies. Block build when downstream project is building – blocks builds when a child of the project is in the queue, or building. This applies to both direct and transitive children. If this option doesn’t meet your needs, you can explicitly name a project (or projects) that must be built before your job is allowed to run. To set this: 1) Scroll down to the ‘Build triggers’ tab on the configure page. 2) Select the ‘Build after other projects are built’ checkbox. This will bring up a text box where you can list any number of projects. Utilized properly, the build triggers and advanced project options should allow you to organize your jobs into a schedule. Tip, if you need even more control over your build schedule, there are plenty of scheduling plugins available. To add plugins to Jenkins, simply: 1) Open the ‘Manage Jenkins’ screen. 2) Click the ‘Manage Plugins’ link. 3) Open the ‘Available’ tab and select the appropriate plugins from the list.
July 28, 2012
by Jessica Thornsby
· 21,041 Views
article thumbnail
Set up a Nightly Build Process with Jenkins, SVN and Nexus
we wanted to set up a nightly integration build with our projects so that we could run unit and integration tests on the latest version of our applications and their underlying libraries. we have a number of libraries that are shared across multiple projects and we wanted this build to run every night and use the latest versions of those libraries even if our applications had a specific release version defined in their maven pom file. in this way we would be alerted early if someone added a change to one of the dependency libraries that could potentially break an application when the developer upgraded the dependent library in a future version of the application. the chart below illustrates our dependencies between our libraries and our applications. updating versions nightly both the crossdock-shared and messaging-shared libraries depend on the siesta framework library. the crossdock web service and crossdockmessaging applications both depend on the crossdock-shared and messaging-shared libraries. because of the dependency structure, we wanted the siestaframework library built first. the crossdock-shared and messaging-shared libraries could be built in parallel, but we didn’t want the builds for the crossdock web service and crossdockmessaging applications to begin until all the libraries had finished building. we also wanted the nightly build to tag a subversion with the build date as well as upload the artifact to our nexus “nightly build” repository. the resulting artifact would look something like siestaframework-20120720.jar also as i had mentioned, even though the crossdockmessaging app may specify in its pom file it depends on version 5.0.4 of the siestaframework library. for the purposes of the nightly build, we wanted it to use the freshly built siestaframework-nightly-20120720.jar version of the library. the first problem to tackle was getting the current date into the project’s version number. for this i started with the jenkins zentimestamp plugin . with this plugin the format of jenkin’s build_id timestamp can be changed. i used this to specify using the format of yyyymmdd for the timestamp. the next step was to get the timestamp into the version number of the project. i was able to accomplish this by using the maven versions plugin. one thing the versions plugin can do is allow you to dynamically override the version number in the pom file at build time. the code snippet from the siestaframework’s pom file is below. org.codehaus.mojo versions-maven-plugin 1.3.1 at this point the jenkins job can be configured to invoke the “versions;set” goal, passing in the new version string to use. the ${build_id} jenkins variable will have the newly formatted date string. this will produce an artifact with the name siestaframework-nightly-20120720.jar uploading artifacts to a nightly repository since this job needed to upload the artifact to a different repository from our release repository that's defined in our project pom files, the “altdeploymentrepository” property was used to pass in the location of the nightly repository. the deployment portion of the siestaframework job specifies the location of the nightly repository where ${lynden_nightly_repo} is a jenkins variable containing the nightly repo url. tagging subversion finally, the jenkins subversion tagging plugin was used to tag svn if the project was successfully built. the plugin provides a post-build action for the job with the configuration section shown below. dynamically updating dependencies so now that the main project is set up, the dependent projects are set up in a similar way, but need to be configured to use the siestaframework-nightly-20120720 of the dependency rather than whatever version they currently have specified in their pom file. this can be accomplished by changing the pom to use a property for the version number of the dependency. for example, if the snippet below was the original pom file— com.lynden siestaframework 5.0.1 —changing it to the following would allow the siestaframework version to be set dynamically: 5.0.1 com.lynden siestaframework ${siesta.version} this version can then be overriden by the jenkins job. the example below shows the jenkins configuration for the crossdock-shared build. enforcing build order the final step in this process is setting up a structure to enforce the build order of the projects. the dependencies are set up in such a way that siestaframework needs to be built first, and the crossdock-shared and messaging-shared libraries can be run concurrently once siestaframework finishes. the crossdock web service and crossdockmessaging application jobs can be run concurrently, too, but not until after both shared libraries have finished. setting up the crossdock-shared and messaging-shared jobs to be built after the siestaframework finishes is pretty straightforward. in the jenkins job configuration for both the shared libraries, the following build trigger is added: to satisfy the requirement that the apps build only after all libraries have built, i enlisted the help of the join plugin . the join plugin can be used to execute a job once all “downstream” jobs have completed. what does this mean exactly? looking at the diagram below, the crossdock-shared and the messaging-shared jobs are “downstream” from the siestaframework job. once both of these jobs complete, a join trigger can be used to start other jobs. in this case, rather than having the join trigger kick off other app jobs directly, i created a dummy join job. in this way, as we add more application builds, we don’t need to keep modifying the siestaframework job with the new application job we just added. to illustrate the configuration, siestaframework has a new post-build action (below): join-build is a jenkins job i configured that does not do anything when executed. then our crossdock web service and crossdockmessaging applications define their builds to trigger as soon as join-build has completed. in this way we are able to run builds each night that will update to the latest version of our dependencies as well as tag svn and archive the binaries to nexus. i’d love to hear feedback from anyone who is handling nightly builds via jenkins, and how they have handled the configuration and build issues.
July 25, 2012
by Rob Terpilowski
· 22,825 Views
article thumbnail
Hadoop Hive Web Interface
I’ve been playing with Hive recently and liking what I’ve found. In theory at least it provides a very nice, simple way of getting into analysing large data sets. To make it even easier to show other people what you’re up to Hive has a nascent web interface with a little documentation on the wiki On the one hand it’s rather simple at this point, but that should be easily enought to prettify given a bit of time. The bigger problem was getting it working in the first place. What follows worked for me using the latest cloudera packages on debian testing. I’m assuming you already have Hive and Hadoop installed, the basic packages worked fine for me here. Next up you’ll need the JDK (not just the JRE) as their is some compilation that will go on the first time you run the web interface. apt-get install ant sun-java6-jdk Next up I had to modify the installed /etc/hive/conf/hive-site.xml file as follows: I changed this: hive.metastore.uris file:///var/lib/hivevar/metastore/metadb/ Comma separated list of URIs of metastore servers. The first server that can be connected to will be used. To this. Note the hivevar path doesn’t exist so I’m not sure if this was a typo in the source. hive.metastore.uris file:///var/lib/hive/var/metastore/metadb/ Comma separated list of URIs of metastore servers. The first server that can be connected to will be used. I also change the following section regarding the metastore name: javax.jdo.option.ConnectionURL jdbc:derby:;databaseName=/var/lib/hive/metastore/${user.name}_db;create=true JDBC connect string for a JDBC metastore To this, with a fixed name. When using the above confirguration the file was actually called ${user.name} rather than my username being subsituted in. Elsewhere this seems to work fine. javax.jdo.option.ConnectionURL jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true JDBC connect string for a JDBC metastore I’m not convinced the above two changes are needed but have left them here just in case. The main tricky part is making sure a load of environment variables are correctly set. The following worked for me: export ANT_LIB=/usr/share/ant/lib export HIVE_HOME=/usr/lib/hive export HADOOP_HOME=/usr/lib/hadoop export PATH=$PATH:$HADOOP_HOME/bin export JAVA_HOME=/usr/lib/jvm/java-6-sun All being well that should allow you to run the hive command with the web interface like so: hive --service hwi That should bring up a webserver on port 9999 where you should see something similar to the screenshot above.
July 25, 2012
by Gareth Rushgrove
· 16,786 Views · 1 Like
article thumbnail
WebSockets vs. SignalR: Why You Should Not Have To Care
Introduction When the web was founded, it was designed as a system to distribute and update data. If you look at the HTTP protocol you can clearly see these origins. It has commands to GET, UPDATE, POST or DELETE data, but it is always the client (and if we ignore REST, this means in most cases the browser) who takes the initiative. However, the web is changing. It has evolved from a pure data distribution system to an application distribution system. Today, the magic word of IT vendors for this is the ‘cloud’, but in fact this shift to an application distribution system has already started some years ago. In order to start massively replacing traditional desktop applications, the web needs a new trick: two way communication between browser and server. The back-end must be able to update parts of a web page without user initiative. A definitive technical solution is underway in the form of web sockets, but do we really need to wait until this technology is broadly supported by the browsers of most users? In my opinion, the answer should be a clear NO. Where SignalR comes in When there are limitations, people become creative and are able to circumvent the issues blocking them from developing great products. Many popular web applications/sits are already capable of updating their content dynamically, yet they do not rely on web sockets. How is this possible? Because they rely on patterns such as ‘long polling’. With long polling, the browser sends a request for information to the web server with a huge timeout. The web server does not immediately sends data back to the browser, but waits until it has data to send back. When the client receives back data from the server, it will immediately resend a new request to the server. This long polling pattern and similar patterns give the user the illusion that a persistent two way connection exists between the browser and the web server, but it causes some unnecessary hard work for applicative developers and this is where SignalR comes in for .NET developers. It makes an abstraction of the long polling pattern and gives applicative developers the same illusion as their end users: a persistent two way connection between browser and web server. SignalR takes care of all the details and allows developers to focus on their most important task: building a great application for users. Because SignalR abstracts the underlying communication protocol, it can both support WebSockets and patterns, such as long polling. This makes an upgrade to WebSockets fairly easy when your organization adopts Windows Server 2012 and your users have moved to modern browsers such as Internet Explorer 10 or recent versions of Firefox and Google Chrome. Conclusion As I already wanted to emphasize in the title, comparing WebSockets with SignalR is pointless. Yes, WebSockets is technically superior and will probably give you some extra performance on your server side. But SignalR makes it possible to start developing today the web applications of tomorrow. When WebSockets become broadly available, SignalR will make it possible for you to move away from long polling without a lot of impact on your code.
July 21, 2012
by Pieter De Rycke
· 42,467 Views
article thumbnail
How to Autoscale MySQL on Amazon EC2
Autoscaling your webserver tier is typically straightforward. Image your apache server with source code or without, then sync down files from S3 upon spinup. Roll that image into the autoscale configuration and you’re all set. With the database tier though, things can be a bit tricky. The typical configuration we see is to have a single master database where your application writes. But scaling out or horizontally on Amazon EC2 should be as easy as adding more slaves, right? Why not automate that process? Below we’ve set out to answer some of the questions you’re likely to face when setting up slaves against your master. We’ve included instructions on building an AMI that automatically spins up as a slave. Fancy! How can I autoscale my database tier? Build an auto-starting MySQL slave against your master. Configure those to spinup. Amazon’s autoscaling loadbalancer is one option, another is to use a roll-your-own solution, monitoring thresholds on servers, and spinning up or dropping off slaves as necessary. Does an AWS snapshot capture subvolume data or just the SIZE of the attached volume? In fact, if you have an attached EBS volume and you create an new AMI off of that, you will capture the entire root volume, plus your attached volume data. In fact we find this a great way to create an auto-building slave in the cloud. How do I freeze MySQL during AWS snapshot? mysql> flush tables with read lock;mysql> system xfs_freeze -f /data At this point you can use the Amazon web console, ylastic, or ec2-create-image API call to do so from the command line. When the server you are imaging off of above restarts – as it will do by default – it will start with /data partition unfrozen and mysql’s tables unlocked again. Voila! If you’re not using xfs for your /data filesystem, you should be. It’s fast! The xfsprogs docs seem to indicate this may also work with foreign filesystems. Check the docs for details. How do I build an AMI mysql slave that autoconnects to master? Install mysql_serverid script below. Configure mysql to use your /data EBS mount. Set all your my.cnf settings including server_id Configure the instance as a slave in the normal way. When using GRANT to create the ‘rep’ user on master, specify the host with a subnet wildcard. For example ’10.20.%’. That will subsequently allow any 10.20.x.y servers to connect and replicate. Point the slave at the master. When all is running properly, edit the my.cnf file and remove server_id. Don’t restart mysql. Freeze the filesystem as described above. Use the Amazon console, ylastic or API call to create your new image. Test it of course, to make sure it spins up, sets server_id and connects to master. Make a change in the test schema, and verify that it propagates to all slaves. How do I set server_id uniquely? As you hopefully already know, in MySQL replication environment each node requires a unique server_id setting. In my Amazon Machine Images, I want the server to startup and if it doesn’t find the server_id in the /etc/my.cnf file, to add it there, correctly! Is that so much to ask? Here’s what I did. Fire up your editor of choice and drop in this bit of code: #!/bin/shif grep -q “server_id” /etc/my.cnf then : # do nothing – it’s already set else # extract numeric component from hostname – should be internet IP in Amazon environment export server_id=`echo $HOSTNAME | sed ‘s/[^0-9]*//g’` echo “server_id=$server_id” >> /etc/my.cnf # restart mysql /etc/init.d/mysql restart fi Save that snippet at /root/mysql_serverid. Also be sure to make it executable: $ chmod +x /root/mysql_serverid Then just append it to your /etc/rc.local file with an editor or echo: $ echo "/root/mysql_serverid" >> /etc/rc.local Assuming your my.cnf file does *NOT* contain the server_id setting when you re-image, then it’ll set this automagically each time you spinup a new server off of that AMI. Nice! Can you easily slave off of a slave? How? It’s not terribly different from slaving off of a normal master. A. First enable slave updates. The setting is not dynamic, so if you don’t already have it set, you’ll have to restart your slave. log_slave_updates=true B. Get an initial snapshot of your slave data. You can do that the locking way: mysql> flush tables with read lock;mysql> show master status\G; mysql> system mysqldump -A > full_slave_dump.mysql mysql> unlock tables; You may also choose to use Percona’s excellent xtrabackup utility to create hotbackups without locking any tables. We are very lucky to have an open-source tool like this at our disposal. MySQL Enterprise Backup from Oracle Corp can also do this. C. On the slave, seed the database with your dump created above. $ mysql < full_slave_dump.mysql D. Now point your slave to the original slave. mysql> change master to master_user='rep', master_password='rep', master_host='192.168.0.1', master_log_file='server-bin-log.000004', master_log_pos=399;mysql> start slave; mysql> show slave status\G; Slave master is set as an IP address. Is there another way? It’s possible to use hostnames in MySQL replication, however it’s not recommended. Why? Because of the wacky world of DNS. Suffice it to say MySQL has to do a lot of work to resolve those names into IP addresses. A hickup in DNS can interrupt all MySQL services potentially as sessions will fail to authenticate. To avoid this problem do two things: A. Set this parameter in my.cnf skip_name_resolve = true Remove entries in mysql.user table where hostname is not an IP address. Those entries will be invalid for authentication after setting the above parameter. Doesn’t RDS take care of all of this for me? RDS is Amazon’s Relational Database Service which is built on MySQL. Amazon’s RDS solution presents MySQL as a service which brings certain benefits to administrators and startups: Simpler administration. Nuts and bolts are handled for you. Push-button replication. No more struggling with the nuances and issues of MySQL’s replication management. Simplicity of administration of course has it’s downsides. Depending on your environment, these may or may not be dealbreakers. No access to the slow query log. This is huge. The single best tool for troubleshooting slow database response is this log file. Queries are a large part of keeping a relational database server healthy and happy, and without this facility, you are severely limited. Locked in downtime window When you signup for RDS, you must define a thirty minute maintenance window. This is a weekly window during which your instance *COULD* be unavailable. When you host yourself, you may not require as much downtime at all, especially if you’re using master-master mysql and zero-downtime configuration. Can’t use Percona Server to host your MySQL data. You won’t be able to do this in RDS. Percona server is a high performance distribution of MySQL which typically rolls in serious performance tweaks and updates before they make it to community addition. Well worth the effort to consider it. No access to filesystem, server metrics & command line. Again for troubleshooting problems, these are crucial. Gathering data about what’s really happening on the server is how you begin to diagnose and troubleshoot a server stall or pileup. You are beholden to Amazon’s support services if things go awry. That’s because you won’t have access to the raw iron to diagnose and troubleshoot things yourself. Want to call in an outside consultant to help you debug or troubleshoot? You’ll have your hands tied without access to the underlying server. You can’t replicate to a non-RDS database. Have your own datacenter connected to Amazon via VPC? Want to replication to a cloud server? RDS won’t fit the bill. You’ll have to roll your own – as we’ve described above. And if you want to replicate to an alternate cloud provider, again RDS won’t work for you. Related posts: Deploying MySQL on Amazon EC2 – 8 Best Practices Review: Host Your Web Site In The Cloud, Amazon Web Services Made Easy 5 Ways to Boost MySQL Scalability Top MySQL DBA interview questions (Part 2) MySQL Cluster In The Cloud – Managers Guide
July 20, 2012
by Sean Hull
· 18,485 Views
article thumbnail
Spring Data - Apache Hadoop
Spring for Apache Hadoop is a Spring project to support writing applications that can benefit of the integration of Spring Framework and Hadoop. This post describes how to use Spring Data Apache Hadoop in an Amazon EC2 environment using the “Hello World” equivalent of Hadoop programming – a Wordcount application. 1./ Launch an Amazon Web Services EC2 instance. - Navigate to AWS EC2 Console (“https://console.aws.amazon.com/ec2/home”): - Select Launch Instance then Classic Wizzard and click on Continue. My test environment was a “Basic Amazon Linux AMI 2011.09″ 32-bit., Instant type: Micro (t1.micro , 613 MB), Security group quick-start-1 that enables ssh to be used for login. Select your existing key pair (or create a new one). Obviously you can select another AMI and instance types depending on your favourite flavour. (Should you vote for Windows 2008 based instance, you also need to have cygwin installed as an additional Hadoop prerequisite beside Java JDK and ssh, see “Install Apache Hadoop” section) 2./ Download Apache Hadoop - as of writing this article, 1.0.0 is the latest stable version of Apache Hadoop, that is what was used for testing purposes. I downloaded hadoop-1.0.0.tar.gz and copied it into /home/ec2-user directory using pscp command from my PC running Windows: c:\downloads>pscp -i mykey.ppk hadoop-1.0.0.tar.gz [email protected]:/home/ec2-user (the computer name above – ec2-ipaddress-region-compute.amazonaws.com – can be found on AWS EC2 console, Instance Description, public DNS field) 3./ Install Apache Hadoop: As prerequisites, you need to have Java JDK 1.6 and ssh installed, see Apache Single-Node Setup Guide. (ssh is automatically installed with Basic Amazon AMI). Then install hadoop itself: $ cd ~ # change directory to ec2-user home (/home/ec2-user) $ tar xvzf hadoop-1.0.0.tar.gz $ ln -s hadoop-1.0.0 hadoop $ cd hadoop/conf $ vi hadoop-env.sh # edit as below export JAVA_HOME=/opt/jdk1.6.0_29 $ vi core-site.xml # edit as below – this defines the namenode to be running on localhost and listeing to port 9000. fs.default.name hdfs://localhost:9000 $ vi hdsf-site.xml # edit as below this defines that file system replicate is 1 (in production environment it is supposed to be 3 by default) dfs.replication 1 $ vi mapred-site.xml # edit as below – this defines the jobtracker to be running on localhost and listeing to port 9001. mapred.job.tracker localhost:9001 $ cd ~/hadoop $ bin/hadoop namenode -format $ bin/start-all.sh At this stage all hadoop jobs are running in pseudo distributed mode, you can verify it by running: $ ps -ef | grep java You should see 5 java processes: namenode, secondarynamenode, datanode, jobtracker and tasktracker. 4./ Install Spring Data Hadoop Download Spring Data Hadoop package from SpringSource community download site. As of writing this article, the latest stable version is spring-data-hadoop-1.0.0.M1.zip. $ cd ~ $ tar xzvf spring-data-hadoop-1.0.0.M1.zip $ ln -s spring-data-hadoop-1.0.0.M1 spring-data-hadoop 5./ Build and Run Spring Data Hadoop Wordcount example $ cd spring-data-hadoop/spring-data-hadoop-1.0.0.M1/samples/wordcount Spring Data Hadoop is using gradle as build tool. Check build.grandle build file. The original version packaged in the tar.gz file does not compile, it complains about thrift, version 0.2.0 and jdo2-api, version2.3-ec. Add datanucleus.org maven repository to the build.gradle file to support jdo2-api (http://www.datanucleus.org/downloads/maven2/) . Unfortunatelly, there seems to be no maven repo for thrift 0.2.0 . You should download thrift 0.2.0.jar and thrift.0.2.0.pom file e.g. from this repo: “http://people.apache.org/~rawson/repo“ and then add it to local maven repo. $ mvn install:install-file -DgroupId=org.apache.thrift -DartifactId=thrift -Dversion=0.2.0 -Dfile=thrift-0.2.0.jar -Dpackaging=jar $ vi build.grandle # modify the build file to refer to datanucleus maven repo for jdo2-api and the local repo for thrift repositories { // Public Spring artefacts mavenCentral() maven { url “http://repo.springsource.org/libs-release” } maven { url “http://repo.springsource.org/libs-milestone” } maven { url “http://repo.springsource.org/libs-snapshot” } maven { url “http://www.datanucleus.org/downloads/maven2/” } maven { url “file:///home/ec2-user/.m2/repository” } } I also modified the META-INF/spring/context.xml file in order to run hadoop file system commands manually: $ cd /home/ec2-user/spring-data-hadoop/spring-data-hadoop-1.0.0.M1/samples/wordcount/src/main/resources $vi META-INF/spring/context.xml # remove clean-script and also the dependency on it for JobRunner. xmlns=”http://www.springframework.org/schema/beans” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns:context=”http://www.springframework.org/schema/context” xmlns:hdp=”http://www.springframework.org/schema/hadoop” xmlns:p=”http://www.springframework.org/schema/p” xsi:schemaLocation=”http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd”> fs.default.name=${hd.fs} Copy the sample file – nietzsche-chapter-1.txt – to Hadoop file system (/user/ec2-user-/input directory) $ cd src/main/resources/data $ hadoop fs -mkdir /user/ec2-user/input $ hadoop fs -put nietzsche-chapter-1.txt /user/ec2-user/input/data $ cd ../../../.. # go back to samples/wordcount directory $ ../gradlew Verify the result: $ hadoop fs -cat /user/ec2-user/output/part-r-00000 | more “AWAY 1 “BY 1 “Beyond 1 “By 2 “Cheers 1 “DE 1 “Everywhere 1 “FROM” 1 “Flatterers 1 “Freedom 1
July 19, 2012
by Istvan Szegedi
· 11,895 Views
article thumbnail
The Most Pressed Keys in Various Programming Languages
i switch between programming languages quite a bit; i often wondered what happens when having to deal with the different syntaxes, does the syntax allow you to be more expressive or faster at coding in one language or another. i don't really know about that; but what i do know what keys are pressed when writing with different programming languages. this might be something interesting for people who are deciding to select a programming language might look into, here is a post on the answer to the aged question of: which programming language should i learn? as far as i can tell languages with a wider focused spread across the keyboard are usually syntaxes we usually associate with ugly languages (ugly to read and code). ex. shell and perl. you might argue that the variables names being used will alter the results, but as most languages programming have conventions for naming but we can assume a decent spread for variable names. i don’t offer conclusions, just poorly layout the facts. although the heat map does miss out on things like shift and caps. ex. in perl with the dollar sign. ($) whitespace hasn’t been taken into consideration (tabs and spaces) which would have been a cool thing to see. the data that was used to gather this information was spread amongst various popular github projects. javascript shell java c c++ ruby python php perl objc lisp lisp code here was written by paul graham. references heatmap.js http://www.patrick-wied.at/projects/heatmap-keyboard/
July 12, 2012
by Mahdi Yusuf
· 39,208 Views
article thumbnail
Introduction to Apache Bigtop, for Packaging and Testing Hadoop
Ah!! The name is everywhere, carried with the wind. Apache Hadoop!! The BIG DATA crunching platform! We all know how alien it can be at start too! Phew!! :o Its my personal experience, nearly 11 months before, I was trying to install HBase, I faced few issues! The problem was version compatibility. Ex: "HBase some x.version" with "Hadoop some y.version". This is a real issue because you will never know which package of what version blends well with the other, unless, someone has tested it. This testing again depends on the environment where they have set up and could be another issue. There was a pressing demand for the management of distributions and then comes an open source project which attempts to create a fully integrated and tested Big Data management distribution, "Apache Bigtop". Goals of Apache Bigtop: -Packaging -Deployment -Integration Testing of all the sub-projects of Hadoop. This project aims at system as a whole, than the individual project. I love the way Doug Cutting quoted in the Keynote, back then, wherein he expressed the similarity between Hadoop and Linux kernel,and the corresponding similarity between the big stack of Hadoop ( Hive, Hbase, Pig, Avro, etc.) and the fully operational operating systems with its distributions (RedHat, Ubuntu, Fedora, Debian etc.). This is an awesome analogy! :) Life is made easy with Bigtop: Bigtop Hadoop distribution artifacts won't make you feel that you live in an alien world! After installing, you will get a chance to blend a Hadoop cluster in any mode, with the sub-projects of it. Its all for you to garnish next! :) Setup Of Bigtop and Installing Hadoop: It's time to welcome all your packages home. [I also mean /home/..] ;) I've tested on Ubuntu 11.04 and here goes a quick and easy installation process. Step 1: Installing the GNU Privacy Guard key, a key management system to access all public key directories. wget -O- http://www.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/GPG-KEY-bigtop | sudo apt-key add - Step 2: Get the repo file from the link http://www.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/ubuntu/bigtop.list sudo wget -O /etc/apt/sources.list.d/bigtop.listhttp://www.apache.org/dist/incubator/bigtop/bigtop-0.3.0-incubating/repos/ubuntu/bigtop.list sudo gedit /etc/apt/sources.list.d/bigtop.list uncomment the mirror link near by. The first link worked for me. deb http://apache.01link.hk/incubator/bigtop/stable/repos/ubuntu/ bigtop contrib Step 3: Updating the apt cache sudo apt-get update Step 4: Checking in the artifacts sudo apt-cache search hadoop Image: Search in the apt cache Step 5: Set your JAVA_HOME export JAVA_HOME=path_to_your_Java export $JAVA_HOME in ~/.bashrc Step 6: Installing the complete Hadoop stack sudo apt-get install hadoop\* Image: (above) Running Hadoop: Step 1: Formatting the namendoe sudo -u hdfs hadoop namenode -format Image : Formatting the namenode Step 2: Starting the Namenode, Datanode, Jobtracker, Tasktracker of Hadoop for i in hadoop-namenode hadoop-datanode hadoop-jobtracker hadoop-tasktracker ; do sudo service $i start ; done Now, the cluster is up and running. Image : Start all the services Step 3: Creating a new directory in hdfs sudo -u hdfs hadoop fs -mkdir /user/bigtop bigtop is the directory name in the user $USER sudo -u hdfs hadoop fs -chown $USER /user/bigtop Image : Create a directory in HDFS Step 4: List the directories in file system hadoop fs -lsr / Image : HDFS directories Step 5: Running a sample pi example hadoop jar /usr/lib/hadoop/hadoop-examples.jar pi 10 1000 Image : Running a sample program Job Completed! Enjoy with your cluster! :) We shall see what more blending could be done with Hadoop (with Hive, Hbase, etc.) in the next post! Until then, Happy Learning!! :):)
July 9, 2012
by Swathi Venkatachala
· 10,932 Views
article thumbnail
How to Find the Most Connected Neo4j Node Using Cypher
As I mentioned in another post about a month ago I’ve been playing around with a neo4j graph in which I have the following relationship between nodes: One thing I wanted to do was work out which node is the most connected on the graph, which would tell me who’s worked with the most people. I started off with the following cypher query: query = " START n = node(*)" query << " MATCH n-[r:colleagues]->c" query << " WHERE n.type? = 'person' and has(n.name)" query << " RETURN n.name, count(r) AS connections" query << " ORDER BY connections DESC" I can then execute that via the neo4j console or through irb using the neography gem like so: > require 'rubygems' > require 'neography' > neo = Neography::Rest.new(:port => 7476) > neo.execute_query query # cut for brevity {"data"=>[["Carlos Villela", 283], ["Mark Needham", 221]], "columns"=>["n.name", "connections"]} That shows me each person and the number of people they’ve worked with but I wanted to be able to see the most connected person in each office . Each person is assigned to an office while they’re working out of that office but people tend to move around so they’ll have links to multiple offices: I put ‘start_date’ and ‘end_date’ properties on the ‘member_of’ relationship and we can work out a person’s current office by finding the ‘member_of’ relationship which doesn’t have an end date defined: query = " START n = node(*)" query << " MATCH n-[r:colleagues]->c, n-[r2:member_of]->office" query << " WHERE n.type? = 'person' and has(n.name) and not(has(r2.end_date))" query << " RETURN n.name, count(r) AS connections, office.name" query << " ORDER BY connections DESC" And now our results look more like this: {"data"=>[["Carlos Villela", 283, "Porto Alegre - Brazil"], ["Mark Needham", 221, "London - UK South"]], "columns"=>["n.name", "connections"]} If we want to restrict that just to return the people for a specific person we can do that as well: query = " START n = node(*)" query << " MATCH n-[r:colleagues]->c, n-[r2:member_of]->office" query << " WHERE n.type? = 'person' and has(n.name) and (not(has(r2.end_date))) and office.name = 'London - UK South'" query << " RETURN n.name, count(r) AS connections, office.name" query << " ORDER BY connections DESC" {"data"=>[["Mark Needham", 221, "London - UK South"]], "columns"=>["n.name", "connections"]} In the current version of cypher we need to put brackets around the not expression otherwise it will apply the not to the rest of the where clause. Another way to get around that would be to put the not part of the where clause at the end of the line. While I am able to work out the most connected person by using these queries I’m not sure that it actually tells you who the most connected person is because it’s heavily biased towards people who have worked on big teams. Some ways to try and account for this are to bias the connectivity in favour of those you have worked longer with and also to give less weight to big teams since you’re less likely to have a strong connection with everyone as you might in a smaller team. I haven’t got onto that yet though!
June 26, 2012
by Mark Needham
· 13,493 Views
article thumbnail
Fast Index Creation with InnoDB
Innodb can indexes built by sort since Innodb Plugin for MySQL 5.1 which is a lot faster than building them through insertion, especially for tables much larger than memory and large uncorrelated indexes you might be looking at 10x difference or more. Yet for some reason Innodb team has chosen to use very small (just 1MB) and hard coded buffer for this operation, which means almost any such index build operation has to use excessive sort merge passes significantly slowing down index built process. Mark Callaghan and Facebook Team has fixed this in their tree back in early 2011 adding innodb_merge_sort_block_size variable and I was thinking this small patch will be merged to MySQL 5.5 promptly, yet it has not happen to date. Here is example of gains you can expect (courtesy of Alexey Kopytov), using 1Mil rows Sysbench table. Buffer Length | alter table sbtest add key(c) 1MB 34 sec 8MB 26 sec 100MB 21 sec 128MB 17 sec REBUILD 37 sec REBUILD in this table is using “fast_index_creation=0″ which allows to disable fast index creation in Percona Server and force complete table to be rebuilt instead. Looking at this data we can see even for such small table there is possible to improve index creation time 2x by using large buffer. Also we can see we can substantially improve performance even increasing it from 1MB to 8MB, which might be sensible as default as even small systems should be able to allocate 8MB to do alter table. You may be wondering why in this case table rebuild is so close in performance to building index by sort with small buffer – this comes from building index on long character field with very short length, Innodb would use fixed size records for sort space which results in a lot more work done than you would otherwise need. Having some optimization to better deal with this case also would be nice. The table also was fitting in buffer pool completely in this case which means table rebuild could have done fast too. Results are from Percona Server 5.5.24
June 19, 2012
by Peter Zaitsev
· 4,501 Views
article thumbnail
NetBeans IDE 7.2 Introduces TestNG
One of the advantages of code generation is the ability to see how a specific language feature or framework is used. As I discussed in the post NetBeans 7.2 beta: Faster and More Helpful, NetBeans 7.2 beta provides TestNG integration. I did not elaborate further in that post other than a single reference to that feature because I wanted to devote this post to the subject. I use this post to demonstrate how NetBeans 7.2 can be used to help a developer new to TestNG start using this alternative (to JUnit) test framework. NetBeans 7.2's New File wizard makes it easier to create an empty TestNG test case. This is demonstrated in the following screen snapshots that are kicked off by using New File | Unit Tests (note that "New File" is available under the "File" drop-down menu or by right-clicking in the Projects window). Running the TestNG test case creation as shown above leads to the following generated test code. TestNGDemo.java (Generated by NetBeans 7.2) package dustin.examples; import org.testng.annotations.AfterMethod; import org.testng.annotations.AfterClass; import org.testng.annotations.BeforeMethod; import org.testng.annotations.BeforeClass; import org.testng.annotations.Test; import org.testng.Assert; /** * * @author Dustin */ public class TestNGDemo { public TestNGDemo() { } @BeforeClass public void setUpClass() { } @AfterClass public void tearDownClass() { } @BeforeMethod public void setUp() { } @AfterMethod public void tearDown() { } // TODO add test methods here. // The methods must be annotated with annotation @Test. For example: // // @Test // public void hello() {} } The test generated by NetBeans 7.2 includes comments indicate how test methods are added and annotated (similar to modern versions of JUnit). The generated code also shows some annotations for overall test case set up and tear down and for per-test set up and tear down (annotations are similar to JUnit's). NetBeans identifies import statements that are not yet used at this point (import org.testng.annotations.Test; and import org.testng.Assert;), but are likely to be used and so have been included in the generated code. I can add a test method easily to this generated test case. The following code snippet is a test method using TestNG. testIntegerArithmeticMultiplyIntegers() @Test public void testIntegerArithmeticMultiplyIntegers() { final IntegerArithmetic instance = new IntegerArithmetic(); final int[] integers = {4, 5, 6}; final int expectedProduct = 2 * 3 * 4 * 5 * 6; final int product = instance.multiplyIntegers(2, 3, integers); assertEquals(product, expectedProduct); } This, of course, looks very similar to the JUnit equivalent I used against the same IntegerArithmetic class that I used for testing illustrations in the posts Improving On assertEquals with JUnit and Hamcrest and JUnit's Built-in Hamcrest Core Matcher Support. The following screen snapshot shows the output in NetBeans 7.2 beta from right-clicking on the test case class and selecting "Run File" (Shift+F6). The text output of the TestNG run provided in the NetBeans 7.2 beta is reproduced next. [TestNG] Running: Command line suite [VerboseTestNG] RUNNING: Suite: "Command line test" containing "1" Tests (config: null) [VerboseTestNG] INVOKING CONFIGURATION: "Command line test" - @BeforeClass dustin.examples.TestNGDemo.setUpClass() [VerboseTestNG] PASSED CONFIGURATION: "Command line test" - @BeforeClass dustin.examples.TestNGDemo.setUpClass() finished in 33 ms [VerboseTestNG] INVOKING CONFIGURATION: "Command line test" - @BeforeMethod dustin.examples.TestNGDemo.setUp() [VerboseTestNG] PASSED CONFIGURATION: "Command line test" - @BeforeMethod dustin.examples.TestNGDemo.setUp() finished in 2 ms [VerboseTestNG] INVOKING: "Command line test" - dustin.examples.TestNGDemo.testIntegerArithmeticMultiplyIntegers() [VerboseTestNG] PASSED: "Command line test" - dustin.examples.TestNGDemo.testIntegerArithmeticMultiplyIntegers() finished in 12 ms [VerboseTestNG] INVOKING CONFIGURATION: "Command line test" - @AfterMethod dustin.examples.TestNGDemo.tearDown() [VerboseTestNG] PASSED CONFIGURATION: "Command line test" - @AfterMethod dustin.examples.TestNGDemo.tearDown() finished in 1 ms [VerboseTestNG] INVOKING CONFIGURATION: "Command line test" - @AfterClass dustin.examples.TestNGDemo.tearDownClass() [VerboseTestNG] PASSED CONFIGURATION: "Command line test" - @AfterClass dustin.examples.TestNGDemo.tearDownClass() finished in 1 ms [VerboseTestNG] [VerboseTestNG] =============================================== [VerboseTestNG] Command line test [VerboseTestNG] Tests run: 1, Failures: 0, Skips: 0 [VerboseTestNG] =============================================== =============================================== Command line suite Total tests run: 1, Failures: 0, Skips: 0 =============================================== Deleting directory C:\Users\Dustin\AppData\Local\Temp\dustin.examples.TestNGDemo test: BUILD SUCCESSFUL (total time: 2 seconds) The above example shows how easy it is to start using TestNG, especially if one is moving to TestNG from JUnit and is using NetBeans 7.2 beta. Of course, there is much more to TestNG than this, but learning a new framework is typically most difficult at the very beginning and NetBeans 7.2 gets one off to a fast start.
June 11, 2012
by Dustin Marx
· 21,513 Views · 1 Like
article thumbnail
Killing IntelliJ Launched Processes
I often use IntelliJ to run applications, and on occasion things go wrong. For example, a thread that wont terminate can cause a running application to become unstoppable via the IntelliJ UI. Usually when this happens I end up running ps aux | grep java and following up with a kill -9 for each process that looks like it might be the one I'm looking for. On good days there's only a few processes; however, things are more complicated when I have several to look through. Last week I noticed that the command used to launch the process printed in the Console window, and, more importantly, the idea.launcher.port is part of that command: e.g. idea.launcher.port=7538. Assuming the port is unique, or even almost unique it's much easier to ps aux | grep for than java.
June 7, 2012
by Jay Fields
· 16,437 Views · 2 Likes
article thumbnail
Get TeamCity Artifacts Using HTTP, Ant, Gradle and Maven
In how many ways can you retrieve TeamCity artifacts? I say plenty to choose from! If you’re in a world of Java build tools then you can use plain HTTP request, Ant + Ivy, Gradle and Maven to download and use binaries produced by TeamCity build configurations. How? Read on. Build Configuration “id” Before you retrieve artifacts of any build configuration you need to know its "id" which can be seen in a browser when corresponding configuration is browsed. Let’s take IntelliJ IDEA Community Edition project hosted at teamcity.jetbrains.com as an example. Its “Community Dist” build configuration provides a number of artifacts which we’re going to play with. And as can be seen on the screenshot below, its "id" is "bt343". HTTP Anonymous HTTP access is probably the easiest way to fetch TeamCity artifacts, the URL to do so is: http://server/guestAuth/repository/download/// Fot this request to work 3 parameters need to be specified: btN Build configuration "id", as mentioned above. buildNumber Build number or one of predefined constants: "lastSuccessful", "lastPinned", or "lastFinished". For example, you can download periodic IDEA builds from last successful TeamCity execution. artifactName Name of artifact like "ideaIC-118.SNAPSHOT.win.zip". Can also take a form of "artifactName!archivePath" for reading archive’s content, like IDEA’s build file. You can get a list of all artifacts produced in a certain build by requesting a special "teamcity-ivy.xml" artifact generated by TeamCity. Ant + Ivy All artifacts published to TeamCity are accompanied by "teamcity-ivy.xml" Ivy descriptor, effectively making TeamCity an Ivy repository. The code below downloads "core/annotations.jar" from IDEA distribution to "download/ivy" directory: "ivyconf.xml" "ivy.xml" "build.xml" Gradle Identically to Ivy example above it is fairly easy to retrieve TeamCity artifacts with Gradle due to its built-in Ivy support. In addition to downloading the same jar file to "download/gradle" directory with a custom Gradle task let’s use it as "compile" dependency for our Java class, importing IDEA’s @NotNull annotation: "Test.java" import org.jetbrains.annotations.NotNull; public class Test { private final String data; public Test ( @NotNull String data ){ this.data = data; } } "build.gradle" apply plugin: 'java' repositories { ivy { ivyPattern 'http://teamcity.jetbrains.com/guestAuth/repository/download/[module]/[revision]/teamcity-ivy.xml' artifactPattern 'http://teamcity.jetbrains.com/guestAuth/repository/download/[module]/[revision]/[artifact](.[ext])' } } dependencies { compile ( 'org:bt343:lastSuccessful' ){ artifact { name = 'core/annotations' type = 'jar' } } } task copyJar( type: Copy ) { from configurations.compile into "${ project.projectDir }/download/gradle" } Maven The best way to use Maven with TeamCity is by setting up an Artifactory repository manager and its TeamCity plugin. This way artifacts produced by your builds are nicely deployed to Artifactory and can be served from there as from any other remote Maven repository. However, you can still use TeamCity artifacts in Maven without any additional setups. "ivy-maven-plugin" bridges two worlds allowing you to plug Ivy resolvers into Maven’s runtime environment, download dependencies required and add them to corresponding "compile" or "test" scopes. Let’s compile the same Java source from the Gradle example but using Maven this time. "pom.xml" 4.0.0 com.test maven jar 0.1-SNAPSHOT [${project.groupId}:${project.artifactId}:${project.version}] Ivy Maven plugin example com.github.goldin ivy-maven-plugin 0.2.5 get-ivy-artifacts ivy initialize ${project.basedir}/ivyconf.xml ${project.basedir}/ivy.xml ${project.basedir}/download/maven compile When this plugin runs it resolves IDEA annotations artifact using the same "ivyconf.xml" and "ivy.xml" files we’ve seen previously, copies it to "download/maven" directory and adds to "compile" scope so our Java sources can compile. GitHub Project All examples demonstrated are available in my GitHub project. Feel free to clone and run it: git clone git://github.com/evgeny-goldin/teamcity-download-examples.git cd teamcity-download-examples chmod +x run.sh dist/ant/bin/ant gradlew dist/maven/bin/mvn ./run.sh Resources The links below can provide you with more details: TeamCity – Patterns For Accessing Build Artifacts TeamCity – Accessing Server by HTTP TeamCity – Configuring Artifact Dependencies Using Ant Build Script Gradle – Ivy repositories "ivy-maven-plugin" That’s it, you’ve seen it – TeamCity artifacts are perfectly accessible using either of 4 ways: direct HTTP access, Ant + Ivy, Gradle or Maven. Which one do you use? Let me know!
June 5, 2012
by Evgeny Goldin
· 11,282 Views
article thumbnail
Connection Pooling in a Java Web Application with Tomcat and NetBeans IDE
After my article Connection Pooling in a Java Web Application with Glassfish and NetBeans IDE, here are the instructions for Tomcat. Requirements NetBeans IDE (this tutorial uses NetBeans 7) Tomcat (this tutorial uses Tomcat 7 that is bundled within NetBeans) MySQL database MySQL Java Driver Steps Assuming your MySQL database is ready, connect to it and create a database. Lets call it connpool: mysql> create database connpool; Now we create and populate the table from which we will fetch the data: mysql> use connpool; mysql> create table data(id int(5) not null unique auto_increment, name varchar(255) not null); mysql> insert into data(name) values("Fred Flintstone"), ("Pink Panther"), ("Wayne Cramp"), ("Johnny Bravo"), ("Spongebob Squarepants"); That is it for the database part. We now create our web application. In NetBeans IDE, click File → New Project... Select Java Web → Web Application: Click Next and give the project the name TomPool. Click Next Choose the server as Tomcat and, since we are not going to use any frameworks, click Finish. The project will be created and the start page, index.jsp, opened for us in the IDE. Now we create the connection pooling parameters. In the Projects window, expand configuration files and open "context.xml". You will see that the IDE has added this code for us: Delete the last line: and then add the following to the context.xml file. I have explained the sections along the way. Make sure you edit your MySQL username and password appropriately: Next, expand the Web Pages node, right-click WEB-INF → New → Other → XML → XML Document. Click Next and type web for the File Name. Click next and choose Well-Formed Document then Finish. You will now have the file "web.xml": Delete everything in the file and paste this code: MySQL Test App DB Connection connpool javax.sql.DataSource Container That is it for the connection pool. We now edit our code to make use of it. Edit index.jsp by adding this code just after the initial coments but before Edit the section of the page: Data in my Connection Pooled Database Now, we test the connection pool by running the application: If you want to have the one connection pool used in multiple applications, you need to edit the following two files: 1. /conf/web.xml Just before the closing tag, add the code DB Connection connpool javax.sql.DataSource Container 2. /conf/context.xml Just before the closing tag, add the code Now you can use the pool without editing XML files in each of your applications. Just use the sample code as given in index.jsp That's it folks!
May 23, 2012
by Arthur Buliva
· 70,296 Views · 2 Likes
  • Previous
  • ...
  • 305
  • 306
  • 307
  • 308
  • 309
  • 310
  • 311
  • 312
  • 313
  • 314
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×