Mining API Mapping for Language Migration
Mining API Mapping for Language Migration
Join the DZone community and get the full member experience.Join For Free
One of the ongoing trends on the .NET community for the past years of its existence has been to import many of the famous and helpful Java projects to the .NET Framework. The main reason is that Open Source is more common and older on the Java community and the .NET community has been wanting to get its hands on the rich tools and libraries created for the Java in the shortest time possible without spending much time recreating the same stuff.
Besides, there have been software projects, teams, and companies trying to migrate from Java to .NET. This has motivated many companies and Open Source projects to write code converters that get the source code in Java and produce the equivalent code in .NET languages such as C# or Visual Basic. This is feasible mainly due to the many similarities between these two platforms and their underlying structure and APIs. Although there have been very good products and tools released for this purpose, there are always some problems for real world code that should be fixed manually and the power of these tools is to reduce the amount of work needed to be done by hand.
This problem has encouraged some Computer Scientists to work on a paper that focuses on this area to improve the quality of code conversion between languages. The outcome of their work was published in ICSE 2010 as a paper entitled Mining API Mapping for Language Migration that we discussed recently at our department.
This paper introduces the idea of using the previously translated source code from Java to .NET to create a mapping between APIs in both platforms which is similar to a learning process for the system. Later when trying to translate a code from Java to .NET, the system can use this mapping history to convert the source code with less problems. They introduce their new approach as Mining API Mapping (MAM) which consists of three main steps:
- Aligning the code in both versions of code in two platforms
- Mining the API mapping between classes
- Mining the API mapping between methods
The Chinese authors apply this to a simple code to exemplify the approach and then provide the results of their evaluations in terms of numbers and percentages that show some improvements. They use some famous projects that were previously converted from Java to .NET to feed their system. There are some famous projects like Hibernate/NHibernate, Lucene/Lucene.NET, and Log4j/Log4net included in this experiment. Having the API mappings from this training, they applied their approach to a few projects and compare the quality of their translated code with the outcome of Java2CSharp tool that they claim to be one of the best tools available for this purpose.
While this approach can make some improvements to this field, there are some challenges to the technique and its evaluations. The experiments are done on some projects that are already converted to the .NET platform using an automated tool. To my knowledge, Lucene.NET and Log4net are both imported with heavy use of automated conversion tools to the .NET platform. This can affect the reliability of the experiment. Also Such an approach can be applied to some projects that don't have dependencies on any third party code because it cannot create mappings for such dependencies. This limits the number of scenarios in which this technique can be applied to. Furthermore, there aren't good metrics used to compare the results. The best metric used is the differences between the number of compiler errors in this new tool and Java2CSharp, but in my experience, in such conversion cases one of the most common problems is the code that is running but doesn't provide the expected output. Additionally, his technique is very limited in scope and can't be generalized to conversions between other languages. The only case where it can work is between the Java and .NET and I doubt if it even works with the same quality in the reverse direction.
All in all, this approach is a good technique to improve the quality of automated code conversions between Java and C# in specific cases, and can be adapted by different tools.
Published at DZone with permission of Keyvan Nayyeri . See the original article here.
Opinions expressed by DZone contributors are their own.