Foreign Characters for the Eclipse Build System
Join the DZone community and get the full member experience.Join For Free
having a problem with eclipse and building files with foreign characters in the file name? if you are developing software, then read and follow this advice:
“do not:!: use foreign characters in file names, paths or for anything else!”
what i mean with ‘foreign characters’ are things like éöüàäü , or simply anything which is outside the 7bit ascii or windows-1252 code page table, even if they are allowed by the file system of your operating system (e.g. windows).
or in other words: only use these characters for file or directory names:
following that advice will keep you out of a lot of troubles, because many tool chain will simply not handle anything else well. you might be able to use spaces in file names, but to keep things on the save side: don’t use it.
if you follow that rule, you are fine and you can stop reading that article now .
eclipse and foreign characters
you are still reading? so if you have foreign characters in your file or director name, then here is how you to workaround at least some of the eclipse (and windows!) issues around it.
eclipse itself deals pretty well with foreign characters. the issue is with windows and the command prompt/dos shell .
the issue is observed in eclipse, e.g. in texas instrument code composer studio v5. having a file name with umlaut fails the build:
obviously, the ‘ü’ character of the source file is handled properly by eclipse, but not by the build (make) system which run with command line tools and on dos/cmd level.
same thing with codewarrior and arm gcc: the compiler is using a wrong file name:
inspecting the make files shows that things are ok here:
so something is going wrong with calling make and the compiler. it looks a wrong character code translation is happening from eclipse to the command prompt (dos command line) level, and that code pages are not matching on my machine .
it turns out that it is all about ‘code pages’: how ascii or windows-1252 code pages are handled on the dos/command prompt level on windows. a microsoft tech note explains code pages here .
how to find out which code page uses cmd.exe? this excellent article shows that the command chcp (for change code page) shows the active code page:
eclipse code page
but what encoding uses eclipse? it must be the code page set by the java environment? i find the settings under the menu window > preferences:
so it shows for me the default windows code page 1252. it is possible to change the default code page of eclipse (for the workspace) using the drop down box:
for codewarrior, it is possible to use an eclipse command line argument to define the code page. this is set in the ‘cwide.ini’ file inside the eclipse installation folder. in trying to fix my problem, i have added this line to it and restarted eclipse:
i was saying ‘trying’, because it fixed the error message reported back by the compiler, but the build still failed:
well, there must be something more. so i decided to revert my change in the cwide.ini file, and asked around for thoughts and help. and yes, someone came to the rescue and explained what is happening (sluvy: thank you, thank you, thank you!).
the thing is that gnu make and even the compiler/linker is internally calling its own programs and batch files, invisible for me. and it looks like these executables likely are using a different code page, thus failing the build. but sluvy has found a fix which requires a windows registry change .
the trick is to permanently set the code page used by the windows command processor (dos shell, cmd.exe) using a small ‘autorun’ command. whenever something is using the command processor, it will execute my command, which is to set the code page to the same one i’m using in eclipse.
for this, i run regedit.exe and go to this setting:
here i use the context menu to add a new multi-string value:
note: i case i already have that value, i do not need to add it, of course
i name the new value ‘autorun’ and assign the command ‘chcp 1252′ to change the code page:
this is how it should look like:
to make that ‘autorun’ command invisible, i can use ‘@chcp 1252>nul’, see this link.
building with foreign characters
now time to try it out .
after changing the code page, it is advised to rebuild all the make files and do a clean build (menu project > clean).
and indeed: my project now compiles properly in eclipse/codewarrior now:
i learned a lot around windows and code pages. and as always: it looks like the shortcomings of the past (7-bit ascii code, etc) echoes into our world today, making things fail. luckily there are is a way to overcome this, if necessary: it is possible to change the code page of the windows command processor (cmd.exe) with a registry if it does not match the eclipse code page used.
but it enforces even more my rule: “do not use foreign characters in file names”, and simply sticking with normal characters and letters. it will avoid a lot of troubles .
happy code paging
Published at DZone with permission of Erich Styger, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
What Is JHipster?
Managing Data Residency, the Demo
Hibernate Get vs. Load
Measuring Service Performance: The Whys and Hows