MSVC9 is now supported by CppCMS
MSVC 2008 had successfully compiled CppCMS 1.x.x branch and run several examples. So it looks like that Windows with MSVC would definitely be one of the officially supported platforms. (Note: 0.0.x stable branch would not support Windows besides Cygwin)
The biggest problems I had with MSVC were lack of support of C99 and some POSIX
functionality like: snprintf, cstdint or gmtime_r… but with couple of ifdefs it mostly solved.
To be honest, Windows development can be quite unpleasant, especially with lack of basic widely available libraries like zlib. But finally I must admit that it is possible to work with Mircosoft compiler, I could even admit, it could be very fine compiler if it was supporting C99.
Also, more time I try supporting Windows platform, more I understand: gcc is not as polished under Windows as it is under Unix platforms. That makes development of cross platform software even harder.
CppCMS 1.x.x moves to CMake
No, I don’t think that CMake is better then autotools. In fact I still think that
CMake is total "crap". It has terrible cache policy, it has broken configuration_file support. It is crappy documentation and many broken configuration tools like CheckTypeSizeOf… and much more.
But it supports MSVC (that I may think supporting in future) and has a better Windows support… So I announce that next version of CppCMS would use CMake (and it already uses in re-factoring branch).
Autotools build system is no longer supported and will be removed from the CppCMS 1.x.x branch, because I do not really like supporting two build systems.
I hope CppCMS users would understand this terrible move.
Introducing Boost.Locale
After a long period of hesitating I had understood – standard C++ locale facets are no-go and started developing localization tools based on ICU that work in C++ friendly way. Thus, Boost.Locale was born. It had just been announced to the Boost community for preliminary review.
Boost.Locale provides case conversion and folding, Unicode normalization, collation, numeric, currency, and date-time formatting, messages formatting, boundary analysis and code-page conversions in C++ aware way.
For example in order to display a currency value it is enough to write this:
cout << as::currency << 123.34 << endl;
And currency like "$124.34" would be formatted. Spelling a number?
cout << as::spellout << 1024 << endl;
Very simple. And much more. This library would be the base for CppCMS localization support and I hope it would be also accepted in Boost at some point in future.
I've tested this library with:
- Linux GCC 4.1, 4.3. with ICU 3.6 and 3.8
- Windows MSVC-9 (VC 2008), with ICU 4.2
- Windows MingW with ICU 4.2
- Windows Cygwin with ICU 3.8
Documentation
- Full tutorials: http://cppcms.sourceforge.net/boost_locale/docs/
- Doxygen reference: http://cppcms.sourceforge.net/boost_locale/docs/doxy/html/
Source Code
Is available from SVN repository.
svn co https://cppcms.svn.sourceforge.net/svnroot/cppcms/boost_locale/trunk
Building
You need CMake 2.4 and above, ICU 3.6 and above, 4.2 recommended So checkout the code, and run
cmake /path/to/boost_locale/libs/locale make
Inputs and comments are welcome.
Localization in 2009 and broken standard of C++.
There are many goodies in upcoming standard C++0x. Both, core language and standard libraries were significantly improved.
However, there is one important part of the library that remains broken – localization.
Let’s write a simple program that prints number to file in C++:
#include <iostream>
#include <fstream>
#include <locale>
int main()
{
// Set global locale to system default;
std::locale::global(std::locale(""));
// open file "number.txt"
std::ofstream number("number.txt");
// write a number to file and close it
number<<13456<<std::endl;
}
And in C:
#include <stdio.h>
#include <locale.h>
int main()
{
setlocale(LC_ALL,"");
FILE *f=fopen("number.txt","w");
fprintf(f,"%'f\n",13456);
fclose(f);
return 0;
}
Lets run both programs with en_US.UTF-8 locale and observe the following number in the output file:
13,456
Now lets run this program with Russian locale LC_ALL=ru_RU.UTF-8 ./a.out. C version gives us as expected:
13 456
When C++ version produces:
13<?>456
Incorrect UTF-8 output text! What happens? What is the difference between C library and C++ library that use same locale database?
According to the locale, the thousands separator in Russian is U+2002 – EN SPACE, the codepoint that requires more then one byte in UTF-8 encoding. But let’s take a look on C++ numbers formatting provider: std::numpunct. We can see that member functions thousands_sep returns single character. When in C locale definition, thousands separator represented as a string, so there is no limitation of single character as in C++ standard class.
This was just a simple and easily reproducible problems with C++ standard locale facets. There much more:
std::time_get– is not symmetric withstd::time_put(as it in C strftime/strptime) and does not allow easy parsing of times with AM/PM marks.std::ctypeis very simplistic assuming that toupper/tolower can be done on per-character base (case conversion may change number of characters and it is context dependent).std::collate– does not support collation strength (case sensitive or insensitive).- There is not way to specify a timezone different from global timezone in time formatting and parsing.
- Time formatting/parsing always assumes Gregorian calendar.
Its very frustrating that in 2009 such annoying, easily reproducible bugs exist and make localization facilities totally useless in certain locales.
All the work I had recently done with support of localization in CppCMS framework had convinced me in important decision — ICU would be mandatory dependency and provide most of localization facilities by default, because native C++ localization is no-go…
The question is: "Would C++0x committee revisit localization support in C++0x?"
Message to Blog Readers
This web site will be down for couple of days. Sorry for inconvenience.