A site for solving at least some of your technical problems...
A site for solving at least some of your technical problems...
Trying to create a PostgreSQL Database with UNICODE does not always work. I ran in a problem with a RedHat installation...
There can be two problems:
Now a day, the second case is quite unlikely and it means that you won't be able to use a UNICODE encoding unless you recompile the database system. Make sure you use the ./configure --multi-byte encoding for that purpose.
In the first case, however, it could be that your cluster was initialized in SQL_ASCII. This is annoying but it does not prevent you from creating database in UNICODE (or UTF-8).
I found the solution on an Italian website. Although I don't really read Italian that well, the solution was in regular SQL language.
The problematic command line is this one:
CREATE DATABASE mydb WITH ENCODING = 'UTF-8';
When I try to run that statement, I get the following error:
ERROR: encoding UTF8 does not match locale en_US DETAIL: The chosen LC_CTYPE setting requires encoding LATIN1.
Which I understood, at first, as "The database does not support UNICODE"... Hmmm...
Later I found the Italian solution but tried from the wrong template—I always create new databases from template1!
CREATE DATABASE mydb WITH ENCODING = 'UTF-8' LC_CTYPE = 'en_US.utf8' LC_COLLATE = 'en_US.utf8';
Here I got it! The LC_CTYPE and LC_COLLATE are PostgreSQL variables, not shell environment variables. I was wondering since the shell did not show those variables...
But I then got the following error:
ERROR: new encoding (UTF8) is incompatible with the encoding of the template database (LATIN1) HINT: Use the same encoding as in the template database, or use template0 as template.
That last error was followed by a great hint though! Use template0 instead of your current template (which was template1). Oh! Yeah! I did not read the Italian fix from top to bottom... Surprised?!
CREATE DATABASE mydb WITH ENCODING = 'UTF-8' LC_CTYPE = 'en_US.utf8' LC_COLLATE = 'en_US.utf8' TEMPLATE template0;
Voilà! That worked.
And yes, to my point of view, all databases should just use UTF-8, although that is not always 100% welcome when some binary data is to be saved in the database and no conversion should occur on that data... But any text should by default be viewed as UTF-8 so all text works.