Working with Unicode in Oracle

I have been working mostly with “normal” character set databases.
The last project however uses Unicode database (AL32UTF8) in 11.2
I ran into this strange behaviour (until I knew better):

create table test(field varchar2(10));
Table created.
SQL> insert into testi values (‘öäåÖÄÅ’);
insert into testi values (‘öäåÖÄÅ’)
*
ERROR at line 1:
ORA-12899: value too large for column “”TEST”.”FIELD” (actual: 18,
maximum: 10)
SQL> insert into test values (‘öä’);
1 row created.
So – I created a field for 10 characters (or so I thought!) and tried to store only 6 characters and failed.
What an odd thing.
Closer look tells that each character actually uses 3 bytes each:

SQL> select dump(field) from test;
DUMP(KENTTA)
——————————————————————————–
Typ=1 Len=6: 239,191,189,239,191,189
1 row selected.
All right, so it seems that there is actually space for only 10 BYTES – not characters…
After 5 seconds of furious googling the reason is clear and my world shattered, again.
The VARCHAR2(10) REALLY means VARCHAR2(10 BYTE).
What I should have always used is VARCHAR2(10 CHAR).
A new test:

create table test2 (field varchar2(10 char));
=>
SQL> insert into test2 values(‘öäåÖÄÅ’);
1 row created.
How embarrassing after so many years of working with Oracle

Tags: , ,

This entry was posted on Monday, March 8th, 2010 at 21:42 and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

 

Leave a Reply