[Date Prev][Date Next][Thread Prev][Thread Next][Thread Index]

Re: [XaraXtreme-dev] XML notes



Alex Bligh wrote:



--On 10 May 2006 15:22 +0100 Phil Martin <phil@xxxxxxxx> wrote:

* Remember that libxml2 uses UTF-8 encoding in memory no matter what the
program's native encoding or the encoding of the original doc. So you
have to be careful to do correct conversion to and from UTF-8 when
passing strring parameters to it. I used wxString for this because it has
useful encoding conversion functions.


So I am guessing it is impossible to reliably encode non-UTF-8 in XML -
using this library anyway? Can you confirm that's the case? If so,
things like saving a text story as XML would be inherently lossy. Fine
if so, but we should know.

Alex

No, it's only the in-memory storage that uses UTF-8 and remember that UTF-8 can encode all unicode characters.

llibxml2 can load and save to any encoding you like so long as you have the appropriate encoding handlers installed and the default handlers include UTF-16 (big and little endian), ISO-Latin-1, ASCII, HTML and more depeing on platform and support libraries.

So there's no reason to lose data if you load one encoding and save to another.

BTW: libxml2 records the encoding that a document had when it was loaded and will save back in the same encoding unless you specify otherwise.

Phil