[Date Prev][Date Next][Thread Prev][Thread Next][Thread Index]

RE: [XaraXtreme-dev] Discussion of string portablility problems



I'm not close to the code, but here are some points:-

- First, we can drop support for Unicode builds if that will help keep
the code simpler (and that applies to LX and XarLib).

- Generally I prefer the option of using the standard printf functions
rather than trying to replace them with our own or trying to build
layers on top, since this is presumably what most incoming Linux
developers will be used to working with. So they won't have to learn how
to call our own special functions. If they do have to learn something
new that's another barrier to new developers becoming productive.

- I presume with the wx functions that Linux developers who are used to
specifying %ls for wide character strings can still do so and won't have
to remember to use %s all the time which they won't find natural.

- Gerry - is the reason you prefer the wx function route because it
requires the least code changes from where we are now? In other words is
it the fastest way forward?

Thanks

Neil

> -----Original Message-----
> From: owner-dev@xxxxxxxxxxxxxxxx [mailto:owner-dev@xxxxxxxxxxxxxxxx]
On
> Behalf Of Phil Martin
> Sent: 03 April 2006 17:12
> To: dev@xxxxxxxxxxxxxx
> Subject: Re: [XaraXtreme-dev] Discussion of string portablility
problems
> 
> Personally I would be very happy to drop support for non-Unicode
builds
> - it's an albatross.
> 
> I think your camStr plan sounds like a good one - lets us use %s
cleanly
> and creates a clean string-handling API. The downside, I suppose, is
> greater reliance on wxWidgets but that's a theoretical problem more
than
> a real one.
> 
> Phil
> 
> Gerry Iles wrote:
> 
> > There is no single solution that will solve all these issues as they
> > stem from the code starting as TCHAR based that was only ever built
in
> > non-unicode mode.
> >
> > One way of removing some of the complexity and handling (some of)
> > these issues would be for us to drop support for non-unicode builds.
> > This would allow the direct use of %ls to mean a string parameter
> > though the use of %s to write a non-unicode string into a Unicode
one
> > would not work on Windows and any code that relied on this would
need
> > to change. This would remove the need for the _T macro though we
would
> > still have issues in some of our macros if they are called with
> > concatenated strings. We would then also make the interface to
XarLib
> > only support wide character strings making it much easier for the
code
> > to be shared between XaraLX and XarLib. I was thinking that this
would
> > be a sensible route to go but it would still need a lot of code to
be
> > changed (a script could replace most of the _T() and tchar type
> > functions but there would be quite a lot of other code to be
changed)
> > and does nothing to address the difference between our own functions
> > and printf (e.g. you would have to use %ls in printf type calls and
%s
> > in MakeMsg ones).
> >
> > wxWidgets defines its own versions of all the strings functions
(e.g.
> > wxStrcpy) and the wxchar.h file has very complex conditionals in it
to
> > support all sorts of platforms. It actually implements a layer over
> > all the printf type functions on linux to make them behave like the
> > MSVC ones as that is "more useful for us". We could define our own
set
> > of string functions (e.g. camStrcpy) for use by the kernel (or Oil
> > code that could be easily ported away from wx) that the wxOil layer
> > simply defines as mapping to the wx equivalents. This would enable
us
> > to go back to the original meaning of %s always meaning a TCHAR
> > string. XarLib could then share the code from XaraLX and would
either
> > just use the wxchar parts of wxWidgets (probably best for now) or
> > would define its own wrapper for linux. I actually now think this
> > would be a better option than dropping Unicode support as it
> > standardises our %s usage and removes the PERCENT_S macros and
should
> > only require a some search and replaces and list of #define
camStrcpy
> > wxStrcpy type defines. This would also considerably simplify
> compatdef.h.
> >
> > Does anyone have any other suggestions or other comments on any of
this?
> >
> > Gerry
> >
> >
------------------------------------------------------------------------
> >
> > *From:* owner-dev@xxxxxxxxxxxxxxxx
[mailto:owner-dev@xxxxxxxxxxxxxxxx]
> > *On Behalf Of *Gerry Iles
> > *Sent:* 03 April 2006 13:03
> > *To:* dev@xxxxxxxxxxxxxx
> > *Subject:* [XaraXtreme-dev] Discussion of string portablility
problems
> >
> > I am currently working on creating an OpenSource version of the
XarLib
> > library (see http://www.xara.com/support/docs/webformat/spec/) that
> > will build on Linux, Mac and Windows. This library really needs to
> > build in both Unicode and non-unicode versions and shouldn't require
> > wxWidgets. The library consists of the Xar format loading and saving
> > code from XaraLX with a small wrapper around it providing a simple
> > interface. While trying to update some of the XaraLX files from the
> > latest version I have come across several problems that really need
to
> > be sorted out properly...
> >
> > Use of _T(), # and ##
> >
> > I went into this in my message of 29/03 but I've looked into it a
bit
> > more and the MS compiler errors if you try to concatenate a narrow
and
> > a wide string. Gcc behaves differently e.g.:
> >
> > char testa[256] = "Hello" "world";
> >
> > char testb[256] = "Hello" L"world";
> >
> > char testc[256] = L"Hello" "world";
> >
> > char testd[256] = L"Hello" L"world";
> >
> > wchar_t wtesta[256] = "Hello" "world";
> >
> > wchar_t wtestb[256] = "Hello" L"world";
> >
> > wchar_t wtestc[256] = L"Hello" "world";
> >
> > wchar_t wtestd[256] = L"Hello" L"world";
> >
> > TCHAR ttesta[256] = "Hello" "world";
> >
> > TCHAR ttestb[256] = "Hello" L"world";
> >
> > TCHAR ttestc[256] = L"Hello" "world";
> >
> > TCHAR ttestd[256] = L"Hello" L"world";
> >
> > MSVC gives the following errors (in a Unicode build):
> >
> > (2) : error C2308: concatenating mismatched wide strings
> >
> > (3) : error C2308: concatenating mismatched wide strings
> >
> > (3) : error C2440: 'initializing' : cannot convert from 'const
> > unsigned short [8]' to 'char [256]'
> >
> > There is no context in which this conversion is possible
> >
> > (4) : error C2440: 'initializing' : cannot convert from 'const
> > unsigned short [11]' to 'char [256]'
> >
> > There is no context in which this conversion is possible
> >
> > (5) : error C2440: 'initializing' : cannot convert from 'const char
> > [11]' to 'wchar_t [256]'
> >
> > There is no context in which this conversion is possible
> >
> > (6) : error C2308: concatenating mismatched wide strings
> >
> > (6) : error C2440: 'initializing' : cannot convert from 'const char
> > [17]' to 'wchar_t [256]'
> >
> > There is no context in which this conversion is possible
> >
> > (7) : error C2308: concatenating mismatched wide strings
> >
> > (9) : error C2440: 'initializing' : cannot convert from 'const char
> > [11]' to 'TCHAR [256]'
> >
> > There is no context in which this conversion is possible
> >
> > (10) : error C2308: concatenating mismatched wide strings
> >
> > (10) : error C2440: 'initializing' : cannot convert from 'const char
> > [17]' to 'TCHAR [256]'
> >
> > There is no context in which this conversion is possible
> >
> > (11) : error C2308: concatenating mismatched wide strings
> >
> > Concatenation of the mismatched strings gives an error but then
> > continues compilation using the type of the first sub-string.
> >
> > GCC gives the following:
> >
> > 2:error: char-array initialised from wide string
> >
> > 3:error: char-array initialised from wide string
> >
> > 4:error: char-array initialised from wide string
> >
> > 5:error: int-array initialised from non-wide string
> >
> > 9:error: int-array initialised from non-wide string
> >
> > Also, removing the lines that error shows that the other lines all
> > produce a fully wide string (e.g. lines 10, 11 and 12 all produce a
> > TCHAR string saying "Helloworld".
> >
> > So, it appears that when concatenating strings, if any of the
> > sub-strings are wide then gcc promotes all of the substrings to wide
> > ones. This explains why our macros that do this compile on linux but
> > not on MSVC.
> >
> > Use of %s in C runtime printf type functions
> >
> > It appears as though on linux %s always means a narrow char string
> > even in the wide version of the function. To pass a wide string you
> > must specify %ls. However, on windows, MS have "helpfully" changed
the
> > wide string version (e.g. swprintf) so that %s means a TCHAR pointer
> > and to force a narrow string you have to use %hs (you can still use
> > %ls to force a wide string). This can be handled by the PERCENT_S
> > macro (with new PERCENT_Sa and PERCENT_Sw macros for forcing to
narrow
> > or wide) but there are quite a lot of calls to printf type functions
> > that pass a %s that need to be changed to work correctly (e.g. at
> > present they will not work correctly on linux because they use %s
but
> > pass a TCHAR*). Also there are quite a few places where narrow
strings
> > are used deliberately and this requires that the PERCENT_S macros
are
> > not used (as they are defined as wide strings on Unicode builds).
> >
> > Other printf type functions
> >
> > XaraLX also implements some of its own printf type functions,
> > _MakeMsg, CCvsprintf etc. These only handle a subset of the standard
%
> > codes and all treat %s as a TCHAR* and don't allow the use of
> > different width strings.
> >
> > Other problematic string usage
> >
> > There are various places in the XaraLX code where narrow string
> > functions are specifically used but are passed TCHAR* as parameters.
> > Presumably this sort of thing is being fixed when the files are
> > initially ported.
> >
> > I think the best solution for this would be for all of our printf
type
> > functions to behave the same and preferably not require the use of
> > PERCENT_S type macros but this would involve updating our own
> > functions (e.g. to support floating point) and also making the
> > standard printf type functions on linux treat %s as TCHAR* (either
by
> > a wrapper that rewrites the format string or by writing our own
> > version). Alternatively, we would have to carefully document exactly
> > what each does support and document how they must be used to be
> portable.
> >
> > Gerry
> >