-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding for C-struct syntax #22
Comments
Hi, The introduction of interpreting unsigned/signed char as an array of numbers instead of utf-8 strings was introduced after suggestion in following issue: I think this is a nice "hack" to tell pycstruct what type of char array you would like to use. Thus I do think that pramas in the source code is unnecessary. I'm not sure what you mean with "a single char should be decoded like an array.". A single char is decoded as an int8 and an "unsigned char" is decoded as an uint8. Note that there is no type in standard C language called 'byte'. The standard type for a byte is 'char'. I do agree that it would be good to also support older encoding schemes for legacy systems (note that utf-8 is more or less standard nowadays). To support this i suggest:
|
Oups. I have probably mix some stuff together in my mind. In fact i saw that a
was deserialized as an int, while my C header documentation was expecting a char like That is why I think it makes sens to decode it, but i understand it's not so easy if there is no dedicated type for strings/values. Some set of extra configuration as you suggest sounds a good idea to know how to handle few cases, like which encoding to use, or how to handle a single char. |
Sorry to spam, but this makes me think a lot :-)
I have some proposal with the way encoding is handled with the C syntax.
signed char
/unsigned char
. As i saw in the documentationParse source code files
, it's probably not a very good C programing style to specify the sign of a char. For that there is thebyte
type. which is an int8. I thinkchar
should only be used for characters.I also saw that the library uses utf-8 as default. It is probably not a good idea. Such thing could be be
latin1
, or a lot of other stuffs. It's also possible that encoding stay unknown until the structure is read.Thinking about that, and following the way packing is handled, maybe this could be generalized for encoding, like using an own pragma.
So what it could be switched in the middle of the description if it is needed.
This said, i don't have such problems for now.
So it's just proposals.
The text was updated successfully, but these errors were encountered: