cstring null terminated but with a fixed length? #23

LiamKarlMitchell · 2017-06-16T23:54:08Z

How would I make a cstring with a fixed length?

I guess like char name[13]; where the app ensures a length of 12 is always in there or copied from there and last byte is always NULL.

The data i'm reading/writing is fixed

Do I need to add my own custom type for this?
Using count and countType cause validation errors.

Example:

Buffer data
31 33 33 37 00 00 00 00

I would expect to count up to the first NULL terminator.
Reading string as this.
'1337'

But the max length of the cstring would be 7 characters.
With a total byte length of 8.
1 byte at the end for the final NULL terminator if all count-1 characters are set.

31 00 31 00 00 00 00 00
Would be read as a string of '1' the other data is ignored because its null terminated.

I'll code my own for now as I already did wide strings in another project.

Thanks

The text was updated successfully, but these errors were encountered:

roblabla · 2017-06-17T10:19:34Z

I've been meaning to add a generic "fixed-length" type that would look like

["fixed", {
  length: 8,
  type: "cstring"
}]

But never went around to actually writing it. The idea is that it would limit what the underlying type "sees" to the length. If the underlying type throws a PartialReadError, then there is a protocol error, and we give up.

This is useful for things like restBuffer, etc...

Saiv46 · 2020-12-30T11:32:39Z

@LiamKarlMitchell I'll ask you one thing - why? Protocols are used to serialize data with fewer bytes as possible.

roblabla · 2020-12-30T12:51:31Z

null-terminated string takes less bytes to encode that length-prefixed ones for strings of long lengths (e.g. anything over 128/256 bytes would requires at least a 2-byte prefix, but only a single byte null terminator).

Also, ProtoDef is meant to describe existing protocols, and I know of a handful that use character-termination instead of length prefixes. As a general rule, having a very generic "field-delimited array" type on top of which we could build cstring would be great.

Somewhat related to this is the notion of substreams that I had for a long long while when working on this. The idea being that we could limit parsers to only take values until a certain predicate is hit. Then we could specify cstring as:

["limited", {
  "endByte": 0,
  "subtype": "string"
}]

Where "string" is a byte sequence that exhausts the current stream (on that topic, string should probably take an encoding parameter). limited would create a new stream that lasts until the endByte byte is seen. Of course, this is not very well thought-out, but it's just to provide an idea of what it could look like. Length-prefixed streams could also be provided for a similar effect, and could be used to recreate the pstring type.

This would provide a new fundamental type that could be used to generate new classes of complex types to parse protocols (\n-delimited strings for text-oriented protocols come to mind).

LiamKarlMitchell · 2020-12-30T13:41:47Z

@Saiv46 Implementing communication using an existing protocol that is not mindful of saving space and as a nicety to avoid having to read bytes then trim the zero bytes from the string in own code nice if it can be implemented as part of the definition.

Saiv46 · 2020-12-30T17:04:28Z

The idea being that we could limit parsers to only take values until a certain predicate is hit.

@roblabla Good idea. Not working.

Usually ProtoDef used to serialize user data, so such kind of datatypes would lead to unexpected behaviour.

For example, how will this

["limited", {
  "endByte": 0,
  "subtype": ["container", [
    { "name": "index", "type": "u8" },
    { "name": "value", "type": "string" }
  ]]
}]

Work with { index: 0, value: "\00\00" }? Result will have 4 bytes, but only 1 will be read.

roblabla · 2020-12-30T20:12:08Z

I expect serialization to throw on such inputs. The idea being that when the subtype writes to the current stream, limited would see the 0 and raise an error. But yes, I concede it is not straightforward to implement.

Also, it could be decided that invalid inputs are allowed to produce gibberish. Obviously not my favorite choice, but nothing says all input to a given type have to produce a valid bytestream. In such a world, we'd simply warn users not to put 0s in their limited inputs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cstring null terminated but with a fixed length? #23

cstring null terminated but with a fixed length? #23

LiamKarlMitchell commented Jun 16, 2017

roblabla commented Jun 17, 2017

Saiv46 commented Dec 30, 2020

roblabla commented Dec 30, 2020 •

edited

Loading

LiamKarlMitchell commented Dec 30, 2020 •

edited

Loading

Saiv46 commented Dec 30, 2020

roblabla commented Dec 30, 2020 •

edited

Loading

cstring null terminated but with a fixed length? #23

cstring null terminated but with a fixed length? #23

Comments

LiamKarlMitchell commented Jun 16, 2017

roblabla commented Jun 17, 2017

Saiv46 commented Dec 30, 2020

roblabla commented Dec 30, 2020 • edited Loading

LiamKarlMitchell commented Dec 30, 2020 • edited Loading

Saiv46 commented Dec 30, 2020

roblabla commented Dec 30, 2020 • edited Loading

roblabla commented Dec 30, 2020 •

edited

Loading

LiamKarlMitchell commented Dec 30, 2020 •

edited

Loading

roblabla commented Dec 30, 2020 •

edited

Loading