Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QCheck2.Gen: enforce naming consistency #223

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

vch9
Copy link
Contributor

@vch9 vch9 commented Jan 17, 2022

Partially close #162 (the optional parameters are not adressed)

As QCheck2 is already released, I went with a "deprecated" phase, this can be back ported to QCheck1 as well.

The generator naming convention is basically <type>_suffixes where _suffixes is optional.

@@ -481,8 +481,8 @@ module Function = struct
Gen.(quad (* string -> int -> string *)
(fun2 ~print:Print.string Observable.string Observable.int (small_string ~gen:char))
(small_string ~gen:char)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also switch to string_small? (two occurrences)

@@ -420,7 +420,7 @@ module Function = struct
Test.make ~name:"fail_pred_map_commute" ~count:100 ~long_factor:100
~print:Print.(triple (list int) Fn.print Fn.print)
Gen.(triple
(small_list small_int)
(small_list nat_small)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use list_small?


let failing =
Test.make ~name:"should_fail_sort_id" ~count:10 ~print:Print.(list int)
Gen.(small_list small_int) (fun l -> l = List.sort compare l)
Gen.(small_list nat_small) (fun l -> l = List.sort compare l)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use list_small?

let small_int = small_nat

let small_signed_int : int t = fun st ->
let int_small_signed : int t = fun st ->
if RS.bool st
then small_nat st
else (small_nat >|= Int.neg) st
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use nat_small? (two occurrences)

@@ -685,11 +697,14 @@ module Gen = struct

let string_printable = string_size ~gen:printable nat

let small_string ?gen st = string_size ?gen small_nat st
let string_small ?gen st = string_size ?gen small_nat st
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use nat_small?


let small_list gen = list_size small_nat gen
let list_small gen = list_size small_nat gen
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here


let small_array gen = array_size small_nat gen
let array_small gen = array_size small_nat gen
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here


@since NEXT_RELEASE
*)

val small_string : ?gen:char t -> string t
(** Builds a string generator, length is {!small_nat}.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nat_small?

@@ -319,11 +394,22 @@ module Gen : sig

@since 0.11 *)

val string_small : ?gen:char t -> string t
(** Builds a string generator, length is {!small_nat}.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nat_small

@jmid
Copy link
Collaborator

jmid commented Jan 18, 2022

Thanks for taking a stab at this! Overall this LGTM 😀

I spotted a few occurrences of small_nat in the interface and in the implementation.
It would make sense to switch to the new names internally in the module, IMO.

There's also one remaining in line 62 of the interface in the documentation example at the top:

Gen.(list small_nat)

I'm guessing there are probably more that I missed.
Could I ask you to take an extra pass over both files to make sure we've caught (most of) them? 🙏‍

Looking at the new interface from a distance we still have a bit of inconsistency regarding signed generators:

  val int : int t
  val pint : ?origin : int -> int t
  val int_neg : int t
  val int_small_signed : int t
  val int_small_corners : unit -> int t
  val int_pos_corners : int list
  val int_corners : int list

compared to:

  val float : float t
  val float_p : float t
  val float_n : float t

One option would be to rename

  • pint to int_pos
  • float_p to float_pos
  • float_n to float_neg

but if the majority prefers _p and _nsuffixes I'm also fine with that - as long as it is consistent! 😅😉

The int_small_signed is a legacy from QCheck IIRC. Since we are renaming for consistency nat to me indicates non-negative and int indicates both positive and negative numbers. As such, while we are at it I would favor a rename

  • int_small_signed to int_small

@c-cube ?

Finally, there's the "optional parameters that aren't so optional" that you didn't intend to address in this PR:
We should do so as part of rewamping the QCheck2 API IMO - but perhaps as a separate PR (before a release, please),
as there are still generators with a single not-so-optional parameter remaining in the revised interface.

@vch9
Copy link
Contributor Author

vch9 commented Jan 18, 2022

Thanks for taking a stab at this! Overall this LGTM grinning

I spotted a few occurrences of small_nat in the interface and in the implementation. It would make sense to switch to the new names internally in the module, IMO.

There's also one remaining in line 62 of the interface in the documentation example at the top:

Gen.(list small_nat)

I'm guessing there are probably more that I missed. Could I ask you to take an extra pass over both files to make sure we've caught (most of) them? 🙏‍

There is most certainly other occurences, thanks for noticing (some of) them. I will correct them and have a second look.

Looking at the new interface from a distance we still have a bit of inconsistency regarding signed generators:

  val int : int t
  val pint : ?origin : int -> int t
  val int_neg : int t
  val int_small_signed : int t
  val int_small_corners : unit -> int t
  val int_pos_corners : int list
  val int_corners : int list

compared to:

  val float : float t
  val float_p : float t
  val float_n : float t

One option would be to rename

* `pint` to `int_pos`

* `float_p` to `float_pos`

* `float_n` to `float_neg`

but if the majority prefers _p and _nsuffixes I'm also fine with that - as long as it is consistent! sweat_smilewink

I also do prefer _pos to be honest.

The int_small_signed is a legacy from QCheck IIRC. Since we are renaming for consistency nat to me indicates non-negative and int indicates both positive and negative numbers. As such, while we are at it I would favor a rename

* `int_small_signed` to `int_small`

@c-cube ?

Finally, there's the "optional parameters that aren't so optional" that you didn't intend to address in this PR: We should do so as part of rewamping the QCheck2 API IMO - but perhaps as a separate PR (before a release, please), as there are still generators with a single not-so-optional parameter remaining in the revised interface.

I think the way to go for optional parameters would be like @sir4ur0n mentioned:

n my opinion, whenever it makes sense, we should provide a generator that does not take any additional input (make it dead-easy to get started on QCheck) so I'd go with val small_string : string t by default.
Then we should also provide another function that takes a mandatory - labeled or not, this depends if there are other arguments - generator.

But for instance with opt, I did not know what was more convenient as a suffix:

val opt : 'a t -> 'a option t
val opt_ratio : ratio:float -> 'a t -> 'a option t

For other types such as list, string, _of can be an appropriate suffix I think.

@vch9
Copy link
Contributor Author

vch9 commented Jan 18, 2022

I think _pos is more convenient: we have a lot of suffixes in the module. For some it's obvious (e.g. _neg) but its more complicated for _big for instance.

@vch9
Copy link
Contributor Author

vch9 commented Jan 18, 2022

Regarding make_primitive I think we could change to:

val make_primitive : ?shrink : ('a -> 'a Seq.t) -> (Random.State.t -> 'a) -> 'a t

where ?(shrink = fun _ -> Seq.empty)

@jmid
Copy link
Collaborator

jmid commented Jan 18, 2022

Finally, there's the "optional parameters that aren't so optional" that you didn't intend to address in this PR: We should do so as part of rewamping the QCheck2 API IMO - but perhaps as a separate PR (before a release, please), as there are still generators with a single not-so-optional parameter remaining in the revised interface.

I think the way to go for optional parameters would be like @sir4ur0n mentioned:

n my opinion, whenever it makes sense, we should provide a generator that does not take any additional input (make it dead-easy to get started on QCheck) so I'd go with val small_string : string t by default.
Then we should also provide another function that takes a mandatory - labeled or not, this depends if there are other arguments - generator.

But for instance with opt, I did not know what was more convenient as a suffix:

val opt : 'a t -> 'a option t
val opt_ratio : ratio:float -> 'a t -> 'a option t

I like opt_ratio - it is a very saying name! 😀
I also agree with @sir4ur0n that providing simple, non-parameterized generators is a good idea to lower the barrier to entry.

I guess a second implicit design principle is to try to align generator names with OCaml's type names and type constructors:
unit, bool, char, sttring, ... int list, char array
For the latter we should stick with the prefixed generator names list int, array char, ...
If we follow this principle, we should rename opt to option though... 🤔

For other types such as list, string, _of can be an appropriate suffix I think.

To keep alignment with the types and type constructors list and array should be kept as is,
as the user has to provide an element type (generator).
For string both the above design principles suggest to keep an un-parameterized version.

I agree about the _of suffix for, e.g., for string!
As I mentioned in #162 (comment)
this makes for very readable generators in test specifications like Gen.(string_of printable), Gen.(string_of numeral), ... We should add additional char generators, like letter, ascii, latin1`, ... (but that is for another issue/PR! 😀)

@vch9 vch9 changed the title QCheck2.Gen: enforce naming consistency WIP: QCheck2.Gen: enforce naming consistency Jan 18, 2022
@vch9
Copy link
Contributor Author

vch9 commented Jan 18, 2022

While addressing the suffixes and optional parameters (except make_primitive yet) I also looked for occurrences with git grep <old_gen>. I might have removed the vast majority but its a error-prone process. Once we have released NEXT_RELEASE we will be able to totally wipe the deprecated generators and the compilation will help us.

@vch9 vch9 changed the title WIP: QCheck2.Gen: enforce naming consistency QCheck2.Gen: enforce naming consistency Jan 18, 2022
@jmid
Copy link
Collaborator

jmid commented Jan 18, 2022

Regarding make_primitive I think we could change to:

val make_primitive : ?shrink : ('a -> 'a Seq.t) -> (Random.State.t -> 'a) -> 'a t

where ?(shrink = fun _ -> Seq.empty)

So making shrink optional and removing the gen-label? 🤔

The QCheck parallel is probably QCheck.make that has a similar interface: a bunch of optional parameters (shrink, print, ...) and one unlabeled required one (the Gen.t generator).

  • On the one hand it will be more consistent with the interface of the remaining generator combinators.
  • On the other hand code of a custom QCheck2 generator could be a bit harder to read, e.g., if it passes two large arguments ("what is this second argument?").

An alternative would be to have an optional shrink and a required+labeled gen.
I don't feel strongly for either though. @c-cube ?

val nat_origin : int -> int t
(** Generates non-strictly positive integers uniformly ([0] included).

Shrinks towards [origin] if specified, otherwise towards [0].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nat_origin b] shrinks towards [b].

Shrinks towards [origin] if specified, otherwise towards [0]. *)
Shrinks towards [origin] if specified, otherwise towards [0].

@deprecated use {!int_origin} *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use {!nat_origin}

@c-cube
Copy link
Owner

c-cube commented Jan 18, 2022

not very fond of required+labeled in this case, but why not . Remember you also need a unit at the end to keep the optional parameter optional.

For int_small_signed, yes, remove _signed. It should be in the docstring but the nat vs int split is clear enough I think.

let p = RS.float st 1. in
if p < (1. -. ratio)
then Tree.pure None
else Tree.opt (gen st)

let option gen = option_ratio ~ratio:0.85 gen

let opt ?(ratio = 0.85) = option_ratio ~ratio
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this compile without the dreaded "this-optional-argument-is-not-so-option" warning/error?
(I realize it is curried, so that option_ratio will expect a later gen argument)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not have the warning but I will double check. But I think the gen argument should prevent the warning yes.

@@ -399,14 +412,16 @@ module Gen = struct
let right = RS.bits st in
left lor middle lor right

let pint ?(origin : int = 0) : int t = fun st ->
let nat_origin origin : int t = fun st ->
let x = pint_raw st in
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming pint_raw to nat_pos_raw while we are at it.
I know it isn't visible from the outside, however using the naming principles internally should make reading the implementation easier for ourselves and others going forward.


let int_bound (n : int) : int t =
if n < 0 then invalid_arg "Gen.int_bound";
fun st ->
if n <= (1 lsl 30) - 2
then Tree.make_primitive (fun a () -> Shrink.int_towards 0 a ()) (RS.int st (n + 1))
else Tree.map (fun r -> r mod (n + 1)) (pint st)
else Tree.map (fun r -> r mod (n + 1)) (nat st)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think suspect this renaming introduces a bug: the else branch handles n bigger than 30-bit,
which will then need the old pint (now: nat_origin) to stitch together 3 calls.
Using plain nat outputs max 10.000 though. A statistics test should reveal the change in distribution!
This would be a good opportunity to add it 😀

then pint ~origin:0 >|= (fun n -> - n - 1)
else pint ~origin:0
then nat_origin 0 >|= (fun n -> - n - 1)
else nat_origin 0

let int_bound (n : int) : int t =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, should this really be called nat_bound? 😬

@jmid
Copy link
Collaborator

jmid commented Jan 18, 2022

Hm. I'm starting to realize that this has opened a can of worms and wonder if the design is going in the direction we intend it to...

In a sense, having nat-generators is short, user-friendly to write, and clear but it also goes against consistently using a type-prefix. I'm not sure replacing nat with int_pos in these will be better though:

  val nat_origin : int -> int t
  val nat_small : int t
  val nat : int t
  val nat_big : int t

I noticed that

  • nat is non-uniform and at most 10.000 - while int is uniform over all OCaml ints.
    (one should use nat_origin 0 to get a behavior mirroring int)
  • there's no int_big

when comparing to these:

  val int : int t
  val int_neg : int t
  val int_small : int t
  val int_small_corners : unit -> int t
  val int_range : ?origin:int -> int -> int -> int t
  val int_pos_corners : int list
  val int_corners : int list

So we end up having int_neg - but no int_pos - so we should rename int_pos_corners to nat_corners I guess?
Furthermore, it seems we are missing a generator of int_pos/nat_(small_)corners that utilizes int_pos_corners... 😬🤷‍♂️

@vch9
Copy link
Contributor Author

vch9 commented Jan 19, 2022

I noticed that

  • nat is non-uniform and at most 10.000 - while int is uniform over all OCaml ints.
    (one should use nat_origin 0 to get a behavior mirroring int)
  • there's no int_big

This should be addressed yes, in this MR or in a subsequent one. If you want to add these new generators feel free to commit on my branch.

So we end up having int_neg - but no int_pos - so we should rename int_pos_corners to nat_corners I guess?

Would it make sense to have int_pos which is just let int_pos = nat where we specify in the documentation to look for nat_* generators for more specific positive integers?

Furthermore, it seems we are missing a generator of int_pos/nat_(small_)corners that utilizes int_pos_corners...

It should also be addressed yes.

@vch9
Copy link
Contributor Author

vch9 commented Mar 9, 2022

I really wish I can revive this PR, however, I'm still busy and I will be for a while. Once I get the time that'll be the first thing I'll do :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

QCheck2.Gen design considerations
3 participants