Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cyrillic characters don't work #93

Open
husainshabbir opened this issue Jan 14, 2020 · 7 comments
Open

Cyrillic characters don't work #93

husainshabbir opened this issue Jan 14, 2020 · 7 comments
Labels

Comments

@husainshabbir
Copy link

Regular expressions with cyrillic characters (e.g. [А-Я]{1,5}[а-я]{5,10}) don't work in the latest version. The last version it used to work in is 0.4.6.

This reproduces the issue: https://codesandbox.io/s/randexp-cyrillic-issue-2kcou

@fent
Copy link
Owner

fent commented Jan 17, 2020

The default range for sets includes only printable ASCII characters https://github.com/fent/randexp.js#default-range

you can change it with something like the following

RandExp.prototype.defaultRange.add(0, 65535);

or with instances

let randexp = new RandExp(/regex/);
randexp.defaultRange.add(0, 65535);

defaultRange was added so that the any (.) character set wouldn't generate characters most randexp users wouldn't expect. although, it's applied to all sets, even custom sets (/[a-f]), and negated sets (/[^\D]).

whether or not it's applied to custom sets is debatable, it does seem like unexpected behavior.

@michaelficarra
Copy link

michaelficarra commented Jan 17, 2020

I can understand the default range being used for any "open' sets, such as . and negated character classes, but "closed" sets should not be restricted by the default range in my opinion. I would consider the current behaviour a bug.

@1valdis
Copy link

1valdis commented Mar 11, 2020

IMO this should be left as it is. People only need a minute of time to check the docs to understand what's going on.

It makes no sense if I explicitly specify character range on Randexp and then see that my string does not follow the range I specified. A regular expression may come from anywhere; a Randexp instance is what I control and use and want my generated string to be in range of.

@fent fent added the feature label Apr 7, 2020
@fent
Copy link
Owner

fent commented Apr 7, 2020

I'm leaning towards @michaelficarra in that the default range should be respected for predefined sets, but for custom non-negated sets like in the OP (e.g. [А-Я]{1,5}[а-я]{5,10}), could ignore the default range

@1valdis
Copy link

1valdis commented Apr 8, 2020

Then why defaultRange is even needed, if some constructs in regexp could "override" it? As it stands, I'm sure that the string generated will have characters in the defined range only, no matter what's in the regexp. So for me this override of range by regexp feels more unintuitive than the OP issue.

@michaelficarra
Copy link

@1valdis That's ridiculous. If defaultRange is restricted to a through f, and I provide the regexp x, should it not produce anything? How about [x]? Or [xyz]? Or [x-z]? defaultRange should only affect "open" sets like . or [^a].

@1valdis
Copy link

1valdis commented Apr 9, 2020

@michaelficarra if it was restricted by someone to a-f then it was done on purpose. For a-f it can be easily found: letters for hexadecimal numbers. If there goes some y or z then it's gonna blow up. The regexp itself is not always something you write into code and control. The randexp.js instance however is.
I believe explaining that as "the default range of generated characters applies to whole regexp" is also simpler and more consistent than "the default range applies only to 'open' sets and negated groups, but not for predefined ranges".
And I don't understand what's the problem with one line of code randexp.defaultRange.add(0, 65535); if you want Chinese, Russian and others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants