Skip to content

Commit

Permalink
french tokenizer small change
Browse files Browse the repository at this point in the history
  • Loading branch information
Lluís Padró committed Mar 31, 2016
1 parent 79581b7 commit f2ba4d0
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion data/fr/tokenizer.dat
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ MAILS 0 {ALPHANUM}+([\._]{ALPHANUM}+)*@{ALPHANUM}+([\._]{ALPHANUM}+)*
LON_FR 1 (l'on){NOALPHANUM} CI
APOSTR_FR 1 ((qu|[cdlmtsjn])')({ALPHA}) CI
URLS1 0 ((mailto:|(news|http|https|ftp|ftps)://)[\w\.\-/]+|^(www(\.[\w\-/]+)+))
URLS2 1 ([\w\.\-/]+\.(com|org|net))[\s]
URLS2 1 ([\w\.\-/]+\.(com|org|net|fr))[\s]
KEEP_COMPOUNDS 0 {ALPHA}+(['_\-\+]{ALPHA}+)+
*ABREVIATIONS1 0 (({ALPHA}+\.)+)(?!\.\.)
WORD 0 {ALPHANUM}+[\+]*
Expand Down

0 comments on commit f2ba4d0

Please sign in to comment.