-
Notifications
You must be signed in to change notification settings - Fork 4
/
index.html
193 lines (171 loc) · 11.4 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Demystifying Regular Expressions</title>
<link rel="stylesheet" href="styles/style.css">
<link href="https://fonts.googleapis.com/css?family=Muli&display=swap" rel="stylesheet">
<link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/styles/vs2015.min.css">
<script src="//cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/highlight.min.js"></script>
</head>
<body>
<div>
<h1>Demystifying Regular Expressions</h1>
<p class="subtitle">Emily Rautenberg - UX Engineer, Comcast NBCUniversal</p>
<p class="subtitle">LibertyJS - Friday, October 25, 2019</p>
<p class="subtitle"><a href="https://github.com/erautenberg/libertyJS-regexp" target="_blank">https://github.com/erautenberg/libertyJS-regexp</a></p>
<img src="./qr_code.png" id="qrcode" />
<h2>RegEx</h2>
<p>A Regular Expression, or RegEx, is a pattern that can be used to describe a specific subset of text. Regular expressions use various special character codes to denote sets of characters and shortcuts to generate a searchable pattern. This tutorial will specifically cover how to utilize regular expressions in JavaScript.</p>
<div class="content">
<section>
<h2>Helpful Resources</h2>
<ul>
<li><a href="https://www.debuggex.com/" target="_blank">https://www.debuggex.com/</a></li>
<li><a href="https://javascript.info/regexp-introduction" target="_blank">https://javascript.info/regexp-introduction</a></li>
<li><a href="https://eloquentjavascript.net/09_regexp.html" target="_blank">https://eloquentjavascript.net/09_regexp.html</a></li>
</ul>
</section>
<section>
<h2>Creating a Regular Expression</h2>
<p>RegEx can be created using the JavaScript object "RegExp"</p>
<pre><code class="javascript" id='create-long'></code></pre>
<p>RegEx can also be created using '/' to start and end the pattern, similarly to how strings use quotes or backticks</p>
<pre><code class="javascript" id='create-short'></code></pre>
<p>Let's check out what this looks like on <a href="https://www.debuggex.com/r/WX61ZDcEIakJo2vE" target="_blank">Debuggex</a></p>
</section>
<section>
<h2>Methods of RegEx and Strings</h2>
<p>There are 6 Javascript methods we will be using throughout this workshop:</p>
<pre><code class="javascript" id='methods-explanation'></code></pre>
<p>Let's test them out on the same regular expression and string as before.</p>
<pre><code class="javascript" id='methods-test'></code></pre>
</section>
<section>
<h2>Flags</h2>
<p>There are 5 JavaScript-specific tags we will be using in this workshop:</p>
<pre><code class="javascript" id="flags-explanation"></code></pre>
<p>Let's take the following regular expressions and strings.</p>
<pre><code class="javascript" id="flags-test"></code></pre>
</section>
<section>
<h2>Anchors</h2>
<p>The special characters ^ and $ are called anchors, and match at the beginning and end of text, respectively</p>
<pre><code class="javascript" id="anchors"></code></pre>
<p>We can see which instance of "Fourth" is selected using <a href="https://www.debuggex.com/r/ydLadC5rdlaKuVxL
" target="_blank">Debuggex</a>.</p>
</section>
<section>
<h2>Sets and Ranges</h2>
<p>It's also possible to query from a set of characters.</p>
<pre><code class="javascript" id="sets-set"></code></pre>
<p>You can also create custom ranges</p>
<pre><code class="javascript" id="sets-range"></code></pre>
<p><a href="https://www.debuggex.com/r/vP_gD0e1rHzNsyDF" target="_blank">Debuggex</a> also diagrams these sets for you.</p>
</section>
<section>
<h2>Character Sets</h2>
<p>There are many different character codes that can be used as shorthands for groupings of characters.</p>
<pre><code class="javascript" id="characters-explanation"></code></pre>
<p>Let's take the following regular expressions and strings.</p>
<pre><code class="javascript" id="characters-test"></code></pre>
<p>There is a "cheatsheet" of all these character classes on <a href="https://www.debuggex.com/#cheatsheet" target="_blank">Debuggex</a> as well.</p>
<p>There is one more character class called a "word boundary," which also comes with an inverse. This is used to specify there should be a separation between your match and the rest of the string.</p>
<pre><code class="javascript" id="characters-boundary"></code></pre>
<pre><code class="javascript" id="characters-boundary-test"></code></pre>
</section>
<section>
<h2>Escaping Special Characters</h2>
<p>As you probably noticed by now, RegEx uses special characters to declare specific behavior. If the pattern you are looking for actually contains a special character, similarly to HTML, Javascript, etc., you will have to escape it using the backslash, '\'.</p>
<pre><code class="javascript" id="escaping"></code></pre>
<p>It might be difficult to see, but the way <a href="https://www.debuggex.com/r/SWFr20JUWzdZiaFq" target="_blank">Debuggex</a> differentiates between the RegEx special characters and the actual character is with blue versus black text.</p>
</section>
<section>
<h2>Lengths</h2>
<p>It's also possible to set how many of a certain character code are expected.</p>
<pre><code class="javascript" id="lengths"></code></pre>
<p><a href="https://www.debuggex.com/r/ni-meD3Oxj6pfujj" target="_blank">Debuggex</a> adds some nice loops to its flow chart when you add lengths.</p>
</section>
<section>
<h2>Quantifiers</h2>
<p>There are also other special characters used to quantify occurances other than a specified range.</p>
<pre><code class="javascript" id="quantifiers-explanation"></code></pre>
<p>Consider the following strings and regular expressions.</p>
<pre><code class="javascript" id="quantifiers-test"></code></pre>
<p>Again, <a href="https://www.debuggex.com/r/uAYBBqscc7ei1-xe" target="_blank">Debuggex</a> provides some helpful loops to its diagram when you use these quantifiers.</p>
</section>
<section>
<h2>Greedy vs. Lazy Quantifiers</h2>
<p>As you may have noticed in the previous example, the regular expression found matches for each quanitfier. The results, however, were different for each quanitifier. This is because quantifiers are "greedy." Greedy quantifiers try to match at a given position and go as far forward as they can before the match fails. When it fails, it backtracks one position at a time.</p>
<p>Consider this "greedy" regular expression against this string.</p>
<pre><code class="javascript" id="greedy"></code></pre>
<p>Let's see what gets highlighted in <a href="https://www.debuggex.com/r/qcaNgVXJjdQ-LCir" target="_blank">Debuggex</a>.</p>
<p>In order to avoid this behavior, you can make the quantifier "lazy," or tell the expression to repeat the minimal number of times. This means they will try to match only the next position first, and after failing, will backtrack once and then trying matching again</p>
<pre><code class="javascript" id="lazy"></code></pre>
<p>Note: There are also "positive lookaheads" and "negative lookaheads" that can be considered, but are outside of the scope of this tutorial. <a href="https://www.regular-expressions.info/lookaround.html" target="_blank">https://www.regular-expressions.info/lookaround.html</a></p>
</section>
<section>
<h2>Groups</h2>
<p>When trying to replace data using RegEx, it is possible to store the matched data in order to use it either later in the expression, or later in a string replace method, by using capturing groups.</p>
<p>This is my <b>favorite</b> part of RegEx!</p>
<pre><code class="javascript" id="groups-test"></code></pre>
<p>Of course, <a href=" https://www.debuggex.com/r/ABnOtr-vYcfptlkW" target="_blank">Debuggex</a> gives you information about these groups.</p>
<p>If you use the <b>exec</b> function, you can view all of the matching groups where 0 is the entire string, and 1-N are the matching groups.</p>
<p>The matches can be referred to as $1, $2, etc. where the number corresponds to the order in which the group matches were found. The entire match can be referenced with <b>$</b>.</p>
<pre><code class="javascript" id="groups-replace"></code></pre>
<p>Alternatively, you can also create "non-capturing" groups.</p>
<pre><code class="javascript" id="groups-noncapture"></code></pre>
</section>
<section>
<h2>Or (|)</h2>
<p>Groups can also contains logical OR's (|)</p>
<pre><code class="javascript" id="ors-explanation"></code></pre>
<p>When used outside of a group, keep in mind the | will compare everything before it and everything after.</p>
<pre><code class="javascript" id="ors-test"></code></pre>
<p>The <a href="https://www.debuggex.com/r/TRr8l0_GYh0LGPmp
" target="_blank">Debuggex</a> diagram illustrates the difference between these really well.</p>
</section>
<section>
<h2>Exercise Questions</h2>
<h3>Prompt 1</h3>
<p>Write a regular expression to confirm a user has entered a valid username, given the following requirements:
<ul>
<li>3-16 characters in length</li>
<li>case does not matter</li>
<li>contains no special characters other than _</li>
</ul>
</p>
<pre><code class="javascript" id="exercises-1"></code></pre>
<h3>Prompt 2</h3>
<p>Extract only the full names from a string which have titles.
<ul>
<li>Mr.</li>
<li>Mrs.</li>
<li>Ms.</li>
<li>Dr.</li>
</ul>
</p>
<pre><code class="javascript" id="exercises-2"></code></pre>
<h3>Prompt 3</h3>
<p>Now, using the answer from Prompt 2, let's rearrange these names to be Last, First Title</p>
<pre><code class="javascript" id="exercises-3"></code></pre>
<h3>Prompt 4</h3>
<p>Think of a voice assistant which is able to interpret various utterances all leading to the same result. How could you account for someone asking Alexa or Google to search for something on YouTube?</p>
<p>Chain together three different regular expressions to account for the following utterances:
<ul>
<li>YouTube [QUERY]</li>
<li>Search YouTube for [QUERY]</li>
<li>Search/Find (for) [QUERY] on YouTube</li>
</ul>
</p>
<pre><code class="javascript" id="exercises-4"></code></pre>
</section>
<section>
<h2>Thank you!</h2>
</section>
</div>
</div>
<script>hljs.initHighlightingOnLoad();</script>
<script src="main.js"></script>
</body>
</html>