# JavaScript 正则表达式

# 前言

;MDN (opens new window) 对正则表达式有更为详细的描述，此文仅是学习正则表达式过程中的记录和发散，相关的内容后面会逐步进行完善。

# 方法

# test

;test (opens new window) 用来检查字符串是否与正则表达式相匹配，返回布尔值。

/hello/.test('hello_world') // true

来看一个特殊情况。

const reg = /hello/
reg.test('hello_world') // true
reg.test('say_hello') // true

const reg = /hello/g
reg.test('hello_world') // true
reg.test('say_hello') // false
reg.test('hello_regexp') // true

你应该知道的是，每个正则表达式都有一个 lastIndex (opens new window) 属性，用来指定下一次匹配的起始索引，可读也可写，默认值为0（表示从字符串头部开始匹配）。但是一般不会起作用，只有在正则表达式开启了全局匹配g时，lastIndex才会生效。

以上第一个正则表达式未开启全局匹配，lastIndex始终不生效，每次执行test时都是从字符串的头部开始匹配，所以都会输出true。而当开启全局匹配时，执行第一个test匹配成功输出true，此时会将lastIndex更新为5，以便下一次匹配，而执行第二个test时，由于lastIndex为5（即从ello开始匹配），将匹配失败输出false。而匹配失败后，lastIndex将被重置为0，所以第三个test又输出true。

因此若要在开启全局匹配下输出一致，可修改lastIndex属性（4表示从hello开始匹配）。

const reg = /hello/g
reg.test('hello_world') // true
reg.lastIndex = 4
reg.test('say_hello') // true
reg.lastIndex // 9

# exec

;exec (opens new window) 用于对字符串执行一次搜索匹配，返回一个结果数组或null。

其中数组中包括[0]（匹配的完整字符串）、[1]...[n]（捕获的分组）、index（匹配的内容在原字符串中的索引）、input（原字符串）、group（ES6的具名组）。

const reg = /h(e)ll(o)/i
const str = 'Hello foo, hEllo bar, heLlo baz'
reg.exec(str) // ["Hello", "e", "o", index: 0, input: "Hello foo, hEllo bar, heLlo baz", groups: undefined]

注意exec也与test类似，开启全局匹配g后lastIndex才会生效，开启后exec可以获取单个字符串中的多次匹配结果。并且与test类似，exec若匹配失败，lastIndex也会归0。

const reg = /h(e)ll(o)/ig
const str = 'Hello foo, hEllo bar, heLlo baz'
var res = null

while (res = reg.exec(str)) {
  console.log(res)
  // ["Hello", "e", "o", index: 0, input: "Hello foo, hEllo bar, heLlo baz", groups: undefined]
  // ["hEllo", "E", "o", index: 11, input: "Hello foo, hEllo bar, heLlo baz", groups: undefined]
  // ["heLlo", "e", "o", index: 22, input: "Hello foo, hEllo bar, heLlo baz", groups: undefined]
}

# 字符串方法

;ES6将String.prototype中的四个方法search、split、replace、match在语言内部都调用了RegExp.prototype上的方法，例如String.prototype.search调用RegExp.prototype[Symbol.search]。

另外若方法的参数为对象，都会存在隐式类型转换。

const str = 'foo, bar, baz'
const reg = {
  toString() {
    return 'baz'
  }
}

str.search(reg) // 10
str.split(reg) // ["foo, bar, ", ""]
str.replace(reg, 'yes') // foo, bar, yes
str.match(reg) // ["baz", index: 10, input: "foo, bar, baz", groups: undefined]

# search

;search (opens new window) 用于返回正则表达式在字符串中首个匹配项的索引，若未匹配则返回-1。

'hello world'.search(/world/) // 6
'hello world'.search(/say/) // -1
'hello world'.search('llo') // 2

# split

;split (opens new window) 用于分割字符串为数组。

其中第一个参数为字符串或者正则表达式，第二个参数用于限制分割后的数组长度。

'say hello world'.split(/[er]/) // ["say h", "llo wo", "ld"]
'say hello world'.split(' ') // ["say", "hello", "world"]
'say hello world'.split(' ', 1) // ["say"]

# replace

;replace (opens new window) 用于替换字符串中的字符为另一些字符，原字符串不变，返回一个新的字符串。

'hello world'.replace('world', 'regexp') // hello regexp
'hello world'.replace(/[er]/g, 'm') // hmllo womld

另外第二个参数可以为一些特殊变量名。

$&：匹配的字符串
&`：匹配结果前面的内容
&'：匹配结果后面的内容
$n：分组捕获，捕获的第n组内容
$$：符号$
&<name>：具名组捕获，捕获的分组内容

'hello_world'.replace(/world/, '$&') // hello_world
'hello_world'.replace(/world/, '$`') // hello_hello_
'hello_world'.replace(/world/, "$'") // hello_
'hello_world'.replace(/(world)/, '$1') // hello_world
'hello_world'.replace(/world/, '$$') // hello_$
'hello_world'.replace(/(?<key>world)/, '$<key>') // hello_world

# match

;match (opens new window) 用于返回一个字符串匹配正则表达式的结果。

注意若未开启全局匹配g，将返回第一个匹配结果和捕获组（等价于exec）。若开启全局匹配，将只返回匹配的所有结果。

const reg = /h(e)ll(o)/i
const rege = /h(e)ll(o)/ig
const str = 'Hello foo, hEllo bar, heLlo baz'
str.match(reg) // ["Hello", "e", "o", index: 0, input: "Hello foo, hEllo bar, heLlo baz", groups: undefined]
str.match(rege) // ["Hello", "hEllo", "heLlo"]

# 修饰符

# i

忽略大小写。

/hello/.test('Hello world') // false
/hello/i.test('Hello world') // true

正则表达式是否设置i修饰符。

const reg = /a/i
reg.ignoreCase // true

# g

全局匹配。

'hello'.replace(/l/, 'm') // hemlo
'hello'.replace(/l/g, 'm') // hemmo

正则表达式是否设置g修饰符。

const reg = /a/g
reg.global // true

# m

多行匹配。

const str = 'hello \nworld'

/^world/.test(str) // false
/^world/m.test(str) // true

world位于第二行行首，指定多行匹配m后会被匹配上

正则表达式是否设置m修饰符。

const reg = /a/m
reg.multiline // true

# s

正则表达式中.不匹配\n，ES2018引入了s修饰符，可以匹配任何单个字符。

const str = 'hello\nworld'

/hello.world/.test(str) // false
/hello.world/s.test(str) // true
/hello[^]world/.test(str) // true

正则表达式是否设置s修饰符。

const reg = /a/s
reg.dotAll // true

# u

;Unicode模式，用于正确识别大于0xffff的字符。

/^.$/.test('𠮷') // false
/^.$/u.test('𠮷') // true

正则表达式是否设置u修饰符。

const reg = /a/s
reg.unicode // true

# y

粘连模式，要求每次都是从剩余字符串的头部开始匹配。

const str = 'aaa_aa_a'
const g = /a+/g
const y = /a+/y

g.exec(str) // ["aaa"]
g.exec(str) // ["aa"]

y.exec(str) // ["aaa"]
y.exec(str) // null

以上粘连模式中，第一次exec匹配后lastIndex为3，第二次exec匹配时，剩余的字符串为_aa_a，而粘连模式要求从剩余字符串的头部开始（即/a+/y等价于/^a+/g）匹配，因此匹配失败，返回null并且lastIndex重置为0。

const str = 'aaa_aa_a'
const y = /a+_/y

y.exec(str) // ["aaa_"]
y.exec(str) // ["aa_"]

正则表达式是否设置y修饰符。

const reg = /a/y
reg.sticky // true

# 🎉 写在最后

🍻伙伴们，如果你已经看到了这里，觉得这篇文章有帮助到你的话不妨点赞👍或 Star (opens new window) ✨支持一下哦！

手动码字，如有错误，欢迎在评论区指正💬~

你的支持就是我更新的最大动力💪~

GitHub (opens new window) / Gitee (opens new window)、GitHub Pages (opens new window)、掘金 (opens new window)、CSDN (opens new window) 同步更新，欢迎关注😉~