1. 引言
正则表达式在程序开发中经常会用到,它通过一些特殊的字符和字符串匹配规则来进行文本处理,C#语言中也提供了正则表达式的支持。本文将会详细讲解C#中的正则表达式元字符。
2. 元字符简介
元字符是正则表达式中具有特殊含义的字符,下面列举了一些常用的元字符:
2.1. .
点表示匹配除了换行符以外的任意一个字符,例如,正则表达式"he..o"可以匹配"hello"、"he123o"、"he~!#o"等。
string pattern = "he..o";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("hello");
Match match2 = regex.Match("he123o");
Match match3 = regex.Match("he~!#o");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: True
match3: True
2.2. \w
\w表示匹配任意一个字母、数字或下划线,等价于 [a-zA-Z0-9_]。
string pattern = @"\w";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("A");
Match match2 = regex.Match("3");
Match match3 = regex.Match("_");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: True
match3: True
2.3. \d
\d表示匹配任意一个数字,等价于 [0-9]。
string pattern = @"\d";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("1");
Match match2 = regex.Match("5");
Match match3 = regex.Match("9");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: True
match3: True
2.4. \s
\s表示匹配任意一个空白字符,包括空格、制表符、换行符等。
string pattern = @"\s";
Regex regex = new Regex(pattern);
Match match1 = regex.Match(" ");
Match match2 = regex.Match("\t");
Match match3 = regex.Match("\n");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: True
match3: True
2.5. \b
\b表示匹配单词边界,即单词和空格之间的位置。
string pattern = @"\bhello\b";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("hello, world");
Match match2 = regex.Match("say hello to everyone");
Match match3 = regex.Match("hellom");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: True
match3: False
2.6. ^
^表示匹配行首,例如,正则表达式"^hello"可以匹配以"hello"开头的字符串。
string pattern = "^hello";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("hello, world");
Match match2 = regex.Match("say hello to everyone");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
输出结果:
match1: True
match2: False
2.7. $
$表示匹配行尾,例如,正则表达式"world$"可以匹配以"world"结尾的字符串。
string pattern = "world$";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("hello, world");
Match match2 = regex.Match("say hello to everyone world");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
输出结果:
match1: True
match2: True
2.8. []
[]表示匹配其中任意一个字符,例如,正则表达式"[bc]at"可以匹配"bat"、"cat"。
string pattern = "[bc]at";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("bat");
Match match2 = regex.Match("cat");
Match match3 = regex.Match("rat");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: True
match3: False
2.9. [^]
[^]表示匹配除了其中任意一个字符以外的字符,例如,正则表达式"[^bc]at"可以匹配"eat"、"fat",但不能匹配"bat"、"cat"。
string pattern = "[^bc]at";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("eat");
Match match2 = regex.Match("fat");
Match match3 = regex.Match("bat");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: True
match3: False
2.10. ()
()表示匹配其中任意一个分组,例如,正则表达式"hi (world|everyone)"可以匹配"hi world"、"hi everyone"。
string pattern = @"hi (world|everyone)";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("hi world");
Match match2 = regex.Match("hi everyone");
Match match3 = regex.Match("hi Tom");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: True
match3: False
3. 元字符的组合使用
通过不同的组合使用,元字符能够实现更加强大的匹配规则。
3.1. \w+和\d+
\w+表示匹配任意一个或多个字母、数字或下划线,\d+表示匹配任意一个或多个数字。
string pattern = @"\w+\d+";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("hello123");
Match match2 = regex.Match("world_456");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
输出结果:
match1: True
match2: False
3.2. ^\d+和\.\d+$
^\d+表示匹配以一个或多个数字开头的字符串,\.\d+$表示匹配以一个小数位结尾的字符串。
string pattern = @"^\d+\.\d+$";
Regex regex = new Regex(pattern);
Match match1 = regex.Match("1.23");
Match match2 = regex.Match("12.345");
Match match3 = regex.Match("123.4567");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: True
match3: False
3.3. (.*)和(?<!\\)\$
(.*)表示匹配任意一个或多个字符,(?<!\\)\$表示匹配不以反斜杠开头的字符串结尾。
string pattern = @"^(.*)\$(?
Regex regex = new Regex(pattern);
Match match1 = regex.Match("hello, world$");
Match match2 = regex.Match("say goodbye\\$");
Match match3 = regex.Match("Tom and Jerry");
Console.WriteLine("match1: " + match1.Success);
Console.WriteLine("match2: " + match2.Success);
Console.WriteLine("match3: " + match3.Success);
输出结果:
match1: True
match2: False
match3: False
4. 结论
本文介绍了C#中的一些常用正则表达式元字符,并通过示例代码演示了它们的用法。掌握了这些元字符可以让我们更加灵活地处理文本,提高程序的开发效率。