计算机教程

【必赢娱乐棋牌】正则表达式学习小记

3 8月 , 2019  

形式:(? : …)

括号的用途:捕获分组

  • 作用:将括号内的子表达式捕获的字符串存放到匹配的结果中,供匹配完成后访问。
  • 形式: 使用普通的括号”(…)”。

例子:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class pa9 {

    public static void main(String[] args) {
        String email = "webmaster@itcast.net";

        String regex = "(\\w+)@([\\w.]+)";

        Pattern p = Pattern.compile(regex);

        Matcher m = p.matcher(email);

        if (m.find()) {
            System.out.println("email add is:\t" + m.group(0));//默认为整个正则表达式的匹配内容。
            System.out.println("username is:\t" + m.group(1));//第一个括号的匹配结果。
            System.out.println("hostname is:\t" + m.group(2));//第二个括号的匹配结果。
        }

    }

}

运行结果:

email add is:   webmaster@itcast.net
username is:    webmaster
hostname is:    itcast.net

需要强调的是:括号的先后顺序按照左括号的出现顺序编号,编号从1开始。编号0为整个正则表达式的匹配结果!!

 字符类:

锚点二

  • “^”
    匹配一行的开头(有可能变化),但是在没有设置匹配模式的情况下(即默认情况下),它是匹配整个字符串的开头和\A一样。
  • “$”
    匹配一行的末尾(有可能变化)但是在没有设置匹配模式的情况下(即默认情况下),它是匹配整个字符串的结尾和\Z一样。
  • “\A” 匹配整个字符串的开头
  • “\Z” 匹配整个字符串的末尾
  • 如果想要 “^”和
    “$”在一整段字符串中匹配逻辑行的开头或者结尾的话,需要修改匹配模式。在
    正则表达式中的基本正则规则详解02
    有解释匹配模式的使用。

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class pa16 {

    public static void main(String[] args) {
        String[] strings = new String[] { "start ", " start  ", " end ", " end" };

        String[] regexes = new String[] { "^start", "\\Astart", "end$", "end\\Z"};

        for (String str : strings) {
            for (String regex : regexes) {
                Pattern p = Pattern.compile(regex);
                Matcher m = p.matcher(str);
                if(m.find()) {
                    System.out.println("\"" + str
                            + "\" can be matched with regex \"" + regex
                            + "\"");
                }
                else {
                    System.out.println("\"" + str
                            + "\" can not be matched with regex \"" + regex
                            + "\"");
                }
            }
            System.out.println("");
        }

    }
}

运行结果:

"start " can be matched with regex "^start"
"start " can be matched with regex "\Astart"
"start " can not be matched with regex "end$"
"start " can not be matched with regex "end\Z"

" start  " can not be matched with regex "^start"
" start  " can not be matched with regex "\Astart"
" start  " can not be matched with regex "end$"
" start  " can not be matched with regex "end\Z"

" end " can not be matched with regex "^start"
" end " can not be matched with regex "\Astart"
" end " can not be matched with regex "end$"
" end " can not be matched with regex "end\Z"

" end" can not be matched with regex "^start"
" end" can not be matched with regex "\Astart"
" end" can be matched with regex "end$"
" end" can be matched with regex "end\Z"

正则表达式中的基本正则规则详解02

              形式:

排除型字符组

  • 作用:规定某个位置不容许出现的字符。
  • 形式:以”[^…]”给出,在方括号内列出不容许出现的字符。
  • 排除型字符组仍然必须匹配一个字符。

例子:

public class pa2{
    public static void main(String args[]){
        String stas[]={"1","8","狼","l","!","!"," ","    ","\n"};//最后几个从l自后开始算,分别是:英文感叹号
        //中文感叹号、空格、制表符、换行
        String regex="[^123]";

        for(String i:stas){
            System.out.println("字符["+i+"]匹配状态为:["+i.matches(regex)+"]");
        }
    }
}

运行结果:

字符[1]匹配状态为:[false]
字符[8]匹配状态为:[true]
字符[狼]匹配状态为:[true]
字符[l]匹配状态为:[true]
字符[!]匹配状态为:[true]
字符[!]匹配状态为:[true]
字符[ ]匹配状态为:[true]
字符[   ]匹配状态为:[true]
字符[
]匹配状态为:[true]

所以排除型表示的意思就是排除当前的字符,然后满足世界上有的字符。

             1.结构模式:有结构的字符串,而不是字符流

量词

  • 作用:限定之前的字符出现的次数
  • 形式:
  • “*” :之前的字符可以出现0次到无穷多次
  • “+” :之前的字符可以出现1次到无穷多次
  • “?” :之前的字符至多只能出现1次,即0次或者1次

    例子:

public class pa5 {

    public static void main(String[] args) {
        String[] strings = new String[] { "", "a", "aa", "aaa"};

        String regex = "a*";
        String regex2 = "a?";
        String regex3 = "a+";

        for (String str : strings) {
            if (str.matches(regex)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex
                        + "\"");
            } else {
                System.out.println("\"" + str
                        + "\" can not be matched with regex \"" + regex
                        + "\"");
            }
        }

        System.out.println("");

        for (String str : strings) {
            if (str.matches(regex2)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex2
                        + "\"");
            } else {
                System.out.println("\"" + str
                        + "\" can not be matched with regex \"" + regex2
                        + "\"");
            }
        }

        System.out.println("");

        for (String str : strings) {
            if (str.matches(regex3)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex3
                        + "\"");
            } else {
                System.out.println("\"" + str
                        + "\" can not be matched with regex \"" + regex3
                        + "\"");
            }
        }
    }

}

运行结果:

"" can be matched with regex "a*"
"a" can be matched with regex "a*"
"aa" can be matched with regex "a*"
"aaa" can be matched with regex "a*"

"" can be matched with regex "a?"
"a" can be matched with regex "a?"
"aa" can not be matched with regex "a?"
"aaa" can not be matched with regex "a?"

"" can not be matched with regex "a+"
"a" can be matched with regex "a+"
"aa" can be matched with regex "a+"
"aaa" can be matched with regex "a+"

反相引用:\1

本篇文章挺长的,读者可以通过上面的目录选择性的阅读。有什么不懂的也可以尽管提问!!!

         
今天学习的是正则表达,因为在以前的学习和这段时间的javascript等学习发现正则表达式正的很重要、很方便、很实用。就想着好好学习学习,了解一番。找了个正则表达式30分钟入门教程就看了起来,现在看的基本有个眉目了。打算再看一遍,把该注意的地方和自己的测验写出来,做个记录。

字符组简记法

  • 对于常用的字符组,正则表达式提供了相应的简记法,方便地表示它们。
  • \d 相当于 [0-9]
  • \D 相当于 [^0-9]
  • \w 相当于 [0-9a-zA-Z_]
  • \W 相当于 [^0-9a-zA-Z_]
  • \s 匹配空白字符(回车\r、换行\n、制表、空格)
  • \S 匹配非空白字符

例子:

public class pa3 {

    public static void main(String[] args) {

        String digitChar = "\\d";
        String noDigitChar = "\\D";
        String wordChar = "\\w";
        String noWordChar = "\\W";
        String spaceChar = "\\s";
        String noSpaceShar = "\\S";

        String[] strs = new String[] { "0", "3", "8", "9", "a", "z", "E", "G",
                " ", "\t", "\r","\n","!","!","狼" };

        for (String s : strs) {
            if (regexMatch(s, digitChar)) {
                System.out.println("\"" + digitChar + "\" can match \"" + s
                        + "\"");
            } else {
                System.out.println("\"" + digitChar + "\" can not match \"" + s
                        + "\"");
            }
        }
        System.out.println("");

        for (String s : strs) {
            if (regexMatch(s, noDigitChar)) {
                System.out.println("\"" + noDigitChar + "\" can match \"" + s
                        + "\"");
            } else {
                System.out.println("\"" + noDigitChar + "\" can not match \""
                        + s + "\"");
            }
        }
        System.out.println("");

        for (String s : strs) {

            if (regexMatch(s, wordChar)) {
                System.out.println("\"" + wordChar + "\" can match \"" + s
                        + "\"");
            } else {
                System.out.println("\"" + wordChar + "\" can not match \"" + s
                        + "\"");
            }
        }
        System.out.println("");

        for (String s : strs) {
            if (regexMatch(s, noWordChar)) {
                System.out.println("\"" + noWordChar + "\" can match \"" + s
                        + "\"");
            } else {
                System.out.println("\"" + noWordChar + "\" can not match \""
                        + s + "\"");
            }
        }
        System.out.println("");

        for (String s : strs) {
            if (regexMatch(s, spaceChar)) {
                System.out.println("\"" + spaceChar + "\" can match \"" + s
                        + "\"");
            } else {
                System.out.println("\"" + spaceChar + "\" can not match \"" + s
                        + "\"");
            }
        }

        System.out.println("");

        for (String s : strs) {
            if (regexMatch(s, noSpaceShar)) {
                System.out.println("\"" + noSpaceShar + "\" can match \"" + s
                        + "\"");
            } else {
                System.out.println("\"" + noSpaceShar + "\" can not match \""
                        + s + "\"");
            }
        }

    }

    public static boolean regexMatch(String s, String regex) {
        return s.matches(regex);
    }

}

运行结果:

"\d" can match "0"
"\d" can match "3"
"\d" can match "8"
"\d" can match "9"
"\d" can not match "a"
"\d" can not match "z"
"\d" can not match "E"
"\d" can not match "G"
"\d" can not match " "
"\d" can not match "    "
"\d" can not match "
"\d" can not match "
"
"\d" can not match "!"
"\d" can not match "!"
"\d" can not match "狼"

"\D" can not match "0"
"\D" can not match "3"
"\D" can not match "8"
"\D" can not match "9"
"\D" can match "a"
"\D" can match "z"
"\D" can match "E"
"\D" can match "G"
"\D" can match " "
"\D" can match "        "
"\D" can match "
"\D" can match "
"
"\D" can match "!"
"\D" can match "!"
"\D" can match "狼"

"\w" can match "0"
"\w" can match "3"
"\w" can match "8"
"\w" can match "9"
"\w" can match "a"
"\w" can match "z"
"\w" can match "E"
"\w" can match "G"
"\w" can not match " "
"\w" can not match "    "
"\w" can not match "
"\w" can not match "
"
"\w" can not match "!"
"\w" can not match "!"
"\w" can not match "狼"

"\W" can not match "0"
"\W" can not match "3"
"\W" can not match "8"
"\W" can not match "9"
"\W" can not match "a"
"\W" can not match "z"
"\W" can not match "E"
"\W" can not match "G"
"\W" can match " "
"\W" can match "        "
"\W" can match "
"\W" can match "
"
"\W" can match "!"
"\W" can match "!"
"\W" can match "狼"

"\s" can not match "0"
"\s" can not match "3"
"\s" can not match "8"
"\s" can not match "9"
"\s" can not match "a"
"\s" can not match "z"
"\s" can not match "E"
"\s" can not match "G"
"\s" can match " "
"\s" can match "        "
"\s" can match "
"\s" can match "
"
"\s" can not match "!"
"\s" can not match "!"
"\s" can not match "狼"

"\S" can match "0"
"\S" can match "3"
"\S" can match "8"
"\S" can match "9"
"\S" can match "a"
"\S" can match "z"
"\S" can match "E"
"\S" can match "G"
"\S" can not match " "
"\S" can not match "    "
"\S" can not match "
"\S" can not match "
"
"\S" can match "!"
"\S" can match "!"
"\S" can match "狼"

         (?<=…)   

字符组

  • 正则表达式的最基本结构之一。
  • 作用:规定某个位置能够出现的字符。
  • 形式:以”[…]”给出,在方括号内列出字符。或者简写字符。
  • 方括号中的字符为某个位置是否出现的字符,例如[123],如果当前位置出现1或者2或者3的话,它都能匹配,但是只能出现一个数字。

例子:
判断十进制字符。

public class pa1{
    public static void main(String args[]){
        String sta="4";
        String regex="[0123456789]";

        System.out.println("本例子用于判读数字是不是十进制");

        if(sta.matches(regex)){
            System.out.println("不是十进制数字");
        }else{
            System.out.println("是十进制数字");
        }
    }
}

运行结果:

本例子用于判读数字是不是十进制
是十进制数字

             所以这里的[.?!]匹配的就是(.或?或!)

量词的局限,括号的使用

  • 量词只能规定之前字符或字符组的出现次数(只能匹配单个字符)
  • 如果要规定一个字符串的出现次数,必须使用括号”()”,在括号内填写字符串,在闭括号之后添加量词。

例子:

public class pa7 {

    public static void main(String[] args) {
        String[] strings = new String[] { "ac", "acc", "accc", "acac", "acacac"};

        String regex = "ac+";
        String regex2 = "(ac)+";

        for (String str : strings) {
            if (str.matches(regex)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex
                        + "\"");
            } else {
                System.out.println("\"" + str
                        + "\" can not be matched with regex \"" + regex
                        + "\"");
            }
        }

        System.out.println("");

        for (String str : strings) {
            if (str.matches(regex2)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex2
                        + "\"");
            } else {
                System.out.println("\"" + str
                        + "\" can not be matched with regex \"" + regex2
                        + "\"");
            }
        }

    }

}

运行结果:

"ac" can be matched with regex "ac+"
"acc" can be matched with regex "ac+"
"accc" can be matched with regex "ac+"
"acac" can not be matched with regex "ac+"
"acacac" can not be matched with regex "ac+"

"ac" can be matched with regex "(ac)+"
"acc" can not be matched with regex "(ac)+"
"accc" can not be matched with regex "(ac)+"
"acac" can be matched with regex "(ac)+"
"acacac" can be matched with regex "(ac)+"

由于锚点对位置的判断不够灵活,所以引进环视

连字符

这个是连字符 “-”
但是上面的例子用来表示十进制[0123456789]太过累赘,所以可以使用连字符用来简化,但是连字符只能够简化是相连接,中间没有空缺的。即可以这样表示[0-9]就行了。也可以这样表示[0-789]这样也是可以的,想要表达的意思就是用了连字符只是起到简化的作用。

可以测试下:

public class pa1{
    public static void main(String args[]){
        String sta="4";
        String regex="[0-9]";

        System.out.println("本例子用于判读数字是不是十进制");

        if(sta.matches(regex)){
            System.out.println("不是十进制数字");
        }else{
            System.out.println("是十进制数字");
        }
    }
}

运行结果:

本例子用于判读数字是不是十进制
是十进制数字

注意事项

  • 注意:在字符组的内部,只有当连字符出现在两个字符之间时,才能表示字符的范围;如果出现在字符组的开头或者结尾,则只能够表示单个字符’-‘即连字符本身这个字符。

        

区间量词

  • 作用:具体规定字符的出现次数
  • 形式:
  • {min,max}即最小出现min次最多出现max次
  • {min,}即最小出现min次最多为无穷次
  • {number}即规定必须出现number次
  • “*” 相当于 {0,}
  • “+” 相当于{1,}
  • “?” 相当于 {0,1}

例子:

public class pa6 {

    public static void main(String[] args) {
        String[] strings = new String[] { "", "a", "aa", "aaa", "aaaa", "aaaaa" };

        String regex = "a{2,4}";
        String regex2 = "a{2,}";
        String regex3 = "a{3}";

        for (String str : strings) {
            if (str.matches(regex)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex + "\"");
            } else {
                System.out.println("\"" + str
                        + "\" can not be matched with regex \"" + regex + "\"");
            }
        }

        System.out.println("");

        for (String str : strings) {
            if (str.matches(regex2)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex2 + "\"");
            } else {
                System.out
                        .println("\"" + str
                                + "\" can not be matched with regex \""
                                + regex2 + "\"");
            }
        }

        System.out.println("");

        for (String str : strings) {
            if (str.matches(regex3)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex3 + "\"");
            } else {
                System.out
                        .println("\"" + str
                                + "\" can not be matched with regex \""
                                + regex3 + "\"");
            }
        }

    }

}

运行结果:

"" can not be matched with regex "a{2,4}"
"a" can not be matched with regex "a{2,4}"
"aa" can be matched with regex "a{2,4}"
"aaa" can be matched with regex "a{2,4}"
"aaaa" can be matched with regex "a{2,4}"
"aaaaa" can not be matched with regex "a{2,4}"

"" can not be matched with regex "a{2,}"
"a" can not be matched with regex "a{2,}"
"aa" can be matched with regex "a{2,}"
"aaa" can be matched with regex "a{2,}"
"aaaa" can be matched with regex "a{2,}"
"aaaaa" can be matched with regex "a{2,}"

"" can not be matched with regex "a{3}"
"a" can not be matched with regex "a{3}"
"aa" can not be matched with regex "a{3}"
"aaa" can be matched with regex "a{3}"
"aaaa" can not be matched with regex "a{3}"
"aaaaa" can not be matched with regex "a{3}"

环视

捕获分组的注意事项

  • 只要使用了括号,就存在捕获分组。
  • 捕获分组按照开括号出现的从左到右的顺序编号,编号从1开始,遇到括号嵌套的情况也是如此。
  • 如果捕获分组之后存在量词,则匹配结果中,捕获分组保存的是子表达式最后一次的匹配字符串。

例子:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class pa10 {

    public static void main(String[] args) {

        explainGroupNo();

        System.out.println("");

        explainGroupQuantifier();       

    }

    public static void explainGroupNo() {
        String email = "webmaster@itcast.net";

        String regex = "((\\w+)@([\\w.]+))";

        Pattern p = Pattern.compile(regex);

        Matcher m = p.matcher(email);

        if (m.find()) {
            System.out.println("match result:\t" + m.group(0));
            System.out.println("group No.1 is:\t" + m.group(1));
            System.out.println("group No.2 is:\t" + m.group(2));
            System.out.println("group No.3 is:\t" + m.group(3));
        }
    }

    public static void explainGroupQuantifier() {
        String email = "webmaster@itcast.net";

        String regex = "(\\w)+@([\\w.])+";

        Pattern p = Pattern.compile(regex);

        Matcher m = p.matcher(email);

        if (m.find()) {
            System.out.println("match result:\t" + m.group(0));
            System.out.println("group No.1 is:\t" + m.group(1));
            System.out.println("group No.2 is:\t" + m.group(2));
        }
    }

}

运行结果:

match result:   webmaster@itcast.net
group No.1 is:  webmaster@itcast.net
group No.2 is:  webmaster
group No.3 is:  itcast.net

match result:   webmaster@itcast.net
group No.1 is:  r
group No.2 is:  t

匹配模式:用来改变某些结构的匹配模式

括号的用途:反向引用

  • 作用:在表达式的某一部分,动态重复之前的子表达式所匹配的文本。
  • 形式:”\1” 其中的1为捕获分组的编号。

例子:验证html代码是否正确

public class pa12 {

    public static void main(String[] args) {

        String[] strings = new String[] { "<h1>good</h1>", "<h1>bad</h2>"};

        String regex = "<(\\w+)>[^<]+</\\1>";

        for (String str : strings) {
            if (str.matches(regex)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex
                        + "\"");
            } else {
                System.out.println("\"" + str
                        + "\" can not be matched with regex \"" + regex
                        + "\"");
            }
        }

    }

}

运行结果:

"<h1>good</h1>" can be matched with regex "<(\w+)>[^<]+</\1>"
"<h1>bad</h2>" can not be matched with regex "<(\w+)>[^<]+</\1>"

例子:去掉重复单词

public class pa13 {

    public static void main(String[] args) {

        String dupWords = "word word";

        String dupWordRegex = "(\\w+)\\s+(\\1)";

        System.out.println("Before replace:\t" + dupWords);
        System.out.println("After replace:\t" + dupWords.replaceAll(dupWordRegex, "$1"));//美元符号可视为到如上所述已捕获子序列的引用

    }

}

运行结果:

Before replace: word word
After replace:  word

到了这个阶段,我想读者也有一定能力了。来看看这个文章:
强调在正则中只要单纯用了括号就会有捕获分组保存

        字符内部  [.]也只能匹配点号

特殊的简记法:点号

  • 点号 “.”是一个特殊的字符组简记法,它可以匹配几乎所有的字符。
  • “\.”匹配点号本身。
  • 在字符组内部,[.]也只能够匹配点号本身。
  • 注意:点号不能够匹配换行符及回车符。

例子:

public class pa4 {

    public static void main(String[] args) {

        String[] strings = new String[] { "a", "A", "0", "$", "(", ".","\n","\r"};

        String normalDot = ".";
        String escapedDot = "\\.";
        String characterClassDot = "[.]";

        for (String s : strings) {
            if (regexMatch(s, normalDot)) {
                System.out.println("\"" + s + "\" can be matched with regex \""
                        + normalDot + "\"");
            } else {
                System.out.println("\"" + s
                        + "\" can not be matched with regex \"" + normalDot + "\"");
            }
        }

        System.out.println("");

        for (String s : strings) {
            if (regexMatch(s, escapedDot)) {
                System.out.println("\"" + s + "\" can be matched with regex \""
                        + escapedDot + "\"");
            } else {
                System.out.println("\"" + s
                        + "\" can not be matched with regex \"" + escapedDot + "\"");
            }
        }

        System.out.println("");

        for (String s : strings) {
            if (regexMatch(s, characterClassDot)) {
                System.out.println("\"" + s + "\" can be matched with regex \""
                        + characterClassDot + "\"");
            } else {
                System.out.println("\"" + s
                        + "\" can not be matched with regex \"" + characterClassDot + "\"");
            }
        }

        System.out.println("");

    }

    public static boolean regexMatch(String s, String regex) {
        return s.matches(regex);
    }

}

运行结果:

"a" can be matched with regex "."
"A" can be matched with regex "."
"0" can be matched with regex "."
"$" can be matched with regex "."
"(" can be matched with regex "."
"." can be matched with regex "."
"
" can not be matched with regex "."
" can not be matched with regex "."

"a" can not be matched with regex "\."
"A" can not be matched with regex "\."
"0" can not be matched with regex "\."
"$" can not be matched with regex "\."
"(" can not be matched with regex "\."
"." can be matched with regex "\."
"
" can not be matched with regex "\."
" can not be matched with regex "\."

"a" can not be matched with regex "[.]"
"A" can not be matched with regex "[.]"
"0" can not be matched with regex "[.]"
"$" can not be matched with regex "[.]"
"(" can not be matched with regex "[.]"
"." can be matched with regex "[.]"
"
" can not be matched with regex "[.]"
" can not be matched with regex "[.]"

转义字符:

括号的用途:多选结构

  • 字符组只能表示某个位置可能出现的单个字符,而不能表示某个位置可能出现的字符串。
  • 作用:表示某个位置可能出现的字符串,或者可能出现的多个字符串。
  • 形式:
  • “(…|…)”,在竖线两端添加各个字符串
  • “(…|…|…|…)”

例子:

public class pa8 {

    public static void main(String[] args) {


        String[] strings = new String[] { "this", "that", "thit"};

        String regex = "th[ia][st]";

        for (String str : strings) {
            if (str.matches(regex)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex
                        + "\"");
            } else {
                System.out.println("\"" + str
                        + "\" can not be matched with regex \"" + regex
                        + "\"");
            }
        }

    }

}

运行结果:

"this" can be matched with regex "th[ia][st]"
"that" can be matched with regex "th[ia][st]"
"thit" can be matched with regex "th[ia][st]"

而我们不想匹配错误的单词thit怎么办呢??
看例子:

public class pa8 {

    public static void main(String[] args) {


        String[] strings = new String[] { "this", "that", "thit"};

        String regex = "(this|that)";//当然也可以这么写th(is|at),这样把公共提出来后可以提高正则匹配效率。

        for (String str : strings) {
            if (str.matches(regex)) {
                System.out.println("\"" + str
                        + "\" can be matched with regex \"" + regex
                        + "\"");
            } else {
                System.out.println("\"" + str
                        + "\" can not be matched with regex \"" + regex
                        + "\"");
            }
        }

    }

}

运行结果:

"this" can be matched with regex "(this|that)"
"that" can be matched with regex "(this|that)"
"thit" can not be matched with regex "(this|that)"

      

锚点

  • 作用:规定匹配的位置。
  • 形式: “\b”单词分界符锚点。
  • 单词分界符的意思是划分单词和符号之间的界限。注意:这些符号还包括换行、空格、制表符、回车符。当然读者不要想多了,符号就只有空格那几个,符号还包括中文的符号和英文的符号如”!”,”!”,英文感叹号,中文感叹号!!!等,你所知道的一切符号!!!

例子:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class pa15 {

    public static void main(String[] args) {

        String[] strings = new String[] {
                "This sentence contain word cat",
                "This sentence contain word \"cat\"",
                "This sentence contain word vacation",
                "This sentence contain word \"cate\"",
        };

        String regex = "\\bcat\\b";

        for(String str : strings) {
            System.out.println("Checking sentence:\t" + str);
            Pattern p = Pattern.compile(regex);
            Matcher m = p.matcher(str);

            if(m.find()) {
                System.out.println("Found word \"cat\"!");
            }
            else {
                System.out.println("Can not found word \"cat\"!");
            }

            System.out.println("");

        }

    }

}

运行结果:

Checking sentence:      This sentence contain word cat
Found word "cat"!

Checking sentence:      This sentence contain word "cat"
Found word "cat"!

Checking sentence:      This sentence contain word vacation
Can not found word "cat"!

Checking sentence:      This sentence contain word "cate"
Can not found word "cat"!

注意事项:

  • \b表示单词分界符,要求一侧是单词字符,另一侧是非单词字符。
  • 单词字符通常是指英文字符、数字字符和中文。
  • 非单词字符通常指的是各种标点符号和空白字符。
  • 如果上面这三条注意事项不是很明白,可以看这篇博客“\b和\B的使用区别及注意事项”

  {number}    设定一个具体的出现次数(连续重复匹配number次)

不捕获文本的括号

  • 如果正则表达式很复杂,或者需要处理的文本很长,捕获分组会降低效率。
  • 作用:仅仅用来对表达式分组,而不把分组捕获的文本存入结果。
  • 形式:”(?:…)”。

例子:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class pa11 {

    public static void main(String[] args) {
        String email = "webmaster@itcast.net";

        String regex = "(?:webmaster|admin)@([\\w.]+)";

        Pattern p = Pattern.compile(regex);

        Matcher m = p.matcher(email);

        if (m.find()) {
            System.out.println("match result:\t" + m.group(0));
            System.out.println("group No.1 is:\t" + m.group(1));
        }

    }
}

运行结果:

match result:   webmaster@itcast.net
group No.1 is:  itcast.net

因为只要出现了括号就会存在捕获分组,并且会保存捕获结果,但是使用(?:…)就不会保存捕获结果了。所以当要输出编号为1的时候就输出了第二个括号的内容。而不会输出第一个括号的捕获内容,因为第一个括号的捕获内容不会保存!!编号0依然是整个正则表达式匹配的内容。

                形式:(…|…)    (…|…|…)   

 

形式:

$   字符用来匹配 整个字符串 的末尾

           字符组简记:

                        X:     注释模式

      [^aeiou]            匹配除了aeiou这几个字母以外的任意字符

        点号本身用  \.  匹配

特殊字符简记法:点号

                       S:    点号通配(在此模式下 .  号也能匹配换行符)

         
这几天一直在努力的学习很多东西,因为是自学,走了很多弯路。特别是看到一些比较难得知识点时,心理总是很焦急,而且能够影响学习效率的东西太多了。一个安静的环境,自由的环境,舒缓的心情感觉真的很重要。就像今天下午的我,心理感觉总是乱糟糟的,一会想做这个一会想做那个,结果一个下午什么也没干成。蛋疼啊!

        $   字符用来匹配 整个字符串 的末尾

                        M:   多行模式(更改^和$符的)

 边界条件         单词分界符锚点:\b    (只对英文字符 数字字符有用
 不支持中文字符) 

            规定字符串出现的次数就必须使用(….) 然后加上量词

注意事项:

 

        .  它可以匹配几乎所有字符   (除了换行符 \n)

         (?!…)

量词的局限性:只能规定之前字符或字符组出现的次数

 

 

形式:

      [\S]               匹配非空白字符

锚点:规定匹配的位置

      [\w]      [0-9a-zA-Z]   匹配0-9和a-z和A-Z之间的字符

 

^  字符用来匹配 整个字符串 的开头

     

      [\s]               匹配空白字符(回车 制表
换行 空格)

 

     如果你想查找元字符本身的话,比如你查找.,或者*,就出现了问题:你没办法指定它们,因为它们会被解释成别的意思。这时你就得使用\来取消这些字符的特殊意义。因    此,你应该使用\.和\*。当然,要查找\本身,你也得用\\.


现在我们来看看一个例子:\(?0\d{2}[) -]?\d{8}

量词:限定之前出现的次数


相关文章

发表评论

电子邮件地址不会被公开。 必填项已用*标注

网站地图xml地图