PHP Regular Expression Anchors
Description
Anchors in regular expression identify various boundaries during a matching.
Anchor List
Here's a full list of the anchors you can use within a regular expression:
Anchor | Meaning |
---|---|
^ | Matches at the start of the string |
$ | Matches at the end of the string |
\b | Matches at a word boundary (between a \w character and a \W character) |
\B | Matches except at a word boundary |
\A | Matches at the start of the string |
\z | Matches at the end of the string |
\Z | Matches at the end of the string or just before a newline at the end of the string |
\G | Matches at the starting offset character position, as passed to the preg_match() function |
\s | Match any whitespace |
\S | Match any non-whitespace |
[\s\S] | match any single character |
[\s\S]* | match anything |
An anchor doesn't itself match any characters; it merely ensures that the pattern appears at a specified point in the target string.
\A and \z are similar to ^ and $. The difference is that ^ and $ match at the beginning and end of a line, respectively, if matching a multi-line string in multi-line mode.
\A and \z only match at the beginning and end of the target string, respectively.
\Z is useful when reading lines from a file that may or may not have a newline character at the end.
\b and \B are handy when searching text for complete words.
Example 1
The caret ( ^ ) symbol specifies that the rest of the pattern will only match at the start of the string. Similarly, you can use the dollar ($ ) symbol to anchor a pattern to the end of the string:
<?PHP//from w w w . j a va2 s .c o m
echo preg_match( "/\[(PHP|Java)\]$/", "Java PHP" ); //
echo "\n";
echo preg_match( "/\[(PHP|Java)\]$/", "Java PHP" ); //
?>
The code above generates the following result.
Example 2
By combining the two anchors, you can ensure that a string contains only the desired pattern, nothing more:
<?PHP
echo preg_match( "/^Hello, \w+$/", "Hello, world" ); // Displays "1"
echo preg_match( "/^Hello, \w+$/", "Hello, world!" ); // Displays "0"
?>
The second match fails because the target string contains a non-word character (!) between the pattern and the end of the string.
The code above generates the following result.
Example 3
/oo\b/ will match "foo," "moo," "boo," because the "oo" is at the end of the word.
The \B pattern will match patterns that aren't on the edges of a word.
/oo\B/ will match "fool," "wool," and "pool" but not "foo," "moo,".
When using \b , the beginning or end of the string is considered a word boundary:
<?PHP/*from w w w . ja v a 2 s . c om*/
echo preg_match( "/over/", "hover menu item" ); //
echo preg_match( "/\bover\b/", "My hover Menu" ); //
echo preg_match( "/\bover\b/", "Show is over " ); //
echo preg_match( "/\bover\b/", "over and under" ); // Displays "1"
?>
By using the \b anchor, along with alternatives within a subexpression.
<?PHP
echo preg_match( "/\b(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\/\d{1,2}\/(\d{2}|\d{4})\b/", "jul/15/2006" ); // Displays "1"
?>
(\d{2}|\d{4})\b expression means:Match either two digits or four digits, followed by a word boundary or the end of the string.
Example 4
That matches precisely 10 non-whitespace characters, followed by one whitespace character, followed by 4 non-whitespace characters.
<?PHP
$string = "java2s.com demo!";
echo preg_match("/[\S]{10}[\s]{1}[\S]{4}/", $string);
?>
The code above generates the following result.