Java regular expression syntax quick reference - Java Regular Expressions

Java examples for Regular Expressions:Pattern

Single characters

Syntax Matches
x character x.
\p punctuation character p.
\\ backslash character.
\n newline character \u000A.
\t tab character \u0009.
\r carriage return character \u000D.
\f form feed character \u000C.
\e escape character \u001B.
\a bell character \u0007.
\uhhhh Unicode character with hexadecimal code hhhh.
\xhh character with hexadecimal code hh.
\0n character with octal code n.
\0nncharacter with octal code nn.
\0nnn Character with octal code nnn, in which nnn <= 377.
\cx control character ^x.

Character classes

Syntax Matches
[...]One of the characters between the brackets.
[^...] Any one character not between the brackets.
[a-z0-9] The character range: a character between (inclusive) a and z or 0 and 9.
[0-9[a-fA-F]] The union of classes: same as [0-9a-fA-F].
[a-z&&[aeiou]]The intersection of classes: same as [aeiou].
[a-z&&[^aeiou]] Subtraction: the characters a through z, except for the vowels.
.Any character, except a line terminator. If the DOTALL flag is set, it matches any character, including line terminators.
\d An ASCII digit: [0-9].
\D Anything but an ASCII digit: [^\d].
\s ASCII whitespace: [ \t\n\f\r\x0B].
\S Anything but ASCII whitespace: [^\s].
\w An ASCII word character: [a-zA-Z0-9_].
\W Anything but an ASCII word character: [^\w].
\p{group} Any character in the named group.
\P{group} Any character not in the named group.
\p{Lower} An ASCII lowercase letter: [a-z].
\p{Upper} An ASCII uppercase letter: [A-Z].
\p{ASCII} Any ASCII character: [\x00-\x7f]
\p{Alpha} An ASCII letter: [a-zA-Z].
\p{Digit} An ASCII digit: [0-9]
\p{XDigit} A hexadecimal digit: [0-9a-fA-F].
\p{Alnum} ASCII letter or digit: [\p{Alpha}\p{Digit}].
\p{Punct} ASCII punctuation: one of !"#$%&'( )*+,-./:;<=>?@[\]^_`{|}~].
\p{Graph} A visible ASCII character: [\p{Alnum}\p{Punct}].
\p{Print} A visible ASCII character: same as \p{Graph}.
\p{Blank} An ASCII space or tab: [ \t].
\p{Space} ASCII whitespace: [ \t\n\f\r\x0b].
\p{Cntrl} An ASCII control character: [\x00-\x1f\x7f].
\p{category}Any character in the named Unicode category. One-letter codes include L for letter, N for number, S for symbol, Z for separator, and P for punctuation. Two-letter codes represent subcategories, such as Lu for uppercase letter, Nd for decimal digit, Sc for currency symbol, Sm for math symbol, and Zs for space separator.
\p{block} Any character in the named Unicode block. In Java regular expressions, block names begin with "In", followed by mixed-case capitalization of the Unicode block name, without spaces or underscores. For example: \p{InOgham} or \p{In Mathematical Operators}.

Sequences, alternatives, groups, and references

Syntax Matches
xy Match x followed by y.
x |yMatch x or y.
(...) Group subexpression within parentheses into a single unit that can be used with * , + , ?, | , and so on.
(?:...) Grouping only. Group subexpression as with ( ), but do not capture the text that matched.
\n Match the same characters that were matched when capturing group number n was first matched.

Repetition

Syntax Matches
x? Zero or one occurrence of x; i.e., x is optional.
x* Zero or more occurrences of x.
x+ One or more occurrences of x.
x{n}Exactly n occurrences of x.
x{n,} n or more occurrences of x.
x{n,m} At least n, and at most m occurrences of x

Anchors

Syntax Matches
^ beginning of the input string or, if the MULTILINE flag is specified, the beginning of the string or of any new line.
$end of the input string or, if the MULTILINE flag is specified, the end of the string or of line within the string.
\b A word boundary
\B A position in the string that is not a word boundary.
\A The beginning of the input string. Like ^, but never matches the beginning of a new line, regardless of what flags are set.
\Z The end of the input string, ignoring any trailing line terminator.
\z The end of the input string, including any line terminator.
\G The end of the previous match.
(?=x)A positive look-ahead assertion.
(?!x)A negative look-ahead assertion.
(?<=x) A positive look-behind assertion.
(?<!x) A negative look-behind assertion.

Miscellaneous

Syntax Matches
(?>x) Match x independently of the rest of the expression.
(?onflags-offflags) Don't match anything, but turn on the flags specified by onflags, and turn off the flags specified by offflags.
(?onflags-offflags:x) Match x, applying the specified flags to this subexpression only. This is a noncapturing group, such as (?:...), with the addition of flags.
\Q Don't match anything, but quote all subsequent pattern text until \E.
\E Don't match anything; terminate a quote started with \Q.
#comment If the COMMENT flag is set, pattern text between a # and the end of the line is considered a comment and is ignored.

Related Tutorials