Unicode character classes : pattern « XML Schema « XML Tutorial






Unicode Character Class   Includes 
C                           Other characters (non-letters, non symbols, non-numbers, non-separators)  
Cc                           Control characters 
Cf                           Format characters 
Cn                           Unassigned code points 
Co                           Private use characters 
L                           Letters 
Ll                           Lowercase letters 
Lm                           Modifier letters 
Lo                           Other letters 
Lt                           Titlecase letters 
Lu                           Uppercase letters 
M                           All Marks 
Mc                           Spacing combining marks 
Me                           Enclosing marks 
Mn                           Non-spacing marks 
N                           Numbers 
Nd                           Decimal digits 
Nl                           Number letters 
No                           Other numbers 
P                           Punctuation 
Pc                           Connector punctuation 
Pd                           Dashes 
Pe                           Closing punctuation 
Pf                           Final quotes (may behave like Ps or Pe) 
Pi                           Initial quotes (may behave like Ps or Pe) 
Po                           Other forms of punctuation 
Ps                           Opening punctuation 
S                           Symbols 
Sc                           Currency symbols 
Sk                           Modifier symbols 
Sm                           Mathematical symbols 
So                           Other symbols 
Z                           Separators 
Zl                           Line breaks 
Zp                           Paragraph breaks 
Zs                           Spaces








3.79.pattern
3.79.1.Pattern syntax
3.79.2.list of atoms that match a single character
3.79.3.Character classes
3.79.4.Unicode character classes
3.79.5.User-defined character classes
3.79.6.Meta Characters
3.79.7.These three characters should be used with caution:
3.79.8.A character class expression is simply a character group, enclosed in square brackets
3.79.9.Any single normal character will match only that character
3.79.10.Special regex characters (-[]) cannot be used for the single normal character form of the character range.
3.79.11.Any ASCII letter: adding a second character range to the character group expression
3.79.12.To match a string of any length (including the empty string) that is comprised exclusively of lower-case ASCII letters
3.79.13.Specifying a Pattern for a Simple Type
3.79.14.Pattern for time
3.79.15.You can use patterns to offer choices for an element's content.
3.79.16.Getting rid of leading zeros
3.79.17.Use quantifiers to limit the number of leading zeros-for instance
3.79.18.Merge our three patterns into one
3.79.19.pattern Constrains the lexical space to literals that must match a defined pattern
3.79.20.A phone number
3.79.21.Define a pattern that can be used for zip codes
3.79.22.pattern: USA_SSN datatype