Documentation‎ > ‎

Appendix

Regular Expression Syntax

Below is a quick reference for the most common regular expression tags supported by IxoraRMS. The full supported syntax is that of the Pattern class, as described in Java documentation.


Characters
  • x The character x 
  • \\ The backslash character 
  • \0n The character with octal value 0n (0 <= n <= 7) 
  • \0nn The character with octal value 0nn (0 <= n <= 7) 
  • \0mnn The character with octal value 0mnn (0 <= m <= 3, 0 <= n <= 7) 
  • \xhh The character with hexadecimal value 0xhh 
  • \uhhhh The character with hexadecimal value 0xhhhh 
  • \t The tab character ('\u0009') 
  • \n The newline (line feed) character ('\u000A') 
  • \r The carriage-return character ('\u000D') 
  • \f The form-feed character ('\u000C') 
  • \a The alert (bell) character ('\u0007') 
  • \e The escape character ('\u001B') 
  • \cx The control character corresponding to x 


Character classes
  • [abc] a, b, or c (simple class) 
  • [^abc] Any character except a, b, or c (negation) 
  • [a-zA-Z] a through z or A through Z, inclusive (range) 
  • [a-d[m-p]] a through d, or m through p: [a-dm-p] (union) 
  • [a-z&&[def]] d, e, or f (intersection) 
  • [a-z&&[^bc]] a through z, except for b and c: [ad-z] (subtraction) 
  • [a-z&&[^m-p]] a through z, and not m through p: [a-lq-z](subtraction) 


Predefined character classes
  • . Any character (may or may not match line terminators) 
  • \d A digit: [0-9] 
  • \D A non-digit: [^0-9] 
  • \s A whitespace character: [ \t\n\x0B\f\r] 
  • \S A non-whitespace character: [^\s] 
  • \w A word character: [a-zA-Z_0-9] 
  • \W A non-word character: [^\w] 


POSIX character classes (US-ASCII only)
  • \p{Lower} A lower-case alphabetic character: [a-z] 
  • \p{Upper} An upper-case alphabetic character:[A-Z] 
  • \p{ASCII} All ASCII:[\x00-\x7F] 
  • \p{Alpha} An alphabetic character:[\p{Lower}\p{Upper}] 
  • \p{Digit} A decimal digit: [0-9] 
  • \p{Alnum} An alphanumeric character:[\p{Alpha}\p{Digit}] 
  • \p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ 
  • \p{Graph} A visible character: [\p{Alnum}\p{Punct}] 
  • \p{Print} A printable character: [\p{Graph}] 
  • \p{Blank} A space or a tab: [ \t] 
  • \p{Cntrl} A control character: [\x00-\x1F\x7F] 
  • \p{XDigit} A hexadecimal digit: [0-9a-fA-F] 
  • \p{Space} A whitespace character: [ \t\n\x0B\f\r] 

Classes for Unicode blocks and categories
  • \p{InGreek} A character in the Greek block (simple block) 
  • \p{Lu} An uppercase letter (simple category) 
  • \p{Sc} A currency symbol 
  • \P{InGreek} Any character except one in the Greek block (negation) 
  • [\p{L}&&[^\p{Lu}]]  Any letter except an uppercase letter (subtraction) 


Boundary matchers
  • ^ The beginning of a line 
  • $ The end of a line 
  • \b A word boundary 
  • \B A non-word boundary 
  • \A The beginning of the input 
  • \G The end of the previous match 
  • \Z The end of the input but for the final terminator, if any 
  • \z The end of the input 


Greedy quantifiers
  • X? X, once or not at all 
  • X* X, zero or more times 
  • X+ X, one or more times 
  • X{n} X, exactly n times 
  • X{n,} X, at least n times 
  • X{n,m} X, at least n but not more than m times 


Capturing Groups
Capturing groups are created by enclosing parts of the regular expresion in brackets (). The string matched by a capturing group is accessible later on with the use of $n tags, where $1 .. $n represent capturing groups 1 to n.


Formatting Syntax

The <format> attributes in IxoraRMS accept the standard Java syntax for number and dates formatting (DecimalFormat and SimpleDateFormat). For full information please refer to Java documentation

Formatting tokens for numbers
  • 0  Number  Digit  
  • #  Number  Digit, zero shows as absent  
  • .  Number  Decimal separator or monetary decimal separator  
  • -  Number  Minus sign  
  • ,  Number  Grouping separator  
  • E  Number  Separates mantissa and exponent in scientific notation. Need not be quoted in prefix or suffix.  
  • ;  Subpattern boundary  Separates positive and negative subpatterns  
  • %  Prefix or suffix  Multiply by 100 and show as percentage  
  • \u2030  Prefix or suffix  Multiply by 1000 and show as per mille  
  • \u00A4  Prefix or suffix  Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If present in a pattern, the monetary decimal separator is used instead of the decimal separator.  
  • '  Prefix or suffix  Used to quote special characters in a prefix or suffix, for example, "'#'#" formats 123 to "#123". To create a single quote itself, use two in a row: "# o''clock".  

Formatting tokens for dates:
  • G  Era designator
  • y  Year
  • M  Month in year
  • w  Week in year
  • W  Week in month
  • D  Day in year
  • d  Day in month
  • F  Day of week in month
  • E  Day in week
  • a  Am/pm marker
  • H  Hour in day (0-23)
  • k  Hour in day (1-24)
  • K  Hour in am/pm (0-11)
  • h  Hour in am/pm (1-12)
  • m  Minute in hour
  • s  Second in minute
  • S  Millisecond
  • z  Time zone (General)
  • Z  Time zone (RFC 822)