Oracle
10g Regular Expression Support
Regular
Expression Support
It is a method for
simple and complex patterns for searching and manipulating a text. You can
search, extract, format, and manipulate a text in the database. At the
beginning, it appears that the syntax is not very intuitive but by second look
it may look easy. The technique more reflects as UNIX style regular expressions.
Interfaces: Oracle Regular Expressions are implemented by the
following functions available in SQL and PL/SQL.
-
REGEXP_LIKE,
-
REGEXP_REPLACE,
-
REGEXP_INSTR, and
-
REGEXP_SUBSTR
Metacharacters: The following is a list of supported Oracle
metacharacters use in Oracle Regular Expressions.
Syntax
|
Description
|
Classification
|
|
Match any
character
|
Dot
|
a?
|
Match "a"
zero or one time
|
Quantifier
|
a*
|
Match "a"
zero or more time
|
Quantifier
|
a+
|
Match "a"
one or more time
|
Quantifier
|
a|b
|
Match either
"a" or "b"
|
Alternation
|
a{m}
|
Match "a"
exactly m times
|
Quantifier
|
a{m,}
|
Match "a"
at least m times
|
Quantifier
|
a{m,n}
|
Match "a"
between m and n times
|
Quantifier
|
[abc]
|
Match either
"a" or "b" or "c"
|
Bracket
Expression
|
(")
|
Group an
expression
|
Subexpression
|
\n
|
Match nth
subexpression
|
Backreference
|
[:cc:]
|
Match
character class in bracket expression
|
Character
Class
|
[.ce.]
|
Match
collation element in bracket expression
|
Collation
Element
|
[=ec=]
|
Match
equivalence class in bracket expression
|
Equivalence
Class
|
NOTE:
It is important to
realize that the results of a particular regular expression query could be
different under a different locale such as changing NLS_SORT, Case and Accent
Sensitivity, etc.
Character Classes: They
are sensitive to the underlying character set such as the [:lower:] character
class.
The
following is a list of Oracle supports character classes, based on character
class definitions in NLS classification data:
Character
Class Syntax
|
Meaning
|
[:alnum:]
|
All
alphanumeric characters
|
[:alpha:]
|
All
alphabetic characters
|
[:blank:]
|
All blank
space characters.
|
[:cntrl:]
|
All control
characters (nonprinting)
|
[:digit:]
|
All numeric
digits
|
[:graph:]
|
All [:punct:],
[:upper:], [:lower:], and [:digit:] characters.
|
[:lower:]
|
All lowercase
alphabetic characters
|
[:print:]
|
All printable
characters
|
[:punct:]
|
All
punctuation characters
|
[:space:]
|
All space
characters (nonprinting)
|
[:upper:]
|
All uppercase
alphabetic characters
|
[:xdigit:]
|
All valid
hexadecimal characters
|
Example:
Consider a simple query to convert the "McLean" city name to a more readable
format. You should look for any instance for a lower case letter immediately
followed by an upper case letter. Your query should record these two letters in
backreferences by using subexpressions, then replaces the first one, followed by
a space, then followed by the second letter:
SQL>
SELECT
REGEXP_REPLACE("McLean",
"([[:lower:]])([[:upper:]])", "\1 \2") as "City"
FROM dual;
Equuivalence
Class: It allows searching for all characters that have a common base
letter.
Example:
SQL>
SELECT
REGEXP_SUBSTR
("iSelfSchooling NOT ISelfSchooling", "[[=i=]]SelfSchooling") as
name
FROM dual;
How to use REGULAR
EXPRESSIONS in Oracle?
Keep this in your
mind that these functions support CHAR, VARCHAR2, CLOB, NCHAR, NVARCHAR, and
NCLOB datatypes.
REGEXP_LIKE
function
It returns a
Boolean indicating whether the pattern matched or not.
Example:
Consider to write an expression that could search for common inflections of the
verb "try". The following regular expression will match try, trying, tried,
and tries.
SQL>
SELECT
REGEXP_LIKE ("We are trying to make the subject easier.",
"tr(y(ing)? | (ied) | (ies))") as REGEXT_SAMPLE
FROM dual;
REGEXP_SUBSTR
function
It returns the
actual data that matches the specified pattern.
Example:
Consider to write an expression that could return the "trying" specified
pattern.
SQL>
SELECT
REGEXP_SUBSTR ("We are trying to make the subject easier.",
"tr(y(ing)? | (ied) | (ies))") as REGEXT_SAMPLE
FROM dual;
REGEXP_INSTR
function
It returns the
character position of either the beginging or end of the match.
Example:
Consider to write an expression that could return the position of "trying"
specified pattern.
SQL>
SELECT
REGEXP_INSTR ("We are trying to make the subject easier.",
"tr(y(ing)? | (ied) | (ies))") as REGEXT_SAMPLE
FROM dual;
REGEXP_REPLACE
function
It looks for an
occurrence of a regular expression and replaces it with the contents of a
supplied text literal.
Example:
SQL>
SELECT
REGEXP_REPLACE ("We are trying to make the subject easier.",
".", " for you." ) as REGEXT_SAMPLE
FROM dual;
Or, query a list of
all employees" name that hired between 1996 and 1999.
SQL>
SELECT ename FROM emp
WHERE REGEXP_REPLACE
(TO_CHAR(hire_date, "YYYY"), "^199[6-9]$");
You used "^" to
indicate that the beginning of the line has to be 199, and [-] with $ to specify
range of valid characters.
Occurance
in the REGEXP functions
All fuctions take
an occurrence that specifies you require the nth matching expression in
REGEXP_SUBSTR and REGEXP_INSTR, the default for which is 1.
Example:
Consider extracting the 3th field being the Oracle system idenfication in a
column.
SQL>
SELECT
REGEXP_SUBSTR("system/password@myhost:1521:mysid",
"[^:]+", 1, 3) as "SID name"
FROM dual;
You also can the
REGEXP function in the DDL statement. Consider to configure a column to allow
only alphabetical characters within a VARCHAR2 column.
SQL>
CREATE TABLE mytest (c1 VARCHAR2(20),
CHECK (REGEXP_LIKE(c1, "^[[:alpha:]]+$)));
|