Welcome, Guest

Oracle Regular Expression Support - REGEXP_SUBSTR

Options

Previous Topic

Next Topic

	codeling 1606 - 6684
	@2015-12-16 09:08:57

REGEXP_SUBSTR extends the functionality of the SUBSTR function by letting you search a string for a regular expression pattern. It is also similar to REGEXP_INSTR, but instead of returning the position of the substring, it returns the substring itself. This function is useful if you need the contents of a match string but not its position in the source string.

Syntax :

REGEXP_SUBSTR(source_char, pattern
              [, position
                 [, occurrence
                    [, match_param
                       [, subexpr
                       ]
                    ]
                 ]
              ]
             )

Name	Description
source_char	source_char is a character expression that serves as the search value. It is commonly a character column and can be of any of the data types CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB or NCLOB.
pattern	pattern is the regular expression. It is usually a text literal and can be of any of the data types CHAR, VARCHAR2, NCHAR, or NVARCHAR2. It can contain up to 512 bytes. If the data type of pattern is different from the data type of source_char, then Oracle Database converts pattern to the data type of source_char.
position	position is a positive integer indicating the character of source_char where Oracle should begin the search. The default is 1, meaning that Oracle begins the search at the first character of source_char.
occurrence	occurrence is a positive integer indicating which occurrence of pattern in source_char Oracle should search for. The default is 1, meaning that Oracle searches for the first occurrence of pattern. If occurrence is greater than 1, then the database searches for the second occurrence beginning with the first character following the first occurrence of pattern, and so forth. If occurrence is greater than 1, then the database searches for the second occurrence beginning with the first character following the first occurrence of pattern, and so forth.
match_param	match_parameter is a text literal that lets you change the default matching behavior of the function. This argument affects only the matching process and has no effect on replace_string. You can specify one or more of the following values for match_parameter:
subexpr	For a pattern with subexpressions, subexpr is a nonnegative integer from 0 to 9 indicating which subexpression in pattern is to be returned by the function.

Characters

Character Class	Description
^	Anchor the expression to the start of a line
$	Anchor the expression to the end of a line

Equivalence Classes

Character Class	Description
= =	Oracle supports the equivalence classes through the POSIX '[==]' syntax. A base letter and all of its accented versions constitute an equivalence class. For example, the equivalence class '[=a=]' matches ä and â. The equivalence classes are valid only inside the bracketed expression

Match Options

Character Class	Description
c	Case sensitive matching
i	Case insensitive matching
m	Treat source string as multi-line activating Anchor chars
n	Allow the period (.) to match any newline character

Posix Characters

Character Class	Description
[:alnum:]	Alphanumeric characters
[:alpha:]	Alphabetic characters
[:blank:]	Blank Space Characters
[:cntrl:]	Control characters (nonprinting)
[:digit:]	Numeric digits
[:graph:]	Any [:punct:], [:upper:], [:lower:], and [:digit:] chars
[:lower:]	Lowercase alphabetic characters
[:print:]	Printable characters
[:punct:]	Punctuation characters
[:space:]	Space characters (nonprinting), such as carriage return, newline, vertical tab, and form feed
[:upper:]	Uppercase alphabetic characters
[:xdigit:]	Hexidecimal characters

Quantifier Characters

Character Class	Description
*	Match 0 or more times
?	Match 0 or 1 time
+	Match 1 or more times
{m}	Match exactly m times
{m,}	Match at least m times
{m, n}	Match at least m times but no more than n times
\n	Cause the previous expression to be repeated n times

Alternative Matching And Grouping Characters

Character Class	Description
\|	Separates alternates, often used with grouping operator ()
( )	Groups subexpression into a unit for alternations, for quantifiers, or for backreferencing (see "Backreferences" section)
[char]	Indicates a character list; most metacharacters inside a character list are understood as literals, with the exception of character classes, and the ^ and - metacharacters

@2015-12-16 09:13:26

The following example parse the last name and first from the full name string:

select

regexp_substr('Bond,Jame', '[^,]+', 1, 1) last_name, -- first occurrence of non-comma substring

regexp_substr('Bond,Jame', '[^,]+', 1, 2) first_name -- second occurrence of non-comma substring

from dual;

output:

LAST_NAME FIRST_NAME Bond Jame

@2015-12-16 09:27:35

The following example examines the string, looking for http:// followed by a substring of one or more alphanumeric characters and optionally, a period (.). Oracle searches for a minimum of two occurrences of this substring between http:// and either a slash (/) or the end of the string.

SELECT
  REGEXP_SUBSTR('http://www.example.com/products',
                'http://([[:alnum:]]+\.?)+{2,}/?') "Host_Url"
  FROM DUAL;

HOST_URL
----------------------
http://www.example.com/

@2015-12-16 14:28:39

Select the second substring of letters that ends with "thy."

SELECT REGEXP_SUBSTR('healthy, wealthy, and wise','\w+thy',1,2);

REGEXP_SUBSTR

---------------

wealthy

Previous Topic

Next Topic

Comments

You must Sign In to comment on this topic.

Print this topic

Watch this topic

Normal

Threaded