web analytics

Oracle Regular Expression Support - REGEXP_SUBSTR

Options

codeling 1602 - 6666
@2015-12-16 09:08:57

REGEXP_SUBSTR extends the functionality of the SUBSTR function by letting you search a string for a regular expression pattern. It is also similar to REGEXP_INSTR, but instead of returning the position of the substring, it returns the substring itself. This function is useful if you need the contents of a match string but not its position in the source string.

Syntax :

REGEXP_SUBSTR(source_char, pattern
              [, position
                 [, occurrence
                    [, match_param
                       [, subexpr
                       ]
                    ]
                 ]
              ]
             )

 

Name Description
source_char source_char is a character expression that serves as the search value. It is commonly a character column and can be of any of the data types  CHAR,  VARCHAR2,  NCHAR,  NVARCHAR2, CLOB  or NCLOB.
pattern pattern is the regular expression. It is usually a text literal and can be of any of the data types CHAR, VARCHAR2, NCHAR, or NVARCHAR2. It can contain up to 512 bytes. If the data type of pattern is different from the data type of source_char, then Oracle Database converts pattern to the data type of source_char.
position

position is a positive integer indicating the character of source_char where Oracle should begin the search. The default is 1, meaning that Oracle begins the search at the first character of source_char.

occurrence occurrence is a positive integer indicating which occurrence of pattern in source_char Oracle should search for. The default is 1, meaning that Oracle searches for the first occurrence of pattern. If occurrence is greater than 1, then the database searches for the second occurrence beginning with the first character following the first occurrence of pattern, and so forth. If occurrence is greater than 1, then the database searches for the second occurrence beginning with the first character following the first occurrence of pattern, and so forth.
match_param match_parameter is a text literal that lets you change the default matching behavior of the function. This argument affects only the matching process and has no effect on replace_string. You can specify one or more of the following values for match_parameter:
subexpr For a pattern with subexpressions, subexpr is a nonnegative integer from 0 to 9 indicating which subexpression in pattern is to be returned by the function.

 

Characters
Character Class Description
^ Anchor the expression to the start of a line
$ Anchor the expression to the end of a line

Equivalence Classes
Character Class Description
= = Oracle supports the equivalence classes through the POSIX '[==]' syntax. A base letter and all of its accented versions constitute an equivalence class. For example, the equivalence class '[=a=]' matches ä and â. The equivalence classes are valid only inside the bracketed expression

Match Options
Character Class Description
c Case sensitive matching
i Case insensitive matching
m Treat source string as multi-line activating Anchor chars
n Allow the period (.) to match any newline character

Posix Characters
Character Class Description
[:alnum:] Alphanumeric characters
[:alpha:] Alphabetic characters
[:blank:] Blank Space Characters
[:cntrl:] Control characters (nonprinting)
[:digit:] Numeric digits
[:graph:] Any [:punct:], [:upper:], [:lower:], and [:digit:] chars
[:lower:] Lowercase alphabetic characters
[:print:] Printable characters
[:punct:] Punctuation characters
[:space:] Space characters (nonprinting), such as carriage return, newline, vertical tab, and form feed
[:upper:] Uppercase alphabetic characters
[:xdigit:] Hexidecimal characters

Quantifier Characters
Character Class Description
* Match 0 or more times
? Match 0 or 1 time
+ Match 1 or more times
{m} Match exactly m times
{m,} Match at least m times
{m, n} Match at least m times but no more than n times
\n Cause the previous expression to be repeated n times

Alternative Matching And Grouping Characters
Character Class Description
| Separates alternates, often used with grouping operator ()
( ) Groups subexpression into a unit for alternations, for quantifiers, or for backreferencing (see "Backreferences" section)
[char] Indicates a character list; most metacharacters inside a character list are understood as literals, with the exception of character classes, and the ^ and - metacharacters

 

@2015-12-16 09:13:26

The following example parse the last name and first from the full name string:

select

     regexp_substr('Bond,Jame', '[^,]+', 1, 1) last_name,  -- first occurrence of non-comma substring

     regexp_substr('Bond,Jame', '[^,]+', 1, 2) first_name  -- second occurrence of non-comma substring

from dual;

 

output:

LAST_NAME    FIRST_NAME
Bond         Jame

  

@2015-12-16 09:27:35

The following example examines the string, looking for http:// followed by a substring of one or more alphanumeric characters and optionally, a period (.). Oracle searches for a minimum of two occurrences of this substring between http:// and either a slash (/) or the end of the string.

SELECT
  REGEXP_SUBSTR('http://www.example.com/products',
                'http://([[:alnum:]]+\.?)+{2,}/?') "Host_Url"
  FROM DUAL;

HOST_URL
----------------------
http://www.example.com/

 

@2015-12-16 14:28:39

Select the second substring of letters that ends with "thy."

SELECT REGEXP_SUBSTR('healthy, wealthy, and wise','\w+thy',1,2);

REGEXP_SUBSTR

---------------

wealthy

Comments

You must Sign In to comment on this topic.


© 2024 Digcode.com