Prabhath Kota: Regular Expressions Modifers

Sep 13, 2012

Regular Expressions Modifers

/i    => ignore case
/g    => global match
/s => single line mode
/m   => multi line mode
/x => free-spacing mode
/o => One-time pattern compilation

$text = "foo\nfoot\nroot";

/s => singile-line mode trates the whole as a single line including \n as well, it has only one start (^) and end ($)
/m => multi-line mode treats the string $text as 3 lines with each line starting with ^ and $

/s, /m Example:

$text = "foo\nfoot\nroot";

$text =~ /^foo/g;           # matches only the first foo

$text =~ /^foo/gm;          # matches both foo

$text =~ /f.*t/g;           # matches only foot

$text =~ /f.*t/gs;          # matches foo\nfoot\nroot

$text =~ /f.*?t/gs;         # matches foo\nfoot
    here \s is the modifier, so it treats the whole as only one string (it won't bother about \n)
    .* is greedy operator
    .*? restricts the greediness till the first occurance

$text =~ /^foot.*root$/g;   # doesn't match
    its understandable

$text =~ /^foot.*root$/gm; # doesn't match
    here \m is the modifier, so it treats the string as
    foo
    foot
    root
no where it has the foot.*root, so it didn't match

$text =~ /^foot.*root$/gs; # doesn't match
    here \s is the modifier, so it treats the whole as only one string (it won't bother about \n)
    the string is not starting with foot

$text =~ /^foot.*root$/gms; # matches foot\nroot
    Carefully observe here we have both modifiers \m and \s
    foo    (using \m it splitted)
    foot\nroot (using \s it matched the string as required)


/o modifier (One time compilation) - Compiled regular expression
When using a regular expression containing an interpolated Perl variable that you are confident will not change during the execution of the program, a standard speed-optimization technique is to add the /o modifier to the regex pattern.
This compiles the regular expression once, for the entire lifetime of the script, rather than every time the pattern is executed

e.g.,

@list = qw/prabhath 100 lakshmi 200 500/;
my $pattern = '^\d+$'; #Only digit validation
                        #This will compile only once, if you are confident that the regex will not change, you can go for it
foreach my $each (@list) {
    if ($each=~/$pattern/o) {
        print "\n Only Digits Match : " . $each;
    }
}

Output:
Only Digits Match : 100
Only Digits Match : 200
Only Digits Match : 500

/x modifier - Free Spacing Mode

m/\w+:(\s+\w+)\s*\d+/;       # A word, colon, space, word, space, digits.

m/\w+: (\s+ \w+) \s* \d+/x; # A word, colon, space, word, space, digits.

m{
    \w+:                     # Match a word and a colon.

    (                        # (begin group)
         \s+                 # Match one or more spaces.
         \w+                 # Match another word.
    )                        # (end group)
    \s*                      # Match zero or more spaces.
    \d+                      # Match some digits
}x;


qr// - Compiling a pattern

    $string = "people of this town";

    $pattern = '^peo';
    $re = qr/$pattern/;

    if($string =~ /$re/) {
        print "Matched Pattern, string starts with p";
    } else {
        print "String does'nt start with p";
    }

Result:
    Matched Pattern, string starts with p

Prabhath Kota

Sep 13, 2012

Regular Expressions Modifers

No comments:

Post a Comment