2008-05-01

Examples on LoadRunner Regular Expressions

I'm going to show and explain how to use Regular Expressions in LoadRunner.

Introduction:
The present article is a summarizing of the LoadRunner Regular Expressions challenge and its results. Also, I added code for RegExp patterns/subpatterns matching.
All LoadRunner Regular Expressions functions are shown with examples.


Outline:
  1. How to check - whether RegExp pattern matches against a text or not
  2. How to get a matched strings (RegExp patterns and subpatterns)

How to check - Whether RegExp pattern matches against a text or not


I thanks Charlie Weiblen and Tim Koopmans for the solution. I modified it slightly.
So, here it is:
  1. Download and unpack Binaries and Developer files for PCRE (Perl Compatible Regular Expressions).
    These and others files are available on Pcre for Windows page.

  2. Unzip downloaded archives into c:\pcre
    C:\pcre folder
  3. Сomment out the include for stdlib.h file in:
    • C:\pcre\include\pcre.h
    • C:\pcre\include\pcreposix.h
    Commented stdlib.h file
  4. In your LoadRunner script, add to globals.h:
    • #include "c:\\pcre\\include\\pcre.h"
    • #include "c:\\pcre\\include\\pcreposix.h"
    Edited globals.h file
  5. Add the match() function to vuser_init section:
    //////////////////////////////////////////////////////////////////////////
    /// 'match' function matches a 'pattern' against a given 'subject'
    /// It returns 1 for a match, or 0 for a non-match / error
    int match(const char *subject, const char *pattern)
    {
    int rc; // Returned code
    regex_t re; // Compiled regexp pattern

    lr_load_dll("c:\\pcre\\bin\\pcre3.dll");

    if (regcomp(&re, pattern, 0) != 0)
    return 0; // Report error

    rc = regexec(&re, subject, 0, NULL, 0);
    regfree(&re);

    if (rc != 0)
    return 0; // Report error
    else
    return 1;
    }

  6. Let's run sample LoadRunner script and check the result:
    As you can see, match() function works correctly. Using match() function, you can check - whether RegExp pattern matches against a text or not.

    It can be helpful, when you verify in LoadRunner that the text (RegExp pattern) matches the text on a downloaded page.

    I tested the match() function with different patterns and subject strings:
    #
    Subject string
    Patterns
    Result of
    match()
    Is correct
    result?
    1
    abcdefb(c(.*))e
    1
    Yes
    2
    abcdef
    b(z(.*))e
    0
    Yes
    3
    2008
    \\d{2,5}
    1
    Yes
    4
    2008
    \\d{5}
    0
    Yes
    5
    abc 1st of May 2008xyz
    \\d.*\\d
    1
    Yes
    Note: Since LoadRunner uses ANSI C language, please do not forget to double backslashes (\\). For example, to match any digit character (0-9), use pattern "\\d".

    match() function is simple enough. But it searches only and it cannot extract matched subpatterns from the text. For example, we have to extract the name of month from these strings:
    • "abc 1st of May 2008xyz"
    • "abc 25th of February 2031"
    • etc
    We can use the following pattern:
    • \d.+([A-Z]\w+)\s+\d{4}

    The name of month will be matches by subpattern ([A-Z]\w+). How to extract the found text? You can use matchex() function for that. Let's discuss it in details...

How to get a matched strings (RegExp patterns and subpatterns)

To get a matched (found) strings, we have to update our match() function.
That's why I created matchex() ('match' EXtended) function.
  1. Add the matchex() function to vuser_init section
    //////////////////////////////////////////////////////////////////////////
    /// 'matchex' (EXtended) function matches a 'pattern' against a given 'subject'
    /// It returns number of matches:
    /// 0 - for a non-match or error
    /// 1 and more - for successful matches
    int matchex(const char *subject, const char *pattern, int nmatch, regmatch_t *pmatch)
    {
    int rc; // Returned code
    regex_t re; // Compiled regexp pattern

    lr_load_dll("c:\\pcre\\bin\\pcre3.dll");

    if (regcomp(&re, pattern, 0) != 0)
    return 0; // Report error

    rc = regexec(&re, subject, nmatch, pmatch, 0);
    pcre_free(&re); // Release memory used for the compiled pattern

    if (rc < 0)
    return 0; // Report error

    // Get total number of matched patterns and subpatterns
    for (rc = 0; rc < nmatch; rc++)
    if (pmatch[rc].rm_so == -1)
    break;

    return rc;
    }

  2. Let's run sample LoadRunner script and check the result:
    matchex() function returns a number of matched patterns/subpatterns and fill an array in with information about each matched substring.


    What is an information about each matched substring?


    This info contains the offset (rm_so) to the first character of each substring and the offset (rm_eo) to the first character after the end of each substring, respectively.

    Note1:
    The 0th element of the array relates to the entire portion of string that was matched.
    Note2: Subsequent elements of the array relate to the capturing subpatterns of the regular expression.
    Note3: Unused entries in the array have both structure members set to -1.

    Let's investigate it with the example. This is our subject string:
    ExampleThe replay log shows offsets for matched substrings:
    • Action.c(7): Matched 3 patterns
    • Action.c(10): Start offset: 1, End offset: 6
    • Action.c(10): Start offset: 2, End offset: 5
    • Action.c(10): Start offset: 3, End offset: 5

    Start offset: 1 and End offset: 6 match substring "bcdef".
    Note4: End offset is the first character after the end the current substring. That's why character "g" (with index 6) is not a part of matched string.

    As I've written in Note1, "bcdef" is the entire portion of string that was matched.
    Others items from an array relate to matched subpatterns.


    What is a subpattern in Regular Expression?

    It is a part of the RegExp pattern surrounded with parenthesis - "(" and ")".

    It's easy to get out the order of subpatterns. Just look through your pattern from left to right. When you find an open parenthes, this is a start of the current subpattern.
    Subpattern can be embedded.

    So, others captured subpatterns are:
    • Start offset: 2, End offset: 5 matches substring "cde".
      Note: current subpattern is "([acqz](.*))".
    • Start offset: 3, End offset: 5 match substring "de".
      Note: current subpattern is "(.*)".

    As you can see - this is not so difficult. :)
    Regular Expressions can be very powerful and useful in LoadRunner.

Another example:

Let's practise with an example I mentioned early:
For example, we have to extract the name of month from these strings:
  • "abc 1st of May 2008xyz"
  • "abc 25th of February 2031"
  • etc
We can use the following pattern:
  • \d.+([A-Z]\w+)\s+\d{4}
The name of month will be matches by subpattern ([A-Z]\w+).

Please, see LoadRunner script, which captures and prints name of months:
Note: Pay attention that I use arr[1] to get info about substring.
As you remember, arr[0] contains info about the entire matched pattern, arr[1], arr[2], and so on contain info about matched subpattern.


Summary:
I've explained, shown and demonstrated how to use Regular Expressions (RegExp) in LoadRunner.
I hope, this knowledge will help you to create advanced LoadRunner scripts.



Related articles:


--
Thank you, my readers!
Dmitry Motevich

25 comments:

  1. Good stuff Dmitry, this is very useful. Icing on the cake would be to have some search & replace functionality as well, but happy with where it's at for the time being...

    This will be particularly useful in matching/correlating binary info for Flex/AMF load testing ...

    ReplyDelete
  2. Your site is very resourceful. Thank you.

    ReplyDelete
  3. hi Dmitry u r providing an good stuff.its is excellent.i have a query that is not related regular expressions.my question is how can i collect the data 4m xml file & send that data to loadrunner parameter.can u pls help me regarding this.

    ReplyDelete
  4. 2Sumathi:
    You can use lr_xml_extract, lr_xml_find, lr_xml_get_values etc functions to work with XML in LoadRunner.

    ReplyDelete
  5. Hello Dmitry, lr_xml_extract, lr_xml_find, lr_xml_get_values etc functions are available only on SOA protocol. I am looking for a way to have these functions working in AJAX or other protocols. Thanks, 'eijuito from gmail.com

    ReplyDelete
  6. any chance you could post this code snippet (currently it is a gif)
    Action5()
    {
    regmatch_t arr[20];
    char month[50];
    char *subject1 = "abc 1st of May 2008xyz";
    char *subject2 = "abc 25th of February 2031";
    char *pattern = "\\d.+([A-Z]\\w+)\\s+\\d{4}";

    matchex (subject1, pattern, 20, arr);
    strncpy (month,subject1 + arr[1].rm_so, arr[1].rm_eo - arr[1].rm_so);
    lr_output_message("Month1 is: %s", month);

    return 0;
    }

    ReplyDelete
  7. I get a
    vuser_init.c(44): Error: C interpreter run time error: vuser_init.c (44): Error -- memory violation : Exception ACCESS_VIOLATION received.

    I noticed that matchex()
    uses:
    pcre_free(&re);
    and
    match() uses:
    regfree(&re);

    I changed matchex to use:
    regfree(&re);

    and that seemed to eliminate the error... I am missing something?

    ReplyDelete
  8. hi,
    can u explain in
    the first example
    rc = regexec(&re, subject,0, NULL, 0);
    if u replace 0 with any value showing an error
    and in second example
    rc = regexec(&re, subject, 20, pmatch, 0);
    its taking 20 as its value
    and not showing an error

    ReplyDelete
  9. 2Celso,
    You can use xml_* functions in web protocols (HTTP/HTML, Click & Script).
    I do not see any problems in using xml_* functions in web protocols.
    What troubles did you face?

    ReplyDelete
  10. 2wasque,
    I've rechecked the source code - it's correct.
    So, it's difficult to say why you got an exception. If I were on your place, I would start debugging to find the reason.

    ReplyDelete
  11. 2m.a,
    Third parameter of regexec function is a number of items in an output array.
    The foorth parameter is an address of this array.

    So, when change
    rc = regexec(&re, subject,0, NULL, 0);
    to
    rc = regexec(&re, subject,20, NULL, 0);

    you specify that an output array contains 20 items, but his address is still == NULL.

    That's why the error occurs.

    ReplyDelete
  12. Hi Dmitry,
    Thank for your efforts on this. Any I realized that you answered this question, but I'm also receiving an error at the pcre_free(&re) line. Here's the complete error: vuser_init.c(18): Error: C interpreter run time error: vuser_init.c (18): Error -- memory violation : Exception ACCESS_VIOLATION received.
    vuser_init.c(18): Notify: CCI trace: vuser_init.c(18): unknown_fun(0x011f0314 " ÏÓ")
    .
    vuser_init.c(18): Notify: CCI trace: Action.c(10): matchex(0x00cc02b8 "abc 1st of May 2008xyz", 0x00cc0287 "\d.+([A-Z]\w+)\s+\d{4}", 20, 0x011f0020)
    .
    vuser_init.c(18): Notify: CCI trace: Compiled_code(0): Action()
    .

    What is interesting is that LoadRunner is saying that the function is unknown.

    Any ideas on what the problem may be?
    Thank you

    ReplyDelete
  13. Did you change something in my script?

    ReplyDelete
  14. lowkhe,
    Did you change something in my script?

    ReplyDelete
  15. Nothing was changed. I did a copy/paste - thanks again:

    //////////////////////////////////////////////////////////////////////////
    /// 'matchex' (EXtended) function matches a 'pattern' against a given 'subject'
    /// It returns number of matches:
    /// 0 - for a non-match or error
    /// 1 and more - for successful matches
    int matchex(const char *subject, const char *pattern, int nmatch, regmatch_t *pmatch)
    {
    int rc; // Returned code

    regex_t re; // Compiled regexp pattern

    lr_load_dll("c:\\pcre\\bin\\pcre3.dll");

    if (regcomp(&re, pattern, 0) != 0)
    return 0; // Report error

    rc = regexec(&re, subject, nmatch, pmatch, 0);
    pcre_free(&re); // Release memory used for the compiled pattern

    if (rc < 0)
    return 0; // Report error

    // Get total number of matched patterns and subpatterns
    for (rc = 0; rc < nmatch; rc++)
    if (pmatch[rc].rm_so == -1)
    break;

    return rc;
    }
    vuser_init()
    {
    return 0;
    }

    ReplyDelete
  16. Since you didn't sent code from Action(), I think that it is blank.
    Am I right?
    If not, then why didn't you provide all required info?

    ReplyDelete
  17. I thought you needed the matchex function as this had the most omplexity. Here's the Action():
    Action()
    {

    regmatch_t arr[20];
    char month[50];
    char *subject1 = "abc 1st of May 2008xyz";
    char *subject2 = "abc 25th of February 2031";
    char *pattern = "\\d.+([A-Z]\\w+)\\s+\\d{4}";

    matchex (subject1, pattern, 20, arr);
    strncpy (month,subject1 + arr[1].rm_so, arr[1].rm_eo - arr[1].rm_so);
    lr_output_message("Month1 is: %s", month);

    return 0;
    }

    ReplyDelete
  18. Lowkhe,
    I've rechecked the code. It works correctly on my computer.
    Could you send your LR script to my email?

    ReplyDelete
  19. userSession value=98117.6678817335fAtzQcHpftVzzzzHDAciVpfzfcf>

    In the above HTML code, I need to capture the value for usersession..

    Now tel me what shud my regular expression shud b...??

    ReplyDelete
  20. Game Freak (September 8),
    For example:
    =(.*)

    ReplyDelete
  21. Hi,
    I am getting the following error in matchex. If I comment out the line

    pcre_free(&re); // Release memory used for the compiled pattern

    than it works fine , I would think this may cause a memory leak?

    Any ideas.

    Virtual User Script started
    Starting action vuser_init.
    Web Turbo Replay of LoadRunner 8.1.0 for WINXP; WebReplay81 build 5495 [MsgId: MMSG-27143]
    Run-Time Settings file: "C:\projects\qpg\scripts\8.1.1\Copy of COMB_VPTF_811_updated_v3\\default.cfg" [MsgId: MMSG-27141]
    vuser_init.c(129): Error: C interpreter run time error: vuser_init.c (129): Error -- memory violation : Exception ACCESS_VIOLATION received.
    vuser_init.c(129): Notify: CCI trace: vuser_init.c(129): unknown_fun(0x02bb0184 "œ~ò")
    .
    vuser_init.c(129): Notify: CCI trace: vuser_init.c(53): matchex(0x029209f0 "abc 1st of May 2008xyz", 0x029209bf "\d.+([A-Z]\w+)\s+\d{4}", 20, 0x02bb0020)
    .
    vuser_init.c(129): Notify: CCI trace: Compiled_code(0): vuser_init()
    .
    Action was aborted.

    ReplyDelete
  22. to Mohhamed (September 24),
    Do not comment the line :)

    ReplyDelete
  23. Dmitry,

    Can't seem to succesfully pas the value from a web_reg_save_param to matchex as it expects a constant character type.

    ReplyDelete
  24. Excellent post! I followed all the steps you indicated but I'm getting the following errors. Any ideas? I did copy all appropriate code to Vuser_init and Action file. Any help will be appreciated.

    In file included from globals.h:9,
    from c:\program files\mercury\loadrunner\scripts\regex1\\combined_RegEx1.c:2:
    C:\QTP\pcre\include\pcre.h:88: stdlib.h: No such file or directory
    In file included from globals.h:9,
    from c:\program files\mercury\loadrunner\scripts\regex1\\combined_RegEx1.c:2:
    C:\\pcre\\include\\pcre.h:88: stdlib.h: No such file or directory
    In file included from globals.h:10,
    from c:\program files\mercury\loadrunner\scripts\regex1\\combined_RegEx1.c:2:
    C:\\pcre\\\include\\pcreposix.h:45: stdlib.h: No such file or directory

    ReplyDelete