Extracting Data with Regex & Fill In The Value Using VAR

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
rapidacenikolayccg
Posts: 2
Joined: Thu Jul 04, 2019 12:14 pm

Extracting Data with Regex & Fill In The Value Using VAR

Post by rapidacenikolayccg » Thu Jul 04, 2019 12:27 pm

I have been trying for quite some hours now, couldn't get this regex work.

It works on regex101.com for I just want to extract price from a html source page.

Code: Select all

<td class="next-table-cell" style="text-align: left;">
<div class="next-table-cell-wrapper">
<div class="myinput-wrap">
<span class="next-input next-input-single next-input-medium  " style="width: 100%;">
<input type="text" min="0" placeholder="<=[b][u]15.9[/u][/b]" value="" height="100%">
</span>

<td class="next-table-cell" style="text-align: left;">
<div class="next-table-cell-wrapper">
<div class="myinput-wrap">
<span class="next-input next-input-single next-input-medium  " style="width: 100%;">
<input type="text" min="0" placeholder="<=[b][u]19.93[/u][/b]" value="" height="100%">
</span>
I want first, to extract all these value(multiple outcome) into a .txt file

Then use SET !LOOP or VAR to these enter values into boxes on the same webpage.

I want to extract 19.93 from this line,

Code: Select all

<input type="text" min="0" placeholder="<=[b][u]19.93[/u][/b]" value="" height="100%">
I use

Code: Select all

\=(\d+\.\d{1,2})
I tried these and it doesn't work,

Code: Select all

SEARCH SOURCE=REGEXP:"target:\=(\d+\.\d{1,2})"    EXTRACT=$1

Code: Select all

SEARCH SOURCE=REGEXP:"\=(\d+\.\d{1,2})"    EXTRACT=$1
Not showing any result even with [Undefined]

Code: Select all

SEARCH SOURCE=REGEXP:"1"    EXTRACT=$1

Code: Select all

SEARCH SOURCE=REGEXP:"a"    EXTRACT=$1

Code: Select all

SEARCH SOURCE=REGEXP:"*"    EXTRACT=$1


Can anyone shed a light, is regexp no longer supported?

Anyway, I downgraded to 9.0.3.

I think I will also try another way, save the page source and use CMD to filter out the result.

Thank you.
chivracq
Posts: 8698
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting Data with Regex & Fill In The Value Using VAR

Post by chivracq » Thu Jul 04, 2019 1:37 pm

rapidacenikolayccg wrote:
Thu Jul 04, 2019 12:27 pm
I have been trying for quite some hours now, couldn't get this regex work.

It works on regex101.com for I just want to extract price from a html source page.

Code: Select all

<td class="next-table-cell" style="text-align: left;">
<div class="next-table-cell-wrapper">
<div class="myinput-wrap">
<span class="next-input next-input-single next-input-medium  " style="width: 100%;">
<input type="text" min="0" placeholder="<=[b][u]15.9[/u][/b]" value="" height="100%">
</span>

<td class="next-table-cell" style="text-align: left;">
<div class="next-table-cell-wrapper">
<div class="myinput-wrap">
<span class="next-input next-input-single next-input-medium  " style="width: 100%;">
<input type="text" min="0" placeholder="<=[b][u]19.93[/u][/b]" value="" height="100%">
</span>
I want first, to extract all these value(multiple outcome) into a .txt file

Then use SET !LOOP or VAR to these enter values into boxes on the same webpage.

I want to extract 19.93 from this line,

Code: Select all

<input type="text" min="0" placeholder="<=[b][u]19.93[/u][/b]" value="" height="100%">
I use

Code: Select all

\=(\d+\.\d{1,2})
I tried these and it doesn't work,

Code: Select all

SEARCH SOURCE=REGEXP:"target:\=(\d+\.\d{1,2})"    EXTRACT=$1

Code: Select all

SEARCH SOURCE=REGEXP:"\=(\d+\.\d{1,2})"    EXTRACT=$1
Not showing any result even with [Undefined]

Code: Select all

SEARCH SOURCE=REGEXP:"1"    EXTRACT=$1

Code: Select all

SEARCH SOURCE=REGEXP:"a"    EXTRACT=$1

Code: Select all

SEARCH SOURCE=REGEXP:"*"    EXTRACT=$1


Can anyone shed a light, is regexp no longer supported?

Anyway, I downgraded to 9.0.3.

I think I will also try another way, save the page source and use CMD to filter out the result.

Thank you.
CIM for me to read and do any "Thinking", read my Sig... :mrgreen:
Even if hum, I think I already have "a" Solution from a diagonal look through your Post..., and maybe 2... (but I never use 'REGEX'...)

Oh...!, FCIM actually, I see some "v9.0.3" lost somewhere in your Post... => Post your FCI at the very top of your OP...
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
chivracq
Posts: 8698
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting Data with Regex & Fill In The Value Using VAR

Post by chivracq » Fri Jul 05, 2019 12:38 am

And..., any "News" about your FCI...? It shouldn't be that "complicated" to mention 3 Versions about your Environment, I would think... :o

And hum, if you are already using v9.0.3 for FF, or just "downgraded" to that Version like you mention, you could actually better "downgrade" again to v8.9.7 for FF which will also work on your FF Version and is much more stable than and not buggy like v9.0.3... :idea:

And OK, I had a look at your Post, and yep, 'SEARCH' + 'REGEX' is always pretty cumbersome and complex, in my Opinion... The "way to go", or at least much easier to implement, is "normally" to use 'EXTRACT=HTM' on your Element (or any Containing Element that can easily be identified), and then using 'EVAL()' + 2x 'split()' to isolate the Data you want to keep... I've produced many Examples if you search the Forum a bit... :idea:

And maybe even simpler, if an 'EXTRACT=TXT' is already able to extract your Data directly from this 'INPUT' Element, and the "Difficulty" resides in locating/identifying that 'INPUT' Element a bit uniquely and reliably, to then use (Double) 'Relative Positioning'. That will even be easier than the first Method... :idea:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
rapidacenikolayccg
Posts: 2
Joined: Thu Jul 04, 2019 12:14 pm

Re: Extracting Data with Regex & Fill In The Value Using VAR

Post by rapidacenikolayccg » Fri Jul 05, 2019 8:10 am

My bad, I'm using Windows 7 32 Bit, FF v52, iMacro 9.0.3

I downgrade it to this, since the new iMacro limit a lot of features.

Oh I see, at first I thought of using Regex because I get to know this stuff by surface some years ago, give it a try and it didn't work.

Thank you for your advise, I will take a look at the 2 commands you mentioned that could also extract the data I need aside from Regex.
chivracq
Posts: 8698
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting Data with Regex & Fill In The Value Using VAR

Post by chivracq » Fri Jul 05, 2019 1:54 pm

rapidacenikolayccg wrote:
Fri Jul 05, 2019 8:10 am
My bad, I'm using

Code: Select all

Windows 7 32 Bit, FF v52, iMacro 9.0.3
I downgrade it to this, since the new iMacro limit a lot of features.

Oh I see, at first I thought of using Regex because I get to know this stuff by surface some years ago, give it a try and it didn't work.

Thank you for your advise, I will take a look at the 2 commands you mentioned that could also extract the data I need aside from Regex.
Oh, good, we have your FCI...! :D (Correct Name/Spelling is "iMacros" btw...)
And again, you can better use v8.9.7 instead of v9.0.3 for FF... The 'EXTRACT' Mechanism for example was buggy in v9.0.3 when extracting in a Table, don't be "surprised" if you get some Double Output...

Yeah well, 'REGEX' is quite powerful and can certainly "do the job" but I never use(d) it myself because I find it too "complex", and I never bothered to dig into it, ah-ah...! But if you are already "fluent" and confident with 'REGEX', then tja, why not, but I prefer my Method(s)... The only Case where 'REGEX' might be more "efficient" is for Negative/Exclusive Search... And you can still use 'REGEX' in the 'EVAL()' if you want, instead of "my" 2x 'split()', for the 1st Method with 'EXTRACT=HTM'...

If the 'EXTRACT=TXT' is able to extract your Data, then the 2nd Method with x 'Relative Positioning' will be the easiest, but I can't tell from the mini-Source (truncated and altered) you've provided on just 2 Cells in a Table. I would need a "larger Picture" of the HTML Structure of that Page, and best is if you provide the URL, if you don't come out by yourself...
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Post Reply