How to grab variable in source code to extract? (script type="text/javascript"> var abc = <payload>)

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
keepcrawling
Posts: 2
Joined: Thu Aug 13, 2020 6:38 pm

How to grab variable in source code to extract? (script type="text/javascript"> var abc = <payload>)

Post by keepcrawling » Thu Aug 13, 2020 7:29 pm

Hi there,

first off, let me include some information:
VERSION BUILD=12.6.505.4525, Trial Version on Windows 10 64-bit, English with Browser IE11. Demo scripts and another script that I built today work well.

I am trying to build a macro that will run through a series of weekends for a car rental company and will store the advertised price in a csv file. I will then later be able to look at the data, mark which weekends I find interesting and - that will be a future project - use a second macro to book those weekends.

The company's website is nice enough to include all the data I really want in one "object" up top. "Object", because here is where it gets tricky for me: I don't know how to call that, let alone grab it. It is not something that drives the display on the website, and I can't seem to use the right "find" option for the TAG command - however I wonder whether it would not be much more elegant go via xpath, if that is possible here.

How would I go about
1) for beginners, and to understand how this works, grabbing everything inside the variable sxux?
2) to really get what I need, grabbing the part marked in green, i.e. the entire string in quotation marks right after "prc_wc" (our, put differently everything between prc_wc and prc_pp)?
Scrshot 2020-08-13_21h10_16.png
Below is the code of the snippet I am trying to get.

Code: Select all

<script type="text/javascript">
var sxux = {"user_id":"","user_email":"","agia":"","uci":"xxx","uda":"16.06.2021","uti":"12:00","rci":"40272","rda":"19.06.2021","rti":"09:00","uliso":"DE","lor":"3","days_til_begin":"306","ctyp":"P","grp":"",
"class_name":"","offers_extrema":"MCMN|29.68|XXAX|209.52|P|53|DEUF3000|P|49|DER1E000","loginstate":"Public","sx_res_tpl":"offerselect","wakz":"EUR","prpd":"","sim_external":"0",
"layout":"standard","view":"","prl":"","rType":"P","rValue":"","insu":"E","fir":"60","posl":"DE","offerposl":"FR","total":"","total_gross":"","prc_wc":"S1:A|S2:A|S3:A|M1:A|M2:A|M3:B|L1:A|L2:B|L3:A",
"u_d":"desktop","prc_eq":"MCMN:42.99:128.98|ECMR:43.66:130.98|CCMR:48.66:145.98|CDMR:49.66:148.98|CLMR:52.66:157.97|CWMR:53.66:160.97|IDMR:55.66:166.97|CLAR:56.32:168.97|
CPMR:56.66:169.97|CWAR:57.32:171.97|ILMR:57.99:173.97|IVMR:57.99:173.97|IWMR:58.32:174.97|IDAR:59.65:178.96|IFMR:59.65:178.96|SDMR:60.65:181.96|ILAR:61.99:185.96|
CPAR:62.32:186.96|IWAR:62.65:187.96|SWMR:62.32:186.96|SSMR:63.65:190.96|SFMR:64.65:193.96|FDMR:66.65:199.96|FWMR:68.32:204.95|FDAR:71.33:213.99|FWAR:73.33:219.98|
PDAR:78.99:236.98|PSAR:81.66:244.97|PWAR:82.32:246.97|LDAR:83.32:249.97|PFMR:83.99:251.97|LWAR:83.99:251.97|LSAR:88.66:265.97|SVMR:88.99:266.96|LFAR:92.32:276.96|S
VAR:98:293.99|CCAN::|SSAX::|FCAR::|FWAX::|XDAR:131.33:393.99|XFAR:132.99:398.98|LCAR::|XSAR:135.33:405.98|LPAN::|LWAX::|LFAE::|LDAN::|LWMR::|LFAJ::|XCAN::|XJAN::|XVAN::
|PXBR::|LFAN::|XLAN::|XWAR::|XFAN::|XCAR::|XJAR::|XXAX::","prc_pp":"","prc_poa":"","resn":"0","sproducts":"Car;MCMN,Car;ECMR,Car;CCMR,Car;CDMR,Car;CLMR,Car;CWMR,Car;IDMR,Car;
CLAR,Car;CPMR,Car;CWAR,Car;ILMR,Car;IVMR,Car;IWMR,Car;IDAR,Car;IFMR,Car;SDMR,Car;ILAR,Car;CPAR,Car;IWAR,Car;SWMR,Car;SSMR,Car;SFMR,Car;FDMR,Car;FWMR,Car;FDAR,Car;FWAR,Car;
PDAR,Car;PSAR,Car;PWAR,Car;LDAR,Car;PFMR,Car;LWAR,Car;LSAR,Car;SVMR,Car;LFAR,Car;SVAR,Car;CCAN,Car;SSAX,Car;FCAR,Car;FWAX,Car;XDAR,Car;XFAR,Car;LCAR,Car;XSAR,Car;LPAN,Car;
LWAX,Car;LFAE,Car;LDAN,Car;LWMR,Car;LFAJ,Car;XCAN,Car;XJAN,Car;XVAN,Car;PXBR,Car;LFAN,Car;XLAN,Car;XWAR,Car;XFAN,Car;XCAR,Car;XJAR,Car;XXAX","product_name":"","pn":
"Reservation-Pkw-Offerselect","lg":"fr","ibe":"PKW|Default","vat":"VAT_Y|19","osl":"79|61|18","wor":"Wednesday","delcol":"DEL_notset|COL_notset","pm":"","cpc":"","fdr":"N","ic":"insu:0",
"bepc":"","br":"",
"grp_p":"","ex_p":"","ec_f":"","ec":""}
</script>
Here are two options I have tried, but to be honest, just piecing together things from random posts I find:

Code: Select all

TAG POS=1 TYPE=INPUT ATTR=CLASS:sxux EXTRACT=TXT
TAG POS=1 TYPE=VAR ATTR=TXT:"sxux" EXTRACT=TXT 
Thanks a bunch!
chivracq
Posts: 9425
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to grab variable in source code to extract? (script type="text/javascript"> var abc = <payload>)

Post by chivracq » Thu Aug 13, 2020 8:13 pm

keepcrawling wrote:
Thu Aug 13, 2020 7:29 pm
Hi there,

first off, let me include some information:

Code: Select all

VERSION BUILD=12.6.505.4525, Trial Version on Windows 10 64-bit, English with Browser IE11.
Demo scripts and another script that I built today work well.

I am trying to build a macro that will run through a series of weekends for a car rental company and will store the advertised price in a csv file. I will then later be able to look at the data, mark which weekends I find interesting and - that will be a future project - use a second macro to book those weekends.

The company's website is nice enough to include all the data I really want in one "object" up top. "Object", because here is where it gets tricky for me: I don't know how to call that, let alone grab it. It is not something that drives the display on the website, and I can't seem to use the right "find" option for the TAG command - however I wonder whether it would not be much more elegant go via xpath, if that is possible here.

How would I go about
1) for beginners, and to understand how this works, grabbing everything inside the variable sxux?
2) to really get what I need, grabbing the part marked in green, i.e. the entire string in quotation marks right after "prc_wc" (our, put differently everything between prc_wc and prc_pp)?

Scrshot 2020-08-13_21h10_16.png

Below is the code of the snippet I am trying to get.

Code: Select all

<script type="text/javascript">
var sxux = {"user_id":"","user_email":"","agia":"","uci":"xxx","uda":"16.06.2021","uti":"12:00","rci":"40272","rda":"19.06.2021","rti":"09:00","uliso":"DE","lor":"3","days_til_begin":"306","ctyp":"P","grp":"",
"class_name":"","offers_extrema":"MCMN|29.68|XXAX|209.52|P|53|DEUF3000|P|49|DER1E000","loginstate":"Public","sx_res_tpl":"offerselect","wakz":"EUR","prpd":"","sim_external":"0",
"layout":"standard","view":"","prl":"","rType":"P","rValue":"","insu":"E","fir":"60","posl":"DE","offerposl":"FR","total":"","total_gross":"","prc_wc":"S1:A|S2:A|S3:A|M1:A|M2:A|M3:B|L1:A|L2:B|L3:A",
"u_d":"desktop","prc_eq":"MCMN:42.99:128.98|ECMR:43.66:130.98|CCMR:48.66:145.98|CDMR:49.66:148.98|CLMR:52.66:157.97|CWMR:53.66:160.97|IDMR:55.66:166.97|CLAR:56.32:168.97|
CPMR:56.66:169.97|CWAR:57.32:171.97|ILMR:57.99:173.97|IVMR:57.99:173.97|IWMR:58.32:174.97|IDAR:59.65:178.96|IFMR:59.65:178.96|SDMR:60.65:181.96|ILAR:61.99:185.96|
CPAR:62.32:186.96|IWAR:62.65:187.96|SWMR:62.32:186.96|SSMR:63.65:190.96|SFMR:64.65:193.96|FDMR:66.65:199.96|FWMR:68.32:204.95|FDAR:71.33:213.99|FWAR:73.33:219.98|
PDAR:78.99:236.98|PSAR:81.66:244.97|PWAR:82.32:246.97|LDAR:83.32:249.97|PFMR:83.99:251.97|LWAR:83.99:251.97|LSAR:88.66:265.97|SVMR:88.99:266.96|LFAR:92.32:276.96|S
VAR:98:293.99|CCAN::|SSAX::|FCAR::|FWAX::|XDAR:131.33:393.99|XFAR:132.99:398.98|LCAR::|XSAR:135.33:405.98|LPAN::|LWAX::|LFAE::|LDAN::|LWMR::|LFAJ::|XCAN::|XJAN::|XVAN::
|PXBR::|LFAN::|XLAN::|XWAR::|XFAN::|XCAR::|XJAR::|XXAX::","prc_pp":"","prc_poa":"","resn":"0","sproducts":"Car;MCMN,Car;ECMR,Car;CCMR,Car;CDMR,Car;CLMR,Car;CWMR,Car;IDMR,Car;
CLAR,Car;CPMR,Car;CWAR,Car;ILMR,Car;IVMR,Car;IWMR,Car;IDAR,Car;IFMR,Car;SDMR,Car;ILAR,Car;CPAR,Car;IWAR,Car;SWMR,Car;SSMR,Car;SFMR,Car;FDMR,Car;FWMR,Car;FDAR,Car;FWAR,Car;
PDAR,Car;PSAR,Car;PWAR,Car;LDAR,Car;PFMR,Car;LWAR,Car;LSAR,Car;SVMR,Car;LFAR,Car;SVAR,Car;CCAN,Car;SSAX,Car;FCAR,Car;FWAX,Car;XDAR,Car;XFAR,Car;LCAR,Car;XSAR,Car;LPAN,Car;
LWAX,Car;LFAE,Car;LDAN,Car;LWMR,Car;LFAJ,Car;XCAN,Car;XJAN,Car;XVAN,Car;PXBR,Car;LFAN,Car;XLAN,Car;XWAR,Car;XFAN,Car;XCAR,Car;XJAR,Car;XXAX","product_name":"","pn":
"Reservation-Pkw-Offerselect","lg":"fr","ibe":"PKW|Default","vat":"VAT_Y|19","osl":"79|61|18","wor":"Wednesday","delcol":"DEL_notset|COL_notset","pm":"","cpc":"","fdr":"N","ic":"insu:0",
"bepc":"","br":"",
"grp_p":"","ex_p":"","ec_f":"","ec":""}
</script>
Here are two options I have tried, but to be honest, just piecing together things from random posts I find:

Code: Select all

TAG POS=1 TYPE=INPUT ATTR=CLASS:sxux EXTRACT=TXT
TAG POS=1 TYPE=VAR ATTR=TXT:"sxux" EXTRACT=TXT 
Thanks a bunch!

Hum..., Compliment for the Good Quality of your first Post on the Forum... :D

OK, yep-yep you can extract the whole Content of that <Script> Object, iMacros treats it like any other "Standard" HTML Objects, for example with stg like:

Code: Select all

TAG POS=1 TYPE=SCRIPT ATTR=TXT:*sxux* EXTRACT=TXT
Then using 'EVAL()' + 'search()' or 'match()' + 'REGEXP', you can further isolate each inner Var and its Value, but I prefer 'split()', with for example:

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
SET !EXTRACT_TEST_POPUP NO
TAB T=1

TAG POS=1 TYPE=SCRIPT ATTR=TXT:*sxux* EXTRACT=TXT

'SET !EXTRACT {{!CLIPBOARD}}

Set sxux_Var "uda"
SET Var_Content EVAL("var s='{{!EXTRACT}}', v='{{sxux_Var}}', x,y,z; x=s.split(v); y=x[1]; z=y.split('\"'); z[2];")

PROMPT {{!EXTRACT}}<BR><BR>sxux_Var:<SP>_{{sxux_Var}}_<BR>Var_Content:<SP>_{{Var_Content}}_
... And yep indeed, it works directly, for example for the "uda" Var, => which returns "16.06.2021" (without the Double Quotes)...

(Tested on iMacros for FF v8.8.2, PM v26.3.3, Win10_x64.)
+ Using my Clipboard for my Test as I obviously cannot do the 'EXTRACT'...

EDIT:
Oh...!, and with your "prc_wc" Var instead of the "uda" Var I had first chosen, the 'PROMPT' then gives for the 2 Vars:

Code: Select all

sxux_Var: _prc_wc_
Var_Content: _S1:A|S2:A|S3:A|M1:A|M2:A|M3:B|L1:A|L2:B|L3:A_
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
chivracq
Posts: 9425
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to grab variable in source code to extract? (script type="text/javascript"> var abc = <payload>)

Post by chivracq » Thu Aug 13, 2020 8:44 pm

Hum, and if you like 'REGEXP'... :shock: , you have a Solution with 'match()' in this Thread... (from their parallel Thread on SOF, the Thread on our Forum is a bit "Low Quality" as the User preferred to "argue" with me about nearly everything... :| ):
- Re: How to extract nested informations using XPATH?

... But I find my Solution with 'split()' much-much simpler...! :P
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
chivracq
Posts: 9425
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to grab variable in source code to extract? (script type="text/javascript"> var abc = <payload>)

Post by chivracq » Mon Aug 17, 2020 11:56 am

And...!?, still no Feedback/Follow-up, 4 Days later...? :o

Rather "strange" that nearly each time I give a Compliment on Quality and write a Solution/Script for a (New) User, (which I very rarely do), then the User simply doesn't follow up anymore... I don't get it...!? :roll: :( :?
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
keepcrawling
Posts: 2
Joined: Thu Aug 13, 2020 6:38 pm

Re: How to grab variable in source code to extract? (script type="text/javascript"> var abc = <payload>)

Post by keepcrawling » Mon Aug 17, 2020 4:01 pm

Sorry, yes, I feel very bad for not getting back earlier. Had a cramped few days, then didn't have the password to the forum on my other computer, and just didn't get back to it. My bad, I know how frustrating that is.

Thank you very much for the impressively quick answer last week. It was spot on (of course) and does the job very nicely.

I am still working my way through understanding how the split actually works (x,y, z), but that is standard JS syntax I assume and should be sortable through Google.

Again, sorry for not feedbacking earlier!
chivracq
Posts: 9425
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to grab variable in source code to extract? (script type="text/javascript"> var abc = <payload>)

Post by chivracq » Mon Aug 17, 2020 4:57 pm

keepcrawling wrote:
Mon Aug 17, 2020 4:01 pm
Sorry, yes, I feel very bad for not getting back earlier. Had a cramped few days, then didn't have the password to the forum on my other computer, and just didn't get back to it. My bad, I know how frustrating that is.

Thank you very much for the impressively quick answer last week. It was spot on (of course) and does the job very nicely.

I am still working my way through understanding how the split actually works (x,y, z), but that is standard JS syntax I assume and should be sortable through Google.

Again, sorry for not feedbacking earlier!

Oh, good, you are still "alive", ah-ah...! :wink:
Then OK, good-good then, and Thanks for your Feedback... :D
(And forget my mini-"Rant", I'm also a bit "busy" and "stressed" at the moment, moving between 2 places, and I often have very limited Access to Internet, but I still try to help on the Forum, especially when I see a "Quality" Thread, ah-ah...! :P )

>>>

Yeah well, I find the JS 'split()' Command very-very powerful and I use it and probably misuse it a lot, as it can do what 'search()' + 'SEARCH' + 'match()' + (Global) 'replace()' + 'indexOf()' + 'substr()' + 'substring()' + 'count()' + ... can do, all in just one Command, and without any 'REGEXP', and (usually) without having to take care of escaping Double Quotes and Special Chars...
(I did escape "directly" the 'Double Quote' that I used for "z=y.split('\"');" and it was working, the Escape is possibly not even needed, but I didn't test/try...)

Then well, with "x,y,z", I use a very simple-basic-beginners JS Syntax, that JS Gurus probably couldn't stop laughing at :oops: , but it works very well with iMacros in 'EVAL()' as it allows to "slowly" build the 'EVAL()' towards the Result that you want, and every (intermediary) Step can easily be followed and debugged (using the 'PROMPT' Command).

Code: Select all

SET Var_Content EVAL("var s='{{!EXTRACT}}', v='{{sxux_Var}}', x,y,z; x=s.split(v); y=x[1]; z=y.split('\"'); z[2];")
=> I actually should have used 4 Vars:

Code: Select all

SET Var_Content EVAL("var s='{{!EXTRACT}}', v='{{sxux_Var}}', w,x,y,z; w=s.split(v); x=w[1]; y=x.split('\"'); z=y[2]; z;")
... And by simply changing the Final (Return) "z" into "w"/"x"/"y", you can check in the 'PROMPT' what every Step is "doing" one by one until you get the Final Result that you expect... Same with "finding" the correct "1" + "2" Numbers to use for "w[n]" and "y[n]"... :idea:

This one would also work:

Code: Select all

SET Var_Content EVAL("var s='{{!EXTRACT}}', v='{{sxux_Var}}', x,y,z; x=s.split(v); y=x[1].split('\"'); z=y[2]; z;")
And once you've found that your 'EVAL()' works, then it's possible to compact/simplify it into for example:

Code: Select all

SET Var_Content EVAL("var s='{{!EXTRACT}}', v='{{sxux_Var}}'; var z=s.split(v)[1].split('\"')[2]; z;")
(All "new" Statements not tested..., Typo(s) always possible...)

>>>

PS: 'count()' doesn't exist, but 'split()' also provides that Func...! 8)
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Post Reply