Extracting data of JSON

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.
kevnad
Posts: 25
Joined: Wed Mar 02, 2011 3:22 pm

Extracting data of JSON

Post by kevnad » Tue Sep 18, 2018 2:41 pm

Hey all,

I'm trying to extract part of a JSON web page to find a specific part in it.

What I want to try to use, is this command :

SET N EVAL("var json2 = JSON.parse('{{!EXTRACT}}');")

But I received this error :
JScript statement in EVAL contains the following error: Variable 'JSON' has not been declared. Line 6: SET N EVAL("var json2 = JSON.parse('{{!EXTRACT}}');")

I am using iMacros Browser 12 on WIndows Server 2012 R2

Most of the example I have seen of the JSON.parse command seems like they were using iMacros for FF, so I'm wondering if it's only available on that.

Thanks

For now, my script is very simple.

URL GOTO=D:\iMacroApp\Scenarios\extract.json
TAG POS=1 TYPE=BODY ATTR=* EXTRACT=TXT
SET N EVAL("var json2 = JSON.parse('{{!EXTRACT}}');")
PROMPT {{N}}
chivracq
Posts: 7730
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting data of JSON

Post by chivracq » Tue Sep 18, 2018 3:16 pm

kevnad wrote:Hey all,

I'm trying to extract part of a JSON web page to find a specific part in it.

What I want to try to use, is this command :

Code: Select all

SET N EVAL("var json2 = JSON.parse('{{!EXTRACT}}');")
But I received this error :

Code: Select all

JScript statement in EVAL contains the following error: Variable 'JSON' has not been declared. Line 6: SET N EVAL("var json2 = JSON.parse('{{!EXTRACT}}');")
I am using

Code: Select all

iMacros Browser 12 on WIndows Server 2012 R2
Most of the example I have seen of the JSON.parse command seems like they were using iMacros for FF, so I'm wondering if it's only available on that.

Thanks

For now, my script is very simple.

Code: Select all

URL GOTO=D:\iMacroApp\Scenarios\extract.json
TAG POS=1 TYPE=BODY ATTR=* EXTRACT=TXT
SET N EVAL("var json2 = JSON.parse('{{!EXTRACT}}');")
PROMPT {{N}}
Yep, normal... This 'json.parse()' is a 'json' Method, not "available" to 'EVAL()' which uses the Built-in JS Engine of your Browser...
I think that only using an (iMacros) '.js' Script in iMacros for FF (until v8.9.7 or v9.0.3 for FF, '.js' Scripts are not supported anymore in v10.0.2 for FF), you can "include" "external" JS Scripts/Libraries (like 'json' or 'node.js' for example, or your own JS Library), into your '.js' Script to make their Methods available from the '.js' Script directly (but still not from 'EVAL()')...

In your Case, you "extract" the whole Content of this 'extract.json' File which is stored as a String in the '!EXTRACT' Var, and you then need to "reconstruct" yourself the Functionality of this 'parse()' Command/Method in "pure" JS for 'EVAL()' to understand it...

If you don't come out by yourself, post the Content of this 'extract.json' File in your Thread (using the ']CODE[' Meta-Tags for Formatting) or upload it to your Thread (zipped - Max 256Kb), and mention exactly what Result you expect from the 'EVAL()' and I can have a look... 8)

Hum, and some possible "Workaround" could be to use some Online 'json' Parser which I guess could "do the job" for you if you don't want to "recode" that 'parse()' Method... (Will be much slower though...) :idea:
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
kevnad
Posts: 25
Joined: Wed Mar 02, 2011 3:22 pm

Re: Extracting data of JSON

Post by kevnad » Tue Sep 18, 2018 3:54 pm

thanks for the reply chivracq

What i need to do is to get a tag in the JSON file to go to another URL.

Like I need to find this :

"descriptions":[
"xxxxxx-EOP",
"",
"C.D. DU VIEUX-LONGUEUIL"
],

and then the first HREF after and get the link there :

"links":[
{
"rel":"nav",
"href":"https://xxxxxx.xxxx.com/sommaire-perso/ ... ae4/detail"
}
]

Any idea?

I have include the file in the post.

Thanks
Attachments
extract.zip
(1.86 KiB) Downloaded 21 times
chivracq
Posts: 7730
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting data of JSON

Post by chivracq » Tue Sep 18, 2018 4:44 pm

kevnad wrote:thanks for the reply chivracq

What i need to do is to get a tag in the JSON file to go to another URL.

Like I need to find this :

Code: Select all

               "descriptions":[  
                  [b]"xxxxxx-EOP[/b]",
                  "",
                  "C.D. DU VIEUX-LONGUEUIL"
               ],
and then the first HREF after and get the link there :

Code: Select all

               "links":[  
                  {  
                     "rel":"nav",
                    [b] "href":"https://xxxxxx.xxxx.com/sommaire-perso/sommaire-aiguilleur/EOP/COP_PFIC_ff6ab11307eec070d93f2906b088ab8158e35fed3c12d96e45f97c2965546ae4/detail"[/b]
                  }
               ]
Any idea?

I have include the file in the post.

Thanks
OK, if I understood correctly, my first Try works directly...!:

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
SET !EXTRACT_TEST_POPUP NO
TAB T=1
'URL GOTO=file:///D:/TEMP/iMacros/Temp/_Forum%20Cases/kevnad/extract.json

SET Descr_1 "xxxxxx-EOP"
SET Descr_2 "C.D. DU VIEUX-LONGUEUIL"

TAG POS=1 TYPE=PRE ATTR=TXT:{<SP>"messages":[<SP>],<SP>"detention":{<SP>"messages":[* EXTRACT=TXT

SET URL_nav EVAL("var s='{{!EXTRACT}}', d1='{{Descr_1}}', d2='{{Descr_2}}', x,y,z; x=s.split(d1)[1].split(d2); y=x[1].split('http'); z='http'+y[1].split('\"')[0]; z;")
PROMPT URL_nav:<BR>_{{URL_nav}}_
... => Which gives as Result in the 'PROMPT': (The Underscores are meant as "Delimiter" to make sure there is no extra Spaces or Soft Returns...)

Code: Select all

URL_nav:
_https://xxxxxx.xxxx.com/sommaire-perso/sommaire-aiguilleur/EOP/COP_PFIC_ff6ab11307eec070d93f2906b088ab8158e35fed3c12d96e45f97c2965546ae4/detail_
(Tested on iMacros for FF v8.8.2, Pale Moon v26.3.3 (=FF47), Win10_x64.)

I used some kind of "Relative" Splitting with the 2 'Descr_1' and 'Descr_2' Vars (specified at the beginning of the Script for "easy Access"), as the Content of 'Descr_2' (=> "C.D. DU VIEUX-LONGUEUIL") seems not to be unique, and you don't want the first Occurrence... :wink:
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
kevnad
Posts: 25
Joined: Wed Mar 02, 2011 3:22 pm

Re: Extracting data of JSON

Post by kevnad » Tue Sep 18, 2018 5:30 pm

Wow thanks a lot !

That looks perfect!

Still need to understand what you have done :)

But it will keep me going!!
kevnad
Posts: 25
Joined: Wed Mar 02, 2011 3:22 pm

Re: Extracting data of JSON

Post by kevnad » Tue Sep 18, 2018 5:53 pm

Any idea why I get this error when going to the URL?

Error -1350: Error loading page. The protocol is not known and no pluggable protocols have been entered that match. Line 40: URL GOTO={{URL_nav}}


I use this command once I got the URL

URL GOTO={{URL_nav}}

don't know if it is site related or anything else

thanks
chivracq
Posts: 7730
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting data of JSON

Post by chivracq » Tue Sep 18, 2018 6:00 pm

kevnad wrote:Wow thanks a lot !

That looks perfect!

Still need to understand what you have done :)

But it will keep me going!!
Ah-ah...!, is not very "complicated" though... :wink:

I can "decompose" the 'EVAL()' Statement into more Sub-Statements, then it's maybe easier to follow...:

Code: Select all

SET URL_nav EVAL("var s='{{!EXTRACT}}', d1='{{Descr_1}}', d2='{{Descr_2}}', x,y,z; x=s.split(d1)[1].split(d2); y=x[1].split('http'); z='http'+y[1].split('\"')[0]; z;")
=> ... becomes:

Code: Select all

SET URL_nav EVAL("var s='{{!EXTRACT}}', d1='{{Descr_1}}', d2='{{Descr_2}}', a,b,c,d,e,z; a=s.split(d1); b=a[1].split(d2); c=b[1].split('http'); d=c[1].split('\"'); e=d[0]; z='http'+e; z;")
=> And you simply "follow" a=>b=>c=>d=>e and =>z that I always use for the "final" Result..., where I each time reuse the previous Expression...
And if "stg" goes wrong, you can easily debug the whole 'EVAL()' by simply changing the last 'z' to any intermediary Var to check them one by one that they already return what you expect...

And on the other hand, it's also possible to "compact" the whole 'EVAL()' into:

Code: Select all

SET URL_nav EVAL("var s='{{!EXTRACT}}', d1='{{Descr_1}}', d2='{{Descr_2}}'; var z='http'+s.split(d1)[1].split(d2)[1].split('http')[1].split('\"')[0]; z;")
All 3 'EVAL()' Statements will probably work as well on your "original" 'EXTRACT' on the 'BODY' Element btw... (All 3 'HTML + 'BODY' + 'PRE' Elements usually have the same Content in '.TXT' Files... (Your '.json' (local) File is seen as a Text File by the Browser (and iMacros)...))
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
Posts: 7730
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting data of JSON

Post by chivracq » Tue Sep 18, 2018 6:08 pm

kevnad wrote:Any idea why I get this error when going to the URL?

Code: Select all

Error -1350: Error loading page. The protocol is not known and no pluggable protocols have been entered that match. Line 40: URL GOTO={{URL_nav}}
I use this command once I got the URL

Code: Select all

URL GOTO={{URL_nav}}
don't know if it is site related or anything else

thanks
Hum, well..., if you've made sure to adapt the "SET Descr_1 "xxxxxx-EOP"" to your "real" Content, and you do get some "https://xxx" in the 'PROMPT' for the URL (and no "__undefined__"), then "it" should work... :?
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
kevnad
Posts: 25
Joined: Wed Mar 02, 2011 3:22 pm

Re: Extracting data of JSON

Post by kevnad » Tue Sep 18, 2018 6:56 pm

Yes the URL is OK

Must be something with the site

THanks for your help!
chivracq
Posts: 7730
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting data of JSON

Post by chivracq » Tue Sep 18, 2018 7:13 pm

kevnad wrote:Yes the URL is OK

Must be something with the site

THanks for your help!
Hum, strange... :?

You don't get "extra" "http://" by any chance (=> "http://https://xxx") if the Domain Name doesn't start with "www"...?
"We"'ve had similar Cases in the past with different Versions of iMacros, especially with some 'about:xxx' and 'view-source:' URL's where some "http://" would get automatically added to the "raw" URL by iMacros or the Browser... Hum, came from iMacros itself in one Case I remember on FF...
Even if hum..., iMB v12.0 has now been around for more than 1 year, I think if this was the case, that it would already have been reported...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
kevnad
Posts: 25
Joined: Wed Mar 02, 2011 3:22 pm

Re: Extracting data of JSON

Post by kevnad » Tue Sep 18, 2018 7:16 pm

well the page is actually displayed correctly, but I still get the error.

It might something weird in the page itself.

This is a mobile site for the mobile apps that we test with imacros. So it might have a few glitch when using imacro browser

I'll have people that know the site inside out tomorrow to look at that error and see if they find anything.

thanks
chivracq
Posts: 7730
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting data of JSON

Post by chivracq » Tue Sep 18, 2018 7:18 pm

kevnad wrote:well the page is actually displayed correctly, but I still get the error.

It might something weird in the page itself.

This is a mobile site for the mobile apps that we test with imacros. So it might have a few glitch when using imacro browser

I'll have people that know the site inside out tomorrow to look at that error and see if they find anything.

thanks
OK... (I'll be interested to hear from some Follow-up... :wink: )
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
Posts: 7730
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting data of JSON

Post by chivracq » Tue Sep 18, 2018 7:41 pm

Hum..., if it's a Mobile Site, a bit depending on how the "Mobile Detection" is done, if any...!, if the Site is in "m.xxx" or "mobile.xxx" for "everybody" or if the Site tries to detect and serve automatically the Desktop or Mobile Version based on the Client/Browser User Agent, iMB broadcasts its own "iMacros Browser" UA... that you can modify if needed, using the iMacros '!USERAGENT' Command... :idea:
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
kevnad
Posts: 25
Joined: Wed Mar 02, 2011 3:22 pm

Re: Extracting data of JSON

Post by kevnad » Wed Sep 19, 2018 6:05 pm

basically, it's because of links that require the mobile app and are not web link

when using Internet Explorer, it ask if "do you want to allow this website to open an app on your computer" with the link

need to put SET !ERRORIGNORE YES / SET !ERRORIGNORE NO before and after the GOTO and validate the result in the page instead after.
chivracq
Posts: 7730
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting data of JSON

Post by chivracq » Wed Sep 19, 2018 8:01 pm

kevnad wrote:basically, it's because of links that require the mobile app and are not web link

when using Internet Explorer, it ask if "do you want to allow this website to open an app on your computer" with the link

need to put SET !ERRORIGNORE YES / SET !ERRORIGNORE NO before and after the GOTO and validate the result in the page instead after.
OK, sounds "logical" a bit..., and yep "!ERRORIGNORE' can do "Miracles", ah-ah...!
(Most of my own Scripts run in "full" '!ERRORIGNORE=YES' anyway, I find it much more "reliable", I usually only switch off '!ERRORIGNORE' when I need to "debug" a Script..., the RuntimeErrors then in 80% already give the "Solution" why stg is failing and why the Script doesn't behave as expected... :wink: )

Did you understand a bit otherwise how I "constructed" your 'EVAL()' Statement(s) to "isolate" the Data you wanted to keep as final Result...?

Hum, and in your "Case" with some missing App to open some Links, you might need to "tune" (= to shorten) '!TIMEOUT_PAGE' or when loading the Page, iMacros might wait the full 60 Sec before proceeding to the next Line in your Script..., "thinking" the Page is still loading..., which for iMacros (and the Browser) will never finish... :idea:

It's possible to "tweak" in such Cases both '!TIMEOUT_xxx' Settings ('!TIMEOUT_PAGE' + '!TIMEOUT_STEP'), say if your Page usually loads in 5 Sec..., but in some (rare) Cases could take up to 20 Sec..., to then specify '!TIMEOUT_PAGE=5' and '!TIMEOUT_STEP=15' for the first/next 'TAG' Statement, which will allow an extra 15 Sec for the Page to keep loading until that 'TAG' Element can be found, and to lower after that '!TIMEOUT_STEP' back to "1" or "0" if you expect some 'TAG' Statement(s) not to find their Elements and want your Script to keep running a bit quick... But it's all a question of "Tuning"...!, depends on the Site/Server/(Content on the) Page and your Connection... :wink: )
Last edited by chivracq on Thu Sep 20, 2018 1:47 pm, edited 1 time in total.
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
Post Reply