Extracting piece of JSON from code

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Extracting piece of JSON from code

by robodoom on Wed Nov 30, 2016 2:33 pm

Hey guys

This really is my last resort, as I tried all different combinations for getting this little script to work.

What I am trying is to extract the part of JSON that is between these brackets [], and including them.
I tried to find out what exactly regex code would look like, and visited regexr.com

code I'm working with:
Code: Select all
<script>
   var profitChartData = [{"date":2011,"profit":570000,"income":4578000},{"date":2012,"profit":50442000,"income":127168000},{"date":2013,"profit":72790000,"income":216381000},{"date":2014,"profit":18135000,"income":211685000},{"date":2015,"profit":60730000,"income":457759000}];
</script>

what regexr suggests as solution:
Code: Select all
profitChartData\s\=\s([^]+);


of course it doesn't work...

this is how my non-functional script looks like at the moment:
Code: Select all
VERSION BUILD=9030808 RECORDER=FX
TAB T=1

URL GOTO=URL
WAIT SECONDS=5
SEARCH SOURCE=REGEXP:"profitChartData\s\=\s([^]+)" EXTRACT="$1"


info:
firefox version: 50.0
imacros for firefox: 9.0.3
windows7 64bit
robodoom
 
Posts: 1
Joined: Sun Sep 27, 2015 3:29 pm

Re: Extracting piece of JSON from code

by chivracq on Wed Nov 30, 2016 9:14 pm

robodoom wrote:Hey guys

This really is my last resort, as I tried all different combinations for getting this little script to work.

What I am trying is to extract the part of JSON that is between these brackets [], and including them.
I tried to find out what exactly regex code would look like, and visited regexr.com

code I'm working with:
Code: Select all
<script>
   var profitChartData = [{"date":2011,"profit":570000,"income":4578000},{"date":2012,"profit":50442000,"income":127168000},{"date":2013,"profit":72790000,"income":216381000},{"date":2014,"profit":18135000,"income":211685000},{"date":2015,"profit":60730000,"income":457759000}];
</script>

what regexr suggests as solution:
Code: Select all
profitChartData\s\=\s([^]+);


of course it doesn't work...

this is how my non-functional script looks like at the moment:
Code: Select all
VERSION BUILD=9030808 RECORDER=FX
TAB T=1

URL GOTO=URL
WAIT SECONDS=5
SEARCH SOURCE=REGEXP:"profitChartData\s\=\s([^]+)" EXTRACT="$1"


info:
Code: Select all
firefox version: 50.0
imacros for firefox: 9.0.3
windows7 64bit

Ah-ah...!, nice to see that you registered already more than one year on the Forum and only today need to ask a Qt, you are doing well with iMacros I reckon... :D

OK, I never understood how 'SEARCH' and REGEXP work either, well..., never tried to understand actually!, because I find it much easier to use 'EVAL()' + 'split()' (x2) (and I've posted "my Method" a few times already on the Forum btw...) on the 'EXTRACT=HTM' Data of some HTML Element containing the Data you are interested in, you can even "do it" at the "high" 'HTML' or 'BODY' Levels (but the Extract can take up to quite a few seconds!), better and quicker is to try to isolate the "smallest/lowest" Element like:
Code: Select all
SET !EXTRACT_TEST_POPUP NO
SET !EXTRACT NULL
TAG POS=1 TYPE=SCRIPT ATTR=TXT:*profitChartData* EXTRACT=HTM
SET profitChartData EVAL("var s='{{!EXTRACT}}'; var x,y,z; y=s.split('['); z=y[1].split(']'); z[0];")
PROMPT {{profitChartData}}

Not tested as you didn't provide the URL or an HTML Saveas of the Page and I cannot really test locally because of all the Spaces and Double Quotes, but it should work..., if not with 'POS=1', then with 'POS=2' etc... if the "profitChartData" part is mentioned/reused in several Scripts on your Page, but you were going for '$1', so I guess that's the first Occurrence and 'POS=1' should then be correct I reckon...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 5568
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Extracting piece of JSON from code

by chivracq on Thu Dec 01, 2016 9:12 am

And hum, mini-Rmk, your Thread will probably get moved to the 'Data Extraction' Sub-Forum as it doesn't have anything specific to 'iMacros for FF', try to select the "correct" Sub-Forum when you open a Thread... :idea: :roll:

Legit would have been, if you were using iMacros for CR, if you had opened it in the 'iMacros for CR' Sub-Forum, because the 'SEARCH' Command is not implemented on CR, according to the Wiki...
And that's where "my Method" comes in handy, as it "bypasses" the 'SEARCH' Command and will work on CR as well...! 8)

Some other Advanced User (@iimfun) will probably/hopefully give you as well the correct REGEXP Syntax when he will notice the Thread, for your original Solution with 'SEARCH' to work as well... That's his "Specialty" ah-ah...!, REGEXP is definitely not my "pieceofcake", beurk...! :wink:
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 5568
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Extracting piece of JSON from code

by chivracq on Wed Dec 07, 2016 10:42 pm

Hum..., 7 days after posting my "Solution" and checking this Thread a few times since, a bit disappointed not to see any Follow-up on this Thread while I know @OP that you checked the Forum after my 2 Replies and must have seen them... :shock:

OK, fair enough..., never mind and enjoy "my Solution" (that I am pretty confident must work..., and I rarely write "ready to use" Scripts), but don't be surprised if I won't try to help you in the Future anymore if you ever have some other Pb, as I only help Users using the Forum "a bit correctly" and a neat Follow-up on (all) their Threads is one of my "Criteria", oops...! :roll:
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 5568
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 2 guests

Website Monitoring