Extracting piece of JSON from code

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
robodoom
Posts: 1
Joined: Sun Sep 27, 2015 10:29 pm

Extracting piece of JSON from code

Post by robodoom » Wed Nov 30, 2016 9:33 pm

Hey guys

This really is my last resort, as I tried all different combinations for getting this little script to work.

What I am trying is to extract the part of JSON that is between these brackets [], and including them.
I tried to find out what exactly regex code would look like, and visited regexr.com

code I'm working with:

Code: Select all

<script>
	var profitChartData = [{"date":2011,"profit":570000,"income":4578000},{"date":2012,"profit":50442000,"income":127168000},{"date":2013,"profit":72790000,"income":216381000},{"date":2014,"profit":18135000,"income":211685000},{"date":2015,"profit":60730000,"income":457759000}];
</script>
what regexr suggests as solution:

Code: Select all

profitChartData\s\=\s([^]+);
of course it doesn't work...

this is how my non-functional script looks like at the moment:

Code: Select all

VERSION BUILD=9030808 RECORDER=FX
TAB T=1

URL GOTO=URL
WAIT SECONDS=5
SEARCH SOURCE=REGEXP:"profitChartData\s\=\s([^]+)" EXTRACT="$1"
info:
firefox version: 50.0
imacros for firefox: 9.0.3
windows7 64bit
chivracq
Posts: 9004
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting piece of JSON from code

Post by chivracq » Thu Dec 01, 2016 4:14 am

robodoom wrote:Hey guys

This really is my last resort, as I tried all different combinations for getting this little script to work.

What I am trying is to extract the part of JSON that is between these brackets [], and including them.
I tried to find out what exactly regex code would look like, and visited regexr.com

code I'm working with:

Code: Select all

<script>
	var profitChartData = [{"date":2011,"profit":570000,"income":4578000},{"date":2012,"profit":50442000,"income":127168000},{"date":2013,"profit":72790000,"income":216381000},{"date":2014,"profit":18135000,"income":211685000},{"date":2015,"profit":60730000,"income":457759000}];
</script>
what regexr suggests as solution:

Code: Select all

profitChartData\s\=\s([^]+);
of course it doesn't work...

this is how my non-functional script looks like at the moment:

Code: Select all

VERSION BUILD=9030808 RECORDER=FX
TAB T=1

URL GOTO=URL
WAIT SECONDS=5
SEARCH SOURCE=REGEXP:"profitChartData\s\=\s([^]+)" EXTRACT="$1"
info:

Code: Select all

firefox version: 50.0
imacros for firefox: 9.0.3
windows7 64bit
Ah-ah...!, nice to see that you registered already more than one year on the Forum and only today need to ask a Qt, you are doing well with iMacros I reckon... :D

OK, I never understood how 'SEARCH' and REGEXP work either, well..., never tried to understand actually!, because I find it much easier to use 'EVAL()' + 'split()' (x2) (and I've posted "my Method" a few times already on the Forum btw...) on the 'EXTRACT=HTM' Data of some HTML Element containing the Data you are interested in, you can even "do it" at the "high" 'HTML' or 'BODY' Levels (but the Extract can take up to quite a few seconds!), better and quicker is to try to isolate the "smallest/lowest" Element like:

Code: Select all

SET !EXTRACT_TEST_POPUP NO
SET !EXTRACT NULL
TAG POS=1 TYPE=SCRIPT ATTR=TXT:*profitChartData* EXTRACT=HTM
SET profitChartData EVAL("var s='{{!EXTRACT}}'; var x,y,z; y=s.split('['); z=y[1].split(']'); z[0];")
PROMPT {{profitChartData}}
Not tested as you didn't provide the URL or an HTML Saveas of the Page and I cannot really test locally because of all the Spaces and Double Quotes, but it should work..., if not with 'POS=1', then with 'POS=2' etc... if the "profitChartData" part is mentioned/reused in several Scripts on your Page, but you were going for '$1', so I guess that's the first Occurrence and 'POS=1' should then be correct I reckon...
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
chivracq
Posts: 9004
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting piece of JSON from code

Post by chivracq » Thu Dec 01, 2016 4:12 pm

And hum, mini-Rmk, your Thread will probably get moved to the 'Data Extraction' Sub-Forum as it doesn't have anything specific to 'iMacros for FF', try to select the "correct" Sub-Forum when you open a Thread... :idea: :roll:

Legit would have been, if you were using iMacros for CR, if you had opened it in the 'iMacros for CR' Sub-Forum, because the 'SEARCH' Command is not implemented on CR, according to the Wiki...
And that's where "my Method" comes in handy, as it "bypasses" the 'SEARCH' Command and will work on CR as well...! 8)

Some other Advanced User (@iimfun) will probably/hopefully give you as well the correct REGEXP Syntax when he will notice the Thread, for your original Solution with 'SEARCH' to work as well... That's his "Specialty" ah-ah...!, REGEXP is definitely not my "pieceofcake", beurk...! :wink:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
chivracq
Posts: 9004
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extracting piece of JSON from code

Post by chivracq » Thu Dec 08, 2016 5:42 am

Hum..., 7 days after posting my "Solution" and checking this Thread a few times since, a bit disappointed not to see any Follow-up on this Thread while I know @OP that you checked the Forum after my 2 Replies and must have seen them... :shock:

OK, fair enough..., never mind and enjoy "my Solution" (that I am pretty confident must work..., and I rarely write "ready to use" Scripts), but don't be surprised if I won't try to help you in the Future anymore if you ever have some other Pb, as I only help Users using the Forum "a bit correctly" and a neat Follow-up on (all) their Threads is one of my "Criteria", oops...! :roll:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Post Reply