kevnad wrote:chivracq wrote:
1- Do you still get the (full) Extract in the 'EXTRACT_TEST_POPUP'...?
2- Does "PROMPT _{{!EXTRACT}}_" work...?
3- Same Qt's with 'EXTRACT=HTM' instead of 'EXTRACT=TXT'...?
4- Does your 'PROMPT' on 'EVAL()' work using 'EXTRACT=HTM'...?:
Code: Select all
TAG POS=1 TYPE=BODY ATTR=* EXTRACT=HTM
SET URL_nav EVAL("var s='{{!EXTRACT}}'; s;")
PROMPT URL_nav:<BR>_{{URL_nav}}_
5- & 6- Can you try a 'SAVEAS' both with 'EXTRACT=TXT' and '=HTM'...? (In order to bypass the 'PROMPT'...)
Example with 'EXTRACT=TXT':
Code: Select all
TAG POS=1 TYPE=BODY ATTR=* EXTRACT=TXT
SET URL_nav EVAL("var s='{{!EXTRACT}}'; s;")
PROMPT URL_nav:<BR>_{{URL_nav}}_
SET !EXTRACT {{URL_nav}}
SAVEAS TYPE=EXTRACT FOLDER=* FILE=JSON_Test_{{!NOW:yyyy-mm-dd_hhhnn}}.txt
1- Yes, I have the full extract with the EXTRACT_TEST_POPUP
2- Yes, the prompt extract works fine
3- Same problem with EXTRACT HTM than TXT
4- No, the EVAL does not work with EXTRACT HTM
5-6 The SAVEAS works fine.
Really, the issue is with the EVAL command.
I'm trying to use the Search command with a REGEXP and EXTRACT option. Still trying to figure out the REGEX to get everything until the sondage section. I think it might work with that since the ' character will not be present.
I'll let you know how it goes!
THanks
Hum, if 5 & 6 are working, that means that the '!EXTRACT' "transiting" through 'EVAL()' is not the Pb..., and the "End_Of_Page" Trick I mentioned might/should work, I would think...
Ouf-ouf...!, 'SEARCH' + 'REGEX' is indeed a "good Idea", but hum, good luck, ah-ah...!, ['EXTRACT' + 'EVAL()' + 'split()'] is actually my "Workaround" that I find easier to use and more powerful than ['SEARCH' = 'REGEX'] that I find "a bit too complicated", ah-ah...!
kevnad wrote:kevnad wrote:
I'm trying to use the Search command with a REGEXP and EXTRACT option. Still trying to figure out the REGEX to get everything until the sondage section. I think it might work with that since the ' character will not be present.
I'll let you know how it goes!
THanks
Think I got it
Code: Select all
SEARCH SOURCE=REGEXP:"messages(.*La Caisse)" EXTRACT=$1
So I extract everything from messages (that is at the start) and until the string "La Caisse" wich is at the start of the sondage section.
I will try to find something better than La Caisse, cause I believe that in the future, it might be elsewhere in the page.
Thanks for you help! greatly appreciated.
OK, I had a bit "missed" this last Post that came onto a 3rd Page in the Thread and I implemented in the meantime my "End_Of_Page" Solution...:
Code: Select all
VERSION BUILD=8820413 RECORDER=FX
SET !EXTRACT_TEST_POPUP NO
SET !ERRORIGNORE YES
TAB T=1
'URL GOTO=file:///D:/TEMP/iMacros/Temp/_Forum%20Cases/kevnad/extract.json
'URL GOTO=file:///D:/TEMP/iMacros/Temp/_Forum%20Cases/kevnad/extractv2.json
'Easy Access Vars:
SET Descr_1 "yyyyyy-EOP"
'SET Descr_1 "-EOP"
SET Descr_2 "C.D. DU VIEUX-LONGUEUIL"
SET End_Of_Page "sectionCartesPretsMarges"
'Extract the full Content of the JSON File:
'TAG POS=1 TYPE=PRE ATTR=TXT:{<SP>"messages":[<SP>],<SP>"detention":{<SP>"messages":[* EXTRACT=TXT
TAG POS=1 TYPE=BODY ATTR=* EXTRACT=TXT
'Truncate the Extract to remove the "sondage" Section which contain a _'_ that seems to be problematic with 'EVAL()':
'=> 2 Methods, using 'split()' or 'indexOf()':
SET Extract_Trunc_1 EVAL("var s='{{!EXTRACT}}', eop='{{End_Of_Page}}', x,y,z; x=s.split(eop); z=x[0]; z;")
'PROMPT Extract_Trunc_1:<BR><BR>_{{Extract_Trunc_1}}_
'>
SET Extract_Trunc_2 EVAL("var s='{{!EXTRACT}}', eop='{{End_Of_Page}}', x,y,z; x=s.indexOf(eop); z=s.substring(0,x); z;")
'PROMPT Extract_Trunc_2:<BR><BR>_{{Extract_Trunc_2}}_
'Isolate the 'URL_nav', different Methods...:
'*********************************************
'=> Directly on the Extract (which seems to be problematic with the "sondage" Section):
'SET URL_nav EVAL("var s='{{!EXTRACT}}', d1='{{Descr_1}}', d2='{{Descr_2}}', x,y,z; x=s.split(d1)[1].split(d2); y=x[1].split('http'); z='http'+y[1].split('\"')[0]; z;")
'SET URL_nav EVAL("var s='{{!EXTRACT}}', d1='{{Descr_1}}', d2='{{Descr_2}}', a,b,c,d,e,z; a=s.split(d1); b=a[1].split(d2); c=b[1].split('http'); d=c[1].split('\"'); e=d[0]; z='http'+e; z;")
'>
'=> Using the truncated Extract:
'SET URL_nav EVAL("var s='{{Extract_Trunc_2}}', d1='{{Descr_1}}', d2='{{Descr_2}}', x,y,z; x=s.split(d1)[1].split(d2); y=x[1].split('http'); z='http'+y[1].split('\"')[0]; z;")
SET URL_nav EVAL("var s='{{Extract_Trunc_2}}', d1='{{Descr_1}}', d2='{{Descr_2}}', a,b,c,d,e,z; a=s.split(d1); b=a[1].split(d2); c=b[1].split('http'); d=c[1].split('\"'); e=d[0]; z='http'+e; z;")
PROMPT URL_nav:<BR>_{{URL_nav}}_
(Tested on iMacros for FF v8.8.2, PM v26.3.3 (=FF47), Win10_x64.)
You may want to re-enable the 2x 'PROMPT' about 'Extract_Trunc_[1|2]', and see if the 4th Method I chose is working for you... (They all 4 work for me.)
But your 'SEARCH' Solution is quite nice as well, (I'm impressed actually, ah-ah...!
), and I reckon you could use like me the "sectionCartesPretsMarges" String that I used as the "End_Of_Page" instead of your "La Caisse".
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...