Half of extracted data goes into 1 cell the rest is seperate

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Half of extracted data goes into 1 cell the rest is seperate

by senor pengwin on Thu Aug 17, 2017 3:39 pm

Hey guys,

I need help with a couple things. I have created a macro to go into a webpage and download a table and create an excel file. Easy enough.

When the table has a small amount of information it puts all of the information into one cell as one long string. This is ok because I have created a VBA script to split the text and format it how I want.

The problem I am having is that when I download the table and it has a lot of information for a whole month it will put the first half of the report all into one cell and then the second half goes into individual rows.

The macro I have created can't work with both formats. Is there something I can do when I extract to either have it all go into one cell or have them all go into individual rows but not both?

Thanks
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Re: Half of extracted data goes into 1 cell the rest is sepe

by chivracq on Thu Aug 17, 2017 4:41 pm

senor pengwin wrote:Hey guys,

I need help with a couple things. I have created a macro to go into a webpage and download a table and create an excel file. Easy enough.

When the table has a small amount of information it puts all of the information into one cell as one long string. This is ok because I have created a VBA script to split the text and format it how I want.

The problem I am having is that when I download the table and it has a lot of information for a whole month it will put the first half of the report all into one cell and then the second half goes into individual rows.

The macro I have created can't work with both formats. Is there something I can do when I extract to either have it all go into one cell or have them all go into individual rows but not both?

Thanks

CIM...! :mrgreen: (Read my Sig...)
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6490
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Half of extracted data goes into 1 cell the rest is sepe

by senor pengwin on Fri Aug 18, 2017 7:13 am

chivracq wrote:CIM...! :mrgreen: (Read my Sig...)


Sorry about that:

1. iMacros for Chrome 8.4.4
2. Google Chrome Version 60.0.3112.101
3. Windows 10 Home
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Re: Half of extracted data goes into 1 cell the rest is sepe

by chivracq on Fri Aug 18, 2017 9:17 am

senor pengwin wrote:
chivracq wrote:CIM...! :mrgreen: (Read my Sig...)


Sorry about that:
Code: Select all
1. iMacros for Chrome 8.4.4
2. Google Chrome Version 60.0.3112.101
3. Windows 10 Home[/quote]

OK for FCI...

Well, Script and URL not posted, then it's difficult to know exactly what your Script is doing (or supposed to do)... Would be easier if you could post that Info, or adapt for example the 'Extact-Table.iim' Demo-Macro to demonstrate your Pb and what you want exactly... :idea:

But the general Idea would be to "manipulate" the Content of '!EXTRACT' directly using 'EVAL()' (and not with 'ADD' or the "automatic" Extract Mechanism) to include/reorganize specific Content or modify its Structure before doing the 'SAVEAS'.
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6490
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Half of extracted data goes into 1 cell the rest is sepe

by senor pengwin on Fri Aug 18, 2017 9:59 am

chivracq wrote:OK for FCI...

Well, Script and URL not posted, then it's difficult to know exactly what your Script is doing (or supposed to do)... Would be easier if you could post that Info, or adapt for example the 'Extact-Table.iim' Demo-Macro to demonstrate your Pb and what you want exactly... :idea:

But the general Idea would be to "manipulate" the Content of '!EXTRACT' directly using 'EVAL()' (and not with 'ADD' or the "automatic" Extract Mechanism) to include/reorganize specific Content or modify its Structure before doing the 'SAVEAS'.


Here is the script I am using. The website to get the information is a website that I need to use login credentials for, do you need to actually visit the site?

Here is my code. I have tried change the Extract type to BODY, or TABLE but I get the same result.
Code: Select all
VERSION BUILD=844 RECORDER=CR
SET !ENCRYPTION NO
URL GOTO=https://eym.sicomasp.com/index.php
TAG POS=2 TYPE=SPAN ATTR=TXT:Custom<SP>User<SP>defineable<SP>report<SP>generator
FRAME F=3
TAG POS=1 TYPE=STRONG ATTR=TXT:Control<SP>Cash<SP>Summary
FRAME F=5
TAG POS=1 TYPE=SELECT ATTR=NAME:hierarchy_selector CONTENT=%5
TAG POS=2 TYPE=INS ATTR=TXT:
TAG POS=1 TYPE=SELECT FORM=NAME:form1 ATTR=NAME:range__date_selector__start__end CONTENT=%10
FRAME F=4
TAG POS=1 TYPE=A ATTR=TXT:Run<SP>Report
FRAME F=3
TAG POS=1 TYPE=HTML ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=mytable_{{!NOW:yymmdd_hhnnss}}.csv


This is what it looks like in the Excel file. You can see cell A1 has most of the data but after a certain point it cuts off and continues on individual rows:
Image
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Re: Half of extracted data goes into 1 cell the rest is sepe

by chivracq on Fri Aug 18, 2017 11:02 am

senor pengwin wrote:Here is the script I am using. The website to get the information is a website that I need to use login credentials for, do you need to actually visit the site?

Here is my code. I have tried change the Extract type to BODY, or TABLE but I get the same result.
Code: Select all
VERSION BUILD=844 RECORDER=CR
SET !ENCRYPTION NO
URL GOTO=https://eym.sicomasp.com/index.php
TAG POS=2 TYPE=SPAN ATTR=TXT:Custom<SP>User<SP>defineable<SP>report<SP>generator
FRAME F=3
TAG POS=1 TYPE=STRONG ATTR=TXT:Control<SP>Cash<SP>Summary
FRAME F=5
TAG POS=1 TYPE=SELECT ATTR=NAME:hierarchy_selector CONTENT=%5
TAG POS=2 TYPE=INS ATTR=TXT:
TAG POS=1 TYPE=SELECT FORM=NAME:form1 ATTR=NAME:range__date_selector__start__end CONTENT=%10
FRAME F=4
TAG POS=1 TYPE=A ATTR=TXT:Run<SP>Report
FRAME F=3
TAG POS=1 TYPE=HTML ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=mytable_{{!NOW:yymmdd_hhnnss}}.csv


This is what it looks like in the Excel file. You can see cell A1 has most of the data but after a certain point it cuts off and continues on individual rows:
Image

Hum, Site is indeed behind L&P... Yep of course, if I can have a look at the Site/Page, I can provide some more precise Help/Advice and "play" with the Page myself...

Screenshot on 'Photobucket' doesn't contain any Data, you can better attach it directly to your Thread...

But OK, from your Script, I see you only do one Extract which is supposed to contain all the Data you want...
'TYPE=TABLE' would normally be more "precise" than 'HTML' or 'BODY' but the Data is displayed in a Frame, so it's possible that this Frame only contains this Table and then it will indeed make no Difference...
You can try posting the direct URL to that Frame, it's sometimes possible to access such direct URL's without the need to log in, depending on how the Web-Server is configured...

But like I already mentioned, you could use 'EVAL()' to enclose the complete '!EXTRACT' between Double Quotes at the beginning and end, which should save the whole Content into one single Cell in your '.CSV' to allow your '.VBA' Macro to be able to handle that Data...
Try stg like this for example:
Code: Select all
FRAME F=3
TAG POS=1 TYPE=HTML ATTR=* EXTRACT=TXT
SET !EXTRACT EVAL("var s='{{!EXTRACT}}'; s='\"'+s+'\"'+'[EXTRACT]'; s;")
SAVEAS TYPE=EXTRACT FOLDER=* FILE=mytable_{{!NOW:yymmdd_hhnnss}}.csv
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6490
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Half of extracted data goes into 1 cell the rest is sepe

by senor pengwin on Fri Aug 18, 2017 12:32 pm

chivracq wrote:Hum, Site is indeed behind L&P... Yep of course, if I can have a look at the Site/Page, I can provide some more precise Help/Advice and "play" with the Page myself...

Screenshot on 'Photobucket' doesn't contain any Data, you can better attach it directly to your Thread...

But OK, from your Script, I see you only do one Extract which is supposed to contain all the Data you want...
'TYPE=TABLE' would normally be more "precise" than 'HTML' or 'BODY' but the Data is displayed in a Frame, so it's possible that this Frame only contains this Table and then it will indeed make no Difference...
You can try posting the direct URL to that Frame, it's sometimes possible to access such direct URL's without the need to log in, depending on how the Web-Server is configured...

But like I already mentioned, you could use 'EVAL()' to enclose the complete '!EXTRACT' between Double Quotes at the beginning and end, which should save the whole Content into one single Cell in your '.CSV' to allow your '.VBA' Macro to be able to handle that Data...
Try stg like this for example:

I tried this but it didnt work, it just put "" in the cell and the rest of the data went underneath. I will attach the full code with login and pw. You dont have privileges to make any changes but I would like to take the login info down once you have been able to get to the report. Thanks for your help.

Code: Select all
VERSION BUILD=844 RECORDER=CR
SET !ENCRYPTION NO
URL GOTO=https://eym.sicomasp.com/index.php
WAIT SECONDS=1

TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:form1 ATTR=NAME:XXX_login_name CONTENT=
TAG POS=1 TYPE=INPUT:PASSWORD FORM=NAME:form1 ATTR=NAME:XXX_login_password CONTENT=
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:form1 ATTR=NAME:XXX_actionSUBMITLOGIN

TAG POS=2 TYPE=SPAN ATTR=TXT:Custom<SP>User<SP>defineable<SP>report<SP>generator
WAIT SECONDS=1
FRAME F=3
TAG POS=1 TYPE=STRONG ATTR=TXT:Control<SP>Cash<SP>Summary
WAIT SECONDS=1
FRAME F=5

TAG POS=1 TYPE=SELECT ATTR=NAME:hierarchy_selector CONTENT=%5
WAIT SECONDS=1
TAG POS=2 TYPE=INS ATTR=TXT:
WAIT SECONDS=1

FRAME F=5
TAG POS=1 TYPE=SELECT FORM=NAME:form1 ATTR=NAME:range__date_selector__start__end CONTENT=%10

WAIT SECONDS=1
FRAME F=4
TAG POS=1 TYPE=A ATTR=TXT:Run<SP>Report

WAIT SECONDS=2
FRAME F=3
TAG POS=1 TYPE=HTML ATTR=* EXTRACT=TXT
SET !EXTRACT EVAL("var s='{{!EXTRACT}}'; s='\"'+s+'\"'+'[EXTRACT]'; s;")
SAVEAS TYPE=EXTRACT FOLDER=* FILE=mytable_{{!NOW:yymmdd_hhnnss}}.csv
Last edited by senor pengwin on Fri Aug 18, 2017 6:43 pm, edited 1 time in total.
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Re: Half of extracted data goes into 1 cell the rest is sepe

by senor pengwin on Fri Aug 18, 2017 3:32 pm

chivracq wrote:Hum, Site is indeed behind L&P... Yep of course, if I can have a look at the Site/Page, I can provide some more precise Help/Advice and "play" with the Page myself...

Screenshot on 'Photobucket' doesn't contain any Data, you can better attach it directly to your Thread...

But OK, from your Script, I see you only do one Extract which is supposed to contain all the Data you want...
'TYPE=TABLE' would normally be more "precise" than 'HTML' or 'BODY' but the Data is displayed in a Frame, so it's possible that this Frame only contains this Table and then it will indeed make no Difference...
You can try posting the direct URL to that Frame, it's sometimes possible to access such direct URL's without the need to log in, depending on how the Web-Server is configured...

But like I already mentioned, you could use 'EVAL()' to enclose the complete '!EXTRACT' between Double Quotes at the beginning and end, which should save the whole Content into one single Cell in your '.CSV' to allow your '.VBA' Macro to be able to handle that Data...
Try stg like this for example:


So I'm thinking the issue is that there are too many characters in the string. It's over 32,500 characters and I'm pretty sure that's close to the limit for a cell so the rest of the information gets separated. What can I do in this case? I've tried using the EVAL() function and tried to split the info, shown below, but it didnt work :/ Any Suggestions?

Code: Select all
SET !VAR1 EVAL("var s=\"{{!EXTRACT}}\"; s.split(' ');")
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Re: Half of extracted data goes into 1 cell the rest is sepe

by chivracq on Fri Aug 18, 2017 5:11 pm

senor pengwin wrote:I tried this but it didnt work, it just put "" in the cell and the rest of the data went underneath. I will attach the full code with login and pw. You dont have privileges to make any changes but I would like to take the login info down once you have been able to get to the report. Thanks for your help.

Code: Select all
VERSION BUILD=844 RECORDER=CR
SET !ENCRYPTION NO
URL GOTO=https://eym.sicomasp.com/index.php
WAIT SECONDS=1

TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:form1 ATTR=NAME:XXX_login_name CONTENT=xxx
TAG POS=1 TYPE=INPUT:PASSWORD FORM=NAME:form1 ATTR=NAME:XXX_login_password CONTENT=yyy
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:form1 ATTR=NAME:XXX_actionSUBMITLOGIN

TAG POS=2 TYPE=SPAN ATTR=TXT:Custom<SP>User<SP>defineable<SP>report<SP>generator
WAIT SECONDS=1
FRAME F=3
TAG POS=1 TYPE=STRONG ATTR=TXT:Control<SP>Cash<SP>Summary
WAIT SECONDS=1
FRAME F=5

TAG POS=1 TYPE=SELECT ATTR=NAME:hierarchy_selector CONTENT=%5
WAIT SECONDS=1
TAG POS=2 TYPE=INS ATTR=TXT:
WAIT SECONDS=1

FRAME F=5
TAG POS=1 TYPE=SELECT FORM=NAME:form1 ATTR=NAME:range__date_selector__start__end CONTENT=%10

WAIT SECONDS=1
FRAME F=4
TAG POS=1 TYPE=A ATTR=TXT:Run<SP>Report

WAIT SECONDS=2
FRAME F=3
TAG POS=1 TYPE=HTML ATTR=* EXTRACT=TXT
SET !EXTRACT EVAL("var s='{{!EXTRACT}}'; s='\"'+s+'\"'+'[EXTRACT]'; s;")
SAVEAS TYPE=EXTRACT FOLDER=* FILE=mytable_{{!NOW:yymmdd_hhnnss}}.csv

senor pengwin wrote:So I'm thinking the issue is that there are too many characters in the string. It's over 32,500 characters and I'm pretty sure that's close to the limit for a cell so the rest of the information gets separated. What can I do in this case? I've tried using the EVAL() function and tried to split the info, shown below, but it didnt work :/ Any Suggestions?

Code: Select all
SET !VAR1 EVAL("var s=\"{{!EXTRACT}}\"; s.split(' ');")

I've noted the L&P and checked once quickly that I could indeed log in before logging out directly, so you can remove the Credentials from the Script in your own Post... I'll have a look tomorrow, 02h at night for me now on a Friday evening, I want to relax a bit (or only answer Threads that I can quickly answer) and your Case may require quite some Testing, so it may take a while...

Hum, Size Limit is possible, there is indeed a Size Limit somewhere with iMacros for CR, there are 2 or 3 Threads related to that on the Forum..., I remember doing some Testing about that a while ago, but those Threads must be about 2 years old I think, I don't have an Env with CR since at least 1.5 years anymore... Or the Limitation can come from Excel as well, indeed...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6490
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Half of extracted data goes into 1 cell the rest is sepe

by senor pengwin on Fri Aug 18, 2017 6:41 pm

Whoops, double post
Last edited by senor pengwin on Fri Aug 18, 2017 6:42 pm, edited 1 time in total.
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Re: Half of extracted data goes into 1 cell the rest is sepe

by senor pengwin on Fri Aug 18, 2017 6:42 pm

senor pengwin wrote:I've noted the L&P and checked once quickly that I could indeed log in before logging out directly, so you can remove the Credentials from the Script in your own Post... I'll have a look tomorrow, 02h at night for me now on a Friday evening, I want to relax a bit (or only answer Threads that I can quickly answer) and your Case may require quite some Testing, so it may take a while...

Hum, Size Limit is possible, there is indeed a Size Limit somewhere with iMacros for CR, there are 2 or 3 Threads related to that on the Forum..., I remember doing some Testing about that a while ago, but those Threads must be about 2 years old I think, I don't have an Env with CR since at least 1.5 years anymore... Or the Limitation can come from Excel as well, indeed...


No problem, thank you so much for your time. I'll spend the night trying to figure it out :mrgreen:
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Re: Half of extracted data goes into 1 cell the rest is sepe

by chivracq on Sat Aug 19, 2017 2:46 pm

senor pengwin wrote:Whoops, double post

Yep, can happen..., and you can always edit+delete the last Post in a Thread as long as nobody (yourself or some other User) has posted after you, which means you would have been able to delete the 2nd Post... Now you cannot anymore as I am posting after you... But don't worry, not a big deal...

senor pengwin wrote:No problem, thank you so much for your time. I'll spend the night trying to figure it out :mrgreen:

OK, so I've been busy for quite a few hours having "fun" [sic...!] with your Report today, ah-ah...! :oops:

OK, "we" are hitting quite a few Limitations in this "Case", ah-ah...!

But first, hum..., I saw on the previous Page, next to the 'Create Report' that you had a 'Save Report' Button, I didn't check it (yet) content-wise, but isn't it actually more or less what you want...?
Or doing a 'SAVEAS TYPE=HTML' or similar on the Page itself...?

OK, now we analyze a bit the HTML Structure of the Page...:
The Report is presented on the Page in a 'FRAME' (F=3="wrapper_main"), that's OK, and this Frame contains a "main" 'TABLE' that contains all the Data, from "EYMKING - Control Cash Summary" on the first Line/Row till "MICHIGAN <=> 1573571.48 3235.98 1576807.46 22450 141.00 94639.78 483347.24 3151.79 1191161.04 6071.83 38566.54 12235.50 312.03 4512.36 0.00 0.00 0.00 4630.43 485 5.96 <=> Page: 1/1" for the last 3 Lines/Rows, about 600 Rows further, at the end of the Page/Frame/Table.
That's "OK" as well, and iMacros has a Mechanism to extract a whole Table with just one 'TAG'+'EXTRACT' Statement, except that the Table on this Site that contains the whole Data I referred to in only ONE Cell ('TD' HTML Element)...!
But iMacros expects a 'TABLE' to be made of Rows ('TR') + Cells ('TD') within a same Row, not one single Cell...!
And it goes even "worse", ah-ah...! Every Line of this Report is one 'DIV' (=> about 600 'DIV''s for 19 days) and each Cell within a Row is a 'SPAN' Element, and some "fake" nested Tables are used as a kind of "Separator" to visually present the Data a bit nicely...

The "final Result" is actually nice visually, but iMacros cannot do much with it...!:
If you extract the Data in that Frame at the 'HTML'/BODY'/'TABLE'/'TD' Level and save it (=106Kb of Data for 600 Rows) to a '.CSV', iMacros took for granted it was just one Cell and when opening your '.CSV' in Excel, Excel expects as well only one Cell...
But then we hit a few Data Size Limits..., the 'EXTRACT' Mechanism + '!EXTRACT' Var + 'SAVEAS' have no Pb with the 106Kb of Data, but 'EVAL()' seems to have (with Strings) so we cannot "manipulate" the Content of '!EXTRACT' before doing the 'SAVEAS' and Excel as well with the Size of Data for one single Cell...

So iMacros can extract 'TABLE''s in one Statement, or it can extract at the 'DIV' or 'SPAN' Level..., but for 'DIV' (= each Line/Row in the Report), that means looping a Script 600 times, and probably about 1000 Rows => times at the end of a month.
OK, I gave it a try, and it takes about 30 Sec to extract 100 Lines/Rows at the 'DIV' Level (=> 3 Min for 600 Rows, 5 Min for 1000 Rows at the end of a month), but then, yep...!, you get the Data of the Report displayed in Excel in one Col with 600 Rows.
I used the following Script:
Code: Select all
VERSION BUILD=8820413 RECORDER=FX
'SET !ERRORIGNORE YES
TAB T=1
'URL GOTO=https://eym.sicomasp.com/index.php

FRAME NAME="wrapper_main"
TAG POS=3 TYPE=TABLE ATTR=*
'TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R{{!LOOP}} TYPE=DIV ATTR=TXT:* EXTRACT=TXT
'PROMPT {{!EXTRACT}}
SAVEAS TYPE=EXTRACT FOLDER=* FILE=S-Pengwin_{{!NOW:yymmdd_hh_DIV}}.csv

I guess your '.VBA' Script will then be able to further separate the Data Cell by Cell for each Row...

Other possibility/ies is/are to split the Logic of that Macro into 2 Macros, one for the first 4 or 5 Lines in order to "neatly" save that Data especially the Table Header Cell by Cell, and then a 2nd Macro starting from Line/Row 5 or 6 to save the "real" Data, either extracting the Data at the 'DIV' Level (= Row by Row), or even at the 'SPAN' Level (= Cell by Cell), will take a bit longer than 30 Sec per 100 Rows..., or still extract at the 'DIV' Level (=> one full Row in one Statement) and separate each 'SPAN'/Cell with 'EVAL()' before each 'SAVEAS'...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6490
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Half of extracted data goes into 1 cell the rest is sepe

by senor pengwin on Sat Aug 19, 2017 4:38 pm

chivracq wrote:
senor pengwin wrote:Whoops, double post

Yep, can happen..., and you can always edit+delete the last Post in a Thread as long as nobody (yourself or some other User) has posted after you, which means you would have been able to delete the 2nd Post... Now you cannot anymore as I am posting after you... But don't worry, not a big deal...

senor pengwin wrote:No problem, thank you so much for your time. I'll spend the night trying to figure it out :mrgreen:

OK, so I've been busy for quite a few hours having "fun" [sic...!] with your Report today, ah-ah...! :oops:

OK, "we" are hitting quite a few Limitations in this "Case", ah-ah...!

But first, hum..., I saw on the previous Page, next to the 'Create Report' that you had a 'Save Report' Button, I didn't check it (yet) content-wise, but isn't it actually more or less what you want...?
Or doing a 'SAVEAS TYPE=HTML' or similar on the Page itself...?

OK, now we analyze a bit the HTML Structure of the Page...:
The Report is presented on the Page in a 'FRAME' (F=3="wrapper_main"), that's OK, and this Frame contains a "main" 'TABLE' that contains all the Data, from "EYMKING - Control Cash Summary" on the first Line/Row till "MICHIGAN <=> 1573571.48 3235.98 1576807.46 22450 141.00 94639.78 483347.24 3151.79 1191161.04 6071.83 38566.54 12235.50 312.03 4512.36 0.00 0.00 0.00 4630.43 485 5.96 <=> Page: 1/1" for the last 3 Lines/Rows, about 600 Rows further, at the end of the Page/Frame/Table.
That's "OK" as well, and iMacros has a Mechanism to extract a whole Table with just one 'TAG'+'EXTRACT' Statement, except that the Table on this Site that contains the whole Data I referred to in only ONE Cell ('TD' HTML Element)...!
But iMacros expects a 'TABLE' to be made of Rows ('TR') + Cells ('TD') within a same Row, not one single Cell...!
And it goes even "worse", ah-ah...! Every Line of this Report is one 'DIV' (=> about 600 'DIV''s for 19 days) and each Cell within a Row is a 'SPAN' Element, and some "fake" nested Tables are used as a kind of "Separator" to visually present the Data a bit nicely...

The "final Result" is actually nice visually, but iMacros cannot do much with it...!:
If you extract the Data in that Frame at the 'HTML'/BODY'/'TABLE'/'TD' Level and save it (=106Kb of Data for 600 Rows) to a '.CSV', iMacros took for granted it was just one Cell and when opening your '.CSV' in Excel, Excel expects as well only one Cell...
But then we hit a few Data Size Limits..., the 'EXTRACT' Mechanism + '!EXTRACT' Var + 'SAVEAS' have no Pb with the 106Kb of Data, but 'EVAL()' seems to have (with Strings) so we cannot "manipulate" the Content of '!EXTRACT' before doing the 'SAVEAS' and Excel as well with the Size of Data for one single Cell...

So iMacros can extract 'TABLE''s in one Statement, or it can extract at the 'DIV' or 'SPAN' Level..., but for 'DIV' (= each Line/Row in the Report), that means looping a Script 600 times, and probably about 1000 Rows => times at the end of a month.
OK, I gave it a try, and it takes about 30 Sec to extract 100 Lines/Rows at the 'DIV' Level (=> 3 Min for 600 Rows, 5 Min for 1000 Rows at the end of a month), but then, yep...!, you get the Data of the Report displayed in Excel in one Col with 600 Rows.
I used the following Script:
Code: Select all
VERSION BUILD=8820413 RECORDER=FX
'SET !ERRORIGNORE YES
TAB T=1
'URL GOTO=https://eym.sicomasp.com/index.php

FRAME NAME="wrapper_main"
TAG POS=3 TYPE=TABLE ATTR=*
'TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R{{!LOOP}} TYPE=DIV ATTR=TXT:* EXTRACT=TXT
'PROMPT {{!EXTRACT}}
SAVEAS TYPE=EXTRACT FOLDER=* FILE=S-Pengwin_{{!NOW:yymmdd_hh_DIV}}.csv

I guess your '.VBA' Script will then be able to further separate the Data Cell by Cell for each Row...

Other possibility/ies is/are to split the Logic of that Macro into 2 Macros, one for the first 4 or 5 Lines in order to "neatly" save that Data especially the Table Header Cell by Cell, and then a 2nd Macro starting from Line/Row 5 or 6 to save the "real" Data, either extracting the Data at the 'DIV' Level (= Row by Row), or even at the 'SPAN' Level (= Cell by Cell), will take a bit longer than 30 Sec per 100 Rows..., or still extract at the 'DIV' Level (=> one full Row in one Statement) and separate each 'SPAN'/Cell with 'EVAL()' before each 'SAVEAS'...


You're a genius!
but...
I tried the code and it doesn't appear that the loop is working. I only get "EYMKING - Control Cash Summary" in the first cell and nothing else. The excel appears immediately. What am I doing wrong?

The amount of time it takes wouldn't be an issue, I plan on running this macro on a schedule early in the morning.

Thanks again

EDIT: OK I think I just don't know how to use Imacros, I found the play loop button, does it actually need to replay the whole macro every time? Also do I need to manually tell it how many times to loop?
Last edited by senor pengwin on Sat Aug 19, 2017 4:59 pm, edited 1 time in total.
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Re: Half of extracted data goes into 1 cell the rest is sepe

by senor pengwin on Sat Aug 19, 2017 4:56 pm

chivracq wrote:But first, hum..., I saw on the previous Page, next to the 'Create Report' that you had a 'Save Report' Button, I didn't check it (yet) content-wise, but isn't it actually more or less what you want...?
Or doing a 'SAVEAS TYPE=HTML' or similar on the Page itself...?


Save report is just to save a "style" of report. If you click on .csv and hit run report it will download an excel file. The problem I have with this is that I don't know how to chose the file name and location, I don't want it to just get lost in the downloads folder.

Is there a way I can use IMacros to export the report but chose the file name and location?
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Re: Half of extracted data goes into 1 cell the rest is sepe

by senor pengwin on Tue Aug 22, 2017 1:09 pm

Just wanted to update for future references. My solution to this was to export the report and use ONDOWNLOAD. Duh :roll:
senor pengwin
 
Posts: 11
Joined: Thu Aug 17, 2017 3:33 pm

Next

Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 6 guests

-->