Extract table

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Extract table

by cww7 on Thu Jul 20, 2017 5:18 am

Hi,

I'm working on a new project : from a datasource csv, I fill out a form with LOOP, that returns tables.
I would like to extract these tables cell by cell but the tables don't have the number of lines.
Actually, I repeat 9 times (max !VAR) the code :

Code: Select all
TAG POS=R1  TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=* FILE=+FICHIERpro_{{!NOW:yyyymmdd}}

SET !TIMEOUT_STEP 1
TAG POS=R1  TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT


SAVEAS TYPE=EXTRACT FOLDER=* FILE=+FICHIERpro_{{!NOW:yyyymmdd}}
...


Sometimes There are more than 9 lines and I waste a lot of time to extract what I want.
How to avoid repeating the code (ENDOFPAGE ?, LOOP?)

Any help would be appreciated.

FF 54.0.1
IMacros 9.0.3
Windows 7
cww7
 
Posts: 2
Joined: Thu Jul 20, 2017 4:44 am

Re: Extract table

by chivracq on Thu Jul 20, 2017 7:09 am

cww7 wrote:
Code: Select all
FF 54.0.1
IMacros 9.0.3
Windows 7


Hi,

I'm working on a new project : from a datasource csv, I fill out a form with LOOP, that returns tables.
I would like to extract these tables cell by cell but the tables don't have the number of lines.
Actually, I repeat 9 times (max !VAR) the code :

Code: Select all
TAG POS=R1  TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=* FILE=+FICHIERpro_{{!NOW:yyyymmdd}}

SET !TIMEOUT_STEP 1
TAG POS=R1  TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT


SAVEAS TYPE=EXTRACT FOLDER=* FILE=+FICHIERpro_{{!NOW:yyyymmdd}}
...


Sometimes There are more than 9 lines and I waste a lot of time to extract what I want.
How to avoid repeating the code (ENDOFPAGE ?, LOOP?)

Any help would be appreciated.

FF 54.0.1
IMacros 9.0.3
Windows 7

Your Thread Title is a bit vague, most Threads in the 'Data Extraction' Sub-Forum are about "Extract Table", would be nice if you could make it a bit more specific... :idea:

Well, instead of extracting all Rows/Cells Cell by Cell with Relative Positioning, why don't you extract the whole Table directly with just one 'EXTRACT' at the 'TYPE=TABLE' Level (instead of your 'TYPE=TD' Statements)...?
Code: Select all
TAG POS=1 TYPE=TABLE ATTR=TXT:* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=+FICHIERpro_{{!NOW:yyyymmdd}}

And you can always use 'EVAL()' if you don't want to include the Table Header in the 'SAVEAS' each time...

But if the "Pb" in your current Script is the time taken by the Rows not present and you don't mind all the '#EANF#'s, simply shorten '!TIMEOUT_STEP' (to "0"). Then you don't mind repeating your Block of Code even 15 times instead of 9, in order to cover the Max Nb of Rows possible...
And if you don't want to save those non-existing Rows, you can use 'EVAL()' for a "Conditional 'SAVEAS'"...

I guess looping your Script and extracting only one Row for each Loop (until there is no more Row) is not really an Option because you would need to include the Form Filling part in the Loop but you could use a '.js' Script and split the Logic of your current Script into 2 Macros, one for the Form Filling, and one for extracting one Row in a Loop until there is no Data anymore.
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Extract table

by cww7 on Fri Jul 21, 2017 6:12 am

Thanks for your response,

instead of extracting all Rows/Cells Cell by Cell with Relative Positioning, why don't you extract the whole Table directly


I need to extract cell by cell because I add the corresponding value from the datasource to each line extracted in the csv file.

simply shorten '!TIMEOUT_STEP' (to "0")


Cool ! This is faster !

you could use a '.js' Script and split the Logic of your current Script into 2 Macros


This is the best way but I'm not familiar with js. I don't know how to organize the script including 2 macros. Do you have an example ?
cww7
 
Posts: 2
Joined: Thu Jul 20, 2017 4:44 am

Re: Extract table

by chivracq on Sat Jul 22, 2017 1:54 pm

cww7 wrote:Thanks for your response,

instead of extracting all Rows/Cells Cell by Cell with Relative Positioning, why don't you extract the whole Table directly

I need to extract cell by cell because I add the corresponding value from the datasource to each line extracted in the csv file.

Well, this is not visible from the excerpt of your Script that you posted, even the 'TAG' on the Anchor is missing, then I cannot guess it...
But hum, like I said earlier, you can always use 'EVAL()' to "manipulate" the Content of the Extract in order to add the Data from your '!COLn'('s) to it...

cww7 wrote:
simply shorten '!TIMEOUT_STEP' (to "0")

Cool ! This is faster !

Yep...! 8)

cww7 wrote:
you could use a '.js' Script and split the Logic of your current Script into 2 Macros

This is the best way but I'm not familiar with js. I don't know how to organize the script including 2 macros. Do you have an example ?

'Best way", hum, yep maybe, the "Standard" way anyway, but I don't use '.js' Scripts myself so I would use the Extract at the 'TABLE' Level for myself, together with 'EVAL()' for any (conditional) Data Manipulation...
For Examples about '.js' Scripts, you have a Demo-Macro for that, and if you check a few Threads from the 'iMacros for FF' Sub-Forum, you'll find many Examples, and you can search the Forum as well for "nested loops javascript" for example and you'll find several Script Examples...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 4 guests

-->