Extract from specific cell/row from CSV

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team

Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
raykay84
Posts: 9
Joined: Tue Aug 11, 2020 11:54 pm

Extract from specific cell/row from CSV

Post by raykay84 » Wed Aug 12, 2020 12:03 am

Hi,
i am new and currently using the freeware plugin for Chrome but i am considering buying the Personal Edition license since i am really interested in extract and use data from Excel.

I am using VERSION BUILD=1005 RECORDER=CR
Bowser Chrome 84
iMacros for Chrome
Windows 10 Professional 64-bit Operating system

I checked the demo projects and also searched this forum however i am not very good with programming and alot of the contents are really not easy to understand for me.
What i want to achieve, is to fill out a webform based on data i have locally saved on my harddrive. The data is somewhere in the Excel file so i cannot use the loop model where it seemingly just flies through the document and prints everyhting into a webform. What i need is instructions how to define and extract content from a specific line / column combination and put it into a specific field of the webform.

Can anybody help me with the code for this?
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract from specific cell/row from CSV

Post by chivracq » Wed Aug 12, 2020 12:45 am

raykay84 wrote:
Wed Aug 12, 2020 12:03 am
Hi,
i am new and currently using the freeware plugin for Chrome but i am considering buying the Personal Edition license since i am really interested in extract and use data from Excel.

I am using

Code: Select all

VERSION BUILD=1005 RECORDER=CR
Bowser Chrome 84
iMacros for Chrome
Windows 10 Professional 64-bit Operating system
I checked the demo projects and also searched this forum however i am not very good with programming and alot of the contents are really not easy to understand for me.
What i want to achieve, is to fill out a webform based on data i have locally saved on my harddrive. The data is somewhere in the Excel file so i cannot use the loop model where it seemingly just flies through the document and prints everyhting into a webform. What i need is instructions how to define and extract content from a specific line / column combination and put it into a specific field of the webform.

Can anybody help me with the code for this?

Hum..., Compliment for the Good Quality for your first Post on the Forum... :)

Your Thread will though probably get moved to the 'General' Sub-Forum, the Terms "Extract"/"Extracting"/"Extraction" and the 'Data Extraction' Sub-Forum are meant for "extracting" Data from a Web-Site, not for using/pulling/retrieving/fetching Input-Data from a (Local) '.csv' DataSource that you control/maintain yourself...

>>>

But OK, yep-yep, what you want is completely possible with iMacros, and you'll need indeed the 'PE' Version ('Personal Edition') as using a Local '.csv' File as a DataSource is only supported in the 'PE' Version (and not in the 'Free' Version)...

Your Cells in your Excel/'.csv' File are identified by the Coordinates of Row+Column, and iMacros can do the same, with '!DATASOURCE_LINE' for the "Row" and '{{!COL[n]}}' for "Col_n" in your Spreadsheet. 8)

And yep indeed, it is possible to loop through your '.csv' DataSource to fetch some Data from all Rows using the Built-in '!LOOP' Var (=> with "SET !DATASOURCE_LINE {{!LOOP}}"), but you can also hard-code the Coordinates and the specific Row that you want to use, => for example if you want to use Data from Row_3/Col_7:

Code: Select all

SET !DATASOURCE_LINE 3
PROMPT Cell[3-7]:<SP>_{{!COL7}}_
>>>

Another Technique, and which would actually also work with the 'Free' Version, is to save your Excel File from Excel as an '.html' File, that can then be opened from the Browser, and therefore also from iMacros (in a 2nd Tab maybe) that will handle it as a "Standard" Web-Page, with all the Data displaid as a 'TABLE' HTML Object, with Rows as 'UL' Elements, and Cells as 'LI' Elements, where you could then use the 'EXTRACT' Mechanism to "extract" the Data that you want to use as Input Data. :idea:

=> So for example, if you have 10 Cols in your Table, then 'Cell[3-7]' could be accessed from:

Code: Select all

TAG POS=27 TYPE=LI ATTR=* EXTRACT=TXT
PROMPT Cell[3-7]:<SP>_{{!EXTRACT}}_
And if you have a Header in your Excel Sheet and "consider" 'Cell[3-7]' to be on the 3rd Row of Data, then "POS=27" would become "POS=37"...

A "Sub-Technique"/"Trick" is otherwise to add an extra Col at the complete Left Side of your Sheet in Excel, => your Sheet/Table will now contain 11 Cols, and you fill in those Cells with "Row_1" + "Row_2" + "Row_3" etc..., and using 'Relative Positioning', you can then even more easily access the same 'Cell[3-7]' with:

Code: Select all

TAG POS=1 TYPE=LI ATTR=TXT:"Row_3"
TAG POS=R7 TYPE=LI ATTR=* EXTRACT=TXT
PROMPT Cell[3-7]:<SP>_{{!EXTRACT}}_
>>>

The same Technique(s) can also be used with a 'GoogleSheet' that also gets "seen" by the Browser (and iMacros) as a Web-Page... :idea:

>>>

Hum..., and as I mentioned some (Workaround) Technique(s) using the 'EXTRACT' Mechanism, the Thread can now "stay" in the 'Data Extraction' Sub-Forum, ah-ah...! :P
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
raykay84
Posts: 9
Joined: Tue Aug 11, 2020 11:54 pm

Re: Extract from specific cell/row from CSV

Post by raykay84 » Wed Aug 12, 2020 5:14 am

Wow thank your for your really helpful answer! And not just that you also provived different alternatives to get to my goal that is thank your very much!

I am a step further now having following code:

Code: Select all

VERSION BUILD=1005 RECORDER=CR
TAB T=1
SET !EXTRACT_TEST_POPUP NO
TAG POS=23 TYPE=LI ATTR=* EXTRACT=TXT
SET !VAR1 {{!EXTRACT}} 
SET !EXTRACT NULL


TAB T=2  
TAG POS=1 TYPE=INPUT:TEXT FORM=ID:demo ATTR=name:City CONTENT={{!VAR1}
in following Setup
Tab1: https://docs.google.com/spreadsheets/d/ ... sp=sharing
Tab2: http://demo.imacros.net/Automate/AutoDataEntry

Goal: bring the value "2.88" (in cell 2 - 3) from the table on Tab1 into the field "City" in the form of Tab2.

However the only thing being transferred is "#EANF#" so it seems i cannot address the google spreadsheet cells. I also tried the suggestion of row+1, row+2 within the first column to have a relative access but this did not work either or i did it wrong.

I also tried using different sources for the first Tab like HTML from CSV, Online HTML Table generator etc. but nothin worked. Regarding saving the Excel as HTML i noticed that there are no LU and LI Tags like you mention in your description but rather TR and TD Tags?

Do you have an idea what i am doing wrong?
Oh and another question, when running the macro it always waits for 6 seconds whenever it reaches the tag-TAG. Do you know why it is doing that and how i can disable this?

Thanks again for your great support! :)
raykay84
Posts: 9
Joined: Tue Aug 11, 2020 11:54 pm

Re: Extract from specific cell/row from CSV

Post by raykay84 » Wed Aug 12, 2020 10:08 am

I found a CSV to HTML tool in the demo section which i used to generate following html page

Code: Select all

<html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252"><link rel="stylesheet" type="text/css" href="cid:css-8f45673c-8d46-4b9f-bbd1-843fac984d9c@mhtml.blink" />
   
   
 </head>
 <body>
    Select CSV file:
    <input type="file" id="datasource">
    <div>Table with loaded CSV file:</div>
    <table id="table" class="table"><tr class="row" id="row_1"><td id="cell_1-1" class="cell">"Ticker</td><td id="cell_1-2" class="cell"> Name"</td><td id="cell_1-3" class="cell">Zuletzt</td><td id="cell_1-4" class="cell">Vorbörsliche Veränderung %</td><td id="cell_1-5" class="cell">Vorbörsliches Volumen</td><td id="cell_1-6" class="cell">Vorbörsliches High</td></tr><tr class="row" id="row_2"><td id="cell_2-1" class="cell">BOOT</td><td id="cell_2-2" class="cell">24.97</td><td id="cell_2-3" class="cell">16.6737378</td><td id="cell_2-4" class="cell">610</td><td id="cell_2-5" class="cell">27.5</td></tr><tr class="row" id="row_3"><td id="cell_3-1" class="cell">HSTO</td><td id="cell_3-2" class="cell">2.88</td><td id="cell_3-3" class="cell">13.79310345</td><td id="cell_3-4" class="cell">1140</td><td id="cell_3-5" class="cell">3.4</td></tr><tr class="row" id="row_4"><td id="cell_4-1" class="cell">SCOR</td><td id="cell_4-2" class="cell">3.21</td><td id="cell_4-3" class="cell">13.41853035</td><td id="cell_4-4" class="cell">600</td><td id="cell_4-5" class="cell">3.55</td></tr><tr class="row" id="row_5"><td id="cell_5-1" class="cell">APEI</td><td id="cell_5-2" class="cell">36.34</td><td id="cell_5-3" class="cell">12.1474359</td><td id="cell_5-4" class="cell">750</td><td id="cell_5-5" class="cell">40.77</td></tr><tr class="row" id="row_6"><td id="cell_6-1" class="cell">MSTR</td><td id="cell_6-2" class="cell">134.89</td><td id="cell_6-3" class="cell">11.77802945</td><td id="cell_6-4" class="cell">500</td><td id="cell_6-5" class="cell">138.18</td></tr><tr class="row" id="row_7"><td id="cell_7-1" class="cell">BJRI</td><td id="cell_7-2" class="cell">24.1</td><td id="cell_7-3" class="cell">11.63080895</td><td id="cell_7-4" class="cell">1000</td><td id="cell_7-5" class="cell">25.943</td></tr><tr class="row" id="row_8"><td id="cell_8-1" class="cell">ORGO</td><td id="cell_8-2" class="cell">4.05</td><td id="cell_8-3" class="cell">9.60779221</td><td id="cell_8-4" class="cell">700</td><td id="cell_8-5" class="cell">4.28</td></tr></table>
 
 </body></html>
I want to adress single cells in there but my code is not working:

Code: Select all

VERSION BUILD=1005 RECORDER=CR
TAB T=1
SET !EXTRACT_TEST_POPUP NO
TAG POS=7 TYPE=TD ATTR=* EXTRACT=TXT
SET !VAR1 {{!EXTRACT}} 

TAB T=2  
TAG POS=1 TYPE=INPUT:TEXT FORM=ID:demo ATTR=name:City CONTENT={{!VAR1}}
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract from specific cell/row from CSV

Post by chivracq » Wed Aug 12, 2020 2:09 pm

Oh...!?, looks like the Cells are 'TD' Elements, and not 'LI' Elements like I thought..., then from your 'GoogleSheet', if you can use the "HSTO" Cell as 'Anchor' for 'Relative Positioning':

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
TAB T=1
'URL GOTO=https://docs.google.com/spreadsheets/d/1oMlXd-rZWhDXkI5eHA8enAz_m93EJROM0d2aXbZXGM0/edit?usp=sharing#

TAG POS=1 TYPE=TD ATTR=TXT:HSTO
'TAG POS=1 TYPE=TD ATTR=TXT:2.88
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
Or you can use the "3" in the Left Margin as 'Anchor':

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
TAB T=1
'URL GOTO=https://docs.google.com/spreadsheets/d/1oMlXd-rZWhDXkI5eHA8enAz_m93EJROM0d2aXbZXGM0/edit?usp=sharing#

TAG POS=1 TYPE=DIV ATTR=TXT:3
'TAG POS=1 TYPE=TD ATTR=TXT:2.88
TAG POS=R2 TYPE=TD ATTR=TXT:* EXTRACT=TXT
(Tested on iMacros for FF v8.8.2, PM v26.3.3, Win10.)

And from your '.csv' to '.html' File, the Cells (also as 'TD' Elements) actually contain the Coordinates of the Cell in their ID then I would expect the following to work:

Code: Select all

TAG POS=1 TYPE=TD ATTR=ID:"cell_3-2" EXTRACT=TXT
(Not tested...)
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
raykay84
Posts: 9
Joined: Tue Aug 11, 2020 11:54 pm

Re: Extract from specific cell/row from CSV

Post by raykay84 » Wed Aug 12, 2020 7:00 pm

Thank you for the explanation. I really dont know where i am going wrong regarding the google spreedsheet.

My code is:

Code: Select all

VERSION BUILD=10021450
TAB T=1
'URL GOTO=https://docs.google.com/spreadsheets/d/1oMlXd-rZWhDXkI5eHA8enAz_m93EJROM0d2aXbZXGM0/edit?usp=sharing#

TAG POS=1 TYPE=TD ATTR=TXT:HSTO
'TAG POS=1 TYPE=TD ATTR=TXT:2.88
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
SET !VAR1 {{!EXTRACT}} 

TAB T=2  
'URL GOTO=http://demo.imacros.net/Automate/AutoDataEntry
TAG POS=1 TYPE=INPUT:TEXT FORM=ID:demo ATTR=ID:fname CONTENT={{!VAR1}}

The error i am getting is

Code: Select all

RuntimeError: element TD specified by TXT:HSTO was not found, line: 5
Since we are both using Firefox and the "read-only" mode of the spreadsheet it must be the iMacros Version Build you have. Yours is stating VERSION BUILD=8820413 RECORDER=FX whereas my is only VERSION BUILD=10021450. I downloaded the plugin from the Firefox store maybe you have another source?
Any other idea?
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract from specific cell/row from CSV

Post by chivracq » Wed Aug 12, 2020 8:42 pm

raykay84 wrote:
Wed Aug 12, 2020 7:00 pm
Thank you for the explanation. I really dont know where i am going wrong regarding the google spreedsheet.

My code is:

Code: Select all

VERSION BUILD=10021450
TAB T=1
'URL GOTO=https://docs.google.com/spreadsheets/d/1oMlXd-rZWhDXkI5eHA8enAz_m93EJROM0d2aXbZXGM0/edit?usp=sharing#

TAG POS=1 TYPE=TD ATTR=TXT:HSTO
'TAG POS=1 TYPE=TD ATTR=TXT:2.88
TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
SET !VAR1 {{!EXTRACT}} 

TAB T=2  
'URL GOTO=http://demo.imacros.net/Automate/AutoDataEntry
TAG POS=1 TYPE=INPUT:TEXT FORM=ID:demo ATTR=ID:fname CONTENT={{!VAR1}}

The error i am getting is

Code: Select all

RuntimeError: element TD specified by TXT:HSTO was not found, line: 5
Since we are both using Firefox and the "read-only" mode of the spreadsheet it must be the iMacros Version Build you have. Yours is stating VERSION BUILD=8820413 RECORDER=FX whereas my is only VERSION BUILD=10021450. I downloaded the plugin from the Firefox store maybe you have another source?
Any other idea?

Oh...!, yep indeed, I can reproduce "your" Behaviour... (in my FCI with v8.8.2 for FF + PM v26.3.3). :shock:

I had actually stopped loading the Page (manually) after maybe 1 Min as the Page never "finished" loading, but this time I left it loading for about 3 Min and it eventually managed to load more or less completely, even if I got some Warning that some Fonts had not been loaded...

I can see a 'TABLE' Element from the Source but the Site is dynamically adding a lot of JS to the Page, and modifying the HTML Structure and iMacros then only sees one main Containing 'DIV', which is indeed the "pain-in-the-ass" Behaviour I "know" for GoogleSheets from some previous Threads for which I had already done some Testing in the past... Grrr...!

"My" Script though still works if I refresh the Page manually and stop it loading after a few Seconds, once the Data/Table becomes visible.

Same with the 'EVENT' Mode, if I record clicking on the 2 "HSTO" + "2.88" Cells and I only gave a few Seconds to the Browser to load the Page, then I get the following Recording/Structure:

Code: Select all

EVENT TYPE=CLICK SELECTOR="#\\34 31745926-grid-table-container>DIV>DIV>TABLE>TBODY>TR:nth-of-type(3)>TD" BUTTON=0
EVENT TYPE=CLICK SELECTOR="#\\34 31745926-grid-table-container>DIV>DIV>TABLE>TBODY>TR:nth-of-type(3)>TD:nth-of-type(2)" BUTTON=0
... with the 'TABLE' and the 'TD' Elements...

But if let the Page loading "longer", then I get for both Cells the same Recording:

Code: Select all

EVENT TYPE=CLICK SELECTOR="#\\34 31745926-scrollable>DIV:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="#\\34 31745926-scrollable>DIV:nth-of-type(2)" BUTTON=0
GoogleSheets are always difficult to handle with iMacros... I have a few "Tricks" but pfff..., this is going to be a bit cumbersome, I'm afraid..., if they ueberhaupt still work...
And you also have one Thread on the Forum where the User had managed to get the Functionality they needed using 'XPATH', let me see if I can find it... [...] Yep, found it, => here...
The whole Thread is quite interesting to read actually... :idea:

And I can see the same HTML Structure in their (the 2nd User in that Thread) 'TAG XPATH' Statement like what I get with the 'EVENT' Mode and only a few Seconds Loading Time for the Page, but I'm afraid this probably won't work anymore if the Page/Site loads "completely"... :oops:

>>>

Then pfff..., if it's a complete hassle to get it to work, maybe the 2nd Option with the Local '.html' File will probably be easier and more straightforward... :idea:
... Or still getting the 'PE' Version, then you can use your '.csv' directly as DataSource... I had mentioned all other Options as Alternatives, but the '!DATASOURCE' Command is still the "Main"/"Official" Way to go...
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
raykay84
Posts: 9
Joined: Tue Aug 11, 2020 11:54 pm

Re: Extract from specific cell/row from CSV

Post by raykay84 » Wed Aug 12, 2020 10:15 pm

Thank you very much, you was completely right. I put my table into a simple table generator and I could use the code without an issue.
I also saw how much effort you put into this forum as a whole and must say all of us who seeking support are very happy there are fellows like you.
Keep it up!
Post Reply