Very Simple Data Scrape Issue

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team

Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
Ausweb
Posts: 4
Joined: Fri Aug 12, 2011 4:38 am

Very Simple Data Scrape Issue

Post by Ausweb » Fri Aug 12, 2011 4:54 am

Hi all.

I've been looking for a useful tool for data mining, and I hope I have found one iMacros! So far so good, apart from the fact that I'm having difficulty with my very first "macro". I would appreciate any help :D

I am having issues with extracting the data from a LOOP into a CSV. The code is as follows:
VERSION BUILD=7401598
TAB T=1
SET !EXTRACT_TEST_POPUP NO
TAB CLOSEALLOTHERS
URL GOTO=xxxxx (URL REMOVED DUE TO BEING A PRIVATE LINK)
TAG POS={{!LOOP}} TYPE=TD FORM=ID:form1 ATTR=TXT:2011/12*
TAG POS=20 TYPE=TD FORM=ID:form1 ATTR=TXT:* EXTRACT=TXT
TAG POS=22 TYPE=TD FORM=ID:form1 ATTR=TXT:* EXTRACT=TXT
TAG POS=34 TYPE=TD FORM=ID:form1 ATTR=TXT:* EXTRACT=TXT
TAG POS=38 TYPE=TD FORM=ID:form1 ATTR=TXT:* EXTRACT=TXT
TAG POS=42 TYPE=TD FORM=ID:form1 ATTR=TXT:* EXTRACT=TXT
TAG POS=8 TYPE=TD FORM=ID:form1 ATTR=TXT:* EXTRACT=TXT
TAG POS=66 TYPE=TD FORM=ID:form1 ATTR=CLASS:labelBold EXTRACT=TXT
BACK
WAIT SECONDS=1
SAVEAS TYPE=EXTRACT FOLDER=C:\iMacrosExports\ FILE=Extract_{{!NOW:ddmmyy_hhnnss}}.csv
The problem is that it's saving multiple CSV's. I believe this is because I don't know how to place the "save extracted data" command outside the loop. Ultimately, I would like it to save each set of extracted data into ONE CSV and separated by new line / row.

Essentially, I would like it to do the following:
LOOP {
GO TO PRODUCT
EXTRACT DATA
}
Save data as ONE CSV file with each 'result set' as a NEW LINE or ROW.
Any help would be greatly appreciated.
Ausweb
Posts: 4
Joined: Fri Aug 12, 2011 4:38 am

Re: Very Simple Data Scrape Issue

Post by Ausweb » Fri Aug 12, 2011 5:02 am

Just to clarify....

I want the macro to loop through product pages, pulling particular info, and exporting them into a single CSV file with each "product" being a new line / row.
MattBell7
Posts: 627
Joined: Thu Nov 26, 2009 11:07 am
Location: United Kingdom

Re: Very Simple Data Scrape Issue

Post by MattBell7 » Fri Aug 12, 2011 9:11 am

its doing because each time you run it this value changes:
Extract_{{!NOW:ddmmyy_hhnnss}}

either user just Extract.csv, or even Extract_{{!NOW:ddmmyy}}.csv so you get the date, but this second one will do a new file each day.
Ausweb
Posts: 4
Joined: Fri Aug 12, 2011 4:38 am

Re: Very Simple Data Scrape Issue

Post by Ausweb » Fri Aug 12, 2011 1:14 pm

Hi.

Thanks for the help. I'm still having the issue of getting each set of extracted data on a new row/line in the output file.

Any further help appreciated.
MattBell7
Posts: 627
Joined: Thu Nov 26, 2009 11:07 am
Location: United Kingdom

Re: Very Simple Data Scrape Issue

Post by MattBell7 » Fri Aug 12, 2011 1:27 pm

so you want each individual field on its own line? or each loop on its own line?
Ausweb
Posts: 4
Joined: Fri Aug 12, 2011 4:38 am

Re: Very Simple Data Scrape Issue

Post by Ausweb » Sat Aug 13, 2011 11:32 am

MattBell7 wrote:so you want each individual field on its own line? or each loop on its own line?
Hi Matt,

As an example, the macro might scrape product name, product description, price and colour. I have tested the script another way, and my end result was a single row CSV eg:

product1, example description 2, $1, red, product2, example description 2, $6, yellow ....etc

I wish for the final export to be a CSV with each "page" scrape (ie. one iteration of the loop) to be a row eg:

product1, example description 2, $1, red,
product2, example description 2, $6, yellow
....etc

Many thanks!
MattBell7
Posts: 627
Joined: Thu Nov 26, 2009 11:07 am
Location: United Kingdom

Re: Very Simple Data Scrape Issue

Post by MattBell7 » Sat Aug 13, 2011 1:21 pm

right, i understand you now. In short, it should be...

SAVEAS TYPE=EXTRACT should always append to a new line. If it isn't, then it could be a bug in the latest iMacros build... try rolling back to v 7.36 and see if that resolves it. If it does, raise a ticket with the support team to get it fixed
Tom, Tech Support
Posts: 3834
Joined: Mon May 31, 2010 4:59 pm

Re: Very Simple Data Scrape Issue

Post by Tom, Tech Support » Tue Aug 16, 2011 9:30 am

Using SAVEAS TYPE=EXTRACT writes a new row to the output file each time it is called. You can run this this simple macro as a test (works in 7.40 also):

Code: Select all

SET !EXTRACT_TEST_POPUP NO
TAB T=1     
TAB CLOSEALLOTHERS  
URL GOTO=http://www.iopus.com/imacros/demo/v6/extract1 
TAG POS={{!LOOP}} TYPE=A ATTR=HREF:http://www.iopus.com/imacros/demo/v6/extract1/*
TAG POS=1 TYPE=P ATTR=CLASS:heading2 EXTRACT=TXT  
TAG POS=2 TYPE=NOBR ATTR=TXT:* EXTRACT=TXT  
TAG POS=3 TYPE=NOBR ATTR=TXT:* EXTRACT=TXT  
TAG POS=4 TYPE=NOBR ATTR=TXT:* EXTRACT=TXT  
SAVEAS TYPE=EXTRACT FOLDER=* FILE=Jobs.csv
Just Play (Loop) it with a Max. value of 3.
Regards,

Tom, iMacros Support
Post Reply