Extract with LOOP

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
Flint
Posts: 3
Joined: Wed Oct 12, 2005 3:16 pm

Extract with LOOP

Post by Flint » Wed Oct 12, 2005 3:30 pm

I'm using the scripting interface and want to extract the text of all links on a page where the href is similar. When I run this using the windows scripting interface, I only get the first link (there should be 5). If I uncomment the last line, I get the 1st and 3rd link, so I know the regular expression is right.

I'm waiting on my license (you have the PO). I don't know if this is a restriction of the 30 day demo, or I'm doing something wrong.

Thanks,
Flint

VERSION BUILD=4310722
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://192.168.4.38/
SIZE X=1296 Y=871
URL GOTO=http://192.168.4.38/services.html
SIZE X=1296 Y=871
FRAME F=5
TAG POS=1 TYPE=H3 ATTR=TXT:*UNI<SP>parameters*
EXTRACT POS={{!LOOP}} TYPE=TXT ATTR=<A<SP>*href="/hn_show?*&r=uni"*>*
'EXTRACT POS=3 TYPE=TXT ATTR=<A<SP>*href="/hn_show?*&r=uni"*>*
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Thu Oct 13, 2005 9:16 am

The macro itself is correct. But the {{LOOP}} variable is only automatically filled when

(a) the macro is started with the LOOP button (as opposed to using the PLAY button) or

(b) the "-loop" command line switch is used ( http://www.iopus.com/iim/help/command_line.htm )

You can use the "-loop" switch also with the Scripting Interface (e. g. iimInit ("-loop 15") ) but the recommend method is to create a loop around the iimPlay command and use/fill your own variable. This approach gives you much more control, for example you can check the iimPlay return code after every loop.

Code: Select all

i = iim1.iimInit ()
For m = 1 to 15
   i = iim1.iimSet ("-var_myloop", cstr (m))
   i = iim1.iimPlay ("yourmacro")
Next
i = iim1.iimExit ()
In the macro you can now use the "myloop" variable that you created in the script:

Code: Select all

EXTRACT POS={{myloop}} TYPE=TXT ATTR=<A<SP>*href="/hn_show?*&r=uni"*>*


Mike
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Thu Oct 13, 2005 9:24 am

Note: If the extraction speed is important you can split your macro into two parts:

macro1: (navigates to the page)

Code: Select all

VERSION BUILD=4310722 
TAB T=1 
TAB CLOSEALLOTHERS 
URL GOTO=http://192.168.4.38/services.html 
SIZE X=1296 Y=871 
FRAME F=5 
TAG POS=1 TYPE=H3 ATTR=TXT:*UNI<SP>parameters* 
macro2: (actual extraction)

Code: Select all

FRAME F=5 
EXTRACT POS={{myloop}} TYPE=TXT ATTR=<A<SP>*href="/hn_show?*&r=uni"*>* 
and use this script:

Code: Select all

i = iim1.iimInit () 
i = iim1.iimPlay ("macro1") 
For m = 1 to 15 
   i = iim1.iimSet ("-var_myloop", cstr (m)) 
   i = iim1.iimPlay ("macro2") 
Next 
i = iim1.iimExit ()
PS: To get the extracted text, you need to use iimGetLastExtract, I omitted this in the sample scripts for clarity.
Post Reply