Extraction Bug

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
msalperen
Posts: 10
Joined: Sat Dec 24, 2005 2:00 pm

Extraction Bug

Post by msalperen » Mon Aug 28, 2006 4:12 pm

Hello

When I try to extract a text which is already in curly brackets such as: {{mytext}}, iopus fails to understand it, becuase it thinks the aimed text as a variable.

I think this is an unseen bug. Here is an example code, it gives an error on line 8:

Code: Select all

VERSION BUILD=4310722     
TAB T=1     
TAB CLOSEALLOTHERS   
URL GOTO=http://www.editthis.info/wikiky/Deneme23?title=Deneme&action=edit     
SIZE X=1004 Y=723    
URL GOTO=http://www.editthis.info/wikiky/Deneme     
TAG POS=1 TYPE=A ATTR=TXT:Düzenle   
EXTRACT POS=1 TYPE=TXT ATTR=<TEXTAREA<SP>accessKey=,<SP>tabIndex=1<SP>name=wpTextbox1<SP>rows=25<SP>cols=80>*
SET !VAR1 {{!EXTRACT}}
TAG POS=1 TYPE=A ATTR=TXT:son<SP>değişiklikler   
TAG POS=1 TYPE=A ATTR=TXT:Deneme23   
TAG POS=1 TYPE=A ATTR=TXT:Düzenle   
TAG POS=1 TYPE=TEXTAREA FORM=NAME:editform ATTR=NAME:wpTextbox1 CONTENT={{!VAR1}} 
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:editform ATTR=NAME:wpSave&&VALUE:Save<SP>page  
'Comment: New page loaded
Note:You can edit that wiki page for your test purposes. I'm keeping the troublesome text intact for you to test. This issue is not new, I have mentioned it months ago, but not given a clear example. I hope this helps.

This is the constant URL: http://www.editthis.info/wikiky/Deneme? ... oldid=5243
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Mon Aug 28, 2006 4:49 pm

Hi,

The issue is fixed and is available with V5.2. With the new version you can use the #NOVAR# escape sequence.

Example:
TAG TYPE=INPUT:TEXT FORM=NAME:TestForm ATTR=NAME:Name CONTENT=Hello#NOVAR#{{World}}

This will fill in the following text: Hello{{World}}

(Note that the {{...}} are not replaced)

Daniel Kerr
iOpus Support
msalperen
Posts: 10
Joined: Sat Dec 24, 2005 2:00 pm

Post by msalperen » Mon Aug 28, 2006 5:59 pm

Hello I tried every combination of parameters but I couldn't solve the issue, would you please show it on the example above?
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Tue Aug 29, 2006 12:11 pm

Hi,

So to do this you need to get the #NOVAR# into the string sequence. This can be done via a simple script. You should split the macro into two:

novar.iim

Code: Select all

VERSION BUILD=4310722     
TAB T=1     
TAB CLOSEALLOTHERS   
'URL GOTO=http://www.editthis.info/wikiky/Deneme23?title=Deneme&action=edit     
SIZE X=1004 Y=723   
URL GOTO=http://www.editthis.info/wikiky/Deneme     
TAG POS=1 TYPE=A ATTR=TXT:Düzenle   
EXTRACT POS=1 TYPE=TXT ATTR=<TEXTAREA<SP>accessKey=,<SP>tabIndex=1<SP>name=wpTextbox1<SP>rows=25<SP>cols=80>*
novar2.iim

Code: Select all

WINCLICK X=65 Y=241 CONTENT=      
WAIT SECONDS=10
TAG POS=1 TYPE=A ATTR=TXT:Deneme23   
TAG POS=1 TYPE=A ATTR=TXT:Düzenle   
TAG POS=1 TYPE=TEXTAREA FORM=NAME:editform ATTR=NAME:wpTextbox1 CONTENT={{input}}
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:editform ATTR=NAME:wpSave&&VALUE:Save<SP>page 
'Comment: New page loaded
and call them from a script:

Code: Select all

option explicit
dim iim1
dim inputstring, i
set iim1= CreateObject ("InternetMacros.iim")
i=iim1.iimInit
i=iim1.iimPlay("novar")
inputstring = "#NOVAR#" & iim1.iimGetLastExtract()
msgbox inputstring
i=iim1.iimSet("-var_input", inputstring)
i=iim1.iimPlay("novar2")
WScript.Quit(0)
Daniel Kerr
iOpus Support
Post Reply