Problem with data = iim1.iimGetLastExtract()

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
facteury
Posts: 21
Joined: Tue Nov 29, 2005 3:09 pm

Problem with data = iim1.iimGetLastExtract()

Post by facteury » Tue Nov 29, 2005 3:14 pm

I asked this question on tech support, but since I'm not getting any helpful answer from it, I'll post my problem here:

It seems that data = iim1.iimGetLastExtract() doesn't work properly in my version. I just installed the latest version.

I have a set of macros that worked very well in v.4 but now doesn't work with present version. When macro extract data from a long text, it now extracts only the first 1012-1016 characters instead of extracting it in its entirety. This NOT a TAG problem.

When I run the macro in IIM using a simple variable, it works fine. But when I do it via a vb, using the command data = iim1.iimGetLastExtract() I get the aformentioned problem.

How_to_duplicate:
Here are the macros and vb created for a simple test. See how the second variable will not extract all the text:

macro Test1:
VERSION BUILD=4010333
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://how2pedia.com/articles/v1/bmw-parts
SIZE X=876 Y=623
'Get the values
EXTRACT POS=1 TYPE=TXT ATTR=<P<SP>id=title>*
EXTRACT POS=1 TYPE=TXT ATTR=<P<SP>id=body>*

macro Test2:
'Now fill them in a form. This is only one example. You could use it also as part of link
URL GOTO=http://www.iopus.com/iim/demo/v4/f1/form.asp
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:TestForm ATTR=NAME:Name CONTENT=***Extract<SP>and<SP>Fill<SP>Demo***
'
'The variables are part of the CONTENT tag. You could use them as part of link
TAG POS=1 TYPE=TEXTAREA FORM=NAME:TestForm ATTR=NAME:Remarks CONTENT=Extraction<SP>results:<BR><BR>One<SP>dollar<SP>costs<SP>{{title}}<SP>EURO,<SP>{{body}}

vb script for test:
'Note: EXTRACT requires Internet Explorer 6.0 or better to be installed

Msgbox ("This example shows how to use a data EXTRACTION and then fill the extracted data into a new form. It uses two macros for this purpose.")

dim body, bod

set iim1= CreateObject ("InternetMacros.iim")

iret = iim1.iimInit ()

'--- Extraction starts ---
iret = iim1.iimDisplay("Extract Data")
iplay = iim1.iimPlay("test1")
data = iim1.iimGetLastExtract()

If iplay < 0 Then
s = "The following error occurred: " + vbCrLf + vbCrLf + data
MsgBox s
WScript.Quit(0)
End If
bod= Split(data, "[EXTRACT]")
title = bod(0)
body = bod(1)

'--- Extraction done ---

'--- Submission starts ---


iret = iim1.iimSet ("-var_body", body)
iret = iim1.iimSet ("-var_title", title)
iplay = iim1.iimPlay("test2")

If iplay < 0 Then
s = "An error occurred"
MsgBox s
End If


Thanks for your help!
Yan Muckle
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Tue Nov 29, 2005 9:23 pm

Hi,
I replied to your problem with a question yesterday and received your answer today. As you stated in your reply, the problem seems to occurr only with very specific websites and not in general.

Today we could not do any further tests as the "problem" website seems to be down (http://how2pedia.com/articles/v1/bmw-parts ). I only get the following error message when testing:
Database Error: Unable to connect to your database. Your database appears to be turned off or the database connection settings in your config file are not correct. Please contact your hosting provider if the problem persists.
Once the site is up again, can you please post the following information here: (1) Data (website extract) how it should look and (2) the data you get via the VBS interface.

I ask this questions as in our tests yesterday the website seemed to extract ok via the interface.
Ann
facteury
Posts: 21
Joined: Tue Nov 29, 2005 3:09 pm

Post by facteury » Wed Nov 30, 2005 12:09 am

Hi Ann,

The site is up now.

Here is what the vbs extracts:
-----------start
If you are looking for parts for your BMW M3, X5 or Z4, you want the best. Whether you are needing to replace the brakes, hoses, lights or even the tell tale emblem you need to know where to look for factory direct parts at affordable prices.

1- The best place to find any type of part for your BMW is, of course through a BMW dealer. They will either have the parts needed in stock or be able to order them for you with short delivery times. They will be more expensive than other outlets, but you will know that the parts they are selling you are genuine BMW auto parts, not something that is generic.

2- If you choose to purchase your BMW parts via a dealership, be sure to take advantage of the knowledge they can provide. The mechanics at a dealership can be the best people to ask basic questions of. They will be able to give you insight as to what your vehicle actually needs and may even tell you how to go about replacing it.

3- If paying dealership prices is not what you had in mind, the next best plac

------------stop

Notice that this extract contains 1012 characters. By testing different articles using the same format and macros, I saw that the extraction always cuts after 1012-1014 characters, regardless of the length of the article.

The full body text should be extracted, like this:
-----------start
If you are looking for parts for your BMW M3, X5 or Z4, you want the best. Whether you are needing to replace the brakes, hoses, lights or even the tell tale emblem you need to know where to look for factory direct parts at affordable prices.

1- The best place to find any type of part for your BMW is, of course through a BMW dealer. They will either have the parts needed in stock or be able to order them for you with short delivery times. They will be more expensive than other outlets, but you will know that the parts they are selling you are genuine BMW auto parts, not something that is generic.

2- If you choose to purchase your BMW parts via a dealership, be sure to take advantage of the knowledge they can provide. The mechanics at a dealership can be the best people to ask basic questions of. They will be able to give you insight as to what your vehicle actually needs and may even tell you how to go about replacing it.

3- If paying dealership prices is not what you had in mind, the next best place to look for BMW parts is online. There is a vast array of online stores, which specialize in genuine, generic, new and used parts for your BMW. The prices are usually less than purchasing through a dealer, however, keep in mind that these stores may lack in knowledgeable sales staff.

4- While you are surfing the web for the best prices on the parts you need, whether they be side mirrors or tail light covers, you might also be wise to look up the "how-to guides" that tell you how to install them. There is a lot of information on the web in regards to BMW and BMW auto parts, their uses and their replacement.

5- As long as you are looking for parts for your BMW, you may want to check out the selection of accessories that are also available. These include items such as shift knob replacements and custom door handle parts, to name a few. The best way to begin a search for accessories is online or through catalogues.
------------stop

As I said, I had no problem extracting that kind of data with vb6 under v.4.3. And if I try to extract just by running the macro itself, in v.5.01, it extracts fine.

Thanks

Yan
facteury
Posts: 21
Joined: Tue Nov 29, 2005 3:09 pm

Post by facteury » Thu Dec 01, 2005 8:09 pm

Were you able to replicate the problem? Have any clue? I need to get this resolved.

Thanks

Yan
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Fri Dec 02, 2005 1:32 am

Strange, the extraction seems to work for us. Can you please test this on a different PC to rule out PC or setup related issues?

Update: I posted details of our tests below. If you still encounter problems, please let us know.
Frank
Last edited by Tech Support on Sat Dec 03, 2005 10:57 pm, edited 2 times in total.
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Sat Dec 03, 2005 10:51 pm

Here is the extraction result we get:

5 Tips to Buying BMW Auto Parts[EXTRACT]If you are looking for parts for your BMW M3, X5 or Z4, you want the best. Whether you are needing to replace the brakes, hoses, lights or even the tell tale emblem you need to know where to look for factory direct parts at affordable prices.

1- The best place to find any type of part for your BMW is, of course through a BMW dealer. They will either have the parts needed in stock or be able to order them for you with short delivery times. They will be more expensive than other outlets, but you will know that the parts they are selling you are genuine BMW auto parts, not something that is generic.

2- If you choose to purchase your BMW parts via a dealership, be sure to take advantage of the knowledge they can provide. The mechanics at a dealership can be the best people to ask basic questions of. They will be able to give you insight as to what your vehicle actually needs and may even tell you how to go about replacing it.

3- If paying dealership prices is not what you had in mind, the next best place to look for BMW parts is online. There is a vast array of online stores, which specialize in genuine, generic, new and used parts for your BMW. The prices are usually less than purchasing through a dealer, however, keep in mind that these stores may lack in knowledgeable sales staff.

4- While you are surfing the web for the best prices on the parts you need, whether they be side mirrors or tail light covers, you might also be wise to look up the "how-to guides" that tell you how to install them. There is a lot of information on the web in regards to BMW and BMW auto parts, their uses and their replacement.

5- As long as you are looking for parts for your BMW, you may want to check out the selection of accessories that are also available. These include items such as shift knob replacements and custom door handle parts, to name a few. The best way to begin a search for accessories is online or through catalogues. [EXTRACT]
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Sat Dec 03, 2005 10:53 pm

And here is the VBS test file we used:

Code: Select all

Dim MyArray
Dim objFileSystem, objOutputFile
Dim strOutputFile
Dim pos


Const OPEN_FILE_FOR_APPENDING = 8

MsgBox "This script demonstrates how to extract data from a web site and store this information in a text file (CSV Format, CSV = Comma separated values). It uses the macro <wsh-extract-jobs.iim>." + vbCrLf + VbCrLf + "Tip: This script has the same function as <extract-2-database.vbs> but stores the data in a test file instead of a database."

' generate a filename based on the script name
strOutputFile = "./extracted-data.txt"

Set objFileSystem = CreateObject("Scripting.fileSystemObject")

Set objOutputFile = objFileSystem.CreateTextFile(strOutputFile, TRUE)

set iim1= CreateObject ("InternetMacros.iim")
iret = iim1.iimInit()
iret = iim1.iimDisplay("Test 1")
iplay = iim1.iimPlay("test1")

data = iim1.iimGetLastExtract()
errortext = iim1.iimGetLastError()   

objOutputFile.WriteLine(data)

if iplay < 0 Then MsgBox errortext

iret = iim1.iimExit
objOutputFile.Close
Set objFileSystem = Nothing

MsgBox "The data is stored in the file <extracted-data.txt>. The script is completed."

WScript.Quit(0)
Test macro:

Code: Select all

VERSION BUILD=4010333 
TAB T=1 
TAB CLOSEALLOTHERS 
URL GOTO=http://how2pedia.com/articles/v1/bmw-parts 
SIZE X=876 Y=623 
'Get the values 
EXTRACT POS=1 TYPE=TXT ATTR=<P<SP>id=title>* 
EXTRACT POS=1 TYPE=TXT ATTR=<P<SP>id=body>* 
facteury
Posts: 21
Joined: Tue Nov 29, 2005 3:09 pm

Post by facteury » Mon Dec 05, 2005 7:00 pm

Hi,

Using the script you used, it works. The data get extracted properly. Can you tell me what was wrong in the example I used (not the same as the real macro I use, which would be too long to post here)? I took that from one of your examples...

Please help me understand how to correct that.

Yan
Yan Muckle
facteury
Posts: 21
Joined: Tue Nov 29, 2005 3:09 pm

Post by facteury » Mon Dec 05, 2005 7:43 pm

It appears the problem is not with extraction of data, but with insertion using a custom variable.

When I try your vbs, it works. Data get extracted completely.

But when I modify your vbs to have it paste the result in your test web form, it does NOT work. Using a custom variable seems to screw the results -- and I didn't have this problem with v.4.

Here is what I changed to the macro to have it paste on your web page. Run it and see how the data is not pasted in its entirety:
Dim MyArray
Dim objFileSystem, objOutputFile
Dim strOutputFile
Dim pos


Const OPEN_FILE_FOR_APPENDING = 8

MsgBox "This script demonstrates how to extract data from a web site and store this information in a text file (CSV Format, CSV = Comma separated values). It uses the macro <wsh-extract-jobs.iim>." + vbCrLf + VbCrLf + "Tip: This script has the same function as <extract-2-database.vbs> but stores the data in a test file instead of a database."

' generate a filename based on the script name
strOutputFile = "./extracted-data.txt"

Set objFileSystem = CreateObject("Scripting.fileSystemObject")

Set objOutputFile = objFileSystem.CreateTextFile(strOutputFile, TRUE)

set iim1= CreateObject ("InternetMacros.iim")
iret = iim1.iimInit()
iret = iim1.iimDisplay("Test 1")
iplay = iim1.iimPlay("test1")

data = iim1.iimGetLastExtract()
errortext = iim1.iimGetLastError()

iret = iim1.iimSet ("-var_body", data)
iplay = iim1.iimPlay("test2")

if iplay < 0 Then MsgBox errortext


objOutputFile.Close
Set objFileSystem = Nothing

MsgBox "The data is stored in the file <extracted-data.txt>. The script is completed."
So the problem is likely with the use of :

iret = iim1.iimSet ("-var_body", data)
iplay = iim1.iimPlay("test2")

Note: the test2 macro used is:
'Now fill them in a form. This is only one example. You could use it also as part of link
URL GOTO=http://www.iopus.com/iim/demo/v4/f1/form.asp
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:TestForm ATTR=NAME:Name CONTENT=***Extract<SP>and<SP>Fill<SP>Demo***
'
'The variables are part of the CONTENT tag. You could use them as part of link
TAG POS=1 TYPE=TEXTAREA FORM=NAME:TestForm ATTR=NAME:Remarks CONTENT=Extraction<SP>results:<BR><BR>One<SP>dollar<SP>costs<SP>{{title}}<SP>EURO,<SP>{{body}}
Thanks for trying this asap...

Yan
Yan Muckle
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Mon Dec 05, 2005 10:26 pm

Thanks again for the details. Please download the patch discussed here: http://forum.imacros.net/viewtopic.php?t=274
facteury
Posts: 21
Joined: Tue Nov 29, 2005 3:09 pm

Post by facteury » Tue Dec 06, 2005 12:37 am

Ok! I downloaded the patch and the insertion now works correctly.

Thanks, and make sure you include that in the next release...

Yan
Yan Muckle
Post Reply