extract tag position keeps on varying

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

extract tag position keeps on varying

by hamzajosh on Tue Dec 06, 2005 11:43 am

I need to extract the descriptions from these links. When i use this link the code generated is as
http://www.sigmaaldrich.com/catalog/sea ... LUKA/54465

VERSION BUILD=5010115
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://www.sigmaaldrich.com/catalog/search/ProductDetail/FLUKA/54465
SIZE X=801 Y=602
EXTRACT POS=2 TYPE=TXT ATTR=<SPAN<SP>class=sectionHeader>*
EXTRACT POS=12 TYPE=TXT ATTR=<DIV<SP>class=leftColumn>*
EXTRACT POS=10 TYPE=TXT ATTR=<DIV<SP>class=rightColumn>*

When i change the link to
http://www.sigmaaldrich.com/catalog/sea ... LUKA/95209 via the loop, it does not get the description. Now it gets some other data. How do I make the last two lines in such a way that only the decription is picked up. please help ASAP. Hamza Josh
hamzajosh
 

by Tech Support on Wed Dec 07, 2005 7:55 am

Your extraction anchors are well chosen. In addition, you can make the EXTRACTION more robust against website changes by using a relative extraction anchor (EXTRACT POS=R1 ...):

This macro extracts the description on both pages correctly:
Code: Select all
VERSION BUILD=5010115
TAB T=1
TAB CLOSEALLOTHERS
'URL GOTO=http://www.sigmaaldrich.com/catalog/search/ProductDetail/FLUKA/54465
URL GOTO=http://www.sigmaaldrich.com/catalog/search/ProductDetail/FLUKA/95209
TAG POS=1 TYPE=SPAN ATTR=TXT:Descriptions*
EXTRACT POS=R1 TYPE=TXT ATTR=<DIV<SP>class=rightColumn>*
Last edited by Tech Support on Wed Dec 07, 2005 8:07 am, edited 2 times in total.
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm

by Tech Support on Wed Dec 07, 2005 8:04 am

More information about relative extraction is available at http://forum.iopus.com/viewtopic.php?p=1029
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 3 guests

cron
-->