main problem beeing that, using imacros tags extract both good data and bad data
the webpage i'm extracting data from is
http://www.google.co.uk/search?q=double ... WIBQ&gbv=2
* the good data is: 1st position in google, 2nd, etc
** the bad data is: the results from Shopping results. (Limelight Beds Furniture Limelight, etc)
the very same tag that is extracting the good data is also extracting the bad data.
there might be a way around this though, and maybe with your help i can get it to work.
what could help me ignore the bad data is the html code.
below you can see the html for both good and bad data.
you'll see that both of them look similar. BUT there is one thing though that makes the difference. and that is the html tag <h3 class="r"> which is only seen is the good data
html code for the good data
Code: Select all
<h3 class="r">
<a href="http://www.website.co.uk/" class=l onmousedown="return rwt">
Double Beds Frames
</a>
</h3>
html code for the bad data
Code: Select all
<a href="http://www.website.co.uk/" class=l onmousedown="return rwt">
Double Beds Frames
</a>
Code: Select all
<h3 class="r">
bottom line ...i need to extract the website url within the google 1st position, than 2nd, 3rd, etc ... which is easy if we're using the imacros browser.
the imacro browser would return
Code: Select all
TAG POS=1 TYPE=CLASS:l&&TXT:*&&HREF: EXTRACT=HREF
problem is that is also extracts data that i don't need. and that is, the urls from the shopping results. (i need to skip the shopping results badly. i need to make it so that imacros won't see them)
i've tried 1ST
Code: Select all
TAG POS=1 TYPE=H3 ATTR=CLASS:r EXTRACT=HREF
i've tried 2ND
Code: Select all
TAG POS=1 TYPE=H3 ATTR=CLASS:r&&TXT:*&&HREF:* EXTRACT=TXT
i've also tried 3RD
Code: Select all
TAG POS=1 TYPE=H3 ATTR=CLASS:r&&CLASS:l&&TXT:*&&HREF:* EXTRACT=TXT
i'd appreciate some help on this