URL Extract Problem

Discussions and Tech Support specific to the iMacros Firefox add-on.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
iorgu
Posts: 6
Joined: Thu Sep 17, 2009 6:33 pm

URL Extract Problem

Post by iorgu » Thu Sep 17, 2009 6:38 pm

Hi guys ,

I have a little problem .

Here is my coding :

Code: Select all

VERSION BUILD=6240709 RECORDER=FX
TAB T=1
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP NO
TAG POS=1 TYPE=DIV ATTR=CLASS:posting_thumbnail&&HREF: EXTRACT=HREF
SAVEAS TYPE=EXTRACT FOLDER=C:\ FILE=gum-test.CSV
TAG POS=2 TYPE=DIV ATTR=CLASS:posting_thumbnail&&A:* EXTRACT=HREF
SAVEAS TYPE=EXTRACT FOLDER=C:\ FILE=gum-test.CSV
And Here is The Html Code :

Code: Select all

<div class="posting_thumbnail"><a href="http://www.gumtree.com/london/66/45343066.html"><img width="58" height="43" src="http://is.gumtree.com/ad_image/live/listingthumb/271199318.jpg"/></a></div>
or

Code: Select all

<div class="posting_row1"><h5><a href="/london/66/45343066.html">Single rooms only 3 Mins Walk to Plaistow Tube- 8 mins walk to Stratford Tube-All Bills WiFi Cleaner</a></h5></div>
Can you guys show me How to extract the link ?
http://www.gumtree.com/london/66/45343066.html

Thanks
Last edited by iorgu on Fri Sep 18, 2009 8:23 am, edited 1 time in total.
Marcia, Tech Support
Posts: 1095
Joined: Thu Jan 29, 2009 1:10 pm

Re: URL Extract Problem

Post by Marcia, Tech Support » Thu Sep 17, 2009 8:55 pm

Hello,

Have you tried something like:

Code: Select all

TAG POS=1 TYPE=A ATTR=TXT:Single<SP>rooms<SP>only<SP>3<SP>Mins* EXTRACT=HREF
Regards,

Marcia
iorgu
Posts: 6
Joined: Thu Sep 17, 2009 6:33 pm

Re: URL Extract Problem

Post by iorgu » Fri Sep 18, 2009 8:22 am

Thanks for the quick reply , but are 40 different links on that page .

Code: Select all

http://www.gumtree.com/london/2478_1.html
I cannot extract trought TXT attribute.

It must be another way to do this .
josephconlin
Posts: 190
Joined: Wed Aug 06, 2008 2:38 am

Re: URL Extract Problem

Post by josephconlin » Fri Sep 18, 2009 4:04 pm

iorgu wrote:Hi guys ,

I have a little problem .

Here is my coding :

Code: Select all

VERSION BUILD=6240709 RECORDER=FX
TAB T=1
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP NO
TAG POS=1 TYPE=DIV ATTR=CLASS:posting_thumbnail&&HREF: EXTRACT=HREF
SAVEAS TYPE=EXTRACT FOLDER=C:\ FILE=gum-test.CSV
TAG POS=2 TYPE=DIV ATTR=CLASS:posting_thumbnail&&A:* EXTRACT=HREF
SAVEAS TYPE=EXTRACT FOLDER=C:\ FILE=gum-test.CSV
And Here is The Html Code :

Code: Select all

<div class="posting_thumbnail"><a href="http://www.gumtree.com/london/66/45343066.html"><img width="58" height="43" src="http://is.gumtree.com/ad_image/live/listingthumb/271199318.jpg"/></a></div>
or

Code: Select all

<div class="posting_row1"><h5><a href="/london/66/45343066.html">Single rooms only 3 Mins Walk to Plaistow Tube- 8 mins walk to Stratford Tube-All Bills WiFi Cleaner</a></h5></div>
Can you guys show me How to extract the link ?
http://www.gumtree.com/london/66/45343066.html

Thanks
Having read through this, here is what I think is happening.

Code: Select all

TAG POS=1 TYPE=DIV ATTR=CLASS:posting_thumbnail&&HREF: EXTRACT=HREF
You are trying to extract the HREF attribute of the div tag. As you can see from the html you posted, the div tag doesn't have an href attribute.

Code: Select all

<div class="posting_thumbnail">...</div>
It sounds to me like what you want to do is locate the div tag by the class attribute, then find the link inside of the div tag and get the href attribute from that. If that is what you are trying to do, you can accomplish that by using relative positioning. See http://wiki.imacros.net/TAG_parameters_ ... ositioning for more information.

Here's an example that might work.

Code: Select all

VERSION BUILD=6240709 RECORDER=FX
TAB T=1
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP NO
TAG POS=1 TYPE=DIV ATTR=CLASS:posting_thumbnail EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=C:\ FILE=gum-test.CSV
TAG POS=R1 TYPE=A ATTR=HREF:* EXTRACT=HREF
SAVEAS TYPE=EXTRACT FOLDER=C:\ FILE=gum-test.CSV
This should do the following.
1) Change to tab 1.
2) Set !ERRORIGNORE variable
3) Set !EXTRACT_TEST_POPUP variable
4) Find the first div tag with a class attribute of posting_thumbnail and extract the text of the tag (in your example, there would be no text, but it keeps the macro from clicking the div tag, which may or may not be needed).
5) Save the extracted text (again, probably nothing) to the CSV file.
6) Go to the first anchor (<a ...>, it is a link) tag following the div tag we previously found and extract the value of the href attribute.
7) Save the extracted href to the CSV file.

Your output CSV file should look something like this:
""
"http://www.gumtree.com/london/66/45343066.html"

Also note that this does not deal with the second div tag example you posted, where the div tag has a class attribute of posting_row1.

Hope this helps.
iorgu
Posts: 6
Joined: Thu Sep 17, 2009 6:33 pm

Re: URL Extract Problem

Post by iorgu » Fri Sep 18, 2009 8:51 pm

Thanks a lot josephconlin.

It's working smoothly.

I have a little question : How can i get ride of those spaces between links ?

Code: Select all

http://www.gumtree.com/london/54/43000054.html

http://www.gumtree.com/london/54/43000054.html

http://www.gumtree.com/london/43/44555243.html
Thanks A Million :)
josephconlin
Posts: 190
Joined: Wed Aug 06, 2008 2:38 am

Re: URL Extract Problem

Post by josephconlin » Sun Sep 20, 2009 11:24 pm

iorgu wrote:Thanks a lot josephconlin.

It's working smoothly.

I have a little question : How can i get ride of those spaces between links ?

Code: Select all

http://www.gumtree.com/london/54/43000054.html

http://www.gumtree.com/london/54/43000054.html

http://www.gumtree.com/london/43/44555243.html
Thanks A Million :)
Here's an example, edited from my previous example, that doesn't save the text extract.

Code: Select all

VERSION BUILD=6240709 RECORDER=FX
TAB T=1
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP NO
TAG POS=1 TYPE=DIV ATTR=CLASS:posting_thumbnail EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R1 TYPE=A ATTR=HREF:* EXTRACT=HREF
SAVEAS TYPE=EXTRACT FOLDER=C:\ FILE=gum-test.CSV
As you can see, I changed one of the SAVEAS lines (the first one) to SET !EXTRACT NULL. This removes the empty extract from !EXTRACT.

If clicking on the DIV element does not cause any change to the page (this is most often true - however, sometimes the div element click will trigger something, so I left the extract in), then it could be shortened to the following.

Code: Select all

VERSION BUILD=6240709 RECORDER=FX
TAB T=1
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP NO
TAG POS=1 TYPE=DIV ATTR=CLASS:posting_thumbnail
TAG POS=R1 TYPE=A ATTR=HREF:* EXTRACT=HREF
SAVEAS TYPE=EXTRACT FOLDER=C:\ FILE=gum-test.CSV
Hope this helps.
iorgu
Posts: 6
Joined: Thu Sep 17, 2009 6:33 pm

Re: URL Extract Problem

Post by iorgu » Mon Sep 21, 2009 10:52 am

thanks man ,

you`ve helped me a lot .

Thank you Thank you Thank you Thank you :D
addy196
Posts: 3
Joined: Tue Aug 20, 2013 11:48 am

Re: URL Extract Problem

Post by addy196 » Tue Aug 20, 2013 5:36 pm

I am unable to upload image on gumtree any one can help me withe code please

Link is https://my.gumtree.com/postad

Please Help
Post Reply