Yellow Pages macro Difficulty

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.
macro2macro
Posts: 1
Joined: Sat Jul 17, 2010 5:30 pm

Re: Yellow Pages macro Difficulty

Post by macro2macro » Sat Jul 17, 2010 5:36 pm

I used the code and for some reason the relative positioning does not work as I get blanks for data. I am on version 7.0.0.795
Tom, Tech Support
Posts: 3415
Joined: Mon May 31, 2010 4:59 pm

Re: Yellow Pages macro Difficulty

Post by Tom, Tech Support » Wed Jul 21, 2010 10:54 am

Hi macro2macro,

The macro is specific to version 6. Relative positioning has changed in version 7 and the macro code here would have to be reworked to be compatible with that version.

http://wiki.imacros.net/V7_Relative_positioning
Regards,

Tom, iMacros Support
secret
Posts: 3
Joined: Mon Oct 04, 2010 10:00 am

Re: Yellow Pages macro Difficulty

Post by secret » Mon Oct 04, 2010 10:17 am

Hi Tom,

I tried your code you posted on July 1 and I would like to modify it for yellowpages.com.au. I changed the ExtractEntry.iim as you can see below:

Code: Select all

SET !TIMEOUT_TAG 1
TAG POS={{Cnt}} TYPE=LI ATTR=CLASS:gold<SP>mappableListing<SP>listingContainer<SP>omnitureListing
SET !ENDOFPAGE {{!TAGSOURCEINDEX}}
TAG POS={{Cnt}} TYPE=DIV ATTR=CLASS:info&&TXT:*
TAG POS=R1 TYPE=SPAN ATTR=TXT:*&&ID:listing-name&&TXT:* EXTRACT=TXT
TAG POS={{Cnt}} TYPE=DIV ATTR=CLASS:info&&TXT:*
TAG POS=R1 TYPE=SPAN ATTR=CLASS:phoneNumber&&TXT:* EXTRACT=TXT
TAG POS={{Cnt}} TYPE=DIV ATTR=CLASS:info&&TXT:*
TAG POS=R1 TYPE=SPAN ATTR=CLASS:address&&TXT:* EXTRACT=TXT
TAG POS={{Cnt}} TYPE=DIV ATTR=CLASS:info&&TXT:*
TAG POS=R1 TYPE=A ATTR=TXT:website&&TXT:* EXTRACT=HREF
I tried it for an example page: http://www.yellowpages.com.au/search/li ... &x=40&y=12

Of course I changed the QryYP.iim accordingly but it doesn't work.

Maybe it is a very simple and noob question but I am totally new to iMacros and I cannot find my way. Can you help me out? What and how should I change?

Thanks a lot in advance,
secret

P.S.: Anyone's advice would be highly appreciated :)
secret
Posts: 3
Joined: Mon Oct 04, 2010 10:00 am

Re: Yellow Pages macro Difficulty

Post by secret » Wed Oct 06, 2010 1:35 pm

Okay, here is what I have found out so far:

Code: Select all

VERSION BUILD=6900210
TAB T=1     
TAB CLOSEALLOTHERS
SET !EXTRACT_TEST_POPUP NO     
TAG POS={{!LOOP}} TYPE=LI ATTR=CLASS:*<SP>listingContainer<SP>omnitureListing&&TXT:*
TAG POS={{!LOOP}} TYPE=SPAN ATTR=TXT:*&&ID:listing-name-* EXTRACT=TXT  
TAG POS={{!LOOP}} TYPE=SPAN ATTR=CLASS:address&&TXT:* EXTRACT=TXT  
TAG POS={{!LOOP}} TYPE=SPAN ATTR=CLASS:phoneNumber&&TXT:* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=C:\TEMP FILE=YP.csv
It is still not usable with the VBSript just to play in Loop Mode and I cannot find a way to extract the HREF from "website".

Can anyone help me to modify it for playing in the VBSript and to extract HREF? I am completely lost.
Tom, Tech Support
Posts: 3415
Joined: Mon May 31, 2010 4:59 pm

Re: Yellow Pages macro Difficulty

Post by Tom, Tech Support » Fri Oct 08, 2010 1:40 pm

Hello secret,

Here is the solution modified to work with the Australian yellow pages site and iMacros 7.
YellowPagesAu.zip
All source files in one zip package
(1.98 KiB) Downloaded 676 times
YellowPagesAu.vbs:

Code: Select all

Option Explicit

Const MAX_EXTRACTED_ITEMS = 3        ' This value should match the number of EXTRACTs in ExtractListing.iim

Dim im, ret, cnt, i
Dim  macroPath, outputFolder, outputFile

macroPath = ""
outputFolder = "*"
outputFile = "YP.csv"

Set im = CreateObject ("iMacros")
CheckErr(im.iimInit())

' Query the yellow pages site
CheckErr(im.iimPlay(macroPath + "QryYP.iim"))

cnt = 1
Do 
    ' Extract the listing
    If Not ExtractListing(cnt) Then ' ExtractListing returns false if there are no more listings
        im.iimPlay("CODE:PROMPT All<SP>done!")
        Exit Do
    End if
    
    ' Set the variables in our Save macro with the extracted values
    For i = 1 to MAX_EXTRACTED_ITEMS
        CheckErr(im.iimSet("field" & CStr(i), Replace(im.iimGetLastExtract(i), "#EANF#", "")))
    Next
    
    ' Save the extracted items to the output file
    CheckErr(im.iimSet("outputFolder", outputFolder))
    CheckErr(im.iimSet("outputFile", outputFile))
    CheckErr(im.iimPlay(macroPath + "SaveListing.iim"))
    cnt = cnt + 1
Loop   

Function ExtractListing(ByRef cnt)
    ExtractListing = True
    
    CheckErr(im.iimSet("Cnt", cnt))
    ret = im.iimPlay(macroPath + "ExtractListing.iim")
    
    If ret = -1300 Then ' Couldn't find any more listings on this page
        ' Attempt to navigate to the next page of results
        ret = im.iimPlay(macroPath + "NextPage.iim")

        If ret = -1300 Then ' Couldn't find a Next Page link/button, so we must be done.
            ExtractListing = False
            Exit Function
        End If

        CheckErr(ret) ' Check for other errors attempting to click next button
        
        ' Extract the first listing on the new page
        cnt = 1
        ExtractListing = ExtractListing(cnt)
    End If
    
    CheckErr(ret) ' Check for other errors during extracting
End Function

Sub CheckErr(retCode)

    If retCode < 0 Then
        MsgBox im.iimGetLastError(), vbCritical, "Macro Error: " & retCode
        WScript.Quit()
    End If
    
End Sub
QueryYP.iim:

Code: Select all

URL GOTO=http://www.yellowpages.com.au/search/listings?clue=thai+restaurant&locationClue=NSW&x=52&y=8
ExtractListing.iim:

Code: Select all

SET !TIMEOUT_STEP 1

' Set the end of the search boundary for relative positioning
TAG POS={{Cnt}} TYPE=UL ATTR=CLASS:primaryListingLinks
SET !ENDOFPAGE {{!TAGSOURCEINDEX}}

' Extract the company name
TAG POS={{Cnt}} TYPE=SPAN ATTR=ID:listing-name* EXTRACT=TXT

' Extract the address
TAG POS=R1 TYPE=SPAN ATTR=CLASS:address EXTRACT=TXT

' Extract the phone number
TAG POS=R1 TYPE=SPAN ATTR=CLASS:phoneNumber EXTRACT=TXT
SaveListing.iim:

Code: Select all

SET !EXTRACT {{field1}}
ADD !EXTRACT {{field2}}
ADD !EXTRACT {{field3}}

SAVEAS TYPE=EXTRACT FOLDER={{outputFolder}} FILE={{outputFile}}
NextPage.iim:

Code: Select all

SET !TIMEOUT_STEP 1
TAG POS=1 TYPE=IMG ATTR=SRC:http://www.yellowpages.com.au/r/28.2/ui/standard/right.gif

' Wait for next page of results to display
WAIT SECONDS=2
SET !TIMEOUT_STEP 60
TAG POS=1 TYPE=UL ATTR=CLASS:primaryListingLinks
Regards,

Tom, iMacros Support
maxdoldan
Posts: 9
Joined: Wed Jun 16, 2010 5:33 pm

Re: Yellow Pages macro Difficulty

Post by maxdoldan » Fri Oct 08, 2010 5:46 pm

Hello Tom I have migrated to the Version 7. I am trying to use the code you posted for the AU YP website. But it seems it has different positioning to the US version of the site. I tried modifying it but i haven´t been able to get it to work.
Would you let me know how can i set the Anchor for the realtive positioning. I am running v7.04.0903
Best Regards.
Image
Paste this code in your browser. Direct Iternational toll free number.

Code: Select all

http://www.theclicktocall.com/areftoe.aspx?&key=0474A5AD5832329B4A165E21EC0A4385&refnum=ogfd8duyhSMZF1xkci4Z1g==
Tom, Tech Support
Posts: 3415
Joined: Mon May 31, 2010 4:59 pm

Re: Yellow Pages macro Difficulty

Post by Tom, Tech Support » Fri Oct 08, 2010 8:00 pm

Here you go Max!
YellowPagesUS.zip
(2.01 KiB) Downloaded 1696 times
Regards,

Tom, iMacros Support
secret
Posts: 3
Joined: Mon Oct 04, 2010 10:00 am

Re: Yellow Pages macro Difficulty

Post by secret » Fri Oct 08, 2010 9:07 pm

Hello Tom,

Thanks a ton for your great help, you rescued my life. :)

Thank you really very much!

Have a beautiful day,

secret
LANWrench
Posts: 2
Joined: Mon Dec 13, 2010 5:39 am

Re: Yellow Pages macro Difficulty

Post by LANWrench » Mon Dec 13, 2010 5:44 am

I am trying out the product and I am trying to see if I can get this code to work on yellowpages.com but I am having issues.

I have even tried making my own but I think that will have to wait lol. Need to learn more.

So has anything changed at yellowpages.com that would cause this script not to work? I keep getting error 1300 and I assume that has something to do with the POS value?

FYI, I am trying this on the US yellowpages.com and used the script posted a couple posts up to do this.

Thanks in advance.
Tom, Tech Support
Posts: 3415
Joined: Mon May 31, 2010 4:59 pm

Re: Yellow Pages macro Difficulty

Post by Tom, Tech Support » Thu Dec 16, 2010 6:19 pm

Hello LANWrench,

I am not experiencing any difficulties with the US yellowpages script when run as-is. Can you explain when/where you are receiving that error and if you made any changes to the script or macros, including the query URL?
Regards,

Tom, iMacros Support
LANWrench
Posts: 2
Joined: Mon Dec 13, 2010 5:39 am

Re: Yellow Pages macro Difficulty

Post by LANWrench » Tue Jan 04, 2011 5:47 pm

I did in fact change the query URL.

If the query URL is changed, what do I need to do to make it work?
Tom, Tech Support
Posts: 3415
Joined: Mon May 31, 2010 4:59 pm

Re: Yellow Pages macro Difficulty

Post by Tom, Tech Support » Wed Jan 05, 2011 9:25 am

LANWrench,

If all you changed was the query string, you shouldn't have to change anything else to get it to work. I was just wondering if you had tried the script with the original query URL provided in the zip package, because that is what I recently tested it with and it was working.

Perhaps you can post the query URL you used and I can try it with that.
Regards,

Tom, iMacros Support
maxdoldan
Posts: 9
Joined: Wed Jun 16, 2010 5:33 pm

Re: Yellow Pages macro Difficulty

Post by maxdoldan » Fri Jul 29, 2011 6:59 pm

Greetings all, Tom I've had no problems with the Yellow pages listings but now I'm trying to migrate the code to this site
http://www.healthgrades.com/provider-se ... kin+cancer

And for some reason it is not looping in the relative POS, I think i have an anchor problem with the site given it has different names for buttons and properties, I've modified the ones I found but there is still something missing.
I didnt find anything else I could do so that is why I ask for your help.
There is also another issue with this types of macros in general is that websites are making popups on the screen for advertisements YP and HG both have it does it affect the scraping? is there any way to make them be clicked with code?
Best Regards to everyone.
Max
Attachments
Healthgrades.rar
HealthGrades project files
(1.96 KiB) Downloaded 469 times
Image
Paste this code in your browser. Direct Iternational toll free number.

Code: Select all

http://www.theclicktocall.com/areftoe.aspx?&key=0474A5AD5832329B4A165E21EC0A4385&refnum=ogfd8duyhSMZF1xkci4Z1g==
Tom, Tech Support
Posts: 3415
Joined: Mon May 31, 2010 4:59 pm

Re: Yellow Pages macro Difficulty

Post by Tom, Tech Support » Tue Aug 02, 2011 1:07 pm

Hello maxdoldan,

I replied to your support ticket.
Regards,

Tom, iMacros Support
dotnetpdr
Posts: 1
Joined: Mon Nov 28, 2011 7:56 pm

Re: Yellow Pages macro Difficulty

Post by dotnetpdr » Mon Nov 28, 2011 8:01 pm

Tom,

Now I have this code in version 7.51.1734. I am trying to run it, how do I do it?
I made few changes to it and trying to work it for my websites that I want to work with.

Please help,
Prashant :roll:
Post Reply