Incrementing Numbers in URLs

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
jfj3rd
Posts: 43
Joined: Thu Dec 17, 2009 5:23 pm
Location: Riverside CA
Contact:

Incrementing Numbers in URLs

Post by jfj3rd » Tue Jun 08, 2010 7:32 pm

Hello everyone,

So I've been working on making a excel call data from a website as is shown in http://forum.imacros.net/_uploads/ExcelVBA-Part2.htm and the tutorials around it.

I'm running into a roadblock. The data scraping is something I understand how to do however I'm having an issue telling the script to go to the next page and scrape the same data from that page. Example:

http://www.something.com/1
http://www.something.com/2
http://www.something.com/3
http://www.something.com/4
...

I have tried everything I can possibly think of and everything I've come across on the Wiki and the forum and am still getting the script either breaking or repeating the scrape for http://www.something.com/1 and never incrementing.

I'm having the VBS in excel load the Firefox iMacro plugin to run the iMacro if that helps any.

Thanks in advance for any feedback that can be given. Even pointing me in the right direction or suggestions as opposed to the actual code will be helpful as I'm now at a loss after a few hours wasted this morning.

- John
--

Warmest Wishes,

John F. Jones III
billbell52
Posts: 125
Joined: Tue Mar 23, 2010 8:45 pm

Re: Incrementing Numbers in URLs

Post by billbell52 » Tue Jun 08, 2010 8:02 pm

Can you post the code. Hard to help without the details.
jfj3rd
Posts: 43
Joined: Thu Dec 17, 2009 5:23 pm
Location: Riverside CA
Contact:

Re: Incrementing Numbers in URLs

Post by jfj3rd » Tue Jun 08, 2010 8:26 pm

Here is the current code I am working with though I've probably butchered it a little through out the day.

VBS Script

Code: Select all

Private Sub CommandButton1_Click()

Dim iim1, iret, row, totalrows
Dim STATUS, SITETYPE, SITEID, WEBSITE, CUSTOMERNUMBER, HOSTDATE, CANCELDATE, SALEDATE, WSFU, FIRSTNAME, MIDDLENAME, LASTNAME, COMPANYNAME, ADDRESS, CITY, STATE, ZIP, EMAIL, DIRECTPHONE, DIRECTPHONEEXT, OFFICEPHONE, OFFICEPHONEEXT, FAX, CELLPHONE, CELLPHONEEXT, HOMEPHONE, HOMEPHONEEXT, PAGERPHONE, PAGERPHONEEXT, TOLLFREEPHONE, TOLLFREEPHONEEXT, COUNTRY

Set iim1 = CreateObject("imacros")
iret = iim1.iimInit("-fx", False, "", "", "", 150)

iret = iim1.iimInit
iret = iim1.iimDisplay("Submitting Data from Excel")

' --- Pass CRMID from Excel to the first iMacro and play that iMacro ---
row = 2
' Set the variable
iret = iim1.iimSet("-var_CRMID", Cells(row, 1).Value)
' Set the Display
iret = iim1.iimDisplay("Row# " + CStr(row))
' Run the Macro
iret = iim1.iimPlay("Scrape")

If iret < 0 Then
    MsgBox iim1.iimGetLastError()
End If

Cells(row, 2).Value = iim1.iimGetLastExtract(0)
Cells(row, 2).Value = iim1.iimGetLastExtract(1)
Cells(row, 2).Value = iim1.iimGetLastExtract(2)
Cells(row, 2).Value = iim1.iimGetLastExtract(3)
Cells(row, 2).Value = iim1.iimGetLastExtract(4)
Cells(row, 2).Value = iim1.iimGetLastExtract(5)
Cells(row, 2).Value = iim1.iimGetLastExtract(6)
Cells(row, 2).Value = iim1.iimGetLastExtract(7)
Cells(row, 2).Value = iim1.iimGetLastExtract(8)
Cells(row, 2).Value = iim1.iimGetLastExtract(9)
Cells(row, 2).Value = iim1.iimGetLastExtract(10)
Cells(row, 2).Value = iim1.iimGetLastExtract(11)
Cells(row, 2).Value = iim1.iimGetLastExtract(12)
Cells(row, 2).Value = iim1.iimGetLastExtract(13)
Cells(row, 2).Value = iim1.iimGetLastExtract(14)
Cells(row, 2).Value = iim1.iimGetLastExtract(15)
Cells(row, 2).Value = iim1.iimGetLastExtract(16)
Cells(row, 2).Value = iim1.iimGetLastExtract(17)
Cells(row, 2).Value = iim1.iimGetLastExtract(18)
Cells(row, 2).Value = iim1.iimGetLastExtract(19)
Cells(row, 2).Value = iim1.iimGetLastExtract(20)
Cells(row, 2).Value = iim1.iimGetLastExtract(21)
Cells(row, 2).Value = iim1.iimGetLastExtract(22)
Cells(row, 2).Value = iim1.iimGetLastExtract(23)
Cells(row, 2).Value = iim1.iimGetLastExtract(24)
Cells(row, 2).Value = iim1.iimGetLastExtract(25)
Cells(row, 2).Value = iim1.iimGetLastExtract(26)
Cells(row, 2).Value = iim1.iimGetLastExtract(27)
Cells(row, 2).Value = iim1.iimGetLastExtract(28)
Cells(row, 2).Value = iim1.iimGetLastExtract(29)
Cells(row, 2).Value = iim1.iimGetLastExtract(30)
Cells(row, 2).Value = iim1.iimGetLastExtract(31)

iret = iim1.iimSet("-var_STATUS", STATUS)
iret = iim1.iimSet("-var_SITETYPE", SITETYPE)
iret = iim1.iimSet("-var_SITEID", SITEID)
iret = iim1.iimSet("-var_WEBSITE", WEBSITE)
iret = iim1.iimSet("-var_CUSTOMERNUMBER", CUSTOMERNUMBER)
iret = iim1.iimSet("-var_HOSTDATE", HOSTDATE)
iret = iim1.iimSet("-var_CANCELDATE", CANCELDATE)
iret = iim1.iimSet("-var_SALEDATE", SALEDATE)
iret = iim1.iimSet("-var_WSFU", WSFU)
iret = iim1.iimSet("-var_FIRSTNAME", FIRSTNAME)
iret = iim1.iimSet("-var_MIDDLENAME", MIDDLENAME)
iret = iim1.iimSet("-var_LASTNAME", LASTNAME)
iret = iim1.iimSet("-var_COMPANYNAME", COMPANYNAME)
iret = iim1.iimSet("-var_ADDRESS", ADDRESS)
iret = iim1.iimSet("-var_CITY", CITY)
iret = iim1.iimSet("-var_STATE", STATE)
iret = iim1.iimSet("-var_ZIP", ZIP)
iret = iim1.iimSet("-var_EMAIL", EMAIL)
iret = iim1.iimSet("-var_DIRECTPHONE", DIRECTPHONE)
iret = iim1.iimSet("-var_DIRECTPHONEEXT", DIRECTPHONEEXT)
iret = iim1.iimSet("-var_OFFICEPHONE", OFFICEPHONE)
iret = iim1.iimSet("-var_OFFICEPHONEEXT", OFFICEPHONEEXT)
iret = iim1.iimSet("-var_FAX", FAX)
iret = iim1.iimSet("-var_CELLPHONE", CELLPHONE)
iret = iim1.iimSet("-var_CELLPHONEEXT", CELLPHONEEXT)
iret = iim1.iimSet("-var_HOMEPHONE", HOMEPHONE)
iret = iim1.iimSet("-var_HOMEPHONEEXT", HOMEPHONEEXT)
iret = iim1.iimSet("-var_PAGERPHONE", PAGERPHONE)
iret = iim1.iimSet("-var_PAGERPHONEEXT", PAGERPHONEEXT)
iret = iim1.iimSet("-var_TOLLFREEPHONE", TOLLFREEPHONE)
iret = iim1.iimSet("-var_TOLLFREEPHONEEXT", TOLLFREEPHONEEXT)
iret = iim1.iimSet("-var_COUNTRY", COUNTRY)

' Loop through CRMIDs and extract remaining data
totalrows = ActiveSheet.UsedRange.Rows.Count
For row = 3 To totalrows
' Set the variable
iret = iim1.iimSet("-var_CRMID", Cells(row, 1).Value)
' Set the Display
iret = iim1.iimDisplay("Row# " + CStr(row))
' Run the Macro
iret = iim1.iimPlay("Scrape2")
If iret < 0 Then
    MsgBox iim1.iimGetLastError()
End If
' Insert the Extracted Data
Cells(row, 2).Value = iim1.iimGetLastExtract(0)
Next row

iret = iim1.iimDisplay("Data Scrape Complete")
iret = iim1.iimExit

End Sub
Scrape.iim

Code: Select all

VERSION BUILD=6600217 RECORDER=FX
SET !REPLAYSPEED FAST
TAB T=1
' Specify input file (if !COL variables are used, IIM automatically assume a CSV format of the input 

file
'CSV = Comma Separated Values in each line of the file
CMDLINE !DATASOURCE C:\Documents...Desktop\data.csv
'Number of columns in the CSV file. This must be accurate!
SET !DATASOURCE_COLUMNS 15
'Start at line 2 to skip the header in the file
SET !LOOP 2
'Increase the current position in the file with each loop 
SET !DATASOURCE_LINE {{!LOOP}}

URL GOTO=Http://www....com/1
Scrape2.iim

Code: Select all

VERSION BUILD=6600217 RECORDER=FX
SET !REPLAYSPEED FAST


' ---------------- Begin Submission to Hotfrog.com --------------------     

TAG POS=1 TYPE=SELECT FORM=NAME:Form1 ATTR=ID:ddlStatusID EXTRACT=TXT 
TAG POS=1 TYPE=SELECT FORM=NAME:Form1 ATTR=ID:ddlSiteTypeID EXTRACT=TXT 
TAG POS=5 TYPE=TD ATTR=CLASS:itemrow2&&TXT:* EXTRACT=TXT  
TAG POS=7 TYPE=TD ATTR=CLASS:itemrow2&&TXT:* EXTRACT=TXT  
TAG POS=10 TYPE=TD ATTR=CLASS:itemrow2&&TXT:* EXTRACT=TXT  
TAG POS=14 TYPE=TD ATTR=CLASS:itemrow2&&TXT:* EXTRACT=TXT  
TAG POS=18 TYPE=TD ATTR=CLASS:itemrow2 EXTRACT=TXT  
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtSaleDate EXTRACT=TXT 
TAG POS=19 TYPE=TD ATTR=CLASS:itemrow2&&TXT:* EXTRACT=TXT  
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtFirstName EXTRACT=TXT
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtMiddleName EXTRACT=TXT
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtLastName EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtCompanyName EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtAddress EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtCity EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtState EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtZip EXTRACT=TXT
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtEmail EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtDirectPhone EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtDirectPhoneExt EXTRACT=TXT
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtOfficePhone EXTRACT=TXT
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtOfficePhoneExt EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtFax EXTRACT=TXT
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtCellPhone EXTRACT=TXT
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtCellPhoneExt EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtHomePhone EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtHomePhoneExt EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtPagerPhone EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtPagerPhoneExt EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtTollFreePhone EXTRACT=TXT 
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtTollFreePhoneExt EXTRACT=TXT
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:Form1 ATTR=ID:txtCountry EXTRACT=TXT
--

Warmest Wishes,

John F. Jones III
billbell52
Posts: 125
Joined: Tue Mar 23, 2010 8:45 pm

Re: Incrementing Numbers in URLs

Post by billbell52 » Tue Jun 08, 2010 11:08 pm

I am confused on what you are trying to do.

1. Scrape.iim does not appear to scrape any data.
2. You have this command many times. Your putting the value in the same cell and overwriting the previous value.
Cells(row, 2).Value = iim1.iimGetLastExtract(0)
3. You have commands like this. I do not see a macro with STATUS as a variable
iret = iim1.iimSet("-var_STATUS", STATUS)
4. You create a lot of variables (STATUS, SITETYPE, SITEID, WEBSITE...) yet you never set their values.

I recommend you do this very incrementally. Write a few lines and test it out. Use the VBA debugger and single step. Examine the variables as you execute. You may want to read a book or go to an online tutorial on Excel VBA.
jfj3rd
Posts: 43
Joined: Thu Dec 17, 2009 5:23 pm
Location: Riverside CA
Contact:

Re: Incrementing Numbers in URLs

Post by jfj3rd » Tue Jun 08, 2010 11:22 pm

Fantastic! Scrape.iim should have ALSO have the body of Scrape2.iim which is what your asking where it is. Not sure how I got that cut off on there.

I basically followed the code example provided by http://forum.imacros.net/_uploads/ExcelVBA-Part2.htm when it came to building the majority of the VBS section. The scraped data from Scrape.iim is then stored iim1.iimGetLastExtract(0), iim1.iimGetLastExtract(1) and so on.

I know the scraped data needs some work to come out correctly. I hadn't worked on that because I'm still trying to figure out how to increment the URL.

Any thoughts on that specifically?

- John
--

Warmest Wishes,

John F. Jones III
billbell52
Posts: 125
Joined: Tue Mar 23, 2010 8:45 pm

Re: Incrementing Numbers in URLs

Post by billbell52 » Wed Jun 09, 2010 2:48 am

Here it is. Take baby steps.

Sub URLSet()

Dim ws As Worksheet
Dim xRow As Integer

'Assumes sheet URLList has a list of URLs in column 1
xRow = 1
Set ws = Worksheets("URLList")
While ws.Cells(xRow, 1) <> ""
urlStr = ws.Cells(xRow, 1) <> ""
iret = iim1.iimSet("-var_URL1", urlStr)
iret = iim1.iimPlay("GoToURL")
' ....
xRow = xRow + 1
Wend
End Sub


GoToURL Macro
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO={{URL1}}
Post Reply