Extracting Price Data

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Extracting Price Data

by rharris on Tue Oct 11, 2005 10:08 am

I've been using Internet Macros for about a month to extract data from various websites. I am currently working on a macro to visit movie theatre websites to extract pricing data. For illustration I will use Fandango.com. To get pricing data I would create four diffenent macros and run them indepentdently to extract the data.

a. Visit Fandango by looping over zipcodes. - Save pages.

b. After extracting the number of pages for that zip code I would create a new csv file with zips and page number (1,2,3, etc). Using newly created csv, I would loop over zip/page combinations again saving pages.

c. I would extract the theatre names from the saved pages and create a third csv file containing zip/page/theatre information.

d. I would loop over the new csv file to and save the pages that contain pricing info. (Box office information)

This method will work but it is fairly cumbersome. I would appreciate any insight on my naive approach.

Thanks!
rharris
 
Posts: 1
Joined: Tue Oct 11, 2005 9:40 am

by Tech Support on Tue Oct 11, 2005 2:32 pm

The steps itself are ok, but using the Scripting Edition you can combine the steps a -d into one loop over all the ZIP codes.

The main "trick" is to look for a "NEXT" link on each page. If there is such a link, you click on it and move to the next page. If there is no such link, the script knows that is done (with this specific zip code). While you do this, you can extract the pricing information at the same time.

Please see http://www.iopus.com/iim/help/faq_extract_pages.htm for an example.

This way you have one input file (ZIP codes) and one output file with the pricing information.
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 2 guests

-->