Extracting Price Data

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
Posts: 1
Joined: Tue Oct 11, 2005 4:40 pm

Extracting Price Data

Post by rharris » Tue Oct 11, 2005 5:08 pm

I've been using Internet Macros for about a month to extract data from various websites. I am currently working on a macro to visit movie theatre websites to extract pricing data. For illustration I will use Fandango.com. To get pricing data I would create four diffenent macros and run them indepentdently to extract the data.

a. Visit Fandango by looping over zipcodes. - Save pages.

b. After extracting the number of pages for that zip code I would create a new csv file with zips and page number (1,2,3, etc). Using newly created csv, I would loop over zip/page combinations again saving pages.

c. I would extract the theatre names from the saved pages and create a third csv file containing zip/page/theatre information.

d. I would loop over the new csv file to and save the pages that contain pricing info. (Box office information)

This method will work but it is fairly cumbersome. I would appreciate any insight on my naive approach.

User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm

Post by Tech Support » Tue Oct 11, 2005 9:32 pm

The steps itself are ok, but using the Scripting Edition you can combine the steps a -d into one loop over all the ZIP codes.

The main "trick" is to look for a "NEXT" link on each page. If there is such a link, you click on it and move to the next page. If there is no such link, the script knows that is done (with this specific zip code). While you do this, you can extract the pricing information at the same time.

Please see http://www.iopus.com/iim/help/faq_extract_pages.htm for an example.

This way you have one input file (ZIP codes) and one output file with the pricing information.
Post Reply