Cannot extract from expedia

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Cannot extract from expedia

by IamMacros on Thu Oct 02, 2008 3:21 pm

Hello,

I am trying to collect a flight prices at expedia from say London to Tokio at different days.
The page where a try to extract minimal price for direct or one-stop flight is here:
http://www.expedia.co.uk/pub/agent.dll? ... &eapid=0-3

I cannot make iMacros to extract the prices properly - it often extracts just rundom messages from the same page. Could you please let me know how can I fix it?
Thanks in advance.

Code: Select all
VERSION BUILD=6070918 RECORDER=FX
TAB T=1
URL GOTO=http://www.expedia.co.uk/
TAG POS=1 TYPE=INPUT:RADIO FORM=ID:bunwiz ATTR=ID:rfli
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:bunwiz ATTR=ID:fret CONTENT=Tokio
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:bunwiz ATTR=ID:fdepdt CONTENT=22/10/2008
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:bunwiz ATTR=ID:fretdt CONTENT=08/11/2008
TAG POS=1 TYPE=SELECT FORM=NAME:bunwiz ATTR=ID:rad1 CONTENT=%2
TAG POS=1 TYPE=INPUT:SUBMIT FORM=ID:bunwiz ATTR=ID:sublink
'TAG POS=5 TYPE=B ATTR=TXT:* EXTRACT=TXT 
'TAG POS=10 TYPE=B ATTR=TXT:* EXTRACT=TXT 
IamMacros
 
Posts: 2
Joined: Thu Oct 02, 2008 1:25 pm

Re: Cannot extract from expedia

by Tech Support on Fri Oct 03, 2008 4:04 am

Please use relative positioning. In this case I think the airplain image can serve as stable anchor. The price is always directly to the right of it.
airplane anchor.png

This macro works:
Code: Select all
URL GOTO=http://www.expedia.co.uk/pub/agent.dll?qscr=fexp&flag=q&city1=London+%28LON+%2D+All+Airports%29&citd1=Tokio&date1=25/10/2008&time1=362&date2=31/10/2008&time2=362&cAdu=1&cSen=&cChi=&cInf=&infs=1&tktt=&trpt=2&ecrc=&eccn=&qryt=1&load=1&rdct=1&rfrr=-13018&eapid=0-3
TAG POS=1 TYPE=IMG ATTR=SRC:http://www.expedia.co.uk/eta/fltitin.gif
TAG POS=R1 TYPE=B ATTR=TXT:* EXTRACT=TXT

If you need all prices on the page, please change the POS attribute of the airplane image from 1 to 2,3,4,... using the Scripting Interface:

Code: Select all
TAG POS={{myloop}} TYPE=IMG ATTR=SRC:http://www.expedia.co.uk/eta/fltitin.gif
TAG POS=R1 TYPE=B ATTR=TXT:* EXTRACT=TXT
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm

Re: Cannot extract from expedia

by IamMacros on Sun Oct 05, 2008 12:28 pm

Thanks a lot!
IamMacros
 
Posts: 2
Joined: Thu Oct 02, 2008 1:25 pm


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: Google [Bot] and 3 guests

-->