Howto handle AJAX?

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Howto handle AJAX?

by nevermind on Mon Apr 17, 2006 4:13 pm

hello,

im pretty new to iopus, testing the demo at the moment.

My current problem:
AJAX is becoming more and more a technology, which is used in todays web applications.
This means that filling out a form and sending it will not require to reload a page. New content can be displayed, without reloading the whole site. Google for example already use it.
For those who dont know what i mean may have a look at wikipedia.
Here is an online example: http://onlyonline.co.uk/simpleajaxexample/showtime.html

And this is my problem atm too.
Im filling out a form, and the information which i want to extract is the result page. As the page wont be really reloaded, i cant easily save it, because it does not wait for the result content, within the page. So my current solution is, to set a WAIT SECONDS=10, which sucks. Cuz when the page needs just 2 seconds for loading, its a mess of time, and when it needs 20 seconds, i have no result page infos at all.

Is there any safe and time saving solution?

thanks in advance


ps: the application seems to be really good! :)
nevermind
 
Posts: 2
Joined: Mon Apr 17, 2006 3:59 pm

AJAX

by Tech Support on Tue Apr 18, 2006 2:02 pm

AJAX based websites are no problem for iMacros:

Here are two possible solutions:

(1) Wait for the text to appear. In our case we wait for the words "Time now" to appear. This indicates that the page was updated. A simple TAG command is sufficient for this: If a TAG command can not find its attribute on the web page the TAG command goes in "LoadCheck" mode and waits for 6 seconds (1/10 of the TIMEOUT value) until the word "Time now" appears. Once the words are found on the page, the macro continues and extracts the time.

This demo macro shows the solution:

VERSION BUILD=5100314
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://onlyonline.co.uk/simpleajaxexample/showtime.html
TAG POS=1 TYPE=INPUT:BUTTON FORM=NAME:NoFormName ATTR=NAME:&&VALUE:Click*
TAG POS=1 TYPE=H2 ATTR=TXT:*Time<SP>now*
EXTRACT POS=1 TYPE=TXT ATTR=<H2<SP>id=lbtime>*

(2) Wait for an image to change or appear (In this example the image of the word "Time now"): This can be done with the IMAGESEARCH command. This command will wait until it "sees" that the new data arrived. Just like a human user would do. For more information about IMAGESEARCH please see http://www.iopus.com/imacros/help/ir_fl ... xample.htm


Related forum post: Performance testing AJAX
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm

updated TAG command?

by jstatham on Wed Jul 25, 2007 2:09 pm

I am working on a similar project, but I'm trying to extract a DIV instead of H2 that initially is blank. I can only get it to work using the WAIT command (not a big deal). However since with version 6.0 EXTRACT is no longer supported, how do I get this to work with the TAG command?

TAG POS=1 TYPE=TXT ATTR=<div<SP>id=result>* EXTRACT=HTM

does not work.
jstatham
 
Posts: 10
Joined: Sat Jan 20, 2007 4:13 pm

by Tech Support on Wed Jul 25, 2007 4:31 pm

Version 6 still supports the EXTRACT command and the old V5 syntax inside the iMacros browser and IE.

However, we recommend using the new V6 syntax, which is easier to use, more powerful and works in the iMacros browser, IE and Firefox.

To answer your question, the correct syntax for the AJAX example is:
TAG POS=1 TYPE=H2 ATTR=ID:lbtime EXTRACT=TXT

So the complete macro looks like this:

Code: Select all
VERSION BUILD=6000714
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://onlyonline.co.uk/simpleajaxexample/showtime.html
TAG POS=1 TYPE=INPUT:BUTTON FORM=NAME:NoFormName ATTR=NAME:&&VALUE:Click*
TAG POS=1 TYPE=H2 ATTR=TXT:*Time<SP>now*
TAG POS=1 TYPE=H2 ATTR=ID:lbtime EXTRACT=TXT


As you can see, the new TAG...EXTRACT command has the same syntax as the standard TAG command, which makes it easy to create data extraction commands.

In your case, I assume the TAG/EXTRACT command will lool like this:

TAG POS=1 TYPE=DIV ATTR=ID:result EXTRACT=TXT

If this does not work, please post a link the complete HTML code. Or simply try to record a TAG command for this element. How does the recorded line look like?
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm

by jstatham on Thu Jul 26, 2007 7:20 am

Yes, I've found that v6 works great with EXTRACT:

WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*

Above works, however to eliminate the wait keep up with new methods I want to use TAG:

TAG POS=1 TYPE=DIV ATTR=ID:result EXTRACT=HTM

but it doesn't work. It goes to "Loading: x (174)s" which is apparently what it does when it can't find the tag it's looking for.

But hey, the EXTRACT works, so I'm not gonna fight it. However I do have another problem once the code is looped you may be able to help with.

I'm trying to scrape the main table after the dropdown is selected and the submit button is hit. My Macro is:

VERSION BUILD=6000524
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://www.shipmentlink.com/servlet/MSQ1_TariffController.do?func=RuleSearch
SIZE X=1092 Y=870
TAG POS=1 TYPE=SELECT FORM=NAME:frmRule ATTR=NAME:tariff CONTENT=4
TAG POS=1 TYPE=IMG ATTR=HREF:http://www.shipmentlink.com/tuf1/images/btn_Submit.gif
WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*
'TAG POS=1 TYPE=DIV ATTR=ID:result EXTRACT=HTM
SAVEAS TYPE=EXTRACT FOLDER=c:\data FILE=evrgrn.htm
TAG POS=1 TYPE=SELECT FORM=NAME:frmRule ATTR=NAME:tariff CONTENT=5
TAG POS=1 TYPE=IMG ATTR=HREF:http://www.shipmentlink.com/tuf1/images/btn_Submit.gif
WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*
SAVEAS TYPE=EXTRACT FOLDER=c:\data FILE=evrgrn2.htm

The first page saves as expected, but for the second page it shows "Loading x (60)s" on step 11 before moving on. How do I avoid this 60s delay for each subsequent page?
jstatham
 
Posts: 10
Joined: Sat Jan 20, 2007 4:13 pm

by Tech Support on Mon Jul 30, 2007 7:46 am

The web browser does not receive a "web page complete" signal from the server for the second part of the web macro. As a workaround, please insert a page refresh command:

VERSION BUILD=6000524
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://www.shipmentlink.com/servlet/MSQ1_TariffController.do?func=RuleSearch
SIZE X=1092 Y=870
TAG POS=1 TYPE=SELECT FORM=NAME:frmRule ATTR=NAME:tariff CONTENT=4
TAG POS=1 TYPE=IMG ATTR=HREF:http://www.shipmentlink.com/tuf1/images/btn_Submit.gif
WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*
'TAG POS=1 TYPE=DIV ATTR=ID:result EXTRACT=HTM
SAVEAS TYPE=EXTRACT FOLDER=c:\ FILE=evrgrn.htm
REFRESH
TAG POS=1 TYPE=SELECT FORM=NAME:frmRule ATTR=NAME:tariff CONTENT=5
TAG POS=1 TYPE=IMG ATTR=HREF:http://www.shipmentlink.com/tuf1/images/btn_Submit.gif
WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*
SAVEAS TYPE=EXTRACT FOLDER=c:\ FILE=evrgrn2.htm
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm

Re: AJAX

by iMark on Sun Sep 19, 2010 3:33 am

Tech Support wrote:AJAX based websites are no problem for iMacros:

Here are two possible solutions:

(1) Wait for the text to appear. In our case we wait for the words "Time now" to appear. This indicates that the page was updated. A simple TAG command is sufficient for this: If a TAG command can not find its attribute on the web page the TAG command goes in "LoadCheck" mode and waits for 6 seconds (1/10 of the TIMEOUT value) until the word "Time now" appears. Once the words are found on the page, the macro continues and extracts the time.

This demo macro shows the solution:

VERSION BUILD=5100314
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://onlyonline.co.uk/simpleajaxexample/showtime.html
TAG POS=1 TYPE=INPUT:BUTTON FORM=NAME:NoFormName ATTR=NAME:&&VALUE:Click*
TAG POS=1 TYPE=H2 ATTR=TXT:*Time<SP>now*
EXTRACT POS=1 TYPE=TXT ATTR=<H2<SP>id=lbtime>*

(2) Wait for an image to change or appear (In this example the image of the word "Time now"): This can be done with the IMAGESEARCH command. This command will wait until it "sees" that the new data arrived. Just like a human user would do. For more information about IMAGESEARCH please see http://www.iopus.com/imacros/help/ir_fl ... xample.htm


Related forum post: Performance testing AJAX


it doesn't work
the script is simply jump the ligne TAG POS=1 TYPE=H2 ATTR=TXT:[color=orange][b]*Time<SP>now* without waiting for the timeout
i'm setting errorignore to yes because i don't want to stop the script while error occured
iMark
 
Posts: 2
Joined: Sun Sep 19, 2010 3:14 am

Re: Howto handle AJAX?

by Tech Support on Wed Sep 22, 2010 3:42 am

I see that the old AJAX demo site is no longer available. What site did you use to test?

Can you please post the URL of the web page and/or the macro that creates the problem?
User avatar
Tech Support
 
Posts: 5003
Joined: Tue Sep 20, 2005 12:25 pm

Re: Howto handle AJAX?

by fwg on Mon Jun 25, 2012 8:25 am

What happens if there is no way to know what the new content will be but it will always appear in the same place?
fwg
 
Posts: 2
Joined: Sun Jun 24, 2012 11:38 pm

Re: Howto handle AJAX?

by pedrocalvin on Wed Dec 05, 2012 8:38 am

Hello,

Thank you for all the information posted here. It has been very useful for me.

However, I am facing a different problem at the moment. The webpage I am doing a macro for, has a AJAX component that can take either 3, 10 or 20+ minutes processing (because the request goes for a queue so the waiting time is quite random).

There is a string I can use for identifying when the process has been completed, which is:
TAG POS=1 TYPE=SPAN ATTR=TXT:*Finished*

However, this won't work well because this test is performed after 6 seconds. As 6 seconds is not enough for processing my AJAX request, it will always fail and stop the processing of the macro.

In case this test fails, is there any way to keep repeating it until it succeeds?

Thank you,
Pedro
pedrocalvin
 
Posts: 1
Joined: Wed Dec 05, 2012 8:27 am


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 3 guests

-->