Howto handle AJAX?

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
nevermind
Posts: 2
Joined: Mon Apr 17, 2006 10:59 pm

Howto handle AJAX?

Post by nevermind » Mon Apr 17, 2006 11:13 pm

hello,

im pretty new to iopus, testing the demo at the moment.

My current problem:
AJAX is becoming more and more a technology, which is used in todays web applications.
This means that filling out a form and sending it will not require to reload a page. New content can be displayed, without reloading the whole site. Google for example already use it.
For those who dont know what i mean may have a look at wikipedia.
Here is an online example: http://onlyonline.co.uk/simpleajaxexample/showtime.html

And this is my problem atm too.
Im filling out a form, and the information which i want to extract is the result page. As the page wont be really reloaded, i cant easily save it, because it does not wait for the result content, within the page. So my current solution is, to set a WAIT SECONDS=10, which sucks. Cuz when the page needs just 2 seconds for loading, its a mess of time, and when it needs 20 seconds, i have no result page infos at all.

Is there any safe and time saving solution?

thanks in advance


ps: the application seems to be really good! :)
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

AJAX

Post by Tech Support » Tue Apr 18, 2006 9:02 pm

AJAX based websites are no problem for iMacros:

Here are two possible solutions:

(1) Wait for the text to appear. In our case we wait for the words "Time now" to appear. This indicates that the page was updated. A simple TAG command is sufficient for this: If a TAG command can not find its attribute on the web page the TAG command goes in "LoadCheck" mode and waits for 6 seconds (1/10 of the TIMEOUT value) until the word "Time now" appears. Once the words are found on the page, the macro continues and extracts the time.

This demo macro shows the solution:

VERSION BUILD=5100314
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://onlyonline.co.uk/simpleajaxexample/showtime.html
TAG POS=1 TYPE=INPUT:BUTTON FORM=NAME:NoFormName ATTR=NAME:&&VALUE:Click*
TAG POS=1 TYPE=H2 ATTR=TXT:*Time<SP>now*
EXTRACT POS=1 TYPE=TXT ATTR=<H2<SP>id=lbtime>*

(2) Wait for an image to change or appear (In this example the image of the word "Time now"): This can be done with the IMAGESEARCH command. This command will wait until it "sees" that the new data arrived. Just like a human user would do. For more information about IMAGESEARCH please see http://www.iopus.com/imacros/help/ir_fl ... xample.htm


Related forum post: Performance testing AJAX
jstatham
Posts: 10
Joined: Sat Jan 20, 2007 11:13 pm

updated TAG command?

Post by jstatham » Wed Jul 25, 2007 9:09 pm

I am working on a similar project, but I'm trying to extract a DIV instead of H2 that initially is blank. I can only get it to work using the WAIT command (not a big deal). However since with version 6.0 EXTRACT is no longer supported, how do I get this to work with the TAG command?

TAG POS=1 TYPE=TXT ATTR=<div<SP>id=result>* EXTRACT=HTM

does not work.
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Wed Jul 25, 2007 11:31 pm

Version 6 still supports the EXTRACT command and the old V5 syntax inside the iMacros browser and IE.

However, we recommend using the new V6 syntax, which is easier to use, more powerful and works in the iMacros browser, IE and Firefox.

To answer your question, the correct syntax for the AJAX example is:
TAG POS=1 TYPE=H2 ATTR=ID:lbtime EXTRACT=TXT

So the complete macro looks like this:

Code: Select all

VERSION BUILD=6000714
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://onlyonline.co.uk/simpleajaxexample/showtime.html
TAG POS=1 TYPE=INPUT:BUTTON FORM=NAME:NoFormName ATTR=NAME:&&VALUE:Click*
TAG POS=1 TYPE=H2 ATTR=TXT:*Time<SP>now*
TAG POS=1 TYPE=H2 ATTR=ID:lbtime EXTRACT=TXT
As you can see, the new TAG...EXTRACT command has the same syntax as the standard TAG command, which makes it easy to create data extraction commands.

In your case, I assume the TAG/EXTRACT command will lool like this:

TAG POS=1 TYPE=DIV ATTR=ID:result EXTRACT=TXT

If this does not work, please post a link the complete HTML code. Or simply try to record a TAG command for this element. How does the recorded line look like?
jstatham
Posts: 10
Joined: Sat Jan 20, 2007 11:13 pm

Post by jstatham » Thu Jul 26, 2007 2:20 pm

Yes, I've found that v6 works great with EXTRACT:

WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*

Above works, however to eliminate the wait keep up with new methods I want to use TAG:

TAG POS=1 TYPE=DIV ATTR=ID:result EXTRACT=HTM

but it doesn't work. It goes to "Loading: x (174)s" which is apparently what it does when it can't find the tag it's looking for.

But hey, the EXTRACT works, so I'm not gonna fight it. However I do have another problem once the code is looped you may be able to help with.

I'm trying to scrape the main table after the dropdown is selected and the submit button is hit. My Macro is:

VERSION BUILD=6000524
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://www.shipmentlink.com/servlet/MSQ ... RuleSearch
SIZE X=1092 Y=870
TAG POS=1 TYPE=SELECT FORM=NAME:frmRule ATTR=NAME:tariff CONTENT=4
TAG POS=1 TYPE=IMG ATTR=HREF:http://www.shipmentlink.com/tuf1/images/btn_Submit.gif
WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*
'TAG POS=1 TYPE=DIV ATTR=ID:result EXTRACT=HTM
SAVEAS TYPE=EXTRACT FOLDER=c:\data FILE=evrgrn.htm
TAG POS=1 TYPE=SELECT FORM=NAME:frmRule ATTR=NAME:tariff CONTENT=5
TAG POS=1 TYPE=IMG ATTR=HREF:http://www.shipmentlink.com/tuf1/images/btn_Submit.gif
WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*
SAVEAS TYPE=EXTRACT FOLDER=c:\data FILE=evrgrn2.htm

The first page saves as expected, but for the second page it shows "Loading x (60)s" on step 11 before moving on. How do I avoid this 60s delay for each subsequent page?
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Mon Jul 30, 2007 2:46 pm

The web browser does not receive a "web page complete" signal from the server for the second part of the web macro. As a workaround, please insert a page refresh command:

VERSION BUILD=6000524
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://www.shipmentlink.com/servlet/MSQ ... RuleSearch
SIZE X=1092 Y=870
TAG POS=1 TYPE=SELECT FORM=NAME:frmRule ATTR=NAME:tariff CONTENT=4
TAG POS=1 TYPE=IMG ATTR=HREF:http://www.shipmentlink.com/tuf1/images/btn_Submit.gif
WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*
'TAG POS=1 TYPE=DIV ATTR=ID:result EXTRACT=HTM
SAVEAS TYPE=EXTRACT FOLDER=c:\ FILE=evrgrn.htm
REFRESH
TAG POS=1 TYPE=SELECT FORM=NAME:frmRule ATTR=NAME:tariff CONTENT=5
TAG POS=1 TYPE=IMG ATTR=HREF:http://www.shipmentlink.com/tuf1/images/btn_Submit.gif
WAIT SECONDS=5
EXTRACT POS=1 TYPE=HTM ATTR=<div<SP>id=result>*
SAVEAS TYPE=EXTRACT FOLDER=c:\ FILE=evrgrn2.htm
iMark
Posts: 2
Joined: Sun Sep 19, 2010 10:14 am

Re: AJAX

Post by iMark » Sun Sep 19, 2010 10:33 am

Tech Support wrote:AJAX based websites are no problem for iMacros:

Here are two possible solutions:

(1) Wait for the text to appear. In our case we wait for the words "Time now" to appear. This indicates that the page was updated. A simple TAG command is sufficient for this: If a TAG command can not find its attribute on the web page the TAG command goes in "LoadCheck" mode and waits for 6 seconds (1/10 of the TIMEOUT value) until the word "Time now" appears. Once the words are found on the page, the macro continues and extracts the time.

This demo macro shows the solution:

VERSION BUILD=5100314
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://onlyonline.co.uk/simpleajaxexample/showtime.html
TAG POS=1 TYPE=INPUT:BUTTON FORM=NAME:NoFormName ATTR=NAME:&&VALUE:Click*
TAG POS=1 TYPE=H2 ATTR=TXT:*Time<SP>now*
EXTRACT POS=1 TYPE=TXT ATTR=<H2<SP>id=lbtime>*

(2) Wait for an image to change or appear (In this example the image of the word "Time now"): This can be done with the IMAGESEARCH command. This command will wait until it "sees" that the new data arrived. Just like a human user would do. For more information about IMAGESEARCH please see http://www.iopus.com/imacros/help/ir_fl ... xample.htm


Related forum post: Performance testing AJAX
it doesn't work
the script is simply jump the ligne TAG POS=1 TYPE=H2 ATTR=TXT:*Time<SP>now* without waiting for the timeout
i'm setting errorignore to yes because i don't want to stop the script while error occured
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Re: Howto handle AJAX?

Post by Tech Support » Wed Sep 22, 2010 10:42 am

I see that the old AJAX demo site is no longer available. What site did you use to test?

Can you please post the URL of the web page and/or the macro that creates the problem?
fwg
Posts: 2
Joined: Mon Jun 25, 2012 6:38 am

Re: Howto handle AJAX?

Post by fwg » Mon Jun 25, 2012 3:25 pm

What happens if there is no way to know what the new content will be but it will always appear in the same place?
pedrocalvin
Posts: 1
Joined: Wed Dec 05, 2012 3:27 pm

Re: Howto handle AJAX?

Post by pedrocalvin » Wed Dec 05, 2012 3:38 pm

Hello,

Thank you for all the information posted here. It has been very useful for me.

However, I am facing a different problem at the moment. The webpage I am doing a macro for, has a AJAX component that can take either 3, 10 or 20+ minutes processing (because the request goes for a queue so the waiting time is quite random).

There is a string I can use for identifying when the process has been completed, which is:
TAG POS=1 TYPE=SPAN ATTR=TXT:*Finished*

However, this won't work well because this test is performed after 6 seconds. As 6 seconds is not enough for processing my AJAX request, it will always fail and stop the processing of the macro.

In case this test fails, is there any way to keep repeating it until it succeeds?

Thank you,
Pedro
Post Reply