iMacros + ASP.NET + IIS = Very Long Extraction Time

Discussions and Tech Support related to automating the iMacros Browser or Internet Explorer from any scripting and programming language, such as VBS (WSH), VBA, VB, Perl, Delphi, C# or C++.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team

Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
COSMOS
Posts: 20
Joined: Thu Jul 28, 2016 5:11 am

iMacros + ASP.NET + IIS = Very Long Extraction Time

Post by COSMOS » Tue Mar 28, 2017 11:41 am

Hello,

I have an asp.net 4.5 web app which I am running on a windows server 2008 R2, through IIS 7.5
In this web app, I have created a class library where I have a Store.cs class which contains all the imacros commands.
I am referencing this class in my web app project. On the server I have installed Imacros Player V11.5.498.2403_x64.

This app allows a client to upload a list of products and to select from a list of stores, on which store they want to search for those articles.
I am using multi-threads in my app, in order to reduce the extraction time, because the user can click on 10 stores and if his list contains 50 articles,
that's a long search. So..when the client clicks on a button called "SearchOnline", I launch on the server 10 (or less) instances of imacros player, each
going to a different site and doing it's stuff(testing if it landed on the right page, extracting the data, writing it into a db).

I have developed this app on a W7 x64 computer, where I have the same player installed.
The app runs perfectly on localhost. For a list of 10 articles and 5 stores, imacros finds about 100 articles in about 7 minutes.
I know it might be a long time for some of you, but it's ok for our needs. I have added a few waits so the time increased more than I wanted,
but I don't know how else to ensure that imacros reaches the page that I want and extracts consistent data.
Without the waits, the page didn't load fully and imacros jumped to the next command, thus causing a chain reaction and ultimately extracting the same
article information again and again. Perhaps accessing the page through a proxy or something and downloading it without the images, would ensure a faster load time ??

The problem is that when I put this application on the server, I get about 18 minutes for the same list.
Now imagine, having 10 users accessing this application in the same time :)

I'm trying to debug this on the server, but I don't have a lot of experience with this.
Which tools should I use to profile my server when the imacros player is running ?
I have tried doing a Performance Monitor and Debug Diag 2
I have tried installing ANTS but that only showed me the client part, until the user pressed the SearchOnline button and the imacros threads started.

Ultimately what I want is to reduce this extraction time on the server:
-either with a profile tool and determining if there is something wrong with my code
-either by loading the page faster with a proxy or something that could filter out the images
(I know imacros has a FILTER tag but when I tried it, it didn't work too well. Most stores still had their pictures on and I didn't see any improvement
in the page load time.)

Any suggestion is greatly appreciated ! I have tried asking people on stackoverflow and reddit.
I have also contacted support as a paid customer and the answer was:

"..unfortunately I am not a web developer myself and know next to nothing about configuring or administering IIS or web applications running on that platform. I'm afraid that I will be of very little help in assisting you with troubleshooting this problem.."

I understand the support guy, but I don't know who else to ask or where to look for information.
Oh..and I should finish this by friday :)) so..if you have any info please share.

Thank you !
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: iMacros + ASP.NET + IIS = Very Long Extraction Time

Post by chivracq » Tue Mar 28, 2017 1:44 pm

Hum..., very interesting Case, sorry, ah-ah...! I was going to send you towards TechSupport but you've already been there, OK... :oops:
Ah-ah...!, and funny to see that you beat me on the Length of your Post (meant as a Compliment...!), I didn't even dare to quote it, which I normally always systematically do as many Users tend to edit constantly theirs Posts as they "progress" or even delete their Posts/Threads once they've got their Answer and their Script working, which then renders such Threads useless for other Users searching the Forum with a similar Qt/Pb/Case...

I don't know anything about your whole Environment and the Server Side with iMacros as I only use the iMacros for FF Add-on, but hum..., I do a lot of Performance and Reliability Tuning with iMacros and my Macros and you may want to have a look at the following Thread where I explained a bit nearly all the Techniques that I use, among which 'FILTER' is indeed one of them (Item_1) and I gave a bit of "Background" Info on how 'FILTER' works that might help you using it in a "better" way that you maybe use it or tried to use it... :idea:
- Re: How do I make iMacros Firefox as fast as for Chrome?

Have a look as well at Items 5 and 6 about tuning the '!TIMEOUT_xxx' Settings and blocking (3rd Party) Scripts... :idea:
A Technique for example (that I didn't mention explicitly in that other Thread) to replace your hard-coded 'WAIT' Statements with "Conditional" 'WAIT's is to shorten '!TIMEOUT_PAGE' to an "Average Minimum" corresponding to when the Page has normally either fully loaded or loaded "enough" for the Elements you want to extract or click to be present on the Page but on the other hand to increase the '!TIMEOUT-STEP' Setting to compensate the shorter '!TIMEOUT_PAGE'.

On "difficult" Sites/Pages, it is even possible to add several "fake" 'TAG' Statements (with this longer '!TIMEOUT_STEP') on a few Elements that should have appeared on the Page before you want your Macro to further proceed with its "real" Task(s), and even using 'EVAL()' (+ 'EXTRACT') to add a Conditional 'WAIT' and even a Conditional Reload of the Page. (I use the Term "Reload" and not "Refresh" because of the 'REFRESH' that cannot be used (easily) in an '.iim' Macro, you need to use 'URL GOTO' for that...) :idea:
For the Reload part, some of my Macros even dynamically "adjust" those 2 '!TIMEOUT_xxx' Settings using some on-the-fly Monitoring and Calculations using some Values collected by a few 'STOPWATCH' Commands placed at a few "strategic" places in my Macros to speed up or slow down the whole Execution (in order to always guaranty the highest Reliability possible) depending on my Connection Speed which can vary a lot in my case...

Could you maybe post direct Links to your Threads on SOF and Reddit...?, seeing the Answers you got might give me some more "Ideas", ah-ah...!
I guess I could find them easily if I looked for them, but it will be easier for you...
I normally "see" all new Threads on SOF from some Notifications but I don't go and check those Threads very often and even less often (= rarely!) answer Threads on SOF because Users never mention their FCI and I don't like the Reputation System on SOF where I cannot even post a Comment...
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
COSMOS
Posts: 20
Joined: Thu Jul 28, 2016 5:11 am

Re: iMacros + ASP.NET + IIS = Very Long Extraction Time

Post by COSMOS » Wed Mar 29, 2017 8:24 am

Thank you for the reply and I'm sorry for making the previous post (and possibly this one) so long.

The links for my previous threads are:
SOF: https://stackoverflow.com/questions/428 ... et-web-app
RED: https://www.reddit.com/r/IIs/comments/5 ... e_problem/
Initially I posted about a different problem that I had with the server, because I was losing the database connection when I tried to write into the database what the imacros was extracting. (I need to update those threads to let them know it's solved, but as you can see I didn't get much help there.)

I looked at the page you suggested and wrote down all of your tips:

-Use Filter to remove images - not really doing anything on my application's performance due to the fact that the site only has 1 image per page, and the header/footer/sidebars are loaded from another html through php or something and it doesn't hide those which consist most of the page's images. It's fully useful for a site with hundreds of pictures but not in this case..and I tend to avoid using it because as you said, it's a bit buggy.

-Remove add-ons, revert to previous versions of FF, use Pale Moon - I use imacros player and the replayspeed is set to fast. I'm assuming the player does a better job than FF or CR or PM. Am I wrong here ? Would I get a better performance if I would open the imacros with fx/ie/cr instead of iimRobot.iimOpen("-runner -silent -v7", true, 300); ? How would I open Pale Moon from script ? Is it possible ?

-Shrink down "!TIMEOUT_PAGE" and increase "!TIMEOUT_STEP" - This is an interesting idea, but how well would it work if you go to a site and search for an article. It finds it, you extract your data, but now you want to search for a new article starting from that page. You put your next article in the search bar, click the search button and the page doesn't search for anything. The click didn't work. Or maybe the click works and the page doesn't fully load.
You are now at the same page you were at article 1, but you actually want to be on the next page for article 2. When imacros finishes everything and you look into your final output, you see you have successfully extracted the same article 50 times and you just wasted half an hour for nothing. :)
What I want is a bulletproof method of reaching a page. Don't start any imacros command until the page has FULLY loaded (sry about caps).

I tried using !WAITPAGECOMPLETE but I was getting the same problems. The page didn't "completely load" and I was seeing the same articles extracted in the final output

I also tried not moving to the next article from the previous article's page and doing a URL GOTO="home page" before I do another search.
That way imacros doesn't find the tags from the first article's page. This just adds more time and I risk even more by relying on the URL GOTO again.

The only way I found to really work was through the WAIT command. Make the page wait for 3 seconds. That is bulletproof most of the times.
The downside is the long extraction time. This is where I'm at. I accepted the fact that I had to wait 7 minutes on my PC, but it's a long stretch to wait for 18 minutes on the server. Currently I'm waiting for a new server and I want to see if adding more resources would help.
But I'm also trying to host the application locally like a desktop version, which I can't do with imacros because I need to .NET Components and I only have EE ( :( ), so I'm forced to install an IIS Express on the client and run the application locally.

So as you can see, the problem can be solved in 2 ways:

PATH 1: From imacros by tweaking the TIMEOUT's and the WAIT's but that results in inconsistent extracted data, which I can't do. I would rather wait than to extract junk.
PATH 2: Try to play with different settings and configurations on the server in order to optimize CPU usage, bandwidth, etc.

If anyone has any other suggestions on either of those 2 paths, please let me know. :)

Thank you!

============================================
Quick update

I have solved this problem partially. It was the replayspeed that was causing the 18 min lag on the server.
Turns out that I had the replayspeed on fast on local but on server it was set to slow. (I must of played with it and forgot about it).
Funny thing is that now I'm constantly getting inconsistent data because while I had the replayspeed to slow it was adding a 2 second delay between each imacros command. But now it doesn't ...and even with my 5 second wait I still get unconsistent extracts.
Post Reply