Blocking Ads or Selecting alternate xpath element

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team

Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
papichan
Posts: 8
Joined: Sat Apr 30, 2016 3:28 pm

Blocking Ads or Selecting alternate xpath element

Post by papichan » Sat May 14, 2016 2:13 pm

Imacros Desktop V10 Windows 7

Intermittent google ad serving is interrupting my script - when the ad is served the required Xpath div changes from 6 to 7. Is there a way of selecting the alternate xpath if the first is not found (i.e. when the ad is served) Or anyway of blocking it? Filtering images doesn't work...

Thanks

Code: Select all

VERSION BUILD=10022823
TAB T=1
TAB CLOSEALLOTHERS
SET !DATASOURCE C:\File.csv
SET !LOOP 2
SET !DATASOURCE_LINE {{!LOOP}}
SET !TIMEOUT_STEP 1
SET !ERRORIGNORE YES
'CLEAR
URL GOTO=https://www.endole.co.uk/search/?search=Asset
FILTER TYPE=IMAGES STATUS=ON
TAG POS=1 TYPE=INPUT:TEXT FORM=ACTION:/search/ ATTR=NAME:search CONTENT={{!COL1}}
TAG POS=1 TYPE=INPUT:SUBMIT FORM=ACTION:/search/ ATTR=CLASS:search_button
WAIT SECONDS=1.206
TAG XPATH=/html/body/div[7]/div[1]/div[6]/div/div[1]/a <!-- Alt Xpath -/html/body/div[7]/div[1]/div[7]/div/div[1]/a  -->
WAIT SECONDS=0.06
TAG POS=1 TYPE=H1 ATTR=* EXTRACT=TXT
TAG POS=3 TYPE=H5 ATTR=* EXTRACT=TXT
TAG POS=5 TYPE=H5 ATTR=* EXTRACT=TXT
TAG POS=6 TYPE=H5 ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=C:\Users\ FILE=C:\File.csv
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Blocking Ads or Selecting alternate xpath element

Post by chivracq » Sat May 14, 2016 3:45 pm

papichan wrote:Imacros Desktop V10 Windows 7

Intermittent google ad serving is interrupting my script - when the ad is served the required Xpath div changes from 6 to 7. Is there a way of selecting the alternate xpath if the first is not found (i.e. when the ad is served) Or anyway of blocking it? Filtering images doesn't work...

Thanks

Code: Select all

VERSION BUILD=10022823
TAB T=1
TAB CLOSEALLOTHERS
SET !DATASOURCE C:\File.csv
SET !LOOP 2
SET !DATASOURCE_LINE {{!LOOP}}
SET !TIMEOUT_STEP 1
SET !ERRORIGNORE YES
'CLEAR
URL GOTO=https://www.endole.co.uk/search/?search=Asset
FILTER TYPE=IMAGES STATUS=ON
TAG POS=1 TYPE=INPUT:TEXT FORM=ACTION:/search/ ATTR=NAME:search CONTENT={{!COL1}}
TAG POS=1 TYPE=INPUT:SUBMIT FORM=ACTION:/search/ ATTR=CLASS:search_button
WAIT SECONDS=1.206
TAG XPATH=/html/body/div[7]/div[1]/div[6]/div/div[1]/a <!-- Alt Xpath -/html/body/div[7]/div[1]/div[7]/div/div[1]/a  -->
WAIT SECONDS=0.06
TAG POS=1 TYPE=H1 ATTR=* EXTRACT=TXT
TAG POS=3 TYPE=H5 ATTR=* EXTRACT=TXT
TAG POS=5 TYPE=H5 ATTR=* EXTRACT=TXT
TAG POS=6 TYPE=H5 ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=C:\Users\ FILE=C:\File.csv
Yep, use some Ad-Blocker to block the Ads or first do a Check with 'EXTRACT=HREF or =TXT' on your Link or if the Ad is displayed, on the Link Position corresponding to the Ad, and using 'EVAL()', spit out a '!VAR1' with "6"/"7" to reuse for:

Code: Select all

'TAG XPATH=/html/body/div[7]/div[1]/div[6]/div/div[1]/a <!-- Alt Xpath -/html/body/div[7]/div[1]/div[7]/div/div[1]/a  -->
'TAG XPATH=/html/body/div[7]/div[1]/div[7]/div/div[1]/a
TAG XPATH=/html/body/div[7]/div[1]/div[{{!VAR1}}]/div/div[1]/a
>>>

Hum, and btw, you didn't finish your first/previous Thread completely neatly with Follow-up/sharing your Solution...
(I normally only help Users who "use the Forum a bit correctly" with FCI and neat Follow-up/sharing their Solution on all their Threads to make all Threads useful for all Users...)
And even in this Current Thread, it's still not clear to me if you are using IE10 or the iMB v10, even if it shouldn't play a role in this case... => FCIM... (Read my Sig...)
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
papichan
Posts: 8
Joined: Sat Apr 30, 2016 3:28 pm

Re: Blocking Ads or Selecting alternate xpath element

Post by papichan » Sat May 14, 2016 4:34 pm

IMB v10.0.2.2823 Win7 64
Do you know if it is possible to install adblock into imb v10? I am having some trouble understanding the 2nd part of your answer...
papichan
Posts: 8
Joined: Sat Apr 30, 2016 3:28 pm

Re: Blocking Ads or Selecting alternate xpath element

Post by papichan » Sat May 14, 2016 4:45 pm

Ok ad blocking using imacros for IE doesn't work as the element is still there; you just can't see the ad.
papichan
Posts: 8
Joined: Sat Apr 30, 2016 3:28 pm

Re: Blocking Ads or Selecting alternate xpath element

Post by papichan » Sat May 14, 2016 5:40 pm

You simply put an additional line of code with the alt xpath below, because the error ignore is set to yes it runs that line if it doesn't find the first.
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Blocking Ads or Selecting alternate xpath element

Post by chivracq » Sat May 14, 2016 6:16 pm

papichan wrote:I

Code: Select all

MB v10.0.2.2823
Win7 64
Do you know if it is possible to install adblock into imb v10? I am having some trouble understanding the 2nd part of your answer...
papichan wrote:Ok ad blocking using imacros for IE doesn't work as the element is still there; you just can't see the ad.
Okay, we finally have your FCI, perfect...!

Hum..., I know the iMB is based on IE, and I actually thought I had understood it was not possible to install (IE) Add-ons/Plugins on it, but your 2nd Reply proves the contrary, OK, good to know, but I've never used iMB actually, so I didn't really know...

And hum, I never realized/thought about that, but it's good to know indeed that with AdBlock/ABP/ABE/uBlock, the Element being blocked is still there but only not displayed, and that means that iMacros will indeed still "see" it, can be important indeed with determining the correct 'POS=n' Position...

Oh...! some more Reply while I was already replying to your first 2 Posts...:
papichan wrote:You simply put an additional line of code with the alt xpath below, because the error ignore is set to yes it runs that line if it doesn't find the first.
For the "2nd part" of my previous Reply, but you seem to have found an easy Solution in the meantime with your next Reply if I understand you correctly, even if I'm not completely convinced it will always work and I would think "my" Solution would be more "reliable", I was going to tell you to provide a few (3 at least I would say) Valid Values for your '!COL1' Var, because I would need 3 different Results of the 'EXTRACT=HREF' + '=TXT', and coming as well from you unless I get those (same...?) Ads as well, but Google Ads can be tricky and based on IP Address and Country and Language etc..., for the similar 6 Extracts on 3 different (...?) Ads (unless you always get the same Ad, mention that, if it's the case...).

With those 6+6=12 Extracts, then I could provide the '!VAR1' Computation I mentioned...
But if you are happy with your Solution, I guess this is not needed anymore... :D
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
papichan
Posts: 8
Joined: Sat Apr 30, 2016 3:28 pm

Re: Blocking Ads or Selecting alternate xpath element

Post by papichan » Sat May 14, 2016 6:58 pm

I'd like to understand how to implement your solution as it is probably more reliable and it will help me. The ad is served irrespective of IP, I used Russian and Korean VPNs and the position of the served ad was the same. Ok, start here
https://www.endole.co.uk/search/?search ... =companies
Without a search term that page won't load, use these terms
Panduit
Mediterranean Shipping Company (MSC)
Atkin & Co
NSL Services Group

For me these terms mean that the ad gets served directly below the first result, which is why it changes the layout. If these don't work maybe try to enter other words in the search box until the ad gets served below the first result.
Untaitled.png
Thanks for your help
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Blocking Ads or Selecting alternate xpath element

Post by chivracq » Sat May 14, 2016 7:18 pm

papichan wrote:I'd like to understand how to implement your solution as it is probably more reliable and it will help me. The ad is served irrespective of IP, I used Russian and Korean VPNs and the position of the served ad was the same. Ok, start here
https://www.endole.co.uk/search/?search ... =companies
Without a search term that page won't load, use these terms
Panduit
Mediterranean Shipping Company (MSC)
Atkin & Co
NSL Services Group

For me these terms mean that the ad gets served directly below the first result, which is why it changes the layout. If these don't work maybe try to enter other words in the search box until the ad gets served below the first result.
Untaitled.png
Thanks for your help
OK, have to go out now to some Concert, it's 21h on Saturday Evening, my Time (Amsterdam/NL), time to enjoy the WE a bit I would think...!, I'll see if I have some time tomorrow or maybe even later on tonight, I'm usually quite "Creative" after a few Beers, ah-ah...! (I'm actually a DJ and Artist, and "Creativity" is all you need to do interesting Things with iMacros...!), bump your Thread otherwise if you don't hear from me..., but pfff, I have a full Book to (finish) proof-reading for a Friend as well who's waiting for my Feedback, so I'm a bit busy IRL as well..., and you are not the only Thread I'm following up as well...
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
Post Reply