Blocking Ads or Selecting alternate xpath element

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Blocking Ads or Selecting alternate xpath element

by papichan on Sat May 14, 2016 7:13 am

Imacros Desktop V10 Windows 7

Intermittent google ad serving is interrupting my script - when the ad is served the required Xpath div changes from 6 to 7. Is there a way of selecting the alternate xpath if the first is not found (i.e. when the ad is served) Or anyway of blocking it? Filtering images doesn't work...

Thanks

Code: Select all
VERSION BUILD=10022823
TAB T=1
TAB CLOSEALLOTHERS
SET !DATASOURCE C:\File.csv
SET !LOOP 2
SET !DATASOURCE_LINE {{!LOOP}}
SET !TIMEOUT_STEP 1
SET !ERRORIGNORE YES
'CLEAR
URL GOTO=https://www.endole.co.uk/search/?search=Asset
FILTER TYPE=IMAGES STATUS=ON
TAG POS=1 TYPE=INPUT:TEXT FORM=ACTION:/search/ ATTR=NAME:search CONTENT={{!COL1}}
TAG POS=1 TYPE=INPUT:SUBMIT FORM=ACTION:/search/ ATTR=CLASS:search_button
WAIT SECONDS=1.206
TAG XPATH=/html/body/div[7]/div[1]/div[6]/div/div[1]/a <!-- Alt Xpath -/html/body/div[7]/div[1]/div[7]/div/div[1]/a  -->
WAIT SECONDS=0.06
TAG POS=1 TYPE=H1 ATTR=* EXTRACT=TXT
TAG POS=3 TYPE=H5 ATTR=* EXTRACT=TXT
TAG POS=5 TYPE=H5 ATTR=* EXTRACT=TXT
TAG POS=6 TYPE=H5 ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=C:\Users\ FILE=C:\File.csv
papichan
 
Posts: 8
Joined: Sat Apr 30, 2016 8:28 am

Re: Blocking Ads or Selecting alternate xpath element

by chivracq on Sat May 14, 2016 8:45 am

papichan wrote:Imacros Desktop V10 Windows 7

Intermittent google ad serving is interrupting my script - when the ad is served the required Xpath div changes from 6 to 7. Is there a way of selecting the alternate xpath if the first is not found (i.e. when the ad is served) Or anyway of blocking it? Filtering images doesn't work...

Thanks

Code: Select all
VERSION BUILD=10022823
TAB T=1
TAB CLOSEALLOTHERS
SET !DATASOURCE C:\File.csv
SET !LOOP 2
SET !DATASOURCE_LINE {{!LOOP}}
SET !TIMEOUT_STEP 1
SET !ERRORIGNORE YES
'CLEAR
URL GOTO=https://www.endole.co.uk/search/?search=Asset
FILTER TYPE=IMAGES STATUS=ON
TAG POS=1 TYPE=INPUT:TEXT FORM=ACTION:/search/ ATTR=NAME:search CONTENT={{!COL1}}
TAG POS=1 TYPE=INPUT:SUBMIT FORM=ACTION:/search/ ATTR=CLASS:search_button
WAIT SECONDS=1.206
TAG XPATH=/html/body/div[7]/div[1]/div[6]/div/div[1]/a <!-- Alt Xpath -/html/body/div[7]/div[1]/div[7]/div/div[1]/a  -->
WAIT SECONDS=0.06
TAG POS=1 TYPE=H1 ATTR=* EXTRACT=TXT
TAG POS=3 TYPE=H5 ATTR=* EXTRACT=TXT
TAG POS=5 TYPE=H5 ATTR=* EXTRACT=TXT
TAG POS=6 TYPE=H5 ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=C:\Users\ FILE=C:\File.csv

Yep, use some Ad-Blocker to block the Ads or first do a Check with 'EXTRACT=HREF or =TXT' on your Link or if the Ad is displayed, on the Link Position corresponding to the Ad, and using 'EVAL()', spit out a '!VAR1' with "6"/"7" to reuse for:
Code: Select all
'TAG XPATH=/html/body/div[7]/div[1]/div[6]/div/div[1]/a <!-- Alt Xpath -/html/body/div[7]/div[1]/div[7]/div/div[1]/a  -->
'TAG XPATH=/html/body/div[7]/div[1]/div[7]/div/div[1]/a
TAG XPATH=/html/body/div[7]/div[1]/div[{{!VAR1}}]/div/div[1]/a


>>>

Hum, and btw, you didn't finish your first/previous Thread completely neatly with Follow-up/sharing your Solution...
(I normally only help Users who "use the Forum a bit correctly" with FCI and neat Follow-up/sharing their Solution on all their Threads to make all Threads useful for all Users...)
And even in this Current Thread, it's still not clear to me if you are using IE10 or the iMB v10, even if it shouldn't play a role in this case... => FCIM... (Read my Sig...)
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6471
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Blocking Ads or Selecting alternate xpath element

by papichan on Sat May 14, 2016 9:34 am

IMB v10.0.2.2823 Win7 64
Do you know if it is possible to install adblock into imb v10? I am having some trouble understanding the 2nd part of your answer...
papichan
 
Posts: 8
Joined: Sat Apr 30, 2016 8:28 am

Re: Blocking Ads or Selecting alternate xpath element

by papichan on Sat May 14, 2016 9:45 am

Ok ad blocking using imacros for IE doesn't work as the element is still there; you just can't see the ad.
papichan
 
Posts: 8
Joined: Sat Apr 30, 2016 8:28 am

Re: Blocking Ads or Selecting alternate xpath element

by papichan on Sat May 14, 2016 10:40 am

You simply put an additional line of code with the alt xpath below, because the error ignore is set to yes it runs that line if it doesn't find the first.
papichan
 
Posts: 8
Joined: Sat Apr 30, 2016 8:28 am

Re: Blocking Ads or Selecting alternate xpath element

by chivracq on Sat May 14, 2016 11:16 am

papichan wrote:I
Code: Select all
MB v10.0.2.2823
Win7 64

Do you know if it is possible to install adblock into imb v10? I am having some trouble understanding the 2nd part of your answer...

papichan wrote:Ok ad blocking using imacros for IE doesn't work as the element is still there; you just can't see the ad.

Okay, we finally have your FCI, perfect...!

Hum..., I know the iMB is based on IE, and I actually thought I had understood it was not possible to install (IE) Add-ons/Plugins on it, but your 2nd Reply proves the contrary, OK, good to know, but I've never used iMB actually, so I didn't really know...

And hum, I never realized/thought about that, but it's good to know indeed that with AdBlock/ABP/ABE/uBlock, the Element being blocked is still there but only not displayed, and that means that iMacros will indeed still "see" it, can be important indeed with determining the correct 'POS=n' Position...

Oh...! some more Reply while I was already replying to your first 2 Posts...:
papichan wrote:You simply put an additional line of code with the alt xpath below, because the error ignore is set to yes it runs that line if it doesn't find the first.

For the "2nd part" of my previous Reply, but you seem to have found an easy Solution in the meantime with your next Reply if I understand you correctly, even if I'm not completely convinced it will always work and I would think "my" Solution would be more "reliable", I was going to tell you to provide a few (3 at least I would say) Valid Values for your '!COL1' Var, because I would need 3 different Results of the 'EXTRACT=HREF' + '=TXT', and coming as well from you unless I get those (same...?) Ads as well, but Google Ads can be tricky and based on IP Address and Country and Language etc..., for the similar 6 Extracts on 3 different (...?) Ads (unless you always get the same Ad, mention that, if it's the case...).

With those 6+6=12 Extracts, then I could provide the '!VAR1' Computation I mentioned...
But if you are happy with your Solution, I guess this is not needed anymore... :D
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6471
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Blocking Ads or Selecting alternate xpath element

by papichan on Sat May 14, 2016 11:58 am

I'd like to understand how to implement your solution as it is probably more reliable and it will help me. The ad is served irrespective of IP, I used Russian and Korean VPNs and the position of the served ad was the same. Ok, start here
https://www.endole.co.uk/search/?search=de+Poel&match=companies
Without a search term that page won't load, use these terms
Panduit
Mediterranean Shipping Company (MSC)
Atkin & Co
NSL Services Group

For me these terms mean that the ad gets served directly below the first result, which is why it changes the layout. If these don't work maybe try to enter other words in the search box until the ad gets served below the first result.

Untaitled.png

Thanks for your help
papichan
 
Posts: 8
Joined: Sat Apr 30, 2016 8:28 am

Re: Blocking Ads or Selecting alternate xpath element

by chivracq on Sat May 14, 2016 12:18 pm

papichan wrote:I'd like to understand how to implement your solution as it is probably more reliable and it will help me. The ad is served irrespective of IP, I used Russian and Korean VPNs and the position of the served ad was the same. Ok, start here
https://www.endole.co.uk/search/?search=de+Poel&match=companies
Without a search term that page won't load, use these terms
Panduit
Mediterranean Shipping Company (MSC)
Atkin & Co
NSL Services Group

For me these terms mean that the ad gets served directly below the first result, which is why it changes the layout. If these don't work maybe try to enter other words in the search box until the ad gets served below the first result.

Untaitled.png

Thanks for your help

OK, have to go out now to some Concert, it's 21h on Saturday Evening, my Time (Amsterdam/NL), time to enjoy the WE a bit I would think...!, I'll see if I have some time tomorrow or maybe even later on tonight, I'm usually quite "Creative" after a few Beers, ah-ah...! (I'm actually a DJ and Artist, and "Creativity" is all you need to do interesting Things with iMacros...!), bump your Thread otherwise if you don't hear from me..., but pfff, I have a full Book to (finish) proof-reading for a Friend as well who's waiting for my Feedback, so I'm a bit busy IRL as well..., and you are not the only Thread I'm following up as well...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6471
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 4 guests

-->