How to increase TAG POS by 4 automatically?

Support for iMacros. The iMacros software is the unique solution for automating every activity inside a web browser, for data extraction and web testing.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
PBNSurprise
Posts: 15
Joined: Sun Jun 19, 2016 12:10 am

How to increase TAG POS by 4 automatically?

Post by PBNSurprise » Tue May 22, 2018 6:47 pm

i'm trying to scrape a web page that has infinite scroll and i am almost able to do it perfectly... just one problem

this is the code that i have

Code: Select all

TAG POS=91 TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
now the issue is i need TAG POS=91 to increase by 4 each time. so something like:

Code: Select all

TAG POS=91 TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
TAG POS=95 TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
next would be TAG POS=99, TAG POS=103.. and so on and so forth. need to have that repeated several thousands of times. issue is, how do i create this? i cannot manually input changes to these values thousands of times.. so how can i automatically set it up so imacros will replace these numbers with the correct ones? i simply need the number to start at 91 and then continue to add 4 each time. i am open to any other websites or tools that is able to accomplish this..
Last edited by PBNSurprise on Fri May 25, 2018 4:26 am, edited 1 time in total.
chivracq
Posts: 8636
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to increase TAG POS by 4 automatically?

Post by chivracq » Tue May 22, 2018 10:32 pm

PBNSurprise wrote:i'm trying to scrape a web page that has infinite scroll and i am almost able to do it perfectly... just one problem

this is the code that i have

Code: Select all

TAG POS=91 TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
now the issue is i need TAG POS=91 to increase by 4 each time. so something like:

Code: Select all

TAG POS=91 TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
TAG POS=95 TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
next would be TAG POS=99, TAG POS=103.. and so on and so forth. need to have that repeated several thousands of times. issue is, how do i create this? i cannot manually input changes to these values thousands of times.. so how can i automatically set it up so imacros will replace these numbers with the correct ones? i simply need the number to start at 91 and then continue to add 4 each time. i am open to any other websites or tools that is able to accomplish this..
Oh...!, wanted to reply for this Thread, but you never followed up on your previous Thread from 2 years ago :shock: , so you would first need to follow up/finish that older Thread "a bit correctly", and you can maybe bump this current one in 2 years again if you still need a Solution... :idea:

+ CIM again...! :mrgreen: (Read my Sig...)

(Applies to your first Thread as well + Needs to be finished "correctly" and useful for the Forum with a Solution...)

>>>

But OK, being nice, "i am open to any other websites or tools that is able to accomplish this.."
=> Very easy to do in iMacros if you "think" a little bit, + several similar Threads on the Forum if you search the Forum a little bit... (And therefore the "Use" for finishing a Thread "correctly" and with a Solution..., you get the Picture...!? :wink: )
Last edited by chivracq on Fri May 25, 2018 5:01 am, edited 1 time in total.
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
PBNSurprise
Posts: 15
Joined: Sun Jun 19, 2016 12:10 am

Re: Simple question: how to increase TAG POS by 4 automatica

Post by PBNSurprise » Tue May 22, 2018 11:05 pm

chivracq wrote:
PBNSurprise wrote:i'm trying to scrape a web page that has infinite scroll and i am almost able to do it perfectly... just one problem

this is the code that i have

Code: Select all

TAG POS=91 TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
now the issue is i need TAG POS=91 to increase by 4 each time. so something like:

Code: Select all

TAG POS=91 TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
TAG POS=95 TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
next would be TAG POS=99, TAG POS=103.. and so on and so forth. need to have that repeated several thousands of times. issue is, how do i create this? i cannot manually input changes to these values thousands of times.. so how can i automatically set it up so imacros will replace these numbers with the correct ones? i simply need the number to start at 91 and then continue to add 4 each time. i am open to any other websites or tools that is able to accomplish this..
Oh...!, wanted to reply for this Thread, but you never followed up on your previous Thread from 2 years ago :shock: , so you would first need to follow up/finish that older Thread "a bit correctly", and you can maybe bump this current one in 2 years again if you still need a Solution... :idea:

+ CIM again...! :mrgreen: (Read my Sig...)

(Applies to your first Thread as well + Needs to be finished "correctly" and useful for the Forum with a Solution...)

>>>

But OK, being nice, "i am open to any other websites or tools that is able to accomplish this.."
=> Very easy to do in iMacros if you "think" a little bit, + several similar Threads on the Forum if you search the Forum a little bit... (And therefore the "Use" for finishing a Thread "correctly" and with a Solution..., you get the Picture...!? :wink: )
hi i'm sorry but i am very new.. therefore i have difficulties understanding some certain things on this forum. it took me a while just to simply extract one line of text with imacros so i am very newbie..

i can tell you that i am using the imacros free add on for IE, the latest version. i believe i worded my issue as clearly as i can. i tried to search the forums for solutions but i cannot put into words what i am trying to accomplish... do you understand what my issue is and can please offer me a solution? thank you
chivracq
Posts: 8636
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to increase TAG POS by 4 automatically?

Post by chivracq » Tue May 22, 2018 11:47 pm

PBNSurprise wrote:hi i'm sorry but i am very new.. therefore i have difficulties understanding some certain things on this forum. it took me a while just to simply extract one line of text with imacros so i am very newbie..

i can tell you that i am using the imacros free add on for IE, the latest version. i believe i worded my issue as clearly as i can. i tried to search the forums for solutions but i cannot put into words what i am trying to accomplish... do you understand what my issue is and can please offer me a solution? thank you
Hum, "Newbie-Newbie", you've been using iMacros for 2 years already now..., I was already doing more "complicated" Things then what you want in this Thread 2 hours after I had discovered and installed iMacros for the first time...
And your Pb Description is perfect, I understand perfectly what you want exactly..., like you say, it's a "simple Qt", and the Solution is as simple... (Basic Maths for Kids in Elementary School, I guess...)

But OK, finish your 2 previous Threads a bit correctly by sharing how you solved them, and I will still help you with this one... :wink:

Hum, and "latest" about your FCI is always a bit vague...
=> iMacros for IE v12.0 (...?), IE11 (...?), OS...?
Last edited by chivracq on Fri May 25, 2018 5:02 am, edited 1 time in total.
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
PBNSurprise
Posts: 15
Joined: Sun Jun 19, 2016 12:10 am

Re: Simple question: how to increase TAG POS by 4 automatica

Post by PBNSurprise » Wed May 23, 2018 12:30 am

chivracq wrote:
PBNSurprise wrote:hi i'm sorry but i am very new.. therefore i have difficulties understanding some certain things on this forum. it took me a while just to simply extract one line of text with imacros so i am very newbie..

i can tell you that i am using the imacros free add on for IE, the latest version. i believe i worded my issue as clearly as i can. i tried to search the forums for solutions but i cannot put into words what i am trying to accomplish... do you understand what my issue is and can please offer me a solution? thank you
Hum, "Newbie-Newbie", you've been using iMacros for 2 years already now..., I was already doing more "complicated" Things then what you want in this Thread 2 hours after I had discovered and installed iMacros for the first time...
And your Pb Description is perfect, I understand perfectly what you want exactly..., like you say, it's a "simple Qt", and the Solution is as simple... (Basic Maths for Kids in Elementary School, I guess...)

But OK, finish your 2 previous Threads a bit correctly by sharing how you solved them, and I will still help you with this one... :wink:

Hum, and "latest" about your FCI is always a bit vague...
=> iMacros for IE v12.0 (...?), IE11 (...?), OS...?
i went back and i updated my 2 previous threads as best as i could remember. i haven't been using imacros consistently.. i only used it a little bit 2 years ago. i'm using iMacros V12.0.0.151 for IE. can i please get assistance? i really would like to have this solved so i can test it and see if i can scrape a page.. i need to scrape several TAG POS so i would have to add four to each number several thousands of times. any help would be great, thanks
chivracq
Posts: 8636
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to increase TAG POS by 4 automatically?

Post by chivracq » Wed May 23, 2018 1:52 am

PBNSurprise wrote:i went back and i updated my 2 previous threads as best as i could remember. i haven't been using imacros consistently.. i only used it a little bit 2 years ago.

i'm using

Code: Select all

iMacros V12.0.0.151 for IE.
can i please get assistance? i really would like to have this solved so i can test it and see if i can scrape a page.. i need to scrape several TAG POS so i would have to add four to each number several thousands of times. any help would be great, thanks
OK, good, I saw your 2 Updates... Good enough...

OK, concerning this one, the "Aim of the Game" is to find some Maths Equation that will be using the '!LOOP' Var to give the following:
- !LOOP=1 => MyLoop=91
- !LOOP=2 => MyLoop=95
- !LOOP=3 => MyLoop=99
... etc...
And with Start_Loop = 91.

You could start with "Start_Loop -4" (=> =87) and you use 4 times the 'ADD' Command to add '{{!LOOP}}' to 'MyLoop', that will work, but that's a bit cumbersome... :oops:

More "elegant" is to use 'EVAL()' to do directly the "x4" Multiplication instead of 4 times the same Addition...:

Code: Select all

SET Start_Loop 91
SET MyLoop EVAL("var stl='{{Start_Loop}}', d='{{!LOOP}}'; var z=((stl*1)-4)+(d*4); z;")
TAG POS={{MyLoop}} TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
That's all...!
Not tested but I expect it to work...
The "*1" on 'Start_Loop' is probably/maybe not needed, it's just to make sure that iMacros and 'EVAL()' will treat your "91" as a Number and not as a String which sometimes happens...

Enjoy...! :wink:
Last edited by chivracq on Fri May 25, 2018 5:02 am, edited 1 time in total.
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
PBNSurprise
Posts: 15
Joined: Sun Jun 19, 2016 12:10 am

Re: Simple question: how to increase TAG POS by 4 automatica

Post by PBNSurprise » Wed May 23, 2018 3:46 am

thank you very much, it's working perfectly now! time to scrape thousands of data :D

once again thanks!
chivracq
Posts: 8636
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Simple question: how to increase TAG POS by 4 automatica

Post by chivracq » Wed May 23, 2018 3:57 am

PBNSurprise wrote:thank you very much, it's working perfectly now! time to scrape thousands of data :D

once again thanks!
:D
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
PBNSurprise
Posts: 15
Joined: Sun Jun 19, 2016 12:10 am

Re: Simple question: how to increase TAG POS by 4 automatica

Post by PBNSurprise » Wed May 23, 2018 5:23 am

chivracq wrote:
PBNSurprise wrote:thank you very much, it's working perfectly now! time to scrape thousands of data :D

once again thanks!
:D
it appears that i have ran into an issue. it seems that these numbers stop increasing at increments of +4 at one point, which causes the scrapping to stop. is there any way that i can get in touch with you privately? i would like to show you what i am trying to do and see if you would be very kind to help me arrive at a smooth code on how to scrape? thanks
chivracq
Posts: 8636
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to increase TAG POS by 4 automatically?

Post by chivracq » Wed May 23, 2018 3:12 pm

PBNSurprise wrote:it appears that i have ran into an issue. it seems that these numbers stop increasing at increments of +4 at one point, which causes the scrapping to stop. is there any way that i can get in touch with you privately? i would like to show you what i am trying to do and see if you would be very kind to help me arrive at a smooth code on how to scrape? thanks
Euh..., nope..., I don't do anything "privately", ah-ah...!, or only rarely when I "work" on a paid Project, or very rarely if the Data/Site is very "sensitive", but I usually only help via the Forum... :wink:

Yeah, well, "these numbers stop increasing at increments of +4 at one point" is a bit vague, I cannot find a Solution with that Info, you must describe exactly what it becomes then we can adapt the Maths Formula...

I actually already wanted to make the Formula more generic/configurable directly, but I thought it would be a bit "overkill", but here it is, and you can then adapt the Script easily with the 2 configurable Vars at the beginning...:

Code: Select all

SET POS_Start 91
SET Increment 4
'>
SET MyLoop EVAL("var ps='{{POS_Start}}', i='{{Increment}}', d='{{!LOOP}}'; var z=((ps*1)-(i*1))+(d*i); z;")
TAG POS={{MyLoop}} TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
=> When your Script will stop, you will then need to identify the last 'POS_Nb' that worked, the next 'POS_Nb' that should become 'POS_Start' in your Script, and the (new) Value for the 'Increment'.
And that part could even easily be automated for each Loop by "proactively" checking if the next Loop will still work with the same Settings, or aborting the Script "voluntarily" (with 'EVAL()' + 'MacroError()') and displaying the last working 'POS_Nb' and what next Value it should take (+ the next Value for the Increment if it changes as well) for you to adapt those 2 Vars in the Script manually before running it again...

And if those 2 Values keep changing "a little bit too often", it would even be possible to make them "Dynamic" and automatically "detected"/computed by the Script..., which is pretty cool actually, ah-ah...!
Let me see if I can find it back, I had already "demonstrated" one Technique on how to do that... (I say "one" Technique, because there are several (that I use), many of my own Scripts are indeed "clever" enough to detect by themselves when stg is "changing" on a Site and to (try to) adapt themselves directly, ah-ah...!)
Yep..., found it...!:
- Re: hdwallpapers
=> If you have a look at the Script, it's about the 'AutoDetect' Functionality/Section and the Nb of Pix per Page..., the same Principle can (probably) be applied to your Site to determine dynamically the Increment. Though hum..., might be a little bit more complicated in your Case if the Increment is not constant on the whole Page, that would bring an extra "Layer" of inner/nested Logic, ah-ah...! Nice...!! :twisted:
And you had mentioned "thousands" of Items to scrap, if they are all on the same Page, you might hit some "Performance" Issue if the Page is very large and the Increment needs to be recalculated on each Loop... (... Hum..., solved already I think..., just got a few "Creative" Ideas on how to solve it in a more efficient way, ah-ah...!)

>>>

But hum..., my Answers and Solutions are based on your exact Qt(s) and the Info you provided... and the Solution/Implementation you had in mind... It is pretty cumbersome in my Opinion to have to increment your 'POS_Nb' by 4 or even "worse" if that Increment is "Dynamic" through the Page, to only want to scrap/tag/extract/save one every 4th Item... :shock:
If you posted the URL of your Site, I could have a Look and would probably find a more efficient and simpler Implementation where you probably wouldn't even need to "play"/care with/about that Increment part... :idea:

>>>

Mini-Request:
Could you remove the "Simple question: " part in your Thread Title...? :?:

Those 2 Words are always useless anyway for a Thread Title on the Forum, as every Qt is "simple" if you know the Answer and if the "Asker" can already qualify their Qt as being "simple", then they should be able to find the Answer by themselves is the "Logic" I usually use, and I usually don't even bother to answer such Threads... :idea:

But more important for this Thread is that because of a 64 Char-Limit for Thread Titles, the Post Titles for all Replies in the Thread all get truncated and the Word/Term "automatically" which is much more important than "Simple" + "question" gets truncated as well, which means that any User searching the Forum with "automatically" as a Search Keyword won't find all those Posts as Search Results... :!:

["Thread Title" = The Title of your first Post in this Thread... => 'Edit' Button and you can edit it...]
Last edited by chivracq on Fri May 25, 2018 5:03 am, edited 1 time in total.
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
PBNSurprise
Posts: 15
Joined: Sun Jun 19, 2016 12:10 am

Re: Simple question: how to increase TAG POS by 4 automatica

Post by PBNSurprise » Fri May 25, 2018 4:27 am

chivracq wrote:
PBNSurprise wrote:it appears that i have ran into an issue. it seems that these numbers stop increasing at increments of +4 at one point, which causes the scrapping to stop. is there any way that i can get in touch with you privately? i would like to show you what i am trying to do and see if you would be very kind to help me arrive at a smooth code on how to scrape? thanks
Euh..., nope..., I don't do anything "privately", ah-ah...!, or only rarely when I "work" on a paid Project, or very rarely if the Data/Site is very "sensitive", but I usually only help via the Forum... :wink:

Yeah, well, "these numbers stop increasing at increments of +4 at one point" is a bit vague, I cannot find a Solution with that Info, you must describe exactly what it becomes then we can adapt the Maths Formula...

I actually already wanted to make the Formula more generic/configurable directly, but I thought it would be a bit "overkill", but here it is, and you can then adapt the Script easily with the 2 configurable Vars at the beginning...:

Code: Select all

SET POS_Start 91
SET Increment 4
'>
SET MyLoop EVAL("var ps='{{POS_Start}}', i='{{Increment}}', d='{{!LOOP}}'; var z=((ps*1)-(i*1))+(d*i); z;")
TAG POS={{MyLoop}} TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
=> When your Script will stop, you will then need to identify the last 'POS_Nb' that worked, the next 'POS_Nb' that should become 'POS_Start' in your Script, and the (new) Value for the 'Increment'.
And that part could even easily be automated for each Loop by "proactively" checking if the next Loop will still work with the same Settings, or aborting the Script "voluntarily" (with 'EVAL()' + 'MacroError()') and displaying the last working 'POS_Nb' and what next Value it should take (+ the next Value for the Increment if it changes as well) for you to adapt those 2 Vars in the Script manually before running it again...

And if those 2 Values keep changing "a little bit too often", it would even be possible to make them "Dynamic" and automatically "detected"/computed by the Script..., which is pretty cool actually, ah-ah...!
Let me see if I can find it back, I had already "demonstrated" one Technique on how to do that... (I say "one" Technique, because there are several (that I use), many of my own Scripts are indeed "clever" enough to detect by themselves when stg is "changing" on a Site and to (try to) adapt themselves directly, ah-ah...!)
Yep..., found it...!:
- Re: hdwallpapers
=> If you have a look at the Script, it's about the 'AutoDetect' Functionality/Section and the Nb of Pix per Page..., the same Principle can (probably) be applied to your Site to determine dynamically the Increment. Though hum..., might be a little bit more complicated in your Case if the Increment is not constant on the whole Page, that would bring an extra "Layer" of inner/nested Logic, ah-ah...! Nice...!! :twisted:
And you had mentioned "thousands" of Items to scrap, if they are all on the same Page, you might hit some "Performance" Issue if the Page is very large and the Increment needs to be recalculated on each Loop... (... Hum..., solved already I think..., just got a few "Creative" Ideas on how to solve it in a more efficient way, ah-ah...!)

>>>

But hum..., my Answers and Solutions are based on your exact Qt(s) and the Info you provided... and the Solution/Implementation you had in mind... It is pretty cumbersome in my Opinion to have to increment your 'POS_Nb' by 4 or even "worse" if that Increment is "Dynamic" through the Page, to only want to scrap/tag/extract/save one every 4th Item... :shock:
If you posted the URL of your Site, I could have a Look and would probably find a more efficient and simpler Implementation where you probably wouldn't even need to "play"/care with/about that Increment part... :idea:

>>>

Mini-Request:
Could you remove the "Simple question: " part in your Thread Title...? :?:

Those 2 Words are always useless anyway for a Thread Title on the Forum, as every Qt is "simple" if you know the Answer and if the "Asker" can already qualify their Qt as being "simple", then they should be able to find the Answer by themselves is the "Logic" I usually use, and I usually don't even bother to answer such Threads... :idea:

But more important for this Thread is that because of a 64 Char-Limit for Thread Titles, the Post Titles for all Replies in the Thread all get truncated and the Word/Term "automatically" which is much more important than "Simple" + "question" gets truncated as well, which means that any User searching the Forum with "automatically" as a Search Keyword won't find all those Posts as Search Results... :!:

["Thread Title" = The Title of your first Post in this Thread... => 'Edit' Button and you can edit it...]
i went ahead and edited the thread. i'm going to try these suggestions that you mentioned here and will reply back with an update. let's see if i can finally get this web scraping to work

thank you for the assistance
chivracq
Posts: 8636
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to increase TAG POS by 4 automatically?

Post by chivracq » Fri May 25, 2018 5:08 am

PBNSurprise wrote:i went ahead and edited the thread. i'm going to try these suggestions that you mentioned here and will reply back with an update. let's see if i can finally get this web scraping to work

thank you for the assistance
OK, good..., and good luck..., and Thanks for editing your Thread Title, perfect now... :D
(And I've edited all my Replies as well to reflect that better/shorter Title..., except one with just one Smiley, which didn't contain any valuable Info anyway, ah-ah...! :wink: )

And like I mentioned already, you would "speed up" the "Process" if you posted the URL for your site is my "Experience", I will very probably find a much easier/simpler Implementation than the one you are "struggling" with, ah-ah...!
But OK, don't worry, you are the "Boss", ah-ah...! :wink:

(And I normally nearly never write Scripts for other Users, I just try to make you understand how to write it yourself..., which is much more "rewarding" I think (for yourself) than getting your Script "served on a golden plate", ah-ah...!)
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
PBNSurprise
Posts: 15
Joined: Sun Jun 19, 2016 12:10 am

Re: How to increase TAG POS by 4 automatically?

Post by PBNSurprise » Fri May 25, 2018 7:33 am

chivracq wrote:
PBNSurprise wrote:i went ahead and edited the thread. i'm going to try these suggestions that you mentioned here and will reply back with an update. let's see if i can finally get this web scraping to work

thank you for the assistance
OK, good..., and good luck..., and Thanks for editing your Thread Title, perfect now... :D
(And I've edited all my Replies as well to reflect that better/shorter Title..., except one with just one Smiley, which didn't contain any valuable Info anyway, ah-ah...! :wink: )

And like I mentioned already, you would "speed up" the "Process" if you posted the URL for your site is my "Experience", I will very probably find a much easier/simpler Implementation than the one you are "struggling" with, ah-ah...!
But OK, don't worry, you are the "Boss", ah-ah...! :wink:

(And I normally nearly never write Scripts for other Users, I just try to make you understand how to write it yourself..., which is much more "rewarding" I think (for yourself) than getting your Script "served on a golden plate", ah-ah...!)
hi there, so i will go into detail on what i am trying to do. basically, i would like to scrape users on facebook that have liked a facebook page. i believe the issue is that TAG POS is unique for each facebook page. for example, let's say we wanted to scrape this page: https://www.facebook.com/search/1215736 ... ?ref=about

you need to log into facebook to see the users. what i would like to do is scrape the first and last names and put them in 1 column in excel. the page has infinite scroll so as the scraping continues more results will automatically load. i tried to use event mode on imacros for firefox, but was struggling with that. scraping facebook usually is difficult because it is hard to identify which TAG POS are needed. it would be great if the script can be used on any facebook page, that would probably be very helpful for anyone or happens to come across this
chivracq
Posts: 8636
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to increase TAG POS by 4 automatically?

Post by chivracq » Sat May 26, 2018 4:39 pm

PBNSurprise wrote:hi there, so i will go into detail on what i am trying to do. basically, i would like to scrape users on facebook that have liked a facebook page. i believe the issue is that TAG POS is unique for each facebook page. for example, let's say we wanted to scrape this page: https://www.facebook.com/search/1215736 ... ?ref=about

you need to log into facebook to see the users. what i would like to do is scrape the first and last names and put them in 1 column in excel. the page has infinite scroll so as the scraping continues more results will automatically load. i tried to use event mode on imacros for firefox, but was struggling with that. scraping facebook usually is difficult because it is hard to identify which TAG POS are needed. it would be great if the script can be used on any facebook page, that would probably be very helpful for anyone or happens to come across this
Ah, hum, OK, I usually don't like to help "too much" for Social Media...

But OK, this one is "easy", I would use the Profile Picture from each User which looks unique for each Result, and Relative Positioning on the Name, like for example...:

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
SET !ERRORIGNORE YES
TAB T=1

SET Like_Search "Fazio's Cat Jewelry"
SET POS_Pix {{!LOOP}}

'TAG POS={{!LOOP}} TYPE=DIV ATTR=TXT:Likes<SP>{{Like_Search}}*

TAG POS=1 TYPE=DIV ATTR=TXT:Add<SP>FriendFriend<SP>request<SP>sentMore<SP>*
TAG POS=R-1 TYPE=* ATTR=* EXTRACT=TXT
TAG POS=R{{POS_Pix}} TYPE=IMG ATTR=SRC:https://*.fbcdn.net/*_n.jpg* EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R1 TYPE=A ATTR=HREF:*?ref=br_rs EXTRACT=TXT

PROMPT {{!EXTRACT}}
(Tested on iMacros for FF v8.8.2, Pale Moon v24.6.6 (=FF47), Win10_x64.)

Like you can see, I tried several "Things", that I left in the Script if that can give you some more "Ideas"...
Users with the 'Add Friend' Button, no 'Add Friend' Button and 'Follow' Button seem to "behave" differently, but this one seems to work for all 3 kinds...
The only "Condition" in this Case for this specific Page is that the first Result has the 'Add Friend' Button enabled as I use the main Containing 'DIV' for all Results as the first Anchor and you might need to adapt this Line if the first Result on a Page doesn't have this Button or only the 'Follow' Button, but you get the Idea... :wink:

Oh...!, and you may notice several 'EXTRACT' Commands, most are "Fake" and only meant to avoid clicking on a Link if any Element has a Link on it as most FB Content is often clickable..., there is only one "Real" 'EXTRACT' for the Name you want after the 'SET !EXTRACT NULL' Line...
Last edited by chivracq on Sun May 27, 2018 8:18 am, edited 2 times in total.
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
PBNSurprise
Posts: 15
Joined: Sun Jun 19, 2016 12:10 am

Re: How to increase TAG POS by 4 automatically?

Post by PBNSurprise » Sun May 27, 2018 8:07 am

chivracq wrote:
PBNSurprise wrote:hi there, so i will go into detail on what i am trying to do. basically, i would like to scrape users on facebook that have liked a facebook page. i believe the issue is that TAG POS is unique for each facebook page. for example, let's say we wanted to scrape this page: https://www.facebook.com/search/1215736 ... ?ref=about

you need to log into facebook to see the users. what i would like to do is scrape the first and last names and put them in 1 column in excel. the page has infinite scroll so as the scraping continues more results will automatically load. i tried to use event mode on imacros for firefox, but was struggling with that. scraping facebook usually is difficult because it is hard to identify which TAG POS are needed. it would be great if the script can be used on any facebook page, that would probably be very helpful for anyone or happens to come across this
Ah, hum, OK, I usually don't like to help "too much" for Social Media...

But OK, this one is "easy", I would use the Profile Picture from each User which looks unique for each Result, and Relative Positioning on the Name, like for example...:

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
SET !ERRORIGNORE YES
TAB T=1

SET Like_Search "Fazio's Cat Jewelry"
SET POS_Pix {{!LOOP}}

'TAG POS={{!LOOP}} TYPE=DIV ATTR=TXT:Likes<SP>{{Like_Search}}*

TAG POS=1 TYPE=DIV ATTR=TXT:Add<SP>FriendFriend<SP>request<SP>sentMore<SP>*
TAG POS=R-1 TYPE=* ATTR=* EXTRACT=TXT
TAG POS=R{{POS_Pix}} TYPE=IMG ATTR=SRC:https://*.fbcdn.net/*_n.jpg* EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R1 TYPE=A ATTR=HREF:*?ref=br_rs EXTRACT=TXT

PROMPT {{!EXTRACT}}
(Tested on iMacros for FF v8.8.2, Pale Moon v24.6.6 (=FF47), Win10_x64.)

Like you can see, I tried several "Things", that I left in the Script if that can give you some more "Ideas"...
Users with the 'Add Friend' Button, no 'Add Friend' Button and 'Follow' Button seem to "behave" differently, but this one seems to work for all 3 kinds...
The only "Condition" in this Case for this specific Page is that the first Result has the 'Add Friend' Button enabled as I use the main Containing 'DIV' for all Results as the first Anchor and you might to adapt this Line if the first Result on a Page doesn't have this Button or only the 'Follow' Button, but you get the Idea... :wink:

Oh...!, and you may notice several 'EXTRACT' Commands, most are "Fake" and only meant to avoid clicking on a Link if any Element as a Link on it as most FB Content is often clickable..., there is only one "Real" 'EXTRACT' for the Name you want after the 'SET !EXTRACT NULL' Line...
hey so thanks for the response, the script works nicely to scrape facebook users that like a page. i've tried it on a few pages and it continues to work. this is the code that i ended up using:

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP NO
TAB T=1

SET Like_Search "Fazio's Cat Jewelry"
SET POS_Pix {{!LOOP}}

'TAG POS={{!LOOP}} TYPE=DIV ATTR=TXT:Likes<SP>{{Like_Search}}*

TAG POS=1 TYPE=DIV ATTR=TXT:Add<SP>FriendFriend<SP>request<SP>sentMore<SP>*
TAG POS=R-1 TYPE=* ATTR=* EXTRACT=TXT
TAG POS=R{{POS_Pix}} TYPE=IMG ATTR=SRC:https://*.fbcdn.net/*_n.jpg* EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R1 TYPE=A ATTR=HREF:*?ref=br_rs EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
WAIT SECONDS=1
a few things i've noticed:

it turns out that it doesn't matter what you enter for SET Like_Search. I've tried "Fazio's Cat Jewelry" but scraped different pages and seems to still work.

i added wait 1 second at the end because once the code gets to the end of the page there needs to be some time to load more results, the page is infinite scroll sadly.

one thing i've noticed is that once i've scrapped around 500 users, the code tries to locate the initial TAG POS at the top of the screen. meaning, for some reason the code goes back to user number 1 and tries to scrape from there. not sure why this happens. it turns out that TAG POS here:

Code: Select all

TAG POS=1 TYPE=DIV ATTR=TXT:Add<SP>FriendFriend<SP>request<SP>sentMore<SP>* 
needs to be changed after scraping around 500 users.

i struggle to determine what exactly is the new TAG POS. i know the last user that was scraped because i can check the file. so what i did was test various TAG POS until i landed where i needed to be. it was like trial and error. what i arrived at was TAG POS=3660. this lands me on the user where the code stopped scraping. so i stop the macro and input this new tag and now it's:

Code: Select all

TAG POS=3660 TYPE=DIV ATTR=TXT:Add<SP>FriendFriend<SP>request<SP>sentMore<SP>*. 
i don't reload the page so i can still see all the users and where i was just scraping. so then i start the macro again on a 100,000 times loop and boom, it continues scraping where it left off

i'm not sure if there's a way to implement this change automatically. but it's not a big deal if there isn't. can you tell me how to find the TAG POS of an item on the webpage? If i knew the exact TAG POS was 3660, i could just make these quick changes as needed and fire the macro up again without having to trial and error

other than this i think the script is running fine, i plan to scrape 100k users

once again thanks for the support

EDIT: well looks like i'm struggling again with imacros. the reason i am scraping is because i would like to use the data to create custom audiences in facebook ads. i scraped a few hundred users and decided to test with them to see if i can import to facebook. unfortunately there is low match potential with just the first and last names so facebook is asking me to provide additional info per user. i was thinking i can scrape the city and state of where each user lives along with the names and hopefully that will be enough data. not every user has a city and state listed, which is fine, i can just skip those users since most have it. i tried to fiddle with this code to see if i can scrape locations for each profile if imacros comes across it

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP NO
TAB T=1

SET Like_Search "Fazio's Cat Jewelry"
SET POS_Pix {{!LOOP}}

'TAG POS={{!LOOP}} TYPE=DIV ATTR=TXT:Likes<SP>{{Like_Search}}*

TAG POS=1 TYPE=DIV ATTR=TXT:Add<SP>FriendFriend<SP>request<SP>sentMore<SP>*
TAG POS=R-1 TYPE=* ATTR=* EXTRACT=TXT
TAG POS=R-4 TYPE=* ATTR=* EXTRACT=TXT
TAG POS=R{{POS_Pix}} TYPE=IMG ATTR=SRC:https://*.fbcdn.net/*_n.jpg* EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R1 TYPE=A ATTR=HREF:*?ref=br_rs EXTRACT=TXT
TAG POS=R4 TYPE=A ATTR=HREF:*/pages/*?ref=br_rs EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
WAIT SECONDS=1
the code doesn't get it done properly but i think i have the right idea. i need imacros to scrape 2 columns of data, one with names and another with city, state. if the profile doesn't have a city,state listed then i would need imacros to skip that and continue scraping. i would need the column city, state to match with the correct users. the current code i have can scrape but the locations are jumbled around with the users, along with other data that i don't need. this is very challenging to accomplish and hard for me to understand how to work these commands to accomplish proper scraping. assistance when you get the chance would be appreciated, thanks
Post Reply