PBNSurprise wrote:it appears that i have ran into an issue. it seems that these numbers stop increasing at increments of +4 at one point, which causes the scrapping to stop. is there any way that i can get in touch with you privately? i would like to show you what i am trying to do and see if you would be very kind to help me arrive at a smooth code on how to scrape? thanks
Euh..., nope..., I don't do anything "privately", ah-ah...!, or only rarely when I "work" on a paid Project, or very rarely if the Data/Site is very "sensitive", but I usually only help via the Forum...
Yeah, well, "these numbers stop increasing at increments of +4 at one point" is a bit vague, I cannot find a Solution with that Info, you must describe exactly what it becomes then we can adapt the Maths Formula...
I actually already wanted to make the Formula more generic/configurable directly, but I thought it would be a bit "overkill", but here it is, and you can then adapt the Script easily with the 2 configurable Vars at the beginning...:
Code: Select all
SET POS_Start 91
SET Increment 4
'>
SET MyLoop EVAL("var ps='{{POS_Start}}', i='{{Increment}}', d='{{!LOOP}}'; var z=((ps*1)-(i*1))+(d*i); z;")
TAG POS={{MyLoop}} TYPE=SPAN ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=file.csv
=> When your Script will stop, you will then need to identify the last 'POS_Nb' that worked, the next 'POS_Nb' that should become 'POS_Start' in your Script, and the (new) Value for the 'Increment'.
And that part could even easily be automated for each Loop by "proactively" checking if the next Loop will still work with the same Settings, or aborting the Script "voluntarily" (with 'EVAL()' + 'MacroError()') and displaying the last working 'POS_Nb' and what next Value it should take (+ the next Value for the Increment if it changes as well) for you to adapt those 2 Vars in the Script manually before running it again...
And if those 2 Values keep changing "a little bit too often", it would even be possible to make them "Dynamic" and automatically "detected"/computed by the Script..., which is pretty cool actually, ah-ah...!
Let me see if I can find it back, I had already "demonstrated"
one Technique on how to do that... (I say "one" Technique, because there are several (that I use), many of my own Scripts are indeed "clever" enough to detect by themselves when stg is "changing" on a Site and to (try to) adapt themselves directly, ah-ah...!)
Yep..., found it...!:
-
Re: hdwallpapers
=> If you have a look at the Script, it's about the 'AutoDetect' Functionality/Section and the Nb of Pix per Page..., the same Principle can (probably) be applied to your Site to determine dynamically the Increment. Though hum..., might be a little bit more complicated in your Case if the Increment is not constant on the whole Page, that would bring an extra "Layer" of inner/nested Logic, ah-ah...! Nice...!!
And you had mentioned "thousands" of Items to scrap, if they are all on the same Page, you might hit some "Performance" Issue if the Page is very large and the Increment needs to be recalculated on each Loop... (... Hum..., solved already I think..., just got a few "Creative" Ideas on how to solve it in a more efficient way, ah-ah...!)
>>>
But hum..., my Answers and Solutions are based on your exact Qt(s) and the Info you provided... and the Solution/Implementation you had in mind... It is pretty cumbersome in my Opinion to have to increment your 'POS_Nb' by 4 or even "worse" if that Increment is "Dynamic" through the Page, to only want to scrap/tag/extract/save one every 4th Item...
If you posted the URL of your Site, I could have a Look and would probably find a more efficient and simpler Implementation where you probably wouldn't even need to "play"/care with/about that Increment part...
>>>
Mini-Request:
Could you remove the "Simple question: " part in your Thread Title...?
Those 2 Words are always useless anyway for a Thread Title on the Forum, as every Qt is "simple" if you know the Answer and if the "Asker" can already qualify their Qt as being "simple", then they should be able to find the Answer by themselves is the "Logic" I usually use, and I usually don't even bother to answer such Threads...
But more important for this Thread is that because of a 64 Char-Limit for Thread Titles, the Post Titles for all Replies in the Thread all get truncated and the Word/Term "automatically" which is much more important than "Simple" + "question" gets truncated as well, which means that any User searching the Forum with "automatically" as a Search Keyword won't find all those Posts as Search Results...
["Thread Title" = The Title of your first Post in this Thread... => 'Edit' Button and you can edit it...]
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...