Extract ID from list

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.
Post Reply
DDon
Posts: 20
Joined: Sat Aug 06, 2016 1:41 pm

Extract ID from list

Post by DDon » Fri Nov 02, 2018 7:02 am

Firefox ESR 52.9 iMacro 8.9.7 Windows 10 64 bit

Is there anyway that I can extract the list data-ad-id? I do wish to extract one by one if possible.

Thank you.

Image
Attachments
hcScTYN.png
Screenshot
Last edited by DDon on Fri Nov 02, 2018 4:16 pm, edited 1 time in total.
chivracq
Posts: 7722
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract ID from list

Post by chivracq » Fri Nov 02, 2018 7:46 am

DDon wrote:

Code: Select all

Firefox ESR 52.9 iMacro 8.9.7 Windows 10 64 bit
Is there anyway that I can extract the list data-ad-id? I do wish to extract one by one if possible.

Thank you.

Image
Beh, yeah, probably, what have you tried, what is the Pb...?

And this I don't remember its Github Name, but this sexy Code-Shower is kinky to impress on Twitter but not on a TechForum, even if SOF are advertizing for it... :roll:
Simply Copy&Paste that Content into some ]CODE[ Tags in your Post, that's more useful..., I can't even find your 'data-ad-id'...
Hum, and even more useful would be a Link to that Page, or upload some HTML Saveas to your Thread...

And when posting Images, upload them directly to the Forum without using external Hosting Providers that will become commercial one day or delete all Images after a few months...

(But I will have a look anyway only after you've mentioned what you've tried and why/where you got stuck... :idea: )
=> Quick Answer to your Qt is: YES...! :D

EDIT:
Oh...!, found the 'data-ad-id' List Items, ah-ah...! :x
Hum, not sure what "one by one" now means then...!? :?
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
DDon
Posts: 20
Joined: Sat Aug 06, 2016 1:41 pm

Re: Extract ID from list

Post by DDon » Fri Nov 02, 2018 4:15 pm

This is from the Gum tree website (local selling site - www.gumtree.com.au), code is from My ads list, I am trying to help my uncle doing the semi-auto relisting as he has so many old items to sell (100+).

I tried put the Tag Class as search-result-set but seem it doesn't work, I put it randomly to see if it works. Basically I don't know how to put the code to extract the ID as this only shows on the source code, not on the website itself, I did test code based on research but didn't work in my case- https://stackoverflow.com/questions/191 ... t#19159592


I said one by one meant if I'd like to extract the 1st id, it will extract only 1195627335.

Let me know what other info u need. I appreciate your help
chivracq
Posts: 7722
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract ID from list

Post by chivracq » Fri Nov 02, 2018 5:41 pm

DDon wrote:This is from the Gum tree website (local selling site - http://www.gumtree.com.au), code is from My ads list, I am trying to help my uncle doing the semi-auto relisting as he has so many old items to sell (100+).

I tried put the Tag Class as search-result-set but seem it doesn't work, I put it randomly to see if it works. Basically I don't know how to put the code to extract the ID as this only shows on the source code, not on the website itself, I did test code based on research but didn't work in my case- https://stackoverflow.com/questions/191 ... t#19159592

I said one by one meant if I'd like to extract the 1st id, it will extract only 1195627335.

Let me know what other info u need. I appreciate your help
Hum, Thanks for uploading the Screenshot directly to the Forum, all those external Image Hosting stop one day or become commercial or clean up "older" Stuff from time to time, which renders (older) Threads a bit useless with an empty Image Container... Hum, and I've personally blocked as well many of those Sites (except 'imgur' btw as it is also stupidly used on SOF) as many Users on some Games Sites and Forums I use regularly often have stupidly large '.GIF''s in their Sigs that are hosted in such Sites...
Now it's a bit Double, but OK, no big deal... But the Content of an Image is still not searchable on the Forum...

"I tried put the Tag Class as search-result-set but seem it doesn't work...", hum..., well it should work..., I can't tell you what you are doing wrong if you don't post your Attempt..., "but seem it doesn't work" is a bit vague... I don't even try to type it for you as I can't copy&paste any Content from your sexy Image..., and I don't see any List on your 'gumtree' URL...
[Hum, "funny", you have some Double Space Typo in "search-result-set [ ] but" which is automatically trimmed by the Forum Software..., I had never really noticed... 8) ]

And yep the List Items are not visible because their 'TXT' Attribute is empty, but you can still extract them from iMacros..., (well, you can always extract everything present in the Source anyway...), => at the 'LI' (=> one by one) or 'UL' (=> the whole List) Level if you manage to tag it in Record Mode, or at one of the Containing 'DIV''s Level... (+ Use 'EVAL()' to keep only the Data you want, even at the 'LI' Level...)
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
DDon
Posts: 20
Joined: Sat Aug 06, 2016 1:41 pm

Re: Extract ID from list

Post by DDon » Mon Nov 05, 2018 1:54 pm

Well.

I did try this code like these:

TAG POS=1 TYPE=UL ATTR=ID:search-result-set
TAG POS=R1 TYPE=LI ATTR=ID:* EXTRACT=TXTALL

But the code doesn't work, I also play with Pos number, change Type, I never extract specific code in the Source before, and I can't find any specific example similar to my case so pretty much all the code I found from google doesn't work in my situation.

I uploaded the website source code, even though it will expose some privacy of my uncle account but I can't do anything in this case.
Last edited by DDon on Fri Nov 09, 2018 3:13 pm, edited 1 time in total.
chivracq
Posts: 7722
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract ID from list

Post by chivracq » Mon Nov 05, 2018 2:37 pm

DDon wrote:Well.

I did try this code like these:

Code: Select all

TAG POS=1 TYPE=UL ATTR=ID:search-result-set
TAG POS=R1 TYPE=LI ATTR=ID:* EXTRACT=TXTALL
But the code doesn't work, I also play with Pos number, change Type, I never extract specific code in the Source before, and I can't find any specific example similar to my case so pretty much all the code I found from google doesn't work in my situation.

I uploaded the website source code, even though it will expose some privacy of my uncle account but I can't do anything in this case.
OK, I've downloaded your '.zip' File, in case you want to remove it from your previous Post, if you don't want "too many" People to download it... :wink:
And I could open it in Pale Moon...

But a mini-"Warning": I don't or very rarely write Scripts for other Users, I explain how to implement some Functionality and I let you and guide you until you find the/a "Solution" by yourself, but I won't be writing your Script... But some other (Advanced) User(s) might still do that for you, if you are a bit lazy + lucky, ah-ah...! :wink:

But OK, even without having a look at your Page (yet), I can already tell you what I think you are "doing wrong"... 8)

1- The 'EXTRACT=TXTALL' Parameter you are trying to use is meant for DDLB's, I don't think it will work on Bullet Lists..., I've never tried actually..., but I would be surprised if it did anything..., and certainly not at the 'LI' Level, maybe at the 'UL' Level, but hum-hum..., I don't know...!

2- The 'Relative Positioning' you are trying to use for the 1st 'LI' Element by using the 'UL' Element as Anchor is a "good Idea", except that I don't think it will work, because I reckon all 'LI' Elements are actually seen by iMacros as being "inside" the 'UL' Element, and iMacros will start looking for the 'POS=R1' only after that 'UL' Element at the same or higher Level in the HTML Structure of the Page, but not at a lower Level (= "inside").
=> You would need to use 'Double Relative Positioning' for iMacros to be able to see inside the 'UL' if using that 'UL' Element as the Anchor. :idea:

3- Once you'll have managed to tag an 'LI' Element, you'll want to extract it then, 'EXTRACT=TXT' won't do what you want because those Elements are a bit "invisible" because they have no 'TXT' Content, and 'EXTRACT=TXT' will simply return an empty String.
As the "data_whatever_id" Value you are interested is not a 'TXT' Attribute but a "Custom" Attribute, you will need to use 'EXTRACT=HTM' on that 'LI' Element to extract its complete 'HTML' Source, and then like I had already mentioned in my previous Post, to use 'EVAL()' to isolate only the part that you want to keep...

iMB and iMacros for IE from v11.5 (I think...?, or maybe only from v12.0, I'm not sure anymore...) support extracting Custom Attributes directly from the 'EXTRACT' Mechanism, and iMacros for FF maybe, (but only from v9.0.3 or v10.0.2 then...?), but not v8.9.7, I had already tried and it is not supported in that Version, so the "way to go" is using 'EXTRACT=HTM'.
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
DDon
Posts: 20
Joined: Sat Aug 06, 2016 1:41 pm

Re: Extract ID from list

Post by DDon » Fri Nov 09, 2018 3:13 pm

Hi chivracq,

Thank you for your reply. I'm bit loss from your advice, however, it gives me an idea for the work around solution.

I'm looking for other element (button) on the web that has the ID on the link, then I extract the URL with ID and use the Eval to extract the ID from the URL. I solved the puzzle. YAY

Thank you!
chivracq
Posts: 7722
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract ID from list

Post by chivracq » Fri Nov 09, 2018 5:59 pm

DDon wrote:Hi chivracq,

Thank you for your reply. I'm bit loss from your advice, however, it gives me an idea for the work around solution.

I'm looking for other element (button) on the web that has the ID on the link, then I extract the URL with ID and use the Eval to extract the ID from the URL. I solved the puzzle. YAY

Thank you!
Okay..., I'm not sure I understand if you are still looking for a/the Solution or if you've managed to get your Script to work...? :?
But if you've found a Solution, you should post your Script anyway as I'll be "curious" to see how you managed to implement it, and I could tell you if it can be improved, for Simplicity and/or Reliability, and that would also "useful" for other Users...

But in a Nutshell if you are still looking for a Solution, the 'R-Pos' Implementation on the 'UL' Element as Anchor is a good Idea but you must use "Double Relative Positioning" (search the Forum to find Threads (with Code Examples) where I've explained the Concept many times already...), or the 'POS=R1' will catch the 1st 'LI' from the 2nd List on the Page (if there are several Lists..., or nothing if there is only 1 List...).

But maybe even simpler would be to extract the whole List in just one 'EXTRACT=HTM' on the 'UL' Element (then you don't even need to use 'Double R-Pos') and let 'EVAL()' "do the job" to isolate each 'LI_id' one by one... :idea:
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
Posts: 7722
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract ID from list

Post by chivracq » Sun Nov 11, 2018 8:28 am

OK, never mind, losing interest a bit, you would need to react and follow up a bit quicker than once every 4 days... :idea:

So OK, I guess you managed to get your Script to work, very good then... :D , but it would still be useful for other Users with a similar Case/Qt and trying to follow the Thread if you could share your Final Script so they don't have "to re-invent the wheel"... :idea:
(Hum, and it's more than "an Idea", I won't help you next time if you don't share your Script/Solution anyway... :idea: )
Thread started 9 days ago..., should have been handled and solved in 1/2 hour... :roll:
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
DDon
Posts: 20
Joined: Sat Aug 06, 2016 1:41 pm

Re: Extract ID from list

Post by DDon » Sun Nov 11, 2018 11:19 pm

Ahh, sorry, I don't check the forums everyday, I just login back when I suddenly like "opps, forget to check iMacro forums'

Bellow is the solution, get the link which contain the listing ID from the Edit button, then extract it by Eval, 8) .

If you noted that in my Eval script, there is var x,y,z, that is because the whole script I didn't write it by myself, I research on the Internet to find for the similar solution as I'm not a programmer, but I spend a lot of hours to make the script works for my case. For other beginers, var x,y,z; can be delete to make the script looks tidy. :wink:

Code: Select all

TAG POS={{row}} TYPE=A ATTR=TXT:Edit EXTRACT=HREF
SET listingID EVAL("var s='{{!EXTRACT}}'; var x,y,z; y=s.split('='); y[1];")
chivracq
Posts: 7722
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract ID from list

Post by chivracq » Mon Nov 12, 2018 5:36 am

DDon wrote:Ahh, sorry, I don't check the forums everyday, I just login back when I suddenly like "opps, forget to check iMacro forums'

Bellow is the solution, get the link which contain the listing ID from the Edit button, then extract it by Eval, 8) .

If you noted that in my Eval script, there is var x,y,z, that is because the whole script I didn't write it by myself, I research on the Internet to find for the similar solution as I'm not a programmer, but I spend a lot of hours to make the script works for my case. For other beginers, var x,y,z; can be delete to make the script looks tidy. :wink:

Code: Select all

TAG POS={{row}} TYPE=A ATTR=TXT:Edit EXTRACT=HREF
SET listingID EVAL("var s='{{!EXTRACT}}'; var x,y,z; y=s.split('='); y[1];")
Yeah, well, "I research on the Internet" + "because the whole script I didn't write it by myself", yeah sure, you didn't need to search "on the Internet" because that Script if from me and I only post on the iMacros Forum...! :lol:

And the Script you posted hasn't hardly anything to do with your original Qt..., it is simply one of my Examples and not really the best one in your Case, even if yep, the Syntax I used in the 'EVAL()' is indeed the one to use to implement the Functionality that you need (applied to the "real" HTML Element you want to extract, not this 'Link' + 'HREF' that have nothing to do with your Thread I would think...), you need one with a Double 'split()' and then you'll understand the Use for "x,y,z", ah-ah...!

>>>

I don't get it, either you've solved your Pb and you post your "real" Script or you still only have some vague "half-Idea" on how to solve it but you don't need to "pretend"... and I can still help you to get your Script to work..., but you've already been using iMacros for at least 2 years from your Reg-Date on the Forum, nothing is very complicated in what I'm trying to explain... :o
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
Posts: 7722
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract ID from list

Post by chivracq » Mon Nov 12, 2018 1:34 pm

But OK, applied to the 'UL' Element, that would give stg like this for the 1st Item in the List...:

Code: Select all

TAG POS=1 TYPE=UL ATTR=ID:search-result-set EXTRACT=HTM
SET List_ID EVAL("var s='{[!EXTRACT}}'; var x,y,z; x=s.split('data-ad-id=\"'); y=x[1].split('\"'); z=y[0]; z;")
PROMPT List_ID:<SP>_{{List_ID}}_ 
(Not tested...)

And if you want the 2nd, 3rd Item in the List etc..., you simply change the "y=x[1]..." part into "y=x[2]...", "y=x[3]..." etc... 8)

Oh...!, and if you ever expect that List to be empty (and/or the 'UL' Element not to be found), then you need to activate '!ERRORIGNORE' before the 'EVAL()' as 'EVAL()' doesn't "like" a 'split()[1]' if there is no 2nd Element in the 'split()', (Index starts at '[0]'), or your Script will abort otherwise...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
Post Reply