Search for a list of URLs in site and extract to file?

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team

Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
Candita
Posts: 3
Joined: Wed Nov 04, 2015 7:22 pm

Search for a list of URLs in site and extract to file?

Post by Candita » Wed Nov 04, 2015 7:38 pm

I need to search my site for several old URLs so they can be updated. Then, for each URL that's found I need to extract the page URL where it was found and the link text that's displayed. This all needs to be written to a CSV file for review. I'm fairly sure the URL will only appear once on a page and there won't be multiple URLs found on a page, but I can't rule that out. Ideally the macro could read the URLs I'm searching for from an excel or CSV file (in which case I would need the macro to list which URL was found with the other info mentioned above, but I'm willing to search them individually. I've tried checking the example macros, forums and wiki, but I'm having trouble fitting it all together (haven't used iMacros in a loooong time). Help or a point in the right direction is appreciated!
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Search for a list of URLs in site and extract to file?

Post by chivracq » Thu Nov 05, 2015 12:56 am

Candita wrote:I need to search my site for several old URLs so they can be updated. Then, for each URL that's found I need to extract the page URL where it was found and the link text that's displayed. This all needs to be written to a CSV file for review. I'm fairly sure the URL will only appear once on a page and there won't be multiple URLs found on a page, but I can't rule that out. Ideally the macro could read the URLs I'm searching for from an excel or CSV file (in which case I would need the macro to list which URL was found with the other info mentioned above, but I'm willing to search them individually. I've tried checking the example macros, forums and wiki, but I'm having trouble fitting it all together (haven't used iMacros in a loooong time). Help or a point in the right direction is appreciated!
Hum, Post/Thread approved because no Spam, but CIM...! :mrgreen: for me to read...

OK, I read 11h->5h, you use the right Terminology and you've already used iMacros... => CIM indeed...! :mrgreen:
(I only read and answer Threads where Users mention their FCI, like mentioned as Required Info in the Forum Rules...)

Then, URL is missing + Script + where you get stuck exactly and what you've tried...

Compliment: Thread Title is perfect...! :D
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
Candita
Posts: 3
Joined: Wed Nov 04, 2015 7:22 pm

Re: Search for a list of URLs in site and extract to file?

Post by Candita » Thu Nov 05, 2015 2:36 pm

Thanks, I guess it's obvious I'm really out of practice and not just with iMacros :oops:

I've managed to search a single page and save a CSV with:

VERSION BUILD=8940826 RECORDER=FX
TAB T=1
URL GOTO=http://www.####.html
'TAG POS=1 TYPE=A ATTR=TXT:Knowledge<SP>Center EXTRACT=HREF
SET !LOOP 1
TAG POS= {{!LOOP}} TYPE=A ATTR=HREF:*[URL pattern I'm searching for]* EXTRACT=TXT
TAG POS= {{!LOOP}} TYPE=A ATTR=HREF:*[URL pattern I'm searching for]* EXTRACT=HREF
SET !EXTRACTADD {{!URLCURRENT}}
SAVEAS TYPE=EXTRACT FOLDER=* FILE=SUPPORT-EXTRACT

I'm not seeing how to search the entire site or how to search each page until the pattern is not found, then move to another page. This also reloads the page each time, which seems inefficient.
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Search for a list of URLs in site and extract to file?

Post by chivracq » Thu Nov 05, 2015 3:16 pm

Candita wrote:Thanks, I guess it's obvious I'm really out of practice and not just with iMacros :oops:

I've managed to search a single page and save a CSV with:

Code: Select all

VERSION BUILD=8940826 RECORDER=FX
TAB T=1
URL GOTO=http://www.####.html
'TAG POS=1 TYPE=A ATTR=TXT:Knowledge<SP>Center EXTRACT=HREF
SET !LOOP 1
TAG POS= {{!LOOP}} TYPE=A ATTR=HREF:*[URL pattern I'm searching for]* EXTRACT=TXT
TAG POS= {{!LOOP}} TYPE=A ATTR=HREF:*[URL pattern I'm searching for]* EXTRACT=HREF
SET !EXTRACTADD {{!URLCURRENT}} 
SAVEAS TYPE=EXTRACT FOLDER=* FILE=SUPPORT-EXTRACT
I'm not seeing how to search the entire site or how to search each page until the pattern is not found, then move to another page. This also reloads the page each time, which seems inefficient.
OK, that's already a bit better, but CIM => FCIM...! :mrgreen: Read my Sig if you don't get that one... :idea:

Code: Select all

=> iMacros for FF v8.9.4, FF...?, OS...?
If it's "your Site", then I don't understand why you obfuscate the URL... :?
It's difficult to have a clear idea of what you want if you don't provide all Info..., => a few Samples of your "[URL pattern I'm searching for]" would be needed as well...

For using a .CSv File as a DataSource, look at the 'Loop-CSV-2-Web.iim' Demo Macro... :idea:
And/or search the Forum on "!DATASOURCE" and you'll find many Examples...

You come indeed from some "ice age period" with iMacros...! :shock: , I haven't seem '!EXTRACTADD' in years anymore, it's been deprecated several years ago, even if I think it still works (except on CR maybe) thanks to Backward Compatibility, but you should rather use the 'ADD' Command now.

Reloading the Page each time with 'URL GOTO' is indeed not very efficient, you can just comment it out.
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
Candita
Posts: 3
Joined: Wed Nov 04, 2015 7:22 pm

Re: Search for a list of URLs in site and extract to file?

Post by Candita » Thu Nov 05, 2015 5:48 pm

Sorry to have bothered you. Please delete my posts and account. I'll find another solution.
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Search for a list of URLs in site and extract to file?

Post by chivracq » Thu Nov 05, 2015 7:09 pm

Candita wrote:Sorry to have bothered you. Please delete my posts and account. I'll find another solution.
You don't bother me nor any other Advanced User(s) who'll be willing to help you, I'm just asking you to mention your FCI (Full Config Info) like stated in the Forum Rules as Required Info, and I/we need some more Info to be able to help you. From just looking at your Script we cannot deduct how your Page looks like and iMacros depends heavily on the HTML Structure of a Page..., and I guess you know that as you've been using iMacros for a long time/in the past...

"Please delete my posts and account." is a bit of a childish reaction... (and you can do it yourself... if you want to go that far..., that's why I quoted all your Posts...)
This Forum is the best place to help you when you have some Pb/Question with/about iMacros and finding a way to get your Script running, but you have to play by the "Rules" and mention all relevant Info needed for other (More Advanced) Users to be able to try to help you... It will be the same on any other (Technical) Forums... OK, good luck anyway... 8)
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
pursharthahuja
Posts: 3
Joined: Mon Jan 25, 2016 7:17 am

Re: Search for a list of URLs in site and extract to file?

Post by pursharthahuja » Mon Jan 25, 2016 8:14 am

i am facing the same problem my,
iMacros for FF v8.9.4 and os is linux..
rest of the details are same mentioned above
please provide an answer to this question asap.

thanks in advance.
Post Reply