Extract In Multiple HTML Tag

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team

Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
debyling12
Posts: 2
Joined: Fri Dec 29, 2017 11:16 am

Extract In Multiple HTML Tag

Post by debyling12 » Fri Dec 29, 2017 11:23 am

Hi, i would like to ask how to extract the BOLD TEXT "The Photographer (2017)" from this html tag below?

<h1><a href="hxxt://sample.com" title="The Photographer (2017)" target="blank;"><img src="http://www.senimovies.net/wp-content/th ... .png"/></a> The Photographer (2017) </h1>

I have use this code
TAG POS=1 TYPE=H1 ATTR=* EXTRACT=TXT

The result is like looping so many times like this
The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract In Multiple HTML Tag

Post by chivracq » Fri Dec 29, 2017 12:29 pm

debyling12 wrote:Hi, i would like to ask how to extract the BOLD TEXT "The Photographer (2017)" from this html tag below?
<h1><a href="hxxt://sample.com" title="The Photographer (2017)" target="blank;"><img src="http://www.senimovies.net/wp-content/th ... .png"/></a> The Photographer (2017) </h1>
I have use this code

Code: Select all

TAG POS=1 TYPE=H1 ATTR=* EXTRACT=TXT
The result is like looping so many times like this
The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)The Photographer (2017)
CIM...! :mrgreen: (Read my Sig...)

But hum, yep, your 'EXTRACT=TXT' on that 'TYPE=H1' Element should indeed extract the Data you want, I would think... But you've removed all Attributes from your 'TAG' Statement, I hope you know what you are doing indeed, ah-ah...!, as you then might be extracting some completely different Element..., unless iMacros in 'Record' Mode recorded itself "ATTR=*"..., no URL posted, we can't have a look at your Site, even if the URL of your Site is probably not difficult to guess... :roll: (Beurk, ugly-shitty Site btw, yikes...!)

Other Option would be 'EXTRACT=TITLE'... :idea:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
debyling12
Posts: 2
Joined: Fri Dec 29, 2017 11:16 am

Re: Extract In Multiple HTML Tag

Post by debyling12 » Sat Dec 30, 2017 5:36 am

Hi, thanks for the reply, i also use EXTRACT=TITLE but there's nothing to extract because the title is inside <h1><a title>
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract In Multiple HTML Tag

Post by chivracq » Sat Dec 30, 2017 9:08 am

debyling12 wrote:Hi, thanks for the reply, i also use EXTRACT=TITLE but there's nothing to extract because the title is inside <h1><a title>
Yeah, I don't know, provide the URL of your Site for me to have a look..., and mention your FCI like I already asked you for me to follow up anyway...
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
Post Reply