Selecte text from paragraph

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
marion
Posts: 1
Joined: Wed Jan 18, 2017 7:54 am

Selecte text from paragraph

Post by marion » Wed Jan 18, 2017 8:02 am

Hi,
I'm using the Firefox Extension(9.0.3) an Firefox 50.1.0

I have folowing HTML:

Code: Select all

<p>
first line<br /> second line<br /> E-Mail: <a href="mailto:test@web.com">test@web.com</a><br>  <a href="http://www.web.com" target="_blank">http://www.web.com</a><br> <br />
</p>
I have following imacro:

Code: Select all

TAG POS=1 TYPE=P ATTR=TXT:* EXTRACT=TXT
When I extract this - on position of the links the text reepats from top:

Code: Select all

first line
second line
E-Mail:first line
second line
E-Mail:test@web.com
first line
second line
E-Mail:first line
second line
E-Mail:test@web.com
http://www.web.com
How can I avoid this? What am I doing wrong?

Thanks for your help
chivracq
Posts: 9004
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Selecte text from paragraph

Post by chivracq » Wed Jan 18, 2017 4:05 pm

marion wrote:Hi,
I'm using the

Code: Select all

Firefox Extension(9.0.3) an Firefox 50.1.0
I have folowing HTML:

Code: Select all

<p>
first line<br /> second line<br /> E-Mail: <a href="mailto:test@web.com">test@web.com</a><br>  <a href="http://www.web.com" target="_blank">http://www.web.com</a><br> <br />
</p>
I have following imacro:

Code: Select all

TAG POS=1 TYPE=P ATTR=TXT:* EXTRACT=TXT
When I extract this - on position of the links the text reepats from top:

Code: Select all

first line
second line
E-Mail:first line
second line
E-Mail:test@web.com
first line
second line
E-Mail:first line
second line
E-Mail:test@web.com
http://www.web.com
How can I avoid this? What am I doing wrong?

Thanks for your help
Extracting paragraph with links in it

Posted by marion on 18 Jan 2017, 10:46

Hi,

Code: Select all

Firefox-Version: 50.1.0
iMacros-Version- Addon for Firefox: 9.0.3
I want to extract a text from a paragraph.

This is the html:

Code: Select all

    <p>
    first line<br /> second line<br /> E-Mail: <a href="mailto:test@web.com">test@web.com</a><br>  <a href="http://www.web.com" target="_blank">http://www.web.com</a><br> <br />
    </p>
This is my code for iMacro:

Code: Select all

    TAG POS=1 TYPE=P ATTR=TXT:* EXTRACT=TXT
I'am expecting this:

Code: Select all

    first line
    second line
    E-Mail: test@web.com
    http://www.web.com
But I get:

Code: Select all

    first line
    second line
    E-Mail:first line
    second line
    E-Mail:test@web.com
    first line
    second line
    E-Mail:first line
    second line
    E-Mail:test@web.com
    http://www.web.com
I can't understand this. Am I doing something wrong? I can I avoid this?
No need to try to spam the Forum by opening Duplicates of your Thread in different Sub-Forums, this current Thread is enough in the 'Data Extraction' Sub-Forum, I've deleted your Duplicate Thread in the 'FF' Sub-Forum as one is enough and your Qt has nothing specific to ONLY iMacros for FF... :roll:

OS is missing btw from your FCI, even if it won't play a role for this current Thread, but FCI = iMacros + Browser + OS.

URL not posted, I cannot test on the Page/Site myself but I suspect that from removing the Content of the 'TXT' Attribute on the 'P' Element you are trying to extract, you now extract some other Hidden 'P' Element higher in the "POS=n" HTML Structure of the Page and the 'P' Element you are trying to extract has now shifted to "POS=2" or "POS=3" etc...

Other Method, your 'P' Element is probably contained in some 'DIV' Element, would be to extract that 'DIV' Element at the higher Level on its Class ID for example...
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Post Reply