Issue Extract with firefox plugin

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
Felipe1
Posts: 2
Joined: Mon May 15, 2017 8:10 am

Issue Extract with firefox plugin

Post by Felipe1 » Mon May 15, 2017 8:27 am

Browser: Firefox 53.0.2 (32 bit)
S.O. Windows 8.1 Pro
plug in: iMacros for Firefox 9.0.3 (updated April, 24. 2017)

-----------------------

Hello,

First of all , sorry for my bad english.

I'm having an issue extracting TXT when in the middle of the text to extract there are Bold or Italic characters. It happens with Firefox plugin.

iMacros program works properly

I've changed the demo page (http://demo.imacros.net/Automate/Extract2) HTML to test if the problem was caused for the web page where I want to extract The text.

I've added bold characters in the middle of the phrase like this:
"The second line is extracted too" for "The second line is extracted too".

In HTML:

<td style="width: 52%; outline: 1px solid blue;" class="bdytxt">
This line is extracted.<br>
The <strong>second line</strong>
is extracted, too.
</td>

Running this macro:

VERSION BUILD=8031994
TAB T=1
SET !EXTRACT_TEST_POPUP NO
URL GOTO=http://demo.imacros.net/Automate/Extract2
TAG POS=1 TYPE=TD ATTR=CLASS:bdytxt&&TXT:* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=+_{{!NOW:yyyymmdd_hhnnss}}

the result is:

This line is extracted.
TheThis line is extracted.
Thesecond lineis extracted, too.

instead of:

This line is extracted.
The second line is extracted, too.


It seems as the extracted text repeats himself when finds the bold tag (<strong>)

Thank You in advance
chivracq
Posts: 9374
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Issue Extract with firefox plugin

Post by chivracq » Mon May 15, 2017 12:49 pm

Felipe1 wrote:

Code: Select all

Browser: Firefox 53.0.2 (32 bit)
S.O. Windows 8.1 Pro
plug in: iMacros for Firefox 9.0.3 (updated April, 24. 2017)
-----------------------

Hello,

First of all , sorry for my bad english.

I'm having an issue extracting TXT when in the middle of the text to extract there are Bold or Italic characters. It happens with Firefox plugin.

iMacros program works properly

I've changed the demo page (http://demo.imacros.net/Automate/Extract2) HTML to test if the problem was caused for the web page where I want to extract The text.

I've added bold characters in the middle of the phrase like this:
"The second line is extracted too" for "The second line is extracted too".

In HTML:

Code: Select all

<td style="width: 52%; outline: 1px solid blue;" class="bdytxt">
 This line is extracted.<br>
 The <strong>second line</strong>
 is extracted, too.
</td>
Running this macro:

Code: Select all

VERSION BUILD=8031994
TAB T=1
SET !EXTRACT_TEST_POPUP NO
URL GOTO=http://demo.imacros.net/Automate/Extract2
TAG POS=1 TYPE=TD ATTR=CLASS:bdytxt&&TXT:* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=+_{{!NOW:yyyymmdd_hhnnss}}
the result is:

Code: Select all

This line is extracted.
TheThis line is extracted.
Thesecond lineis extracted, too.
instead of:

Code: Select all

This line is extracted.
The second line is extracted, too.
It seems as the extracted text repeats himself when finds the bold tag (<strong>)

Thank You in advance
Hum..., sounds like a Bug to me indeed... But..., hum again..., I cannot reproduce, it works fine for me after modifying the Demo Page like you did... I attach my "own" Demo Page to the Thread...

Your Extract Statement:

Code: Select all

TAG POS=1 TYPE=TD ATTR=CLASS:bdytxt&&TXT:* EXTRACT=TXT
returns for me the expected Result:

Code: Select all

                This line is extracted.
                The second line is extracted, too.
            
And even the following Extract on the 'STRONG' Element:

Code: Select all

'TAG POS=1 TYPE=STRONG ATTR=TXT:second<SP>line EXTRACT=HTM
TAG POS=1 TYPE=STRONG ATTR=TXT:* EXTRACT=TXT
returns the expected Result:

Code: Select all

second line
I tested in 2 different FCI's, both with the same Results:
- FCI_1: iMacros for FF v8.8.2, Pale Moon v26.3.3 (=FF47), Win10-x64.
- FCI_2: iMacros for FF v8.9.7, FF51, Win10-x64.

It could be specific to only v9.0.3 that you are using as I tested on 2 earlier Versions of iMacros for FF...
But v9.0.3 is pretty Buggy actually and never really got tested as most "serious" Users quickly reverted to v8.9.7 when v9.0.3 was released, so new Bugs keep slowly reaching the Forum...
But v8.9.7 is much more stable and reliable than v9.0.3, so Advice would be for you to revert to v8.9.7...! (It still works on FF52/53.)
(Make sure to disable Automatic Updates for iMacros otherwise it will update itself back to v9.0.3, ah-ah...!)
Attachments
iMacros - STRONG.zip
(82.47 KiB) Downloaded 139 times
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Felipe1
Posts: 2
Joined: Mon May 15, 2017 8:10 am

Re: Issue Extract with firefox plugin

Post by Felipe1 » Tue May 16, 2017 7:12 am

Thank you very much !!!

Yes, its a bug from version 9.0.3. Reverting to version 8.9.7 it works fine

;-)
Post Reply