Extract or eliminate text from duplicate text

Support for iMacros. The iMacros software is the unique solution for automating every activity inside a web browser, for data extraction and web testing.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
deepesh
Posts: 3
Joined: Sat Jul 01, 2017 6:44 am

Extract or eliminate text from duplicate text

Post by deepesh » Fri Jul 07, 2017 10:13 am

I'm Extracting data from this link "http://www.magicbricks.com/property-for ... ax=20-Lacs"
I tried

Code: Select all

TAG POS=2 TYPE=a ATTR=CLASS:property-sticky-link&&TXT:* EXTRACT=TXT
But this is giving me
BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartment1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartment1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan690 sqft
but what I want is just 1 BHK Apartment for sale in Kalyan only once
so tried this

Code: Select all

SET !VAR1 EVAL("var s=\"{{!EXTRACT}}\"; s.split(' ')[-7];")
But These aren't working.

Any idea on how to go about this.

Any help would be much appreciated. Thanks
chivracq
Posts: 8703
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract or eliminate text from duplicate text

Post by chivracq » Fri Jul 07, 2017 3:05 pm

deepesh wrote:I'm Extracting data from this link "http://www.magicbricks.com/property-for ... ax=20-Lacs"
I tried

Code: Select all

TAG POS=2 TYPE=a ATTR=CLASS:property-sticky-link&&TXT:* EXTRACT=TXT
But this is giving me
BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartment1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartment1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan690 sqft
but what I want is just 1 BHK Apartment for sale in Kalyan only once
so tried this

Code: Select all

SET !VAR1 EVAL("var s=\"{{!EXTRACT}}\"; s.split(' ')[-7];")
But These aren't working.

Any idea on how to go about this.

Any help would be much appreciated. Thanks
Hum, nice to see that you've "finally" found our Forum, ah-ah...!, I had noticed you had been asking a few "interesting" Qt's on the SOF Forum in the last few weeks, for a few of which I would have probably had some better/simpler Solutions btw, but I never answered any as you never mention(ed) your FCI and I don't like the Reputation System on SOF...

We are a little bit "stricter" on this Forum, and you need to mention your FCI (read my Sig...) for each Thread you open, or if you post for the first time in some existing Thread...
=> CIM...! :mrgreen:

But OK, I had a look at your Site, but hum, I only get once "1 BHK Apartment for sale in Ambernath 650 sqft" for your Extract on 'POS=2'... The Site Content seems to be updated regularly...
I've removed a few Soft Returns from the Extract btw, the exact Result is:

Code: Select all




1 BHK Apartment

for sale in
Ambernath




650 sqft

The Data looks a little bit "cleaner" if you extract it from the 'H3' Element embedded in the Link:

Code: Select all

TAG POS=2 TYPE=A ATTR=CLASS:property-sticky-link&&TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=* ATTR=* EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R-1 TYPE=H3 ATTR=* EXTRACT=TXT
=> Extracted:

Code: Select all

1 BHK Apartment

for sale in
Ambernath




650 sqft
But for both you would still need to trim and remove the Soft Returns, and you would get the same Result, so your 'EXTRACT' on the Link directly is "good enough" already, I would think, ah-ah...!

But OK, this one seems to work on all Properties on this Page, simply change the 'POS=2' on the Link to 'POS=n' or 'POS={{!LOOP}}' to test...:

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
TAB T=1
SET !EXTRACT_TEST_POPUP NO

'Extract 'Property Description':
TAG POS=2 TYPE=A ATTR=CLASS:property-sticky-link&&TXT:* EXTRACT=TXT
SET Ppt_Extract {{!EXTRACT}}

'Extract 'nnn sqft' Value (to remove it from the Description):
TAG POS=R1 TYPE=* ATTR=* EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R-1 TYPE=B ATTR=CLASS:"areaValue" EXTRACT=TXT
SET Sqft {{!EXTRACT}}

'Isolate and clean 'Property Description':
SET Ppt_Descr EVAL("var pe='{{Ppt_Extract}}', sqft='{{Sqft}}'; x=pe.replace(sqft,''); x=x.replace('for sale in',' for sale in '); y=x.split('\\n').join(''); z=y.trim(); z;;")
PROMPT _{{Ppt_Descr}}_
=> Result from the 'PROMPT' will be:
_1 BHK Apartment for sale in Ambernath_
(Tested on iMacros for FF v8.8.2, Pale Moon v26.3.3 (=FF47), Win10-x64.)
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
deepesh
Posts: 3
Joined: Sat Jul 01, 2017 6:44 am

Re: Extract or eliminate text from duplicate text

Post by deepesh » Tue Jul 11, 2017 5:02 am

ohh yes it worked ...Thanks alot for your time and effort...Thanks a ton
chivracq
Posts: 8703
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract or eliminate text from duplicate text

Post by chivracq » Tue Jul 11, 2017 6:55 am

deepesh wrote:ohh yes it worked ...Thanks alot for your time and effort...Thanks a ton
Okay, Thanks for the Update, well 4 days later, glad to hear that my Script solved your Pb..., but hum..., I asked you to mention your FCI but you didn't comply so I guess I won't try to help you next time, or you'll have to wait for the same time until you mention it... :wink:
(Sorry but I only help Users using the Forum "a bit correctly" and reading the Forum Rules and answering my Qt's belong to that "a bit correctly"...)
And hum, many/most of your Threads on SOF are waiting for some Follow-up from you I notice, you should respect a bit more the People helping you on Tech Forums, just saying... :idea:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Post Reply