Extract or eliminate text from duplicate text

Support for iMacros. The iMacros software is the unique solution for automating every activity inside a web browser, for data extraction and web testing.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Extract or eliminate text from duplicate text

by deepesh on Fri Jul 07, 2017 3:13 am

I'm Extracting data from this link "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=20-Lacs"
I tried
Code: Select all
TAG POS=2 TYPE=a ATTR=CLASS:property-sticky-link&&TXT:* EXTRACT=TXT

But this is giving me
BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartment1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartment1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan690 sqft


but what I want is just 1 BHK Apartment for sale in Kalyan only once
so tried this
Code: Select all
SET !VAR1 EVAL("var s=\"{{!EXTRACT}}\"; s.split(' ')[-7];")

But These aren't working.

Any idea on how to go about this.

Any help would be much appreciated. Thanks
deepesh
 
Posts: 3
Joined: Fri Jun 30, 2017 11:44 pm

Re: Extract or eliminate text from duplicate text

by chivracq on Fri Jul 07, 2017 8:05 am

deepesh wrote:I'm Extracting data from this link "http://www.magicbricks.com/property-for-sale/residential-real-estate?bedroom=1&proptype=Multistorey-Apartment,Builder-Floor-Apartment,Penthouse,Studio-Apartment&cityName=Thane&BudgetMin=5-Lacs&BudgetMax=20-Lacs"
I tried
Code: Select all
TAG POS=2 TYPE=a ATTR=CLASS:property-sticky-link&&TXT:* EXTRACT=TXT

But this is giving me
BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartment1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartment1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan1 BHK Apartmentfor sale in1 BHK Apartmentfor sale inKalyan690 sqft


but what I want is just 1 BHK Apartment for sale in Kalyan only once
so tried this
Code: Select all
SET !VAR1 EVAL("var s=\"{{!EXTRACT}}\"; s.split(' ')[-7];")

But These aren't working.

Any idea on how to go about this.

Any help would be much appreciated. Thanks

Hum, nice to see that you've "finally" found our Forum, ah-ah...!, I had noticed you had been asking a few "interesting" Qt's on the SOF Forum in the last few weeks, for a few of which I would have probably had some better/simpler Solutions btw, but I never answered any as you never mention(ed) your FCI and I don't like the Reputation System on SOF...

We are a little bit "stricter" on this Forum, and you need to mention your FCI (read my Sig...) for each Thread you open, or if you post for the first time in some existing Thread...
=> CIM...! :mrgreen:

But OK, I had a look at your Site, but hum, I only get once "1 BHK Apartment for sale in Ambernath 650 sqft" for your Extract on 'POS=2'... The Site Content seems to be updated regularly...
I've removed a few Soft Returns from the Extract btw, the exact Result is:
Code: Select all



1 BHK Apartment

for sale in
Ambernath




650 sqft



The Data looks a little bit "cleaner" if you extract it from the 'H3' Element embedded in the Link:
Code: Select all
TAG POS=2 TYPE=A ATTR=CLASS:property-sticky-link&&TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=* ATTR=* EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R-1 TYPE=H3 ATTR=* EXTRACT=TXT

=> Extracted:
Code: Select all
1 BHK Apartment

for sale in
Ambernath




650 sqft

But for both you would still need to trim and remove the Soft Returns, and you would get the same Result, so your 'EXTRACT' on the Link directly is "good enough" already, I would think, ah-ah...!

But OK, this one seems to work on all Properties on this Page, simply change the 'POS=2' on the Link to 'POS=n' or 'POS={{!LOOP}}' to test...:
Code: Select all
VERSION BUILD=8820413 RECORDER=FX
TAB T=1
SET !EXTRACT_TEST_POPUP NO

'Extract 'Property Description':
TAG POS=2 TYPE=A ATTR=CLASS:property-sticky-link&&TXT:* EXTRACT=TXT
SET Ppt_Extract {{!EXTRACT}}

'Extract 'nnn sqft' Value (to remove it from the Description):
TAG POS=R1 TYPE=* ATTR=* EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R-1 TYPE=B ATTR=CLASS:"areaValue" EXTRACT=TXT
SET Sqft {{!EXTRACT}}

'Isolate and clean 'Property Description':
SET Ppt_Descr EVAL("var pe='{{Ppt_Extract}}', sqft='{{Sqft}}'; x=pe.replace(sqft,''); x=x.replace('for sale in',' for sale in '); y=x.split('\\n').join(''); z=y.trim(); z;;")
PROMPT _{{Ppt_Descr}}_

=> Result from the 'PROMPT' will be:
_1 BHK Apartment for sale in Ambernath_
(Tested on iMacros for FF v8.8.2, Pale Moon v26.3.3 (=FF47), Win10-x64.)
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Extract or eliminate text from duplicate text

by deepesh on Mon Jul 10, 2017 10:02 pm

ohh yes it worked ...Thanks alot for your time and effort...Thanks a ton
deepesh
 
Posts: 3
Joined: Fri Jun 30, 2017 11:44 pm

Re: Extract or eliminate text from duplicate text

by chivracq on Mon Jul 10, 2017 11:55 pm

deepesh wrote:ohh yes it worked ...Thanks alot for your time and effort...Thanks a ton

Okay, Thanks for the Update, well 4 days later, glad to hear that my Script solved your Pb..., but hum..., I asked you to mention your FCI but you didn't comply so I guess I won't try to help you next time, or you'll have to wait for the same time until you mention it... :wink:
(Sorry but I only help Users using the Forum "a bit correctly" and reading the Forum Rules and answering my Qt's belong to that "a bit correctly"...)
And hum, many/most of your Threads on SOF are waiting for some Follow-up from you I notice, you should respect a bit more the People helping you on Tech Forums, just saying... :idea:
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to General Support & Discussions

Who is online

Users browsing this forum: Google [Bot], Majestic-12 [Bot] and 10 guests

-->