Extracting from Salesforce webpage

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Extracting from Salesforce webpage

by deplane on Wed Oct 26, 2016 9:00 am

FCI: Win 7 x32 + FF 49.0.2 (always the latest) + iMacro for FF 9.0.3 (always the latest)

I'm attempting to extract portions of text from a salesforce website and I"m not having any luck figuring out how to accomplish this task.

Here is a snippet of the code from the page that I'm trying to extract data from. "Member Banks need to journal" is the text I'm trying to extract. The website is internal to my work domain so a link is not going to be helpful.

Code: Select all
</div>
        </td>
        <td class="efhpCenterContent ">
        <div class="efhpCenterContentBody">
            <div class="efhpCenterTopRow">
                <span class="efhpIcon" style="background: transparent url(/img/icon/cases16.png) 0 0 no-repeat;"><img title="" alt="" src="/s.gif"/></span>
                <span class="efhpCenterLabel">Case Number</span>
                <span class="efhpCenterValue  "><span  title="00056071">00056071</span></span>
                <span class="efhpSeparator"> </span>
                <span class="efhpCenterLabel">Created Date</span>
                <span class="efhpCenterValue  ">9/20/2016 12:30 PM</span>
            </div>
            <div class="efhpTitle" title="Member Banks need to journal">Member Banks need to journal</div>


Here is some sample code that I've been playing with but can't seem to work it out.
I've tried inputting all manner of values for CLASS: [with is not present in the code] and for TYPE.

My hope is that once I see how it is done for this value I'll be able to build a iMarco that will Extract all the values I need into a CSV file. I'll also want to extract efhpCenterValue, efhpCenterLabel, and efhpCenterValue, etc. Those are within a TD and SPAN so I'm not sure if they are handled differently. XPATH may be the way to go, but I'm not sure how to get the XPATH values.

Code: Select all
VERSION BUILD=9030808 RECORDER=FX
TAB T=1
'URL GOTO=https://na33.salesforce.com/5003900001gmeKcAAI?nooverride=1
TAG POS=1 TYPE=DIV ATTR=TXT: EXTRACT=TXT


Thanks in Advance
de Plane
[Boss] de Plane [has landed!]
deplane
 
Posts: 2
Joined: Wed Oct 26, 2016 8:31 am

Re: Extracting from Salesforce webpage

by chivracq on Wed Oct 26, 2016 10:18 am

deplane wrote:FCI: Win 7 x32 + FF 49.0.2 (always the latest) + iMacro for FF 9.0.3 (always the latest)

I'm attempting to extract portions of text from a salesforce website and I"m not having any luck figuring out how to accomplish this task.

Here is a snippet of the code from the page that I'm trying to extract data from. "Member Banks need to journal" is the text I'm trying to extract. The website is internal to my work domain so a link is not going to be helpful.

Code: Select all
</div>
        </td>
        <td class="efhpCenterContent ">
        <div class="efhpCenterContentBody">
            <div class="efhpCenterTopRow">
                <span class="efhpIcon" style="background: transparent url(/img/icon/cases16.png) 0 0 no-repeat;"><img title="" alt="" src="/s.gif"/></span>
                <span class="efhpCenterLabel">Case Number</span>
                <span class="efhpCenterValue  "><span  title="00056071">00056071</span></span>
                <span class="efhpSeparator"> </span>
                <span class="efhpCenterLabel">Created Date</span>
                <span class="efhpCenterValue  ">9/20/2016 12:30 PM</span>
            </div>
            <div class="efhpTitle" title="Member Banks need to journal">Member Banks need to journal</div>


Here is some sample code that I've been playing with but can't seem to work it out.
I've tried inputting all manner of values for CLASS: [with is not present in the code] and for TYPE.

My hope is that once I see how it is done for this value I'll be able to build a iMarco that will Extract all the values I need into a CSV file. I'll also want to extract efhpCenterValue, efhpCenterLabel, and efhpCenterValue, etc. Those are within a TD and SPAN so I'm not sure if they are handled differently. XPATH may be the way to go, but I'm not sure how to get the XPATH values.

Code: Select all
VERSION BUILD=9030808 RECORDER=FX
TAB T=1
'URL GOTO=https://na33.salesforce.com/5003900001gmeKcAAI?nooverride=1
TAG POS=1 TYPE=DIV ATTR=TXT: EXTRACT=TXT


Thanks in Advance
de Plane
[Boss] de Plane [has landed!]

No need to open several Duplicates when you want to open a Thread... And always put a Descriptive Thread Title like finally a bit correct in this 3rd Thread, all Users posting in this Sub-Forum have a "Problem with Extraction", ah-ah...!
(But hum, your Duplicate Posting was maybe related to the fact that you had already opened your original-original Thread last Friday (which got approved then), I think I remember your Pseudo, but "your Timing" was "unlucky" as the Forum had to revert on that day at the end of the afternoon (EUR Time) to a Back-up from a few hours earlier and a few Posts and New Users "disappeared" in the Process...)

>>>

Hum, for your Case, extracting your "Member Banks need to journal" can be done in several ways, I would think:
Code: Select all
TAG POS=1 TYPE=DIV ATTR=CLASS:"efhpTitle" EXTRACT=TXT

Code: Select all
TAG POS=1 TYPE=DIV ATTR=CLASS:"efhpTitle" EXTRACT=TITLE

Code: Select all
TAG POS=1 TYPE=DIV ATTR=TXT:*Case<SP>Number*
TAG POS=R1 TYPE=DIV ATTR=* EXTRACT=TXT

Code: Select all
TAG POS=1 TYPE=DIV ATTR=TXT:*Case<SP>Number*
TAG POS=R1 TYPE=DIV ATTR=* EXTRACT=TITLE

Code: Select all
TAG POS=1 TYPE=SPAN ATTR=TXT:*Case<SP>Number*
TAG POS=R1 TYPE=DIV ATTR=* EXTRACT=TXT

Code: Select all
TAG POS=1 TYPE=SPAN ATTR=TXT:*Case<SP>Number*
TAG POS=R1 TYPE=DIV ATTR=* EXTRACT=TITLE

=> Not tested obviously but I would expect all Code Suggestions to work..., I would need an HTML Saveas of the Page (zipped, Max 256Kb) uploaded to the Thread otherwise to have a "deeper" Look...

To extract the 3 other 'SPAN' Fields, you can reuse a similar Construction like the first one using directly the Class Name or like the last 2 ones (using "EXTRACT=TXT") where I used Relative Positioning on the "Case Number" 'SPAN' Element. It wouldn't work on the 'DIV' Element because all 4 'SPAN' Elements are all located within the same 'DIV' Element, or you would need to use "Double Relative Positioning" to first get outside the 'DIV' for iMacros to be able to "see" inside again.
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6484
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Extracting from Salesforce webpage

by deplane on Wed Oct 26, 2016 10:54 am

No need to open several Duplicates when you want to open a Thread... And always put a Descriptive Thread Title like finally a bit correct in this 3rd Thread, all Users posting in this Sub-Forum have a "Problem with Extraction", ah-ah...!
(But hum, your Duplicate Posting was maybe related to the fact that you had already opened your original-original Thread last Friday (which got approved then), I think I remember your Pseudo, but "your Timing" was "unlucky" as the Forum had to revert on that day at the end of the afternoon (EUR Time) to a Back-up from a few hours earlier and a few Posts and New Users "disappeared" in the Process...)


FCI: Win 7 x32 + FF 49.0.2 (always the latest) + iMacro for FF 9.0.3 (always the latest)

This is actually my first ever post. I've been using iMacros for about 6 months and usually I can find answers to me problems by Googling. I didn't mean to post dupes, I was trying to edit my subject to make it more descriptive, and one thing led to another and bada bing bada boom, I got dupes.

Anyway, thanks for the suggestions, I likely won't have time to try them today as I'm now busy with other tasks, but first thing tomorrow I'll give them a go and see if I can sort things out. A few of the suggestions look like things I've already tried but perhaps you've given me enough to extract [pun intended] an answer.

Good Day!
de Plane
[Boss] de Plane [has landed!]
deplane
 
Posts: 2
Joined: Wed Oct 26, 2016 8:31 am

Re: Extracting from Salesforce webpage

by chivracq on Wed Oct 26, 2016 2:43 pm

deplane wrote:This is actually my first ever post. I've been using iMacros for about 6 months and usually I can find answers to me problems by Googling. I didn't mean to post dupes, I was trying to edit my subject to make it more descriptive, and one thing led to another and bada bing bada boom, I got dupes.

Anyway, thanks for the suggestions, I likely won't have time to try them today as I'm now busy with other tasks, but first thing tomorrow I'll give them a go and see if I can sort things out. A few of the suggestions look like things I've already tried but perhaps you've given me enough to extract [pun intended] an answer.

Good Day!
de Plane
[Boss] de Plane [has landed!]

Ah OK for first Post, some other User then was unlucky to post on Friday for the first time and their Thread disappeared after the Forum had recovered from a Back-up and I thought it could have been you... And I guess they must have thought that their Thread had been deleted deliberately and didn't dare to post again, oops...!

OK for the Duplicates, you can always edit your previous Msg's and even delete a Post/Thread as long as nobody has posted a Reply after you...

Well, post your Results once you've had a chance to try all the Suggs I posted and upload some HTML Saveas of your Page of find some Salesforce Demo on internet for me or other Advanced Users to be able to have a look if you don't come out by yourself...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6484
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 3 guests

-->