Web Scraping Table With SPAN Class

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Web Scraping Table With SPAN Class

by swolfe2 on Tue Apr 25, 2017 5:19 am

Hello iMacros Community,

I'm brand new to iMacros, but I have spent a lot of time watching YouTube videos and browsing this forum. However, I haven't been able to figure out how to do this quite yet.

My company uses an eCRM system that loads via Internet Explorer 11, and I am able to get iMacros to go through all of the login process and get to the point where I'm ready to extract records. I can even get it to extract the first row of records. However, I cannot get it to go down to the next row, and extract it. Each of my attempts either involve it scraping the first line multiple times, or in an #EANF# error after the first row.

My code is below:
Code: Select all
VERSION BUILD=11.5.498.2403
TAB T=1
TAB CLOSEALLOTHERS
FRAME NAME=WorkAreaFrame1

'Table fields to extract in order. [objectid] is the unique index for each record
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].objectid EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].createdon EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].sender EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].description EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].status EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].employee_concat EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].group_concat EXTRACT=TXT
'TAG POS={{!LOOP}} TYPE=A ATTR=ID:C17_W66_V67_V76_items_table[1].ITEMTYPE EXTRACT=TXT
'TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].priority EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=T:\TELESERV\Continuous<sp>Improvement<sp>Db<sp>Files\TCS-KNX<sp>Daily<sp>Digest FILE=Extract_{{!NOW:ddmmyy_hhnnss}}.csv

'This is the row selector
'TAG POS=1 TYPE=A ATTR=ID:C17_W66_V67_V76_ItemTree_sel_1-rowsel EXTRACT=TXT

'Forward button for next page of results. Need to push if exists, and grab next list.
'TAG POS=4 TYPE=SPAN ATTR=CLASS:th-clr-span EXTRACT=TXT

I do have some commented out lines, until I can figure out how to get the scraping of one list of results. The hard part is that this list can contain up to a maximum of 200 results. If there are more than 200 results, it would need to push the "Next" button (TAG POS at the very bottom of the code above), then extract those results as well until the "Next" button could not be pressed again.

Each row is indicated in the string at the very end, surrounded in []. So,
Code: Select all
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].objectid EXTRACT=TXT
would then go to the below for the next line, and so on.
Code: Select all
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[2].objectid EXTRACT=TXT

Below is a picture of how our eCRM GUI is set up.
2017-04-25 08_10_15-iMacros Browser V11.5.498.2403.png
eCRM Table Setup

Although I'm not grabbing every field, the names from the headers should give you an idea of the text that exists within them.

Any help/direction you all could provide would be greatly appreciated.

Thanks!
swolfe2
 
Posts: 1
Joined: Mon Apr 24, 2017 9:19 am

Re: Web Scraping Table With SPAN Class

by chivracq on Tue Apr 25, 2017 8:12 am

swolfe2 wrote:Hello iMacros Community,

I'm brand new to iMacros, but I have spent a lot of time watching YouTube videos and browsing this forum. However, I haven't been able to figure out how to do this quite yet.

My company uses an eCRM system that loads via Internet Explorer 11, and I am able to get iMacros to go through all of the login process and get to the point where I'm ready to extract records. I can even get it to extract the first row of records. However, I cannot get it to go down to the next row, and extract it. Each of my attempts either involve it scraping the first line multiple times, or in an #EANF# error after the first row.

My code is below:
Code: Select all
VERSION BUILD=11.5.498.2403
TAB T=1
TAB CLOSEALLOTHERS
FRAME NAME=WorkAreaFrame1

'Table fields to extract in order. [objectid] is the unique index for each record
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].objectid EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].createdon EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].sender EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].description EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].status EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].employee_concat EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].group_concat EXTRACT=TXT
'TAG POS={{!LOOP}} TYPE=A ATTR=ID:C17_W66_V67_V76_items_table[1].ITEMTYPE EXTRACT=TXT
'TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].priority EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=T:\TELESERV\Continuous<sp>Improvement<sp>Db<sp>Files\TCS-KNX<sp>Daily<sp>Digest FILE=Extract_{{!NOW:ddmmyy_hhnnss}}.csv

'This is the row selector
'TAG POS=1 TYPE=A ATTR=ID:C17_W66_V67_V76_ItemTree_sel_1-rowsel EXTRACT=TXT

'Forward button for next page of results. Need to push if exists, and grab next list.
'TAG POS=4 TYPE=SPAN ATTR=CLASS:th-clr-span EXTRACT=TXT

I do have some commented out lines, until I can figure out how to get the scraping of one list of results. The hard part is that this list can contain up to a maximum of 200 results. If there are more than 200 results, it would need to push the "Next" button (TAG POS at the very bottom of the code above), then extract those results as well until the "Next" button could not be pressed again.

Each row is indicated in the string at the very end, surrounded in []. So,
Code: Select all
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[1].objectid EXTRACT=TXT
would then go to the below for the next line, and so on.
Code: Select all
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[2].objectid EXTRACT=TXT

Below is a picture of how our eCRM GUI is set up.
2017-04-25 08_10_15-iMacros Browser V11.5.498.2403.png

Although I'm not grabbing every field, the names from the headers should give you an idea of the text that exists within them.

Any help/direction you all could provide would be greatly appreciated.

Thanks!

=> FCI: iMacros for IE v11.5, IE11, OS...?

Hum, difficult to be sure without being able to "play" with the Page myself, but if it is the 'table[i]' which gets incremented for each Row, I guess each Field you want to extract will remain at 'POS=1' and won't get incremented... If that's correct, you have 2 Options:

Option 1: Loop on 'POS' and use Wildcard for 'table[i]':
Code: Select all
FRAME NAME=WorkAreaFrame1

'Table fields to extract in order. [objectid] is the unique index for each record
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[*].objectid EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[*].createdon EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[*].sender EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[*].description EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[*].status EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[*].employee_concat EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[*].group_concat EXTRACT=TXT
'TAG POS={{!LOOP}} TYPE=A ATTR=ID:C17_W66_V67_V76_items_table[*].ITEMTYPE EXTRACT=TXT
'TAG POS={{!LOOP}} TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[*].priority EXTRACT=TXT
PROMPT {{!EXTRACT}}


Option 2: Loop on 'table[i]':
Code: Select all
FRAME NAME=WorkAreaFrame1

'Table fields to extract in order. [objectid] is the unique index for each record
TAG POS=1 TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[{{!LOOP}}].objectid EXTRACT=TXT
TAG POS=1 TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[{{!LOOP}}].createdon EXTRACT=TXT
TAG POS=1 TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[{{!LOOP}}].sender EXTRACT=TXT
TAG POS=1 TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[{{!LOOP}}].description EXTRACT=TXT
TAG POS=1 TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[{{!LOOP}}].status EXTRACT=TXT
TAG POS=1 TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[{{!LOOP}}].employee_concat EXTRACT=TXT
TAG POS=1 TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[{{!LOOP}}].group_concat EXTRACT=TXT
'TAG POS=1 TYPE=A ATTR=ID:C17_W66_V67_V76_items_table[{{!LOOP}}].ITEMTYPE EXTRACT=TXT
'TAG POS=1 TYPE=SPAN ATTR=ID:C17_W66_V67_V76_items_table[{{!LOOP}}].priority EXTRACT=TXT
PROMPT {{!EXTRACT}}
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Web Scraping Table With SPAN Class

by chivracq on Sat Apr 29, 2017 7:53 pm

Hum, a bit frustrating..., 4-5 days later and still no Follow-up from @OP on this Thread (especially for a first Thread on the Forum!), I won't be trying to help you next time, you need to follow up a bit quicker on your Thread(s), 5 days and no Follow-up is a bit too long on a Technical Forum, sorry... :shock:
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 2 guests

-->