Loop through extraction using field name versus POS?

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Loop through extraction using field name versus POS?

by azbob on Mon Sep 05, 2016 7:03 pm

Configuration: Surface Pro 2, Win 10, Firefox 47.0.1, iMacros Standard Edition (x86) Version 11.0.246.4051

Hi,
I am looking for a way to reduce the amount of statements when extracting data. It would be ideal if I could get it down to 2 statements for the lot...since this is a small section of a larger macro I would prefer not to go to .js but handle in .iim files. I only want to extract the data from the fields that match certain field names. I can't figure out how to loop through POS since the extract selection is based on ATTR="field name" not position on web page.

Here is what i have now:
VERSION BUILD=10.3.27.5830
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP YES
SET !TIMEOUT_STEP 0
'Subject GLA SF
TAG POS=1 TYPE=LABEL ATTR=TXT:Living<SP>Area
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Subject Year built
TAG POS=1 TYPE=LABEL ATTR=TXT:Year<SP>Built
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Subject Stories
TAG POS=1 TYPE=LABEL ATTR=TXT:Stories
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Begin Collect Subdivision data
'year range
TAG POS=1 TYPE=LABEL ATTR=TXT:Year<SP>Built<SP>Range
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'# with pool
TAG POS=1 TYPE=LABEL ATTR=TXT:With<SP>Pool
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'# single story
TAG POS=1 TYPE=LABEL ATTR=TXT:Single<SP>Story
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'# multi story
TAG POS=1 TYPE=LABEL ATTR=TXT:Multiple<SP>Story
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Avg GLA SF
TAG POS=1 TYPE=LABEL ATTR=TXT:Sqft
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Avg Lot SF
TAG POS=1 TYPE=LABEL ATTR=TXT:Lot<SP>Sqft
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'End Subdicision
'Deed Hstory-Last Sale date
TAG POS=1 TYPE=TH ATTR=TXT:Sale<SP>Date
TAG POS=R1 TYPE=TD ATTR=CLASS:collapsible-caption EXTRACT=TXT
'Deed Hstory- Last Sale price
TAG POS=1 TYPE=TH ATTR=TXT:Sale<SP>Price
TAG POS=R1 TYPE=TD ATTR=CLASS:cell-numeric EXTRACT=TXT
' Most recent Land value
TAG POS=1 TYPE=DIV ATTR=TXT:2017<SP>Prelim
TAG POS=R2 TYPE=SPAN ATTR=CLASS:htable-data EXTRACT=TXT
' Most recent taxes
TAG POS=1 TYPE=DIV ATTR=TXT:2015<SP>Final
TAG POS=R8 TYPE=SPAN ATTR=CLASS:htable-data EXTRACT=TXT
' Subject subdivision
TAG POS=1 TYPE=LABEL ATTR=TXT:Subdivision
TAG POS=R1 TYPE=DIV ATTR=CLASS:* EXTRACT=TXT
' Subject Legal Description
TAG POS=1 TYPE=SPAN ATTR=TXT:Description
TAG POS=R1 TYPE=LABEL ATTR=CLASS:monsoon-fielddata EXTRACT=TXT
' Subject Zoning
TAG POS=1 TYPE=DIV ATTR=TXT:City<SP>Zone
TAG POS=R1 TYPE=LABEL ATTR=CLASS:field-label EXTRACT=TXT
'Does subject have pool
TAG POS=1 TYPE=LABEL ATTR=TXT:Pool
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT

Thanks
azbob
 
Posts: 76
Joined: Mon Sep 21, 2009 11:16 am

Re: Loop through extraction using field name versus POS?

by chivracq on Mon Sep 05, 2016 7:40 pm

azbob wrote:Configuration:
Code: Select all
Surface Pro 2, Win 10, Firefox 47.0.1, iMacros Standard Edition (x86) Version 11.0.246.4051


Hi,
I am looking for a way to reduce the amount of statements when extracting data. It would be ideal if I could get it down to 2 statements for the lot...since this is a small section of a larger macro I would prefer not to go to .js but handle in .iim files. I only want to extract the data from the fields that match certain field names. I can't figure out how to loop through POS since the extract selection is based on ATTR="field name" not position on web page.

Here is what i have now:
Code: Select all
VERSION BUILD=10.3.27.5830
SET !ERRORIGNORE YES
SET !EXTRACT_TEST_POPUP YES
SET !TIMEOUT_STEP 0
'Subject GLA SF
TAG POS=1 TYPE=LABEL ATTR=TXT:Living<SP>Area
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Subject Year built
TAG POS=1 TYPE=LABEL ATTR=TXT:Year<SP>Built
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Subject Stories
TAG POS=1 TYPE=LABEL ATTR=TXT:Stories
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Begin Collect Subdivision data
'year range
TAG POS=1 TYPE=LABEL ATTR=TXT:Year<SP>Built<SP>Range
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'# with pool
TAG POS=1 TYPE=LABEL ATTR=TXT:With<SP>Pool
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'# single story
TAG POS=1 TYPE=LABEL ATTR=TXT:Single<SP>Story
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'# multi story
TAG POS=1 TYPE=LABEL ATTR=TXT:Multiple<SP>Story
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Avg GLA SF
TAG POS=1 TYPE=LABEL ATTR=TXT:Sqft
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'Avg Lot SF
TAG POS=1 TYPE=LABEL ATTR=TXT:Lot<SP>Sqft
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
'End Subdicision
'Deed Hstory-Last Sale date
TAG POS=1 TYPE=TH ATTR=TXT:Sale<SP>Date
TAG POS=R1 TYPE=TD ATTR=CLASS:collapsible-caption EXTRACT=TXT
 'Deed Hstory- Last Sale price
TAG POS=1 TYPE=TH ATTR=TXT:Sale<SP>Price
TAG POS=R1 TYPE=TD ATTR=CLASS:cell-numeric EXTRACT=TXT
' Most recent Land value
TAG POS=1 TYPE=DIV ATTR=TXT:2017<SP>Prelim
TAG POS=R2 TYPE=SPAN ATTR=CLASS:htable-data EXTRACT=TXT
' Most recent taxes
TAG POS=1 TYPE=DIV ATTR=TXT:2015<SP>Final
TAG POS=R8 TYPE=SPAN ATTR=CLASS:htable-data EXTRACT=TXT
' Subject subdivision
TAG POS=1 TYPE=LABEL ATTR=TXT:Subdivision
TAG POS=R1 TYPE=DIV ATTR=CLASS:* EXTRACT=TXT
' Subject Legal Description
TAG POS=1 TYPE=SPAN ATTR=TXT:Description
TAG POS=R1 TYPE=LABEL ATTR=CLASS:monsoon-fielddata EXTRACT=TXT
' Subject Zoning
TAG POS=1 TYPE=DIV ATTR=TXT:City<SP>Zone
TAG POS=R1 TYPE=LABEL ATTR=CLASS:field-label EXTRACT=TXT
'Does subject have pool
TAG POS=1 TYPE=LABEL ATTR=TXT:Pool
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT


Thanks

I guess you can loop them on:
Code: Select all
TAG POS={{!LOOP}} TYPE=LABEL ATTR=TXT:*
TAG POS=R1 TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT
or even more simply, as you use Relative Positioning, directly on:
Code: Select all
TAG POS={{!LOOP}} TYPE=DIV ATTR=CLASS:field-value EXTRACT=TXT


Oh...!, but hum, this will only work with the first 9 Fields with the same Class Attribute, I see that it changes after that..., hum, and with different Types (DIV + TD + SPAN + LABEL), so pfff, won't be easy...! But if your Script works, why do you want to modify it...!?
If you ever notice that one Field is missing or you want to exclude some specific Field, trying to find out which one will be some king of hassle trying to identify if it's POS=12 or =13 or =15 etc...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6490
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: Loop through extraction using field name versus POS?

by azbob on Tue Sep 06, 2016 4:32 pm

Only wanted to see if there was a way to handle what seemed on surface as repetitive steps.
azbob
 
Posts: 76
Joined: Mon Sep 21, 2009 11:16 am

Re: Loop through extraction using field name versus POS?

by chivracq on Tue Sep 06, 2016 6:49 pm

azbob wrote:Only wanted to see if there was a way to handle what seemed on surface as repetitive steps.

"Repetitive Steps", ah-ah...!, that's exactly what iMacros is for...!, you record once which Fields you want to extract and you never have to do it again..., well, unless you one day want to add or remove any Fields.
And Web-Pages can change, if they added some "Elevator" Field for example, your current Script would still work while if you were looping, that extra Field would shift away all other Extracts that you do and you would need a lot of Debugging to get it to work again..., until the next Change again while your current Script is very easy to follow and to maintain (and to debug) by yourself or anybody you would give the Script to to make a Modification thanks to the Labels and all the nice Comments you included in it...

You didn't post the URL of the Site but I could still understand and follow what your Script was doing/meant to do..., if it had been some kind of
Code: Select all
TAG POS={{!LOOP}} TYPE=* ATTR=TXT:* EXTRACT=TXT
=> I wouldn't even have reacted to your Thread as I wouldn't have had any Idea what your Script was doing exactly without Access to the Site...

What's the saying btw...? "If it ain't broken, don't change it...!", or stg like that...! 8)
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6490
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 4 guests

-->