Trouble Extracting Data - no reference position with automatic extractor

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team

Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
brg48
Posts: 17
Joined: Fri Sep 06, 2019 3:45 pm

Trouble Extracting Data - no reference position with automatic extractor

Post by brg48 » Mon Sep 20, 2021 3:18 pm

I’m not a coder and I use imacros as it helps a non-coder like me. I’m trying to extract data from a website but I don’t see a reference position when I use the automatic extractor. I’m entering a parcel number into the “Map/Block/Parcel” field to get the total amount due. Specifically I’m trying to extract the “Total Due” amount in red from the table. What am I missing? Any help would be greatly appreciated.

I’m using:
iMacros 2021.0
Progress Software Corporation
iMacros version 14.2.2.1
Released on 7/31/2021


Website: http://web.florenceco.org/cgi-bin/ta/tax-inq.cgi
Sample parcel numbers to use in the “Map/Block/Parcel” field:
80019 02 014
80018 06 009



My script:
VERSION BUILD=2021.0
TAB T=1
TAB CLOSEALLOTHERS
'SET !PLAYBACKDELAY 0.2
SET !TIMEOUT_STEP 1
SET !ERRORIGNORE YES
'use a data source
SET !DATASOURCE Florence21.csv
'start on the second record
SET !LOOP 2
'use the rest of the data in the data source?
SET !DATASOURCE_LINE {{!LOOP}}
URL GOTO=http://web.florenceco.org/cgi-bin/ta/tax-inq.cgi
TAG POS=1 TYPE=INPUT:TEXT ATTR=NAME:map CONTENT={{!COL1}}
TAG POS=1 TYPE=INPUT:TEXT ATTR=NAME:block CONTENT={{!COL2}}
TAG POS=1 TYPE=INPUT:TEXT ATTR=NAME:parcel CONTENT={{!COL3}}
TAG POS=1 TYPE=INPUT:SUBMIT ATTR=*
TAG POS=1 TYPE=FONT ATTR=* EXTRACT=TXT
TAG POS=1 TYPE=FONT ATTR=* EXTRACT=TXT
TAG POS=1 TYPE=FONT ATTR=TXT:"Total Due" EXTRACT=TXT
TAG POS=1 TYPE=FONT ATTR=* EXTRACT=TXT
'save the extract to the following location
SAVEAS TYPE=EXTRACT FOLDER=C:\Users\trey_\OneDrive\Documents\iMacros\Downloads\SC\Florence FILE=Florence21Extract9-17.csv


This is what I get when I run the script:

Date: Sep 20, 2021 Date: Sep 20, 2021 Total Due Date: Sep 20, 2021
brg48
Posts: 17
Joined: Fri Sep 06, 2019 3:45 pm

Re: Trouble Extracting Data - no reference position with automatic extractor

Post by brg48 » Mon Sep 20, 2021 3:20 pm

I forgot to add I'm using Windows 10 Home version 20H2 64-bit
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Trouble Extracting Data - no reference position with automatic extractor

Post by chivracq » Mon Sep 20, 2021 9:02 pm

brg48 wrote:
Mon Sep 20, 2021 3:18 pm
I’m not a coder and I use imacros as it helps a non-coder like me. I’m trying to extract data from a website but I don’t see a reference position when I use the automatic extractor. I’m entering a parcel number into the “Map/Block/Parcel” field to get the total amount due. Specifically I’m trying to extract the “Total Due” amount in red from the table. What am I missing? Any help would be greatly appreciated.

I’m using:

Code: Select all

iMacros 2021.0
Progress Software Corporation
iMacros version 14.2.2.1
Released on 7/31/2021

Website: http://web.florenceco.org/cgi-bin/ta/tax-inq.cgi
Sample parcel numbers to use in the “Map/Block/Parcel” field:

Code: Select all

80019 02 014
80018 06 009


My script:

Code: Select all

VERSION BUILD=2021.0
TAB T=1
TAB CLOSEALLOTHERS
'SET !PLAYBACKDELAY 0.2
SET !TIMEOUT_STEP 1
SET !ERRORIGNORE YES
'use a data source
SET !DATASOURCE Florence21.csv
'start on the second record
SET !LOOP 2
'use the rest of the data in the data source?
SET !DATASOURCE_LINE {{!LOOP}}
URL GOTO=http://web.florenceco.org/cgi-bin/ta/tax-inq.cgi
TAG POS=1 TYPE=INPUT:TEXT ATTR=NAME:map CONTENT={{!COL1}}
TAG POS=1 TYPE=INPUT:TEXT ATTR=NAME:block CONTENT={{!COL2}}
TAG POS=1 TYPE=INPUT:TEXT ATTR=NAME:parcel CONTENT={{!COL3}}
TAG POS=1 TYPE=INPUT:SUBMIT ATTR=*
TAG POS=1 TYPE=FONT ATTR=* EXTRACT=TXT
TAG POS=1 TYPE=FONT ATTR=* EXTRACT=TXT
TAG POS=1 TYPE=FONT ATTR=TXT:"Total Due" EXTRACT=TXT
TAG POS=1 TYPE=FONT ATTR=* EXTRACT=TXT
'save the extract to the following location
SAVEAS TYPE=EXTRACT FOLDER=C:\Users\trey_\OneDrive\Documents\iMacros\Downloads\SC\Florence FILE=Florence21Extract9-17.csv

This is what I get when I run the script:

Code: Select all

Date: Sep 20, 2021	Date: Sep 20, 2021	Total Due	Date: Sep 20, 2021
brg48 wrote:
Mon Sep 20, 2021 3:20 pm
I forgot to add I'm using

Code: Select all

Windows 10 Home version 20H2 64-bit

Good Quality for your Post/Thread, that's a "Pleasure", Thanks...! :D

'Pro'/'Ent(erprise)'/'Trial' is also missing from your FCI..., => will probably be 'Trial' I reckon, or I guess you would have "Direct Acces" to @TechSup and wouldn't be asking your Qt on the Forum, ah-ah...! :wink:

>>>

=> Alright, here is an "easy" Solution for your Scenario, => using 'Relative Positioning', ... which will very often be "your Best Friend" when extracting some Data, especially from some 'TABLE' Element...: :idea:

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
TAB T=1
'URL GOTO=http://web.florenceco.org/cgi-bin/ta/tax-inq.cgi?file=rpcpubf&step=2&name=&street=&map=80019&block=02&parcel=014&acct=&cat=&num=&rc1=&rc2=
TAG POS=1 TYPE=FONT ATTR=TXT:Total<SP>Due

'Recorded:
'TAG POS=1 TYPE=FONT ATTR=TXT:584.69
'TAG POS=1 TYPE=TD ATTR=TXT:245.34<SP>272.60<SP>517.94<SP>584.69

'Attempts to extract the "Total Due" Amount:
'TAG POS=R1 TYPE=FONT ATTR=TXT:* EXTRACT=TXT
'TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=TXT
'TAG POS=R1 TYPE=TD ATTR=TXT:* EXTRACT=HTM
'>
'Works...!:
TAG POS=R1 TYPE=FONT ATTR=SIZE:"+1"&&TXT:* EXTRACT=TXT
=> This will extract "584.69", the "Total Due" Amount that you want, I would think...! :P
(Tested with iMacros for FF v8.8.2, PM26, Win10_x64.)

I left in the Script the 3 first "Attempts" I did to try to extract that Value/Element and to understand for myself the HTML Structure of that Page, you can reactivate them one by one if you want to understand what I tried... :idea:

From all 3, it would still have been possible to "isolate" the "final Data" that you want, but 'EVAL()' would have then been needed, but the "SIZE:"+1"" also/already "did the Trick" without requiring 'EVAL()', so I went for this one, ah-ah...! 8)
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
brg48
Posts: 17
Joined: Fri Sep 06, 2019 3:45 pm

Re: Trouble Extracting Data - no reference position with automatic extractor

Post by brg48 » Wed Sep 22, 2021 2:33 pm

Thank you so much. This solves my issue and I appreciate your help!
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Trouble Extracting Data - no reference position with automatic extractor

Post by chivracq » Wed Sep 22, 2021 2:44 pm

brg48 wrote:
Wed Sep 22, 2021 2:33 pm
Thank you so much. This solves my issue and I appreciate your help!

Ah...!, alright, good..., I was "starting" indeed to wonder if you were going to follow up..., then OK, perfect..., and glad to hear that "it" works... :D

... And I "hope" you understand/understood how I "tackled" your Case/Scenario..., dare to ask if you need some/more Explanation... :wink:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
Post Reply