Extracting a table when two columns share same element name

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Extracting a table when two columns share same element name

by tennisdude on Thu Mar 30, 2017 9:52 am

Hi,

I'm trying to extract a table with 7 columns of data.

In the table I'm trying to extract two columns share the same element name "CLASS:player-name" I have to apply a regular expression to that element name to remove additional data I do not need.

When I try to create a loop on the POS of the individual element when The script loops it uses the same player name for both columns in the csv file.

I've tried to get the script to use relative extraction and have had no luck.

Could some of the imacros wizards help me or point me in the right direction??

Cheers
TD


Code: Select all

VERSION BUILD=10022823
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=https://matchstat.com/tennis/all-upcoming-matches

TAG POS=3 TYPE=TD ATTR=CLASS:round EXTRACT=TXT
TAG POS=3 TYPE=TD ATTR=CLASS:event-name EXTRACT=TXT

TAG POS=5 TYPE=TD ATTR=CLASS:player-name EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.replace(/ \\(.+/gm, '');")

TAG POS=6 TYPE=TD ATTR=CLASS:player-name EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.replace(/ \\(.+/gm, '');")

TAG POS=3 TYPE=TD ATTR=CLASS:odds-td<SP>odds-0 EXTRACT=TXT
TAG POS=3 TYPE=TD ATTR=CLASS:odds-td<SP>odds-1 EXTRACT=TXT
TAG POS=3 TYPE=TD ATTR=CLASS:h2h EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=* FILE=T1.csv

tennisdude
 
Posts: 4
Joined: Mon Jan 30, 2017 2:48 pm

Re: Extracting a table when two columns share same element n

by chivracq on Thu Mar 30, 2017 6:05 pm

tennisdude wrote:Hi,

I'm trying to extract a table with 7 columns of data.

In the table I'm trying to extract two columns share the same element name "CLASS:player-name" I have to apply a regular expression to that element name to remove additional data I do not need.

When I try to create a loop on the POS of the individual element when The script loops it uses the same player name for both columns in the csv file.

I've tried to get the script to use relative extraction and have had no luck.

Could some of the imacros wizards help me or point me in the right direction??

Cheers
TD

Code: Select all
VERSION BUILD=10022823
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=https://matchstat.com/tennis/all-upcoming-matches

TAG POS=3 TYPE=TD ATTR=CLASS:round EXTRACT=TXT
TAG POS=3 TYPE=TD ATTR=CLASS:event-name EXTRACT=TXT

TAG POS=5 TYPE=TD ATTR=CLASS:player-name EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.replace(/ \\(.+/gm, '');")

TAG POS=6 TYPE=TD ATTR=CLASS:player-name EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.replace(/ \\(.+/gm, '');")

TAG POS=3 TYPE=TD ATTR=CLASS:odds-td<SP>odds-0 EXTRACT=TXT
TAG POS=3 TYPE=TD ATTR=CLASS:odds-td<SP>odds-1 EXTRACT=TXT
TAG POS=3 TYPE=TD ATTR=CLASS:h2h EXTRACT=TXT

SAVEAS TYPE=EXTRACT FOLDER=* FILE=T1.csv


FCIM...! :mrgreen: (Always mention your FCI when you open a Thread, read my Sig..., that's mostly why I never reacted to any of your 2 previous Threads, even if one mentioned IE, many Commands are not implemented for all Browsers/Versions or behave differently...)
=> FCI:
Code: Select all
iMacros v10.0.2, iMB/IE...?, if IE v...?, OS...?

(If you could mention that Info + for your 2 previous Threads as well (where you could post some Update btw, either you still have the Pb or you managed to solve those 2 Threads and you are expected to share your Sol), I won't follow up otherwise... (and I won't ask again in some future Thread(s)...?)
I normally don't even read Threads when FCI is not clearly mentioned (preferably at the top of the Opening Post in a Thread)... :idea:

But OK, I had a look at your Case as you luckily provided Script + URL, and hum..., not Difficulty at all to use Relative Positioning, I don't know what you tried, but there is no Glitch at all for R-Positioning for the 2nd Player based on the 1st Player or even for the whole Row based on the first Element (the "SF"/"QF"/etc...) like for example:
Code: Select all
VERSION BUILD=8820413 RECORDER=FX
'SET !EXTRACT_TEST_POPUP NO
TAB T=1

'URL GOTO=https://matchstat.com/tennis/all-upcoming-matches

TAG POS=5 TYPE=TD ATTR=CLASS:round EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=CLASS:event-name EXTRACT=TXT

TAG POS=R1 TYPE=TD ATTR=CLASS:player-name EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.replace(/ \\(.+/gm, '');")
PROMPT _{{!EXTRACT}}_

TAG POS=R1 TYPE=TD ATTR=CLASS:player-name EXTRACT=TXT
SET !EXTRACT EVAL("'{{!EXTRACT}}'.replace(/ \\(.+/gm, '');")
PROMPT _{{!EXTRACT}}_

TAG POS=R1 TYPE=TD ATTR=CLASS:odds-td<SP>odds-0 EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=CLASS:odds-td<SP>odds-1 EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=CLASS:h2h EXTRACT=TXT
PROMPT _{{!EXTRACT}}_

PAUSE
SAVEAS TYPE=EXTRACT FOLDER=* FILE=T1.csv
(Tested on iMacros for FF v8.8.2, Pale Moon v26.3.3 (=FF47), Win10-x64.)

If you want to loop your Script, you only need to "play" with '!LOOP' for the first ('round') Element which seems to follow 'TAG POS=3/5/7/9/etc'... =>:
Code: Select all
SET !LOOP 1
SET My_Loop {{!LOOP}
ADD My_Loop {{!LOOP}}
ADD My_Loop 1

TAG POS={{My_Loop}} TYPE=TD ATTR=CLASS:round EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=CLASS:event-name EXTRACT=TXT

'/etc... all other Extracts with 'R1' as well...


Hum..., and all your Data Manipulation directly on '!EXTRACT' and especially on the complete '!EXTRACT' to clean up a bit the 2 Names is not Best Practice in my Opinion and could be "dangerous" and unreliable, it's better to use Temp_Vars for each Extract and to "reconstruct" the '!EXTRACT' before doing the 'SAVEAS'...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 5730
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 1 guest

Website Monitoring