anggun123 wrote: ↑Wed Aug 04, 2021 1:22 pm
Code: Select all
TAG POS=1 TYPE=TABLE ATTR=* EXTRACT=TXT
Not working
Yeah, well, "Not working" is a bit vague if you don't mention/explain what "happens" and what Result do you get and what is "not working", ah-ah...!
But OK, like I had "suspected", I can see from your (truncated) HTML Source for the 1st 'TR' Element that the 'TD' Elements indeed contain some inner 'DIV' and 'SPAN' Elements and that the Data at the 'TABLE' Level will be extremely "bloated" and will also contain some Data that you don't want to retain like the "Populer" + "Tambah" Strings, and "cleaning" that whole Data to keep only the Data that you would want would be a bit of a "hassle", ah-ah...!
>>>
anggun123 wrote: ↑Wed Aug 04, 2021 1:22 pm
this is html code
Code: Select all
<tr class="shopee-table__row valign-top" style=""><td class="is-first"><div class="shopee-table__cell first-cell"><div data-v-a290c0d4="" class="keyword">
shoes
<span data-v-a290c0d4="" class="hot"><i data-v-a290c0d4="" class="fire shopee-icon"><svg viewBox="0 0 8 11" xmlns="http://www.w3.org/2000/svg"><path d="M6.373 2.2c.315.943.158 2.436-.942 3.3C6.059 3.3 3.938.393 3.23.079c0 0-.079 0-.079-.079C3.545 3.614.245 4.243.009 7.071-.148 9.271 1.738 11 3.938 11a3.89 3.89 0 0 0 3.928-3.929c0-1.807-.55-3.614-1.493-4.871z" fill-rule="nonzero"></path></svg></i>
Populer
</span></div></div></td><td class=""><div class="shopee-table__cell"><div data-v-a290c0d4="" class="quality-wrap"><span data-v-a290c0d4="" class="quality-process" style="width: 60%;"></span></div></div></td><td class=""><div class="shopee-table__cell"><span data-v-a290c0d4="" class="left">25.217</span></div></td><td class=""><div class="shopee-table__cell"><span data-v-a290c0d4="" class="left">
Rp3.029
</span></div></td><td class="is-last"><div class="shopee-table__cell last-cell"><button data-v-a290c0d4="" type="button" class="shopee-button shopee-button--link shopee-button--small"><span>
Tambah
</span>
i want to take the value
columns 1 = shoes
columns 2 = 25,217
columns 3 = Rp3,029
any idea ?
Then OK, if extracting the Data at the 'TABLE' Level is "not going to work", then hum..., back to Method_2 that I mentioned, => you'll have to extract Cell by Cell at the 'TD' Level.
Then well, like "always" then when extracting Data from a Table, => 'Relative Positioning' will be your "Best Friend", ah-ah...!
And if/as it's already working, you can use your existing "TAG XPATH//tbody/tr[n]" Statement as 'Anchor', but you'll need to use "Double" 'Relative Positioning' because the Data you want is located "inside" that 'TR' Element.
Pretty "straightforward" Technique that I've explained and demonstrated many-many times, you'll find many Examples on the Forum if you "search" a bit, but from the truncated HTML Source you've posted, that would give stg like...:
Hum, wait, I first need to simply the HTML Structure of the 'TR' Element:
Code: Select all
<tr class="shopee-table__row valign-top" style="">
<td class="is-first"><div class="shopee-table__cell first-cell">
shoes
Populer
</td>
<td class=""></td>
<td class="">25.217</td>
<td class="">Rp3.029</td>
<td class="is-last">
Tambah
[...?...]
</td>
[...?...]
</tr>
=> Then OK, the 'TR' Element seems to contain 5 'TD''s, and you want "shoes" from 'TD_1' + "25.217" from 'TD_3' + "Rp3.029" from 'TD_4'.
Now the Implementation, in pure '.iim', I'll let you convert it to your on-the-fly Script in '.js'...:
Code: Select all
'SET !EXTRACT_TEST_POPUP YES/NO
SET !LOOP 1
TAG XPATH//tbody/tr[{{!LOOP}}] EXTRACT=TXT
TAG POS=R-1 TYPE=SPAN ATTR=* EXTRACT=TXT
SET !EXTRACT NULL
TAG POS=R1 TYPE=TD ATTR=* EXTRACT=TXT
TAG POS=R2 TYPE=TD ATTR=* EXTRACT=TXT
TAG POS=R1 TYPE=TD ATTR=* EXTRACT=TXT
PROMPT _{{!EXTRACT}}_
... Well, the 1st "TAG POS=R1 TYPE=TD ATTR=* EXTRACT=TXT" will actually extract "shoes" + "Populer" together, while you only want "shoes", you would need to adjust that Line to extract at the corresponding 'DIV' or 'SPAN' Element with stg like:
Code: Select all
TAG POS=R1 TYPE=DIV ATTR=CLASS:keyword EXTRACT=TXT
But..., no Luck, this won't work either and will still contain the "Populer" String because that "Populer" is contained in a 'SPAN' Element contained itself also in the same 'DIV' Element, => then you would also need to extract that 'SPAN'/"Populer" separately to retract/subtract it from the whole 'DIV'.
Or probably easier would be to use 'EXTRACT=HTM' on the 'DIV' and to "isolate" the "shoes" String from that 'EXTRACT=HTM', using the same Method/Principle I will use in Method_3...
>
Oh yeah...!, and the "TAG POS=R-1 TYPE=SPAN ATTR=* EXTRACT=TXT" for the first Part of the Double 'Relative Positioning' Implementation is a bit of a "Guess", I'm not sure if the "TYPE=SPAN" will be correct as you didn't post enough Info about the (whole) HTML Structure of the whole Table, this "shopee" Site is probably Public, you would make everybody's Life easier if you simply posted the URL of that Page/Site...
, but it is based on the "Assumption" that the Table will contain some 'TH' Element, especially when extracting the 1st Row of Data, and that this 'TH' Element will contain (at least) one 'SPAN' Element that can be used as "2nd Anchor" for the Double 'R-POS' Technique...
If that's not correct, then you'll need to adapt that Line to find some other Element or Type, maybe "TYPE=DIV", + possibly need to adapt the 1st "POS=R1" to "find" the "shoes" String, or might even need to "get outside" of the Table, or to reverse the Direction of "POS=R-1" + "POS=R1" into "POS=R1" + "POS=R-1", it's a bit "complicated" to explain with Words, I would need to test myself...
But then, Method_3 would then be much easier in that Case, I would think...
>>>
Hum, and I mentioned "Method_3", ah-ah...!
Or it's actually "Method_4", because I had already mentioned a "Method_3" in my previous Post, but both are a bit based on the same Principle actually...
Alright..., hum, that's a long Post already, ah-ah...! Still following...!?
But hum, I would think that's probably the easiest and most straightforward Method, I guess... And this one I can test myself, well..., a bit "approx" and still in pure '.iim', I don't do/use any '.js' Script myself...
Alright, your existing "TAG XPATH//tbody/tr[n] EXTRACT=TXT" actually already contains all the Data that you want, then it's "simply" a matter of isolating and separating the Data that you want to keep into 3 Cols, ah-ah...!
And yep, that Method works directly, here is the Script I used myself, with your 'TR' HTML Source in my Clipboard...:
Code: Select all
VERSION BUILD=8820413 RECORDER=FX
TAB T=1
'Debug:
SET !EXTRACT {{!CLIPBOARD}}
SET Col_1 EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.split('keyword\">'); y=x[1].split('<'); z=y[0].trim(); z;")
SET Col_2 EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.split('class=\"left\">'); y=x[1].split('<'); z=y[0].trim(); z;")
SET Col_3 EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.split('class=\"left\">'); y=x[2].split('<'); z=y[0].trim(); z;")
SET !EXTRACT {{Col_1}}[EXTRACT]{{Col_2}}[EXTRACT]{{Col_3}}
PROMPT {{!CLIPBOARD}}<BR><BR>Col_1:<SP>_{{Col_1}}_<BR>Col_2:<SP>_{{Col_2}}_<BR>Col_3:<SP>_{{Col_3}}_<BR><BR>SAVEAS:<BR>_{{!EXTRACT}}_
... Which will display in the 'PROMPT':
Code: Select all
[<tr> HTML Source from Clipboard]
Col_1: _shoes_
Col_2: _25.217_
Col_3: _Rp3.029_
SAVEAS:
_shoes[EXTRACT]25.217[EXTRACT]Rp3.029_
(Tested in iMacros for FF v8.8.2, PM v26.3.3, Win10_Pro_x64.)
Well, your HTML Code contains Dots for the Values in 'Col_2' and 'Col_3', if you "really" want/prefer Commas, well..., I let you adapt the Script... Not difficult...
And if you want to test yourself that the Script works correctly indeed, I guess you can use this one:
Code: Select all
VERSION BUILD=8820413 RECORDER=FX
SET !EXTRACT_TEST_POPUP NO
TAB T=1
SET !LOOP 1
TAG XPATH//tbody/tr[{{!LOOP}}] EXTRACT=HTM
SET TR_Extracted {{!EXTRACT}}
'Debug:
'SET !EXTRACT {{!CLIPBOARD}}
SET Col_1 EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.split('keyword\">'); y=x[1].split('<'); z=y[0].trim(); z;")
SET Col_2 EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.split('class=\"left\">'); y=x[1].split('<'); z=y[0].trim(); z;")
SET Col_3 EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.split('class=\"left\">'); y=x[2].split('<'); z=y[0].trim(); z;")
SET !EXTRACT {{Col_1}}[EXTRACT]{{Col_2}}[EXTRACT]{{Col_3}}
'PROMPT {{!CLIPBOARD}}<BR><BR>Col_1:<SP>_{{Col_1}}_<BR>Col_2:<SP>_{{Col_2}}_<BR>Col_3:<SP>_{{Col_3}}_<BR><BR>SAVEAS:<BR>_{{!EXTRACT}}_
PROMPT {{TR_Extracted}}<BR><BR>Col_1:<SP>_{{Col_1}}_<BR>Col_2:<SP>_{{Col_2}}_<BR>Col_3:<SP>_{{Col_3}}_<BR><BR>SAVEAS:<BR>_{{!EXTRACT}}_
SAVEAS TYPE=EXTRACT FOLDER=* FILE=try.csv
(Not tested of course...)
And hum, be "careful" when saving some Data to the "C:\" Root on Win_x32/_x64, it is "not supposed" to work, for "Security" Reasons, ah-ah...!
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...