Extract Last Word

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
Tamilselvan
Posts: 111
Joined: Mon Mar 07, 2016 10:49 am

Extract Last Word

Post by Tamilselvan » Tue Sep 14, 2021 5:50 pm

Firefox 52.9.0 (32-bit)
iMacros 8.9.7
Win-10 (64-bit)

Hi,
I try open the link by EVENT mode, then extract the two positions of last word and combine together save it as html page in specified folder.. But it saved file name as
undefined__undefined
.html.

i want to replace the "(" and ")" of the 1st position of extracting last word .
I don't idea of these...( I am gathering code from in Data Extraction and Web Screen Scraping Forum page as i need ) Please guide me.

Code: Select all

VERSION BUILD=8970419 RECORDER=FX
TAB T=1

'URL GOTO=https://mnregaweb2.nic.in/netnrega/FTO/fto_sign_detail.aspx?lflag=local&flg=W&page=b&state_name=%e0%ae%a4%e0%ae%ae%e0%ae%bf%e0%ae%b4%e0%af%8d%e0%ae%a8%e0%ae%be%e0%ae%9f%e0%af%81&state_code=29&district_name=%e0%ae%a4%e0%ae%bf%e0%ae%b0%e0%af%81%e0%ae%b5%e0%ae%a3%e0%af%8d%e0%ae%a3%e0%ae%be%e0%ae%ae%e0%ae%b2%e0%af%88&district_code=2906&block_name=Thellar&block_code=2906015&fin_year=2021-2022&typ=fst_sig&mode=b&source=&Digest=h6f9h6YyIpuMzlkCjaGneQ
SET !LOOP 2
EVENT TYPE=CLICK SELECTOR="#form1>DIV:nth-of-type(3)>TABLE:nth-of-type(4)>TBODY>TR:nth-of-type({{!LOOP}})>TD:nth-of-type(2)>A" BUTTON=0 MODIFIERS="ctrl"

TAB T=2
SET !EXTRACT_TEST_POPUP NO

SET !EXTRACT NULL
' Extract Panchayat Name
'TAG POS=1 TYPE=TD ATTR=TXT:TN-06-015-043-043/559-A<SP>(Sathapoondi) EXTRACT=TXT
TAG POS=1 TYPE=TD ATTR=TXT:TN-06-015-* EXTRACT=TXT
SET Pt_name {{!EXTRACT}}

SET Pt_name EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('('); y=s.substr(x);if(x<0){z='No Name';} else{z=y;}; z;")

PROMPT {{Pt_name}}

SET !EXTRACT NULL
' Extract FTO No.
'TAG POS=1 TYPE=B ATTR=TXT:Fto<SP>No.<SP>:<SP>TN2906015_010421FTO_1238 EXTRACT=TXT
TAG POS=1 TYPE=B ATTR=TXT:Fto<SP>No.<SP>:<SP>TN2906015_* EXTRACT=TXT

SET Fto_no EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('T'); y=s.substr(x); if(x<0){z='No FTO';} else{z=y;};z;")
PROMPT {{Fto_no}}

SAVEAS TYPE=HTM FOLDER=E:\Pt-Fto\ FILE={{!Pt_name}}{{!Fto_no}}.htm

FTOs-min.jpg
Open FTO-min.jpg
Thanks & Regards,
S. Tamilselvan.
chivracq
Posts: 9929
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract Last Word

Post by chivracq » Tue Sep 14, 2021 9:47 pm

Tamilselvan wrote:
Tue Sep 14, 2021 5:50 pm
Firefox 52.9.0 (32-bit)
iMacros 8.9.7
Win-10 (64-bit)

Hi,
I try open the link by EVENT mode, then extract the two positions of last word and combine together save it as html page in specified folder.. But it saved file name as
undefined__undefined
.html.

i want to replace the "(" and ")" of the 1st position of extracting last word .
I don't idea of these...( I am gathering code from in Data Extraction and Web Screen Scraping Forum page as i need ) Please guide me.

Code: Select all

VERSION BUILD=8970419 RECORDER=FX
TAB T=1

'URL GOTO=https://mnregaweb2.nic.in/netnrega/FTO/fto_sign_detail.aspx?lflag=local&flg=W&page=b&state_name=%e0%ae%a4%e0%ae%ae%e0%ae%bf%e0%ae%b4%e0%af%8d%e0%ae%a8%e0%ae%be%e0%ae%9f%e0%af%81&state_code=29&district_name=%e0%ae%a4%e0%ae%bf%e0%ae%b0%e0%af%81%e0%ae%b5%e0%ae%a3%e0%af%8d%e0%ae%a3%e0%ae%be%e0%ae%ae%e0%ae%b2%e0%af%88&district_code=2906&block_name=Thellar&block_code=2906015&fin_year=2021-2022&typ=fst_sig&mode=b&source=&Digest=h6f9h6YyIpuMzlkCjaGneQ
SET !LOOP 2
EVENT TYPE=CLICK SELECTOR="#form1>DIV:nth-of-type(3)>TABLE:nth-of-type(4)>TBODY>TR:nth-of-type({{!LOOP}})>TD:nth-of-type(2)>A" BUTTON=0 MODIFIERS="ctrl"

TAB T=2
SET !EXTRACT_TEST_POPUP NO

SET !EXTRACT NULL
' Extract Panchayat Name
'TAG POS=1 TYPE=TD ATTR=TXT:TN-06-015-043-043/559-A<SP>(Sathapoondi) EXTRACT=TXT
TAG POS=1 TYPE=TD ATTR=TXT:TN-06-015-* EXTRACT=TXT
SET Pt_name {{!EXTRACT}}

SET Pt_name EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('('); y=s.substr(x);if(x<0){z='No Name';} else{z=y;}; z;")

PROMPT {{Pt_name}}

SET !EXTRACT NULL
' Extract FTO No.
'TAG POS=1 TYPE=B ATTR=TXT:Fto<SP>No.<SP>:<SP>TN2906015_010421FTO_1238 EXTRACT=TXT
TAG POS=1 TYPE=B ATTR=TXT:Fto<SP>No.<SP>:<SP>TN2906015_* EXTRACT=TXT

SET Fto_no EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('T'); y=s.substr(x); if(x<0){z='No FTO';} else{z=y;};z;")
PROMPT {{Fto_no}}

SAVEAS TYPE=HTM FOLDER=E:\Pt-Fto\ FILE={{!Pt_name}}{{!Fto_no}}.htm
FTOs-min.jpg
Open FTO-min.jpg

Thanks & Regards,
S. Tamilselvan.

File saved as "undefined__undefined.html. "
=> Yep normal, ah-ah...!:

Code: Select all

SAVEAS TYPE=HTM FOLDER=E:\Pt-Fto\ FILE={{!Pt_name}}{{!Fto_no}}.htm
... You've added a leading "!" to both Vars in the 'FILE' Param... :idea:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Tamilselvan
Posts: 111
Joined: Mon Mar 07, 2016 10:49 am

Re: Extract Last Word

Post by Tamilselvan » Wed Sep 15, 2021 11:38 am

Now I have changed...

Code: Select all

SAVEAS TYPE=HTM FOLDER=E:\Pt-Fto\ FILE={{!Pt_name}}{{Fto_no}}.htm
But I get
" __undefined__TN2906015_010421FTO_1238"
How to replace both "(" & ")" as blank....

Code: Select all

SET Pt_name EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('('); y=s.substr(x);if(x<0){z='No Name';} else{z=y;}; z;")
chivracq
Posts: 9929
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract Last Word

Post by chivracq » Wed Sep 15, 2021 1:12 pm

Tamilselvan wrote:
Wed Sep 15, 2021 11:38 am
Now I have changed...

Code: Select all

SAVEAS TYPE=HTM FOLDER=E:\Pt-Fto\ FILE={{!Pt_name}}{{Fto_no}}.htm
But I get
" __undefined__TN2906015_010421FTO_1238"

Yeah, well, pay attention a bit, I said "both Vars", and "both" means "x2", ah-ah...! :idea:
But you've only corrected the Name of one Var, tja...! :roll:

>>>
Tamilselvan wrote:
Wed Sep 15, 2021 11:38 am
How to replace both "(" & ")" as blank....

Code: Select all

SET Pt_name EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('('); y=s.substr(x);if(x<0){z='No Name';} else{z=y;}; z;")

Waiting for you to post what you get as Result already in the "__undefined__" Part corresponding to that 'Pt_name' Var, as I don't see any "(" & ")" Chars in your current Output (" __undefined__TN2906015_010421FTO_1238")...

Also handy would be if you posted the original Content of the 'EXTRACT' (= Input), and what you expect as Result (= Output).

>

+ I'm not sure what you mean with "replace as blank", => "blank" = "empty" (=> remove/delete)...?, or "blank" = "Space"...?

Hum, I guess you mean "remove", as I think your current Expression was removing already the "(" Char with 'indexOf()' + 'substr()', => then do another 'indexOf()' on the ")" and use 'substring()' on both Indexes instead of 'substr()'... :idea:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Tamilselvan
Posts: 111
Joined: Mon Mar 07, 2016 10:49 am

Re: Extract Last Word

Post by Tamilselvan » Thu Sep 16, 2021 11:05 am

my 1st extracted word is
1st Extr.jpg
1st Extr.jpg (13.08 KiB) Viewed 240 times
and 2nd is
2nd Extract.jpg
2nd Extract.jpg (13.34 KiB) Viewed 240 times
Now i have changed both..

Code: Select all

SAVEAS TYPE=HTM FOLDER=E:\Pt-Fto\ FILE={{Pt_name}}{{Fto_no}}.htm
File name saved in folder as below
(Sathapoondi)TN2906015_010421FTO_1238
.... I want to remove both '(' ,')' .

Code: Select all

VERSION BUILD=8970419 RECORDER=FX
TAB T=1

'URL GOTO=https://mnregaweb2.nic.in/netnrega/FTO/fto_sign_detail.aspx?lflag=local&flg=W&page=b&state_name=%e0%ae%a4%e0%ae%ae%e0%ae%bf%e0%ae%b4%e0%af%8d%e0%ae%a8%e0%ae%be%e0%ae%9f%e0%af%81&state_code=29&district_name=%e0%ae%a4%e0%ae%bf%e0%ae%b0%e0%af%81%e0%ae%b5%e0%ae%a3%e0%af%8d%e0%ae%a3%e0%ae%be%e0%ae%ae%e0%ae%b2%e0%af%88&district_code=2906&block_name=Thellar&block_code=2906015&fin_year=2021-2022&typ=fst_sig&mode=b&source=&Digest=h6f9h6YyIpuMzlkCjaGneQ
SET !LOOP 2
EVENT TYPE=CLICK SELECTOR="#form1>DIV:nth-of-type(3)>TABLE:nth-of-type(4)>TBODY>TR:nth-of-type({{!LOOP}})>TD:nth-of-type(2)>A" BUTTON=0 MODIFIERS="ctrl"

TAB T=2
SET !EXTRACT_TEST_POPUP NO

SET !EXTRACT NULL
' Extract Panchayat Name
'TAG POS=1 TYPE=TD ATTR=TXT:TN-06-015-043-043/559-A<SP>(Sathapoondi) EXTRACT=TXT
TAG POS=1 TYPE=TD ATTR=TXT:TN-06-015-* EXTRACT=TXT
SET Pt_name {{!EXTRACT}}

SET Pt_name EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('('); y=s.substring(x).trim()+ s.indexOf(')'); y=s.substring(x).trim();if(x<0){z='No Name';} else{z=y;}; z;")

PROMPT {{Pt_name}}

SET !EXTRACT NULL
' Extract FTO No.
'TAG POS=1 TYPE=B ATTR=TXT:Fto<SP>No.<SP>:<SP>TN2906015_010421FTO_1238 EXTRACT=TXT
TAG POS=1 TYPE=B ATTR=TXT:Fto<SP>No.<SP>:<SP>TN2906015_* EXTRACT=TXT

SET Fto_no EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('T'); y=s.substr(x); if(x<0){z='No FTO';} else{z=y;};z;")
PROMPT {{Fto_no}}

SAVEAS TYPE=HTM FOLDER=E:\Pt-Fto\ FILE={{Pt_name}}{{Fto_no}}.htm
where i have to change my code ... to find last word...

Code: Select all

SET Pt_name EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('('); y=s.substring(x).trim()+ s.indexOf(')'); y=s.substring(x).trim();if(x<0){z='No Name';} else{z=y;}; z;")
chivracq
Posts: 9929
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Extract Last Word

Post by chivracq » Thu Sep 16, 2021 11:54 am

Tamilselvan wrote:
Thu Sep 16, 2021 11:05 am
my 1st extracted word is
1st Extr.jpg

and 2nd is

2nd Extract.jpg

Now i have changed both..

Code: Select all

SAVEAS TYPE=HTM FOLDER=E:\Pt-Fto\ FILE={{Pt_name}}{{Fto_no}}.htm
File name saved in folder as below
(Sathapoondi)TN2906015_010421FTO_1238
.... I want to remove both '(' ,')' .

Alright, this is good/better already...! :D

>

Then hum, what you call "my 1st/2nd extracted word" is for both already the Result of the 'EVAL()' Statements, => the "Output", but "'EXTRACT"' / "extracted" is the "Input", and I still miss that Info... :(

=> A "better" way to use the 'PROMPT' for Debug Purpose, (even if I'm already very pleased to see that you are using 'PROMPT' to debug your 'EVAL()' Statements, and you also use "my" Syntax in 'EVAL()', very good...! :D ), is to display both the "original" 'EXTRACT' + the Result of your 'EVAL()', and each time mentioning the Name of your Var, + surrounding all Vars with some Delimiter (I usually use "_") to make sure that all (Soft) Tabs/Returns/Spaces are also visible, same also if the Var only contains an empty String...: :idea:

Code: Select all

PROMPT EXTRACT:<BR>_{{!EXTRACT}}_<BR><BR>Pt_name:<BR>_{{Pt_name}}_
... And you can do the same for the 'PROMPT' corresponding to your your 'Fto_no' Var...

>

Oh yeah...!, and one "Detail", when displaying Vars in the 'PROMPT', I always remove the "!" Char from Built-in Vars in the Text Name of the Var (=> for '!EXTRACT' or '!VAR1'/'!VAR2'/etc), or iMacros "thinks" you want to use an "Input PROMPT" instead of a "Display PROMPT"... :!:

>>>
Tamilselvan wrote:
Thu Sep 16, 2021 11:05 am

Code: Select all

VERSION BUILD=8970419 RECORDER=FX
TAB T=1

'URL GOTO=https://mnregaweb2.nic.in/netnrega/FTO/fto_sign_detail.aspx?lflag=local&flg=W&page=b&state_name=%e0%ae%a4%e0%ae%ae%e0%ae%bf%e0%ae%b4%e0%af%8d%e0%ae%a8%e0%ae%be%e0%ae%9f%e0%af%81&state_code=29&district_name=%e0%ae%a4%e0%ae%bf%e0%ae%b0%e0%af%81%e0%ae%b5%e0%ae%a3%e0%af%8d%e0%ae%a3%e0%ae%be%e0%ae%ae%e0%ae%b2%e0%af%88&district_code=2906&block_name=Thellar&block_code=2906015&fin_year=2021-2022&typ=fst_sig&mode=b&source=&Digest=h6f9h6YyIpuMzlkCjaGneQ
SET !LOOP 2
EVENT TYPE=CLICK SELECTOR="#form1>DIV:nth-of-type(3)>TABLE:nth-of-type(4)>TBODY>TR:nth-of-type({{!LOOP}})>TD:nth-of-type(2)>A" BUTTON=0 MODIFIERS="ctrl"

TAB T=2
SET !EXTRACT_TEST_POPUP NO

SET !EXTRACT NULL
' Extract Panchayat Name
'TAG POS=1 TYPE=TD ATTR=TXT:TN-06-015-043-043/559-A<SP>(Sathapoondi) EXTRACT=TXT
TAG POS=1 TYPE=TD ATTR=TXT:TN-06-015-* EXTRACT=TXT
SET Pt_name {{!EXTRACT}}

SET Pt_name EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('('); y=s.substring(x).trim()+ s.indexOf(')'); y=s.substring(x).trim();if(x<0){z='No Name';} else{z=y;}; z;")

PROMPT {{Pt_name}}

SET !EXTRACT NULL
' Extract FTO No.
'TAG POS=1 TYPE=B ATTR=TXT:Fto<SP>No.<SP>:<SP>TN2906015_010421FTO_1238 EXTRACT=TXT
TAG POS=1 TYPE=B ATTR=TXT:Fto<SP>No.<SP>:<SP>TN2906015_* EXTRACT=TXT

SET Fto_no EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('T'); y=s.substr(x); if(x<0){z='No FTO';} else{z=y;};z;")
PROMPT {{Fto_no}}

SAVEAS TYPE=HTM FOLDER=E:\Pt-Fto\ FILE={{Pt_name}}{{Fto_no}}.htm
where i have to change my code ... to find last word...

Code: Select all

SET Pt_name EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('('); y=s.substring(x).trim()+ s.indexOf(')'); y=s.substring(x).trim();if(x<0){z='No Name';} else{z=y;}; z;")

=> Hum, OK...:

Code: Select all

x=s.indexOf('('); y=s.substring(x).trim()+ s.indexOf(')'); y=s.substring(x).trim();
Well, the "Purpose" of using 'substring()' is to use both Indexes in just one same 'substring()' Command, the Syntax of that JS Method is:

Code: Select all

[Input_String].substring([Start_Index],[End_Index])
And "Start_Index" will be "indexOf('(')+1" and "End_Index" will be "indexOf(')')-1", I would think, well, control yourself if the "+/-1" are correct for both Indexes, and adjust accordingly if needed...

And that would give for the whole 'EVAL()' stg like: :idea:

Code: Select all

SET Pt_name EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.indexOf('('); y=s.indexOf(')'); if(x<0){z='No Name';} else{z=s.substring(x+1,y-1);}; z;")
PROMPT EXTRACT:<BR>_{{!EXTRACT}}_<BR><BR>Pt_name:<BR>_{{Pt_name}}_
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Tamilselvan
Posts: 111
Joined: Mon Mar 07, 2016 10:49 am

Re: Extract Last Word

Post by Tamilselvan » Thu Sep 16, 2021 4:30 pm

The result is very perfect.....

It's working... :D I feel very happy..... :D

Grate explanation....Your knowledge very useful to us.... .!!!

You are genius...Big salute....

Thanks a lot..!!!
Post Reply