Select file from folder which is not already in CSV file

Support for iMacros. The iMacros software is the unique solution for automating every activity inside a web browser, for data extraction and web testing.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
Mark S
Posts: 1
Joined: Mon Feb 19, 2018 1:16 pm

Select file from folder which is not already in CSV file

Post by Mark S » Mon Feb 19, 2018 1:59 pm

Win 7, FF 48.0, iMacros 8.9.7

Script target:
1. to check CSV with a list of file names then select/extract first filename from the folder which is not already in the CSV file.
2. execute action with extracted file name
3. save the file name in the CSV, so it won't be used in the next loop.

CSV is a storage info of which files from the folder were already done.
Simple !LOOP isn't good because if I want to run script again it'll pick the same files which were used on a previous run.
The files are also added every so often to the folder and it messes up the queue.
The only solution would be deleting finished files from the folder before starting script again, but that is not an option.

I don't know much about javascript and I think that's what it's needed to get this done.
Please help:

Code: Select all

var folder="file:///D:/Folder/";
var fileProgress="progress.csv";
var folderCSV="file:///C:/Users/Pc%20User/Documents/iMacros/Downloads/";

var file;
file =  "CODE:";
file +=  "TAB CLOSEALLOTHERS" + "\n"; 
file +=  "SET fileProgress EVAL(\"{{LOOP}} == 1 || {{LOOP}} != 1 ? '{{folderCSV}}/{{fileProgress}}' : 'javascript:undefined;';\")" + "\n";
file +=  "URL GOTO={{fileProgress}}" + "\n";
file +=  "SET extPro EVAL(\"{{LOOP}} == 1 ? 1 : 0;\")" + "\n";
file +=  "SET !ERRORIGNORE YES" + "\n";
file +=  "TAG POS={{extPro}} TYPE=* ATTR=* EXTRACT=TXT" + "\n";
file +=  "SET !ERRORIGNORE NO" + "\n";
file +=  "SET text EVAL(\"{{!LOOP}} == 1 ? '{{!EXTRACT}}' : '{{text}}';\")" + "\n";
file +=  "SET !EXTRACT NULL" + "\n";

file +=  "SET !EXTRACT_TEST_POPUP NO" + "\n"; 
file +=  "SET !REPLAYSPEED FAST " + "\n"; 
file +=  "SET !ERRORIGNORE YES" + "\n"; 
file +=  "SET !DATSOURCE_LINE {{LOOP}}" + "\n"; 
file +=  "SET urlFolder EVAL(\"{{LOOP}} == 1 || {{LOOP}} != 1 ? '{{folder}}' : 'javascript:undefined;';\")" + "\n"; 
file +=  "URL GOTO={{urlFolder}}" + "\n"; 
file +=  "TAG POS=1 TYPE=TBODY ATTR=* EXTRACT=TXT" + "\n"; 

//Here I need to extract first file name (without Size or Last Modified) from the extracted file names list and which is not already in the CSV file ({{text}} extracted).

//Some of my attempts with EVAL which don't work and I don't know JS to solve it on my own:
// This should compare text extracted from CSV file with the list of file names, but doesn't work as expected.
//file +=  "SET curFile EVAL(\"var reg = new RegExp('/', 'g'); '{{text}}'.indexOf('{{!EXTRACT}}') > -1 ? '' : '{{!EXTRACT}}'.replace(reg, '\\\\');\")" + "\n"; 

// This extracts file name from the folder (without any size or Last Modified data) but it picks the file based on the loop number, so if I'll run the script again it'll pick the same files right from begin as in previous run).
//file +=  "SET curFile EVAL(\"var f = []; var a = '{{!EXTRACT}}'.match(/.+/g); for (i in a) {if (i % 4 == 0) f.push(a[i]);} f[{{LOOP}} - 1];\")" + "\n";

file +=  "SET !EXTRACT NULL" + "\n"; 
file +=  "SET !EXTRACT {{curFile}}" + "\n"; 


//Action with extracted Current file name will be here


var rite;
rite =  "CODE:";
rite +=  "SET !EXTRACT {{filename}}" + "\n";
rite +=  "SAVEAS TYPE=EXTRACT FOLDER=* FILE=progress.csv" + "\n";
rite +=  "SET !EXTRACT NULL" + "\n";

var filename;

for(i=1;i<=200;i++)
	{
	iimSet("LOOP",i);
	iimSet("fileProgress",fileProgress);
	iimSet("folderCSV",folderCSV);
	iimSet("folder",folder);
	iimPlay(file);
	filename=iimGetLastExtract(1)
	iimSet("filename",filename);
	iimPlay(rite);
	
	//Action with extracted Current file name will be here
	
	}
chivracq
Posts: 8781
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Select file from folder which is not already in CSV file

Post by chivracq » Tue Feb 20, 2018 2:46 am

Mark S wrote:

Code: Select all

Win 7, FF 48.0, iMacros 8.9.7
Script target:
1. to check CSV with a list of file names then select/extract first filename from the folder which is not already in the CSV file.
2. execute action with extracted file name
3. save the file name in the CSV, so it won't be used in the next loop.

CSV is a storage info of which files from the folder were already done.
Simple !LOOP isn't good because if I want to run script again it'll pick the same files which were used on a previous run.
The files are also added every so often to the folder and it messes up the queue.
The only solution would be deleting finished files from the folder before starting script again, but that is not an option.

I don't know much about javascript and I think that's what it's needed to get this done.
Please help:

Code: Select all

var folder="file:///D:/Folder/";
var fileProgress="progress.csv";
var folderCSV="file:///C:/Users/Pc%20User/Documents/iMacros/Downloads/";

var file;
file =  "CODE:";
file +=  "TAB CLOSEALLOTHERS" + "\n"; 
file +=  "SET fileProgress EVAL(\"{{LOOP}} == 1 || {{LOOP}} != 1 ? '{{folderCSV}}/{{fileProgress}}' : 'javascript:undefined;';\")" + "\n";
file +=  "URL GOTO={{fileProgress}}" + "\n";
file +=  "SET extPro EVAL(\"{{LOOP}} == 1 ? 1 : 0;\")" + "\n";
file +=  "SET !ERRORIGNORE YES" + "\n";
file +=  "TAG POS={{extPro}} TYPE=* ATTR=* EXTRACT=TXT" + "\n";
file +=  "SET !ERRORIGNORE NO" + "\n";
file +=  "SET text EVAL(\"{{!LOOP}} == 1 ? '{{!EXTRACT}}' : '{{text}}';\")" + "\n";
file +=  "SET !EXTRACT NULL" + "\n";

file +=  "SET !EXTRACT_TEST_POPUP NO" + "\n"; 
file +=  "SET !REPLAYSPEED FAST " + "\n"; 
file +=  "SET !ERRORIGNORE YES" + "\n"; 
file +=  "SET !DATSOURCE_LINE {{LOOP}}" + "\n"; 
file +=  "SET urlFolder EVAL(\"{{LOOP}} == 1 || {{LOOP}} != 1 ? '{{folder}}' : 'javascript:undefined;';\")" + "\n"; 
file +=  "URL GOTO={{urlFolder}}" + "\n"; 
file +=  "TAG POS=1 TYPE=TBODY ATTR=* EXTRACT=TXT" + "\n"; 

//Here I need to extract first file name (without Size or Last Modified) from the extracted file names list and which is not already in the CSV file ({{text}} extracted).

//Some of my attempts with EVAL which don't work and I don't know JS to solve it on my own:
// This should compare text extracted from CSV file with the list of file names, but doesn't work as expected.
//file +=  "SET curFile EVAL(\"var reg = new RegExp('/', 'g'); '{{text}}'.indexOf('{{!EXTRACT}}') > -1 ? '' : '{{!EXTRACT}}'.replace(reg, '\\\\');\")" + "\n"; 

// This extracts file name from the folder (without any size or Last Modified data) but it picks the file based on the loop number, so if I'll run the script again it'll pick the same files right from begin as in previous run).
//file +=  "SET curFile EVAL(\"var f = []; var a = '{{!EXTRACT}}'.match(/.+/g); for (i in a) {if (i % 4 == 0) f.push(a[i]);} f[{{LOOP}} - 1];\")" + "\n";

file +=  "SET !EXTRACT NULL" + "\n"; 
file +=  "SET !EXTRACT {{curFile}}" + "\n"; 


//Action with extracted Current file name will be here


var rite;
rite =  "CODE:";
rite +=  "SET !EXTRACT {{filename}}" + "\n";
rite +=  "SAVEAS TYPE=EXTRACT FOLDER=* FILE=progress.csv" + "\n";
rite +=  "SET !EXTRACT NULL" + "\n";

var filename;

for(i=1;i<=200;i++)
	{
	iimSet("LOOP",i);
	iimSet("fileProgress",fileProgress);
	iimSet("folderCSV",folderCSV);
	iimSet("folder",folder);
	iimPlay(file);
	filename=iimGetLastExtract(1)
	iimSet("filename",filename);
	iimPlay(rite);
	
	//Action with extracted Current file name will be here
	
	}
Yeah...!, that's an Interesting Case/Scenario, ah-ah...! :D

Hum, you are doing well already, for sbd who "doesn't know much about JavaScript", ah-ah...!, you are already much more Advanced than me, oops...! :wink:
I won't try to "fix" your 'EVAL()' Statements, as I don't like/do/use REGEX, and I would have a much simpler Approach, in pure '.iim' that you can loop with the standard '!LOOP' and you can of course easily convert to a '.js' Script if you want...

Part_1:
The Principle would be based on the Technique I've demonstrated in the following Thread:
- Re: Get number of lines from CSV and use as variable?
... with a Script in the Post corresponding to my direct Link. You can then reuse the last Row_Nb to directly access the last Row in your '.CSV' to get the Name of the last File that got processed (for '!DATASOURCE_LINE'). But as you would already have your 'Progress_Tracker' File open, you actually don't really need the iMacros '!DATASOURCE' Mechanism, you could retrieve the Name of that File directly.

To use "my" Technique, you simply need to enclose your Col in the 'SAVEAS' between 2 extra Cols containing some unique Char (Combination), like "###" and "##" for example that I used in the other Thread. (And a simple 'EVAL()' with 'lastIndexOf()' + 'split()' will get you the name of your File.)

One Condition for my Method to work is to NOT use the '.CSV' File Extension for the 'SAVEAS', ah-ah...!
=> '.TXT' or '.LOG' for example are OK...

Part_2:
Displaying the Content of your Folder with "URL GOTO=file:///D:/My_Folder/" is a perfect Idea. :D
The Dir Listing is then nicely displayed in an HTML Table by the Browser that you can extract Row by Row ('TYPE=TR') or Cell by Cell ('TYPE=TD').

When loading the Content of the Folder, you make sure to sort the Content on "Last Modified" (oldest first) (to make sure you will always process your Files based on their Creation/Modification Timestamp from the oldest.

The very first time you will run your Script, the Data (= the File you will want to process) will be located on the first Row of Data after the Table Header, and after processing that File in the rest of your Script, you will save it in the 'Progress_Tracker.LOG' File, and for each Run/Loop after, you will first tag the Cell/Field with the "last processed File" from Part_1, and "TAG POS=R1 TYPE=TR" with Relative Positioning for the Row or "TAG POS=R4 TYPE=TD" for the direct Cell will give you directly the Name of the File to use for the current Run/Loop.

That's all...! 8)

>>>

Hum, and for those interested, parallel Thread on SOF...
(No Replies yet...)
Hum, @OP, pity you didn't mention your FCI on SOF (read my Sig...), I would have given you a "+1" Up-Vote, ah-ah...! :wink:
Last edited by chivracq on Thu Mar 22, 2018 5:14 pm, edited 1 time in total.
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
chivracq
Posts: 8781
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Select file from folder which is not already in CSV file

Post by chivracq » Wed Feb 21, 2018 5:24 pm

And...?, 2 days later, and still no Follow-up...? No Replies on SOF (either), looks like I'm your "best Friend" for the moment, ah-ah...! 8)

But OK..., @OP, did you make any Progress...?, and did you understand my Approach...?

I might be willing to write/post a "quick and dirty" Implementation of my Approach if you have "Difficulties" understanding it and converting it to a concrete Script, but you need to follow up a bit on your Thread... :idea:
(I very rarely write Scripts for other Users, and only for "Interesting" a bit "High-Level" Cases when Users really get stuck after really-really! trying their best...)
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
chivracq
Posts: 8781
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Select file from folder which is not already in CSV file

Post by chivracq » Sat Feb 24, 2018 2:27 pm

Oh...!, I thought we were going to get some Follow-up/Update as I saw you were checking the Forum, @OP, but nope...! :cry:

OK, never mind, and don't worry, I won't ask again, ah-ah...! :wink:

Hum, and I see that Advanced User @Shugar finally reacted to your Thread on SOF, offering to help you with your Script but probably not for Free, ah-ah...!
Well, I guess he finds your Approach a bit cumbersome as well and has some Idea(s) for a different Implementation of your Scenario, probably using a '.js' Script as well like you started...
(To be honest, I was actually waiting/wondering a bit if he would react, ah-ah...! :twisted: , he's I think the only one on SOF with the "Skills" and the "Creative Mind" to come up with a Solution, ah-ah...! Good-good...! Pity he didn't mention anything about how he would tackle your Scenario...)
(He's very good btw, and probably the most Advanced iMacros User on SOF, his Coding is clean, and he regularly comes up with "Creative" Solutions/Workarounds... :wink: )

Well OK, I guess we won't see your Final Script then, pity though... :(
But hum..., if you ever need Help again on this Forum, I only help Users who use the Forum "a bit correctly", and that "a bit correctly" includes neat Follow-up + sharing their Final Script... :idea:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
Post Reply