inputting keywords from CSV file

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

inputting keywords from CSV file

by Athira on Wed May 31, 2017 4:31 pm

I am using a CSV file as input with keywords to my imacros program. I am able to read only the column and generate the search output that I need. I want the program to read even the rows. The input has to be read like a nested for loop ware every row and column have to be inputted in the form (1,2),(1,3),(1,4)..........(1,n) (2,3),(2,4)................(2,n) and so on. Is it possible to read the input file with imacros in this way or is there any other alternative for this.

Here is my program that scrapes information using keywords and saves parsed information in another file. However, I want to use both rows and columns and input more than one keywords at once.
[code][/code]
'VERSION BUILD=9030808 RECORDER=FX
TAB T=1
set !extract_test_popup no
set !replayspeed fast
set !timeout_page 200
SET !ERRORIGNORE YES
SET !TIMEOUT_STEP 2
SET !Datasource keyword.CSV
Set !Loop 1
Set !Datasource_Line {{!Loop}}

URL GOTO=https://twitter.com/search-advanced

wait seconds=1
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:NoFormName ATTR=NAME:ands CONTENT=
{{!COl1}}
TAG POS=1 TYPE=BUTTON FORM=NAME:NoFormName ATTR=TXT:Search
wait seconds=1

ADD !EXTRACT {{!URLCURRENT}}
Set url {{!Extract}}
Set !Extract Null



'Keyword Scrape
Tag pos=1 type=h1 attr=class:SearchNavigation-titleText* Extract=Txt
Set key {{!Extract}}
Set !Extract Null


'Main Heading Scrapping
Tag pos=1 type=a attr=class:AdaptiveNewsLargeImageHeadline-title*
Extract=Txt
Set mainheading {{!Extract}}
Set !Extract Null

'Main heading URL
Tag pos=1 type=a attr=class:AdaptiveNewsLargeImageHeadline-title*
Extract=href
Set mainheadingurl {{!Extract}}
Set !Extract Null


'Date of Post
Tag pos=1 type=a attr=class:AdaptiveNewsHeadlineDetails-date<sp>js-nav*
Extract=txt
Set date {{!Extract}}
Set !Extract Null


'Username whose post this article
TAG XPATH=//*[@id="page-
container"]/div[2]/div/div/div[2]/div/div[2]/div/div[2]/div[2]/div[1]/a/span
Extract=Txt
Set username {{!Extract}}
Set !Extract Null

'extract user name
TAG POS=1 TYPE=A ATTR=TXT:@* EXTRACT=TXT
Set username1 {{!Extract}}
Set !Extract Null


Add !Extract {{mainheading}}
Add !Extract {{date}}
Add !Extract {{mainheadingurl}}
ADD !EXTRACT {{url}}
Add !Extract {{username}}
Add !Extract {{key}}
Add !Extract {{username1}}


SAVEAS TYPE=EXTRACT FOLDER=* FILE=test1_output.csv
clear'
Athira
 
Posts: 4
Joined: Wed May 31, 2017 4:22 pm

Re: inputting keywords from CSV file

by chivracq on Wed May 31, 2017 6:21 pm

Athira wrote:I am using a CSV file as input with keywords to my imacros program. I am able to read only the column and generate the search output that I need. I want the program to read even the rows. The input has to be read like a nested for loop ware every row and column have to be inputted in the form (1,2),(1,3),(1,4)..........(1,n) (2,3),(2,4)................(2,n) and so on. Is it possible to read the input file with imacros in this way or is there any other alternative for this.

Here is my program that scrapes information using keywords and saves parsed information in another file. However, I want to use both rows and columns and input more than one keywords at once.
Code: Select all
'VERSION BUILD=9030808 RECORDER=FX
 TAB T=1
 set !extract_test_popup no
 set !replayspeed fast
 set !timeout_page 200
 SET !ERRORIGNORE YES
 SET !TIMEOUT_STEP 2
 SET !Datasource keyword.CSV
 Set !Loop 1
 Set !Datasource_Line {{!Loop}}

 URL GOTO=https://twitter.com/search-advanced

 wait seconds=1
 TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:NoFormName ATTR=NAME:ands CONTENT=
 {{!COl1}}
 TAG POS=1 TYPE=BUTTON FORM=NAME:NoFormName ATTR=TXT:Search
 wait seconds=1

 ADD !EXTRACT {{!URLCURRENT}}
 Set url {{!Extract}}
 Set !Extract Null

'Keyword Scrape
Tag pos=1 type=h1 attr=class:SearchNavigation-titleText* Extract=Txt
Set key {{!Extract}}
Set !Extract Null

'Main Heading Scrapping
Tag pos=1 type=a attr=class:AdaptiveNewsLargeImageHeadline-title*
Extract=Txt
Set mainheading {{!Extract}}
Set !Extract Null

'Main heading URL
Tag pos=1 type=a attr=class:AdaptiveNewsLargeImageHeadline-title*
Extract=href
Set mainheadingurl {{!Extract}}
Set !Extract Null

'Date of Post
 Tag pos=1 type=a attr=class:AdaptiveNewsHeadlineDetails-date<sp>js-nav*
 Extract=txt
 Set date {{!Extract}}
 Set !Extract Null

 'Username whose post this article
  TAG XPATH=//*[@id="page-
container"]/div[2]/div/div/div[2]/div/div[2]/div/div[2]/div[2]/div[1]/a/span
 Extract=Txt
 Set username {{!Extract}}
 Set !Extract Null

'extract user name
TAG POS=1 TYPE=A ATTR=TXT:@* EXTRACT=TXT
Set username1 {{!Extract}}
Set !Extract Null

Add !Extract {{mainheading}}
Add !Extract {{date}}
Add !Extract {{mainheadingurl}}
ADD !EXTRACT {{url}}
Add !Extract {{username}}
Add !Extract {{key}}
Add !Extract {{username1}}

SAVEAS TYPE=EXTRACT FOLDER=* FILE=test1_output.csv
clear

FCIM...! :mrgreen:
(Always mention your FCI when you open a Thread, read my Sig..., many Commands are not implemented for all Browsers/Versions...)

But yep, what you want can be done... The main "part" of your Scenario is "Nested Loops", you'll find several Threads on the Forum with Examples, both in '.iim' and in '.js' Scripts.

The "a bit difficult" part will indeed be the Looping through the different Columns, for which I think Solutions have only been posted in '.js' Scripts on the Forum.
I can think directly of a (somewhat cumbersome...!) Solution in pure '.iim' if your '.CSV' has a Max Nb of Columns as it involves repeating a certain Block of Code as many times as your '!DATASOURCE_COLUMNS' Value.

But hum..., to manage your "Expectations", ah-ah...!, I'm never too keen on helping Users who want to use/misuse iMacros with Social Media (+ Spam + Games/Votes + Hacking) even if you apparently only want to extract some Data, but the Script that you want is nearly a bit "too powerful" in the wrong "hands" and could easily be used for "Direct Targeting" and "Spam"..., so I won't be helping you "too precisely", oops...!

Hum..., I can think of several Solutions in pure '.iim' actually, 2 with 'EVAL()', and one more with "Relative Positioning" (yes indeed...!) which is pretty simple and probably even quicker than in '.js', I've already posted several times on the Forum about this "Technique"... (And all 3 Solutions (in '.iim') take into account that you'll probably have a variable Number of Cols for each Row I guess...)
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: inputting keywords from CSV file

by Athira on Wed May 31, 2017 8:57 pm

Thank you for your reply!!!

I am sorry for not mentioning the FCI it is imacros Version 9.0.3 , Firefox , Windows 8.

I do not have any intentions of misusing this code for any wrong purpose. This is the project that I am doing for collecting data for my research work at school. My main aim with this is to create simple automated system that can collect data on topics for reference for my research as my research requires up to date information. It's only after you said that I have realized that the Script that I want is actually "too powerful" and it could be used for "Direct Targeting" and "spam". But, I promise that is not my intention and will make sure that nothing goes wrong.

Anyways once again thank you for your reply and making me realize how negatively the code can work.
Athira
 
Posts: 4
Joined: Wed May 31, 2017 4:22 pm

Re: inputting keywords from CSV file

by chivracq on Thu Jun 01, 2017 5:51 am

Athira wrote:Thank you for your reply!!!

I am sorry for not mentioning the FCI it is
Code: Select all
imacros Version 9.0.3 , Firefox , Windows 8.


I do not have any intentions of misusing this code for any wrong purpose. This is the project that I am doing for collecting data for my research work at school. My main aim with this is to create simple automated system that can collect data on topics for reference for my research as my research requires up to date information. It's only after you said that I have realized that the Script that I want is actually "too powerful" and it could be used for "Direct Targeting" and "spam". But, I promise that is not my intention and will make sure that nothing goes wrong.

Anyways once again thank you for your reply and making me realize how negatively the code can work.

OK, good for FCI, even if the FF Version is missing, FF53, I reckon...

Hum, nice School you go to then, if you get such interesting Projects, and I guess your Teacher will "realize" the Potential of your Script if the "Concept" came from you, it's a bit the Principle of Search Engines or maybe closer 'FB-Graph'...

But hum, for your Script, if you search the Forum on "nested loops" + "looping columns" for example, you will find relevant Threads with Code Examples as those are the 2 main "Concepts" your Script will need... :idea:

Even if I didn't exactly understand the exact Structure of your '.CSV' DataSource and how (and how many) you want to use the Keywords...:
- Every Row has a main Keyword and you want to search on 'main Keyword' + any other (x1) Keyword from the same Row.
- You want to try all Combinations with 2 Keywords from the same Row.
- All Keywords are in Col_1 and the same Keywords are repeated horizontally in Row_1 (except the first one) and you want to try all Combinations with 2 Keywords. But if that's your Scenario, then you could better use only 1 or 2 Cols, then you only need to take care of "Nested Loops" and you avoid the Looping on Cols... And you then only need to add one more Level of Nested Loops each time, if you want to extend your Concept to 3, then 4 Keywords, etc...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: inputting keywords from CSV file

by Athira on Thu Jun 01, 2017 9:43 am

Thank you so much for reconsidering and replying !!!!

My school gives really challenging projects. I had only discussed the idea with my teacher and he had left the complete project to me . I have not yet discussed about the partial scripting and output that I have done till now yet to my teacher. I thought to complete it, then show it to him and hope that even he realizes the potential of my idea and script.

And regarding my script my input key word CSV file is the third kind you mentioned.

All Keywords are in Col_1 and the same Keywords are repeated horizontally in Row_1 (except the first one as (1,2) or (2,1) will be the same output for search ) and I want to try all Combinations first with 2 keywords and then keep adding Keywords depending on the significant depth of particular data that we may need to collect.

Once again thank you so much for reconsidering.
Athira
 
Posts: 4
Joined: Wed May 31, 2017 4:22 pm

Re: inputting keywords from CSV file

by chivracq on Thu Jun 01, 2017 11:34 am

Athira wrote:Thank you so much for reconsidering and replying !!!!

My school gives really challenging projects. I had only discussed the idea with my teacher and he had left the complete project to me . I have not yet discussed about the partial scripting and output that I have done till now yet to my teacher. I thought to complete it, then show it to him and hope that even he realizes the potential of my idea and script.

And regarding my script my input key word CSV file is the third kind you mentioned.

All Keywords are in Col_1 and the same Keywords are repeated horizontally in Row_1 (except the first one as (1,2) or (2,1) will be the same output for search ) and I want to try all Combinations first with 2 keywords and then keep adding Keywords depending on the significant depth of particular data that we may need to collect.

Once again thank you so much for reconsidering.

OK, but then if your Input File is of Type_3, then that's the "easiest" Case, ah-ah...! You don't even need to repeat to repeat all Keywords in Row_1, you simply "play" with the "Inner" Nested Loop for the 2nd Keyword on '!DATASOURCE_LINE' using 'EVAL()'.

Mini-Difficulty will be that the Inner Loop (for the 2nd Keyword) will each time become 1 shorter than the previous one (on each 1st Keyword), and you will a way to "know" that you have reached the last Row. There are several Methods for that:
- In a '.js' Script, you can have a first '.iim' Macro that will first loop through all Rows to count how many Rows you have in total.
- In an '.iim', during one given Loop, you can always check the next Row at the same time and if there is no Data in the next Row, you know you have reached the EOF (End Of File) and you then know that the Outer Loop can now jump to the next Row for the 1st Keyword. But your Macro needs to be able to "communicate" with itself to pass that Info to the next Loop, which can be done through the OS Clipboard (a bit dangerous, if you do other "things" at the same time and you want to do some Copy&Paste, you then "screw" the Content of the Clipboard for your Script and you can start all over again...!) or with some mini-Temp File used as a 2nd DataSource.
- If the Number of Rows in your '.CSV' is fixed and known, you can hard-code that Nb in your Script or include a Counter in the '.CSV' itself. (Can easily be automatically computed in a Cell if you maintain your '.CSV' File from Excel/OO.)

>>>

I see btw that you opened a similar Thread on StackOverFlow, ah-ah...! No Pb, don't worry...!
And Advanced User @Shugar (he's good...!) gave you a clever Trick (by using a "fake" Delimiter...!) to manage to retrieve the Content of a whole Row without the Need to check all Cols one by one. But this Trick would only work on FF... And hum..., I am afraid you may have to struggle a bit against the Double Quotes as Separator if your '.CSV' uses any Separator, ah-ah...!

His whole Script is a bit similar to one of the 2 Methods I had in mind (with 'eVAL()'), except that it doesn't take into account if you had Rows of variable Lengths, but it can easily be adapted to include that Factor for each Row to be able to count dynamically how many Cols it contains, but then you fall back on the Drawback that the Macro needs to be able to communicate with itself between 2 Loops...

Ah...!, but hum...!, his Trick gives you another (easy...!) way to count the Nb of Rows/Keywords, but hum, that forces you to make sure that you maintain your List of Keywords very carefully, and that each time you add a Keyword "vertically" in an extra Row, you add it as well "horizontally" in the first Row. (But that was your original Plan anyway...)
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)

Re: inputting keywords from CSV file

by Athira on Thu Jun 01, 2017 2:14 pm

Thank you for your reply!!

oh yes, I was stuck in this particular part of my project from a pretty long time, I was unable to resolve the issue, it was taking more time than I expected and the deadline for completion is approaching. So I posted simultaneously in 2 forums.

I am sorry I forgot to mention that the .CSV file I am using for both input and output is from Excel/OO. I used that basically because I thought it was more presentable and adding keywords to cells was much easy.

I tried the code suggested by @Shugar but I am unable to get the output and iMacros was getting hung. I tried to modify it still I was unable to run it properly.

As you said it may be becoz of the format of the input file. I am a bit confused with this part. I am not sure if I am missing anything important in the code.
Athira
 
Posts: 4
Joined: Wed May 31, 2017 4:22 pm


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: Google [Bot] and 4 guests

-->