Extracting Page Source Information If Present

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Extracting Page Source Information If Present

by jer1975 on Sun Jan 31, 2016 7:32 am

VERSION BUILD=8961227 RECORDER=FX
Windows 8.1 Firefox 44.0

Okay, please forgive me for what may end up being very novice questions. I am fairly new to trying to use iMacros. Added it to Chrome a couple years ago but never really tried using it. I just added it to Firefox and am trying to find out if it can even do what I want it to do.

I am trying to use iMacros to search the page HTML source for a specific string. If that string exists I would like for it to extract that string along with the following 60 characters and date and time to append it to a CSV file. The thing is, that string could appear in the source 1 time, 20 times or perhaps not at all. I would then want the macro to wait 10 minutes, refresh the page and do the search all over again. I am looking for what may be subtle differences in if the string is found or not. Though I can determine the differences by analyzing the CSV file later.

I am successfully able to search the page source with the below code. I have been able to get it to display a popup when the string is found. The problem is that it times out when the string is not found and the loop ends. Also, it is only prompting that the string exists. It is not logging that the string exists in a file for later analysis. I am also only able to extract the exact string, not the following 60 characters.

Code: Select all
VERSION BUILD=8961227 RECORDER=FX
TAB T=1
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:SearchForm ATTR=*
TAG POS=1 TYPE=A ATTR=TXT:More<SP>Dates
SEARCH SOURCE=REGEXP:"(String_Text_To_Find)" IGNORE_CASE=YES EXTRACT=$1
PROMPT {{!EXTRACT}}


I hope that I provided enough information for someone to be able to answer my question. I guess I want to know if this is even something that iMacros is capable of doing before I put a lot of time in to working through it to only find out it isn't possible. My knowledge of iMacros and the code is very rudimentary right now, but I hope to learn a lot here and reading any resources you could point me to.

Thanks in advance!
jer1975
 
Posts: 1
Joined: Sun Jan 31, 2016 6:43 am

Re: Extracting Page Source Information If Present

by chivracq on Sun Jan 31, 2016 1:27 pm

jer1975 wrote:
Code: Select all
VERSION BUILD=8961227 RECORDER=FX
Windows 8.1 Firefox 44.0


Okay, please forgive me for what may end up being very novice questions. I am fairly new to trying to use iMacros. Added it to Chrome a couple years ago but never really tried using it. I just added it to Firefox and am trying to find out if it can even do what I want it to do.

I am trying to use iMacros to search the page HTML source for a specific string. If that string exists I would like for it to extract that string along with the following 60 characters and date and time to append it to a CSV file. The thing is, that string could appear in the source 1 time, 20 times or perhaps not at all. I would then want the macro to wait 10 minutes, refresh the page and do the search all over again. I am looking for what may be subtle differences in if the string is found or not. Though I can determine the differences by analyzing the CSV file later.

I am successfully able to search the page source with the below code. I have been able to get it to display a popup when the string is found. The problem is that it times out when the string is not found and the loop ends. Also, it is only prompting that the string exists. It is not logging that the string exists in a file for later analysis. I am also only able to extract the exact string, not the following 60 characters.

Code: Select all
VERSION BUILD=8961227 RECORDER=FX
TAB T=1
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:SearchForm ATTR=*
TAG POS=1 TYPE=A ATTR=TXT:More<SP>Dates
SEARCH SOURCE=REGEXP:"(String_Text_To_Find)" IGNORE_CASE=YES EXTRACT=$1
PROMPT {{!EXTRACT}}


I hope that I provided enough information for someone to be able to answer my question. I guess I want to know if this is even something that iMacros is capable of doing before I put a lot of time in to working through it to only find out it isn't possible. My knowledge of iMacros and the code is very rudimentary right now, but I hope to learn a lot here and reading any resources you could point me to.

Thanks in advance!

Compliments on the way you presented your Thread...

Have a look at the Command Reference for a list of all Commands in iMacros, => you'll notice:
- '!ERRORIGNORE' (+ maybe shorten '!TIMEOUT_STEP') for your Timeout Pb....
- 'SAVEAS' for saving the Data you want to a .CSV File.

For the String with the following 60 Chars, I guess you can do it with 'SEARCH SOURCE=REGEXP' as well, but I don't know this Command really well as I don't use it myself, the "Workaround" I would use for myself would be to try to tag "some" HTML Element containing your Search Text and extract its TXT or HTM Data and then using 'EVAL()' + 'split()' + 'substr()' to isolate the first 60 Chars after your Search Text in the Extract:
Code: Select all
TAG POS=1 TYPE=* ATTR=TXT:*String<SP>to<SP>find* EXTRACT=TXT

=> EXTRACT=TXT or HTM...
TYPE=* or HTML or BODY or DIV or SPAN..., you'll have to try different Options depending on how generic you want your Script to be...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6474
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 2 guests

-->