Unable to save 1 little box of text

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
bobbyd67
Posts: 15
Joined: Sat May 21, 2011 5:19 am

Unable to save 1 little box of text

Post by bobbyd67 » Sun May 31, 2015 1:43 am

Hi All,

VERSION BUILD=8920312
Win7
Firefox 36.0.1

What I am trying to do is search for a product on a website, then simply extract and save in a csv file the text description for the product.

I have a bout 300 part numbers to search for.

This is the website it first starts at.
http://us.idec.com/Home.aspx

I then searches for product successfully, and lands at say this page: http://us.idec.com/Catalog/SearchResult ... 3G8JT22TFB*

You can then see the small text description on the page " Touchscreen HMI 8.4 inch TFT 65K Color SVGA 800x600 Black Bezel Ethernet USB Port ".

This is the decsription I need to save for each searched product.

I have the looping all set, but can not get it to save that section of text.

Please help.

Here is the macro as i have so far.

########

VERSION BUILD=8920312 RECORDER=FX
TAB T=1
URL GOTO=http://us.idec.com/

SET !LOOP 1
SET !DATASOURCE IDEC.csv
SET !DATASOURCE_COLUMNS 1
SET !DATASOURCE_LINE {{!LOOP}}

EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT" BUTTON=0
EVENTS TYPE=KEYPRESS SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT" CHARS={{!COL1}}
EVENT TYPE=MOUSEUP POINT="(1003,86)"
EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV>DIV:nth-of-type(2)>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(2)>P" BUTTON=0

TAG POS=1 TYPE=text ATTR=TXT:* EXTRACT=TXT


SAVEAS TYPE=TXT FOLDER=* FILE=idecscrape_{{!NOW:yymmdd_hhnnss}}.csv

WAIT SECONDS=5

I look forward to any help someone could give.
chivracq
Posts: 9809
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Unable to save 1 little box of text

Post by chivracq » Sun May 31, 2015 2:16 pm

bobbyd67 wrote:Hi All,

VERSION BUILD=8920312
Win7
Firefox 36.0.1

What I am trying to do is search for a product on a website, then simply extract and save in a csv file the text description for the product.

I have a bout 300 part numbers to search for.

This is the website it first starts at.
http://us.idec.com/Home.aspx

I then searches for product successfully, and lands at say this page: http://us.idec.com/Catalog/SearchResult ... 3G8JT22TFB*

You can then see the small text description on the page " Touchscreen HMI 8.4 inch TFT 65K Color SVGA 800x600 Black Bezel Ethernet USB Port ".

This is the decsription I need to save for each searched product.

I have the looping all set, but can not get it to save that section of text.

Please help.

Here is the macro as i have so far.

########

Code: Select all

VERSION BUILD=8920312 RECORDER=FX
TAB T=1
URL GOTO=http://us.idec.com/

SET !LOOP 1
SET !DATASOURCE IDEC.csv
SET !DATASOURCE_COLUMNS 1
SET !DATASOURCE_LINE {{!LOOP}}

EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT" BUTTON=0
EVENTS TYPE=KEYPRESS SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT" CHARS={{!COL1}}
EVENT TYPE=MOUSEUP POINT="(1003,86)"
EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV>DIV:nth-of-type(2)>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(2)>P" BUTTON=0 

TAG POS=1 TYPE=text ATTR=TXT:* EXTRACT=TXT


SAVEAS TYPE=TXT FOLDER=* FILE=idecscrape_{{!NOW:yymmdd_hhnnss}}.csv

WAIT SECONDS=5
I look forward to any help someone could give.
OK, but I don't see any difficulty, especially for sbd who's been using iMacros for at least 4 years from your Registration Date on the Forum...

You don't provide any Search Keywords from your .CSV File but using Standard Recording Mode with ID's and one simple Relative Positioning Tag/Extract, the following Macro extracts the Product Image URL + Name + Description once you've already landed on the Product Page, and the Macro looks generic enough to me to work with other Search Keywords on this Site...:

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
TAB T=1
SET !EXTRACT_TEST_POPUP NO
URL GOTO=http://us.idec.com/Catalog/SearchResults.aspx?IndexCatalogue=IdecIndex&SearchQuery=HG3G8JT22TFB*

'Search Results:
TAG POS=1 TYPE=H3 ATTR=ID:ctl00_ContentPlaceHolder1_searchHeader
'TAG POS=1 TYPE=P ATTR=TXT:Showing<SP>1<SP>-<SP>1<SP>of<SP>1<SP>Products
'>
'Product Image:
'TAG POS=1 TYPE=INPUT:IMAGE FORM=ID:aspnetForm ATTR=ID:ctl00_ContentPlaceHolder1_ProductsList_ProductListRepeater_ctl00_ThumbnailImageButton
SET !EXTRACT NULL
TAG POS=1 TYPE=INPUT:IMAGE FORM=ID:aspnetForm ATTR=ID:ctl00_ContentPlaceHolder1_ProductsList_ProductListRepeater_ctl00_ThumbnailImageButton EXTRACT=HREF
SET Product_Image {{!EXTRACT}}
'TAG POS=1 TYPE=H2 ATTR=TXT:HG3G-8JT22TF-BTouchscreen<SP>8.4<SP>inch<SP>65k-color
'BACK
'>
'Product Name:
'TAG POS=1 TYPE=A ATTR=ID:ctl00_ContentPlaceHolder1_ProductsList_ProductListRepeater_ctl00_ProductNameHyperLink
SET !EXTRACT NULL
TAG POS=1 TYPE=A ATTR=ID:ctl00_ContentPlaceHolder1_ProductsList_ProductListRepeater_ctl00_ProductNameHyperLink EXTRACT=TXT
SET Product_Name {{!EXTRACT}}
'BACK
'>
'Product Description:
'TAG POS=1 TYPE=P ATTR=TXT:Touchscreen<SP>HMI<SP>8.4<SP>inch<SP>TFT<SP>65K<SP>Color*
SET !EXTRACT NULL
TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT
SET Product_Description {{!EXTRACT}}
'>
'View Product:
'TAG POS=1 TYPE=A ATTR=ID:ctl00_ContentPlaceHolder1_ProductsList_ProductListRepeater_ctl00_ViewProductHyperLink
'BACK
'>
'Display Extract Info:
PROMPT Product_Image:<BR>_{{Product_Image}}_<BR><BR>Product_Name:<BR>_{{Product_Name}}_<BR><BR>Product_Description:<BR>_{{Product_Description}}_
=> Extracted Info:
Product_Image:
_http://us.idec.com/productimages/Thumbnails//OperatorInterfaces/HG2G-Bcolor.jpg_

Product_Name:
_HG3G-8JT22TF-B_

Product_Description:
_Touchscreen HMI 8.4 inch TFT 65K Color SVGA 800x600 Black Bezel Ethernet USB Port_
(Tested on iMacros for FF v8.8.2, Pale Moon v24.6.2 (=FF31), Win7-x64.)
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
bobbyd67
Posts: 15
Joined: Sat May 21, 2011 5:19 am

Re: Unable to save 1 little box of text

Post by bobbyd67 » Mon Jun 01, 2015 1:29 am

chivracq thank you for your reply.

Yes I registered 4 years ago, but did not use imacros an awful lot. Have just returned to try and learn it a bit more and use again.

I tried editing my macro to incorporate parts of yours.

Unfortunately it still wont work properly.

As you seen in my example it is looking to an excel file. Here is a sample of 6 part numbers.
FC5A-D12K1E
FC5A-D16RK1
FC5A-D32K3
FC5A-C10R2
FC5A-C10R2C
FC5A-C10R2D

What it needs to do is search for the part number and THEN ONLY extract the product description, and store that in a csv file, and continue through the list in IDEC.csv

So what i would end up with in a csv file is a long list of product descriptions only.

I hope this is explaining enough for you to understand.

Thank you again
Darren

P>S> my attempt at modifying did extract info but it seemed to extract the whole page, not just the Product Description.
chivracq
Posts: 9809
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Unable to save 1 little box of text

Post by chivracq » Mon Jun 01, 2015 7:43 pm

bobbyd67 wrote:chivracq thank you for your reply.

Yes I registered 4 years ago, but did not use imacros an awful lot. Have just returned to try and learn it a bit more and use again.

I tried editing my macro to incorporate parts of yours.

Unfortunately it still wont work properly.

As you seen in my example it is looking to an excel file. Here is a sample of 6 part numbers.

Code: Select all

FC5A-D12K1E
FC5A-D16RK1
FC5A-D32K3
FC5A-C10R2
FC5A-C10R2C
FC5A-C10R2D
What it needs to do is search for the part number and THEN ONLY extract the product description, and store that in a csv file, and continue through the list in IDEC.csv

So what i would end up with in a csv file is a long list of product descriptions only.

I hope this is explaining enough for you to understand.

Thank you again
Darren

P>S> my attempt at modifying did extract info but it seemed to extract the whole page, not just the Product Description.
Yep, normal, your 'SAVEAS' Statement is x2 faulty (for your purpose):

Code: Select all

SAVEAS TYPE=TXT FOLDER=* FILE=idecscrape_{{!NOW:yymmdd_hhnnss}}.csv
You should be using stg like this for 1 File per Hour...:

Code: Select all

SAVEAS TYPE=EXTRACT FOLDER=* FILE=idecscrape_{{!NOW:yyyymmdd_hhh}}.csv
or like this for 1 File per Day...:

Code: Select all

SAVEAS TYPE=EXTRACT FOLDER=* FILE=idecscrape_{{!NOW:yyyymmdd}}.csv
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
bobbyd67
Posts: 15
Joined: Sat May 21, 2011 5:19 am

Re: Unable to save 1 little box of text

Post by bobbyd67 » Mon Jun 01, 2015 10:10 pm

chivracq I am confused on the hour or day bit. I certainly do not want to save 1 entry per hour or day.

Below is my code as I have it. I extracted the portion for extracting the "Product Description" only into what I had originally.

####
VERSION BUILD=8920312 RECORDER=FX
TAB T=1
SET !EXTRACT_TEST_POPUP NO
URL GOTO=http://us.idec.com/

SET !LOOP 1
SET !DATASOURCE IDEC.csv
SET !DATASOURCE_COLUMNS 1
SET !DATASOURCE_LINE {{!LOOP}}

EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT" BUTTON=0
EVENTS TYPE=KEYPRESS SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT" CHARS={{!COL1}}
EVENT TYPE=MOUSEUP POINT="(1003,86)"
EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV>DIV:nth-of-type(2)>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(2)>P" BUTTON=0

SET !EXTRACT NULL
TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT
SET Product_Description {{!EXTRACT}}

SAVEAS TYPE=TXT FOLDER=* FILE=idecscrape.csv

WAIT SECONDS=5

###

The macro did run, but what it saved to the csv file was a heap of text, and not just the product description line.
As you can see here in the below screencast.

http://screencast.com/t/hsKHG4xh6

All i need to do it loop the macro for the 300 odd part numbers, save those 300 "product descriptions " to 1 continuous csv file and job done.
chivracq
Posts: 9809
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: Unable to save 1 little box of text

Post by chivracq » Mon Jun 01, 2015 10:48 pm

bobbyd67 wrote:chivracq I am confused on the hour or day bit. I certainly do not want to save 1 entry per hour or day.

Below is my code as I have it. I extracted the portion for extracting the "Product Description" only into what I had originally.

Code: Select all

####
VERSION BUILD=8920312 RECORDER=FX
TAB T=1
    SET !EXTRACT_TEST_POPUP NO
URL GOTO=http://us.idec.com/

SET !LOOP 1
SET !DATASOURCE IDEC.csv
SET !DATASOURCE_COLUMNS 1
SET !DATASOURCE_LINE {{!LOOP}}

EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT" BUTTON=0
EVENTS TYPE=KEYPRESS SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT" CHARS={{!COL1}}
EVENT TYPE=MOUSEUP POINT="(1003,86)"
EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV:nth-of-type(3)>DIV>DIV:nth-of-type(3)>DIV>DIV>DIV>INPUT:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>FORM>DIV:nth-of-type(3)>DIV>DIV>DIV>DIV:nth-of-type(2)>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(2)>P" BUTTON=0 

SET !EXTRACT NULL
    TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT
    SET Product_Description {{!EXTRACT}}

SAVEAS TYPE=TXT FOLDER=* FILE=idecscrape.csv

WAIT SECONDS=5

###
The macro did run, but what it saved to the csv file was a heap of text, and not just the product description line.
As you can see here in the below screencast.

http://screencast.com/t/hsKHG4xh6

All i need to do it loop the macro for the 300 odd part numbers, save those 300 "product descriptions " to 1 continuous csv file and job done.
About the Day/Hour, your original Macro was creating a new File every single Second because of your use of the '{{!NOW}}' Variable in the File Name, I slowed that process to 1 new File per Hour or per Day or always the same File like you have now...

In my first Macro for you, I said I was using Relative Positioning:

Code: Select all

TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT
"R1" is for Relative...
"Relative" means relative to stg, but you've removed the Statement tagging the Anchor (Product Name) just before... And you need to keep the "fake" Extract on Product Name even if you don't do anything with it to avoid to follow the Link...
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- I don't even read the Qt if that (required) Info is not mentioned...!
- Script & URL help a lot for more "educated" Help...
techimac
Posts: 407
Joined: Fri Feb 20, 2015 9:27 pm

Re: Unable to save 1 little box of text

Post by techimac » Tue Jun 02, 2015 2:52 am

This works.

TAG POS=1 TYPE=P ATTR=TXT:Showing<SP>*Products
TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT

If there are 2 descriptions:
TAG POS=1 TYPE=P ATTR=TXT:Showing<SP>*Products
TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT
TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT
Available for custom iim, javascript iMacros scripts
bobbyd67
Posts: 15
Joined: Sat May 21, 2011 5:19 am

Re: Unable to save 1 little box of text

Post by bobbyd67 » Tue Jun 02, 2015 2:57 am

where in the macro do i put these bits.... i need it to only write the description field to csv
techimac
Posts: 407
Joined: Fri Feb 20, 2015 9:27 pm

Re: Unable to save 1 little box of text

Post by techimac » Tue Jun 02, 2015 3:05 am

bobbyd67 wrote:where in the macro do i put these bits.... i need it to only write the description field to csv
You have complicated the macro.
Available for custom iim, javascript iMacros scripts
techimac
Posts: 407
Joined: Fri Feb 20, 2015 9:27 pm

Re: Unable to save 1 little box of text

Post by techimac » Tue Jun 02, 2015 3:07 am

URL GOTO=http://us.idec.com/
SET !DATASOURCE IDEC.csv
SET !DATASOURCE_COLUMNS 1

TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:aspnetForm ATTR=NAME:ctl00$ctl11$SearchTextBox CONTENT={{!COL1}}
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:aspnetForm ATTR=NAME:ctl00$ctl11$SearchButton

TAG POS=1 TYPE=P ATTR=TXT:Showing<SP>*Products
1st description
TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT
'2nd description
TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT

SAVEAS TYPE=TXT FOLDER=* FILE=idecscrape_{{!NOW:yymmdd_hhnnss}}.csv

WAIT SECONDS=5
Available for custom iim, javascript iMacros scripts
techimac
Posts: 407
Joined: Fri Feb 20, 2015 9:27 pm

Re: Unable to save 1 little box of text

Post by techimac » Tue Jun 02, 2015 3:10 am

This will also work

SET !DATASOURCE IDEC.csv
SET !DATASOURCE_COLUMNS 1

URL GOTO=http://us.idec.com/Catalog/SearchResult ... ry={{!COL1}}*

TAG POS=1 TYPE=P ATTR=TXT:Showing<SP>*Products
1st description
TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT
'2nd description
TAG POS=R1 TYPE=P ATTR=TXT:* EXTRACT=TXT

SAVEAS TYPE=TXT FOLDER=* FILE=idecscrape_{{!NOW:yymmdd_hhnnss}}.csv

WAIT SECONDS=5
Available for custom iim, javascript iMacros scripts
bobbyd67
Posts: 15
Joined: Sat May 21, 2011 5:19 am

Re: Unable to save 1 little box of text

Post by bobbyd67 » Tue Jun 02, 2015 3:12 am

thanks techimac, where would i have the loop function, as in the file there is about 300 part numbers it needs to go through and save
techimac
Posts: 407
Joined: Fri Feb 20, 2015 9:27 pm

Re: Unable to save 1 little box of text

Post by techimac » Tue Jun 02, 2015 3:18 am

bobbyd67 wrote:thanks techimac, where would i have the loop function, as in the file there is about 300 part numbers it needs to go through and save
Specify 300 in Max:
Click Play (Loop)
Available for custom iim, javascript iMacros scripts
bobbyd67
Posts: 15
Joined: Sat May 21, 2011 5:19 am

Re: Unable to save 1 little box of text

Post by bobbyd67 » Tue Jun 02, 2015 3:20 am

of course....doh
techimac
Posts: 407
Joined: Fri Feb 20, 2015 9:27 pm

Re: Unable to save 1 little box of text

Post by techimac » Tue Jun 02, 2015 3:24 am

1st description
should be
'1st description
since it's a comment
Available for custom iim, javascript iMacros scripts
Post Reply