Scrapping a Table that is not HTML

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
antoine
Posts: 6
Joined: Wed Oct 09, 2013 2:12 pm

Scrapping a Table that is not HTML

Post by antoine » Fri Dec 13, 2013 2:30 pm

Hello everybody,

To make it short, I m trying to scrape this url
http://gatherproxy(DOT)com/proxylist/country/?c=Venezuela
I m trying to extract the 2cells ip adress and the port for each row.

While TAG POS=2 TYPE=TD ATTR=TXT EXTRACT=TXT works pretty good with html tables, here it's not working because the data are not inside a table but under a javascript.

I do not know nothing about JS, for me is like Japanese, can someone help me to understand how to do when I face this kind of pages.

Thank you verry much
Derek, Tech Support
Posts: 12
Joined: Thu Dec 12, 2013 3:27 pm

Re: Scrapping a Table that is not HTML

Post by Derek, Tech Support » Fri Dec 13, 2013 5:02 pm

Hello!

iMacros can easily extract well-formatted HTML tables with just one command:

http://wiki.imacros.net/Data_Extraction#Extract_Table

In this case, I looked at the source HTML code to find the main table with the data in it and discovered that it is uniquely identified with id="tblproxy". I used this information to construct the following command for extracting the data:

TAG POS=1 TYPE=TABLE ATTR=ID:tblproxy EXTRACT=TXT

Then save the data as a CSV file using the SAVEAS command:

SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv

Here's the full macro:

Code: Select all

URL GOTO=http://gatherproxy.com/proxylist/country/?c=Venezuela
TAG POS=1 TYPE=TABLE ATTR=ID:tblproxy EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
Another way to have done this would be to start recording a macro and then open the Text Extraction Wizard (Record tab > Wizards > Extract > Text). But then you would have to attempt to click on the outer table element. It's possible, but a bit tricky this way. Instead, maybe look directly in the code and manually construct the command.

Thanks!
Kind regards,
iMacros Support
antoine
Posts: 6
Joined: Wed Oct 09, 2013 2:12 pm

Re: Scrapping a Table that is not HTML

Post by antoine » Fri Dec 13, 2013 7:39 pm

Hi Derek,

First of all thanks for your help, the result I m getting is not what excpected.

I need to exctract datas in that way in the csv file

"198.526.35.2";"8080"
"198.526.35.2";"8080"
"198.526.35.2";"8080"
"198.526.35.2";"8080"
"198.526.35.2";"8080"

Instead doing a single extract i m getting every thing together. I didn't said it but when this macro is finish it runs another macro that deserve to a php treatment for each row, but making a "one shot" extract of the table don't allow me to use these data after.

Can you give me another advice please ? :) thank you so much
antoine
Posts: 6
Joined: Wed Oct 09, 2013 2:12 pm

Re: Scrapping a Table that is not HTML

Post by antoine » Sun Dec 15, 2013 10:30 am

Hello everybody :)

Sorry if I'm insisting with my table, but maybe the title of my post was not clear

This is a part of how the data that I would like to scrap are shown in the source code of the page :

Code: Select all

<script type="text/javascript">
                    gp.insertPrx({"PROXY_CITY":"","PROXY_COUNTRY":"Venezuela","PROXY_IP":"186.93.245.143","PROXY_LAST_UPDATE":"7 48","PROXY_PORT":"8080","PROXY_REFS":null,"PROXY_STATE":null,"PROXY_STATUS":"OK","PROXY_TIME":"699","PROXY_TYPE":"Anonymous","PROXY_UID":null,"PROXY_UPTIMELD":"177\/20"});
                </script>
                <script type="text/javascript">
                    gp.insertPrx({"PROXY_CITY":"","PROXY_COUNTRY":"Venezuela","PROXY_IP":"190.205.163.189","PROXY_LAST_UPDATE":"7 17","PROXY_PORT":"8080","PROXY_REFS":null,"PROXY_STATE":null,"PROXY_STATUS":"OK","PROXY_TIME":"163","PROXY_TYPE":"Transparent","PROXY_UID":null,"PROXY_UPTIMELD":"1\/0"});
                </script>
                <script type="text/javascript">
                    gp.insertPrx({"PROXY_CITY":"","PROXY_COUNTRY":"Venezuela","PROXY_IP":"186.91.78.208","PROXY_LAST_UPDATE":"6 48","PROXY_PORT":"8080","PROXY_REFS":null,"PROXY_STATE":null,"PROXY_STATUS":"OK","PROXY_TIME":"87","PROXY_TYPE":"Transparent","PROXY_UID":null,"PROXY_UPTIMELD":"1\/0"});
                </script>


I need at least 2 values, the proxy IP values and the proxy port values

Derek helped me with a little trick, extracting the full content with TAG POS 1 TYPE=TABLE it works... but not really like I need, because it extract the full table in a way that make me not able to use this outpout as a datasource in an other macro. Basically when i m using this csv for another macro, Imacro just find undefined values :?

I have attached one image to show you how the csv looks like.

I would like extracted data on the csv file looks like usually, in that way

Code: Select all

"198.526.35.2";"8080"
"198.526.35.2";"8080"
"198.526.35.2";"8080"
"198.526.35.2";"8080"
"198.526.35.2";"8080"
Really thank you so much for your help
Attachments
Sans titre.png
How the csv looks like exporting the full content of the tag Table
Derek, Tech Support
Posts: 12
Joined: Thu Dec 12, 2013 3:27 pm

Re: Scrapping a Table that is not HTML

Post by Derek, Tech Support » Tue Dec 17, 2013 9:13 pm

Hello!

Here's a quick macro that grabs just the data from the two columns. This macro is hard coded for 25 rows of data. If there are less rows, you will see some #EANF#'s in the output file.

This macro could be broken into two and then controlled for more robust looping. Please see the Loop after Query or Login for this capability.

Code: Select all

SET !WAITPAGECOMPLETE YES
VERSION BUILD=9052613
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://gatherproxy.com/proxylist/country/?c=Venezuela
'Gather table column data
'Data Row 1
SET !VAR0 "TD"
SET !VAR1 "3"
SET !VAR2 "4"
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 2
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 3
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 4
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 5
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 6
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 7
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 8
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 9
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 10
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 11
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 12
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 13
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 14
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 15
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 16
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 17
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 18
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 19
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 20
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 21
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 22
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 23
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 24
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 +8
ADD !VAR2 +8
'Data Row 25
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
Kind regards,
iMacros Support
antoine
Posts: 6
Joined: Wed Oct 09, 2013 2:12 pm

Re: Scrapping a Table that is not HTML

Post by antoine » Thu Dec 19, 2013 10:07 am

Thank you verry much Derek

It works for the first raw, but then i get this error message...

Code: Select all

BadParameter: expected POS=<number> or POS=R<number>where <number> is a non-zero integer as parameter 1, line: 17 (Error code: -911)
Anyway I want to thank you for your precious help
Derek, Tech Support
Posts: 12
Joined: Thu Dec 12, 2013 3:27 pm

Re: Scrapping a Table that is not HTML

Post by Derek, Tech Support » Thu Dec 19, 2013 7:26 pm

Hello!

Ah I see what I did. I had a "+" in my add statement.

Give this a try:

Code: Select all

SET !EXTRACT_TEST_POPUP NO
SET !WAITPAGECOMPLETE YES
VERSION BUILD=9052613
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://gatherproxy.com/proxylist/country/?c=Venezuela
'Gather table column data
'Data Row 1
SET !VAR0 "TD"
SET !VAR1 "3"
SET !VAR2 "4"
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 2
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 3
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 4
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 5
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 6
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 7
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 8
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 9
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 10
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 11
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 12
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 13
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 14
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 15
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 16
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 17
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 18
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 19
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 20
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 21
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 22
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 23
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 24
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
ADD !VAR1 8
ADD !VAR2 8
'Data Row 25
TAG POS={{!VAR1}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
TAG POS={{!VAR2}} TYPE={{!VAR0}} ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=venezuela_proxy_list.csv
Kind regards,
iMacros Support
Derek, Tech Support
Posts: 12
Joined: Thu Dec 12, 2013 3:27 pm

Re: Scrapping a Table that is not HTML

Post by Derek, Tech Support » Fri Dec 20, 2013 2:36 pm

Hello!

antoine , what version of iMacros are you running and what browser? We may have found something that explains what you're seeing.

Thanks!
Kind regards,
iMacros Support
dragreat
Posts: 2
Joined: Sat Dec 21, 2013 3:26 pm

Re: Scrapping a Table that is not HTML

Post by dragreat » Thu Dec 26, 2013 3:43 pm

Hi Derek,

I may be having a similar problem and really need your help.

Basically, I have a url like this :

http://mytour(DOT)vn/141-khach-san-moevenpick-sai-gon-movenpick-sai-gon.html

I need to extract the price (e.g "1,848,000") but it's nowhere to be found in the source code. I tried the extraction wizard, it got everything in the table, except the prices!

If I open the link in Firefox and save as complete html, I can find the prices in the source code. However, when I used SAVEAS (type =CPL) and call the macro from php, the complete page may have been saved, but the prices are not there!

I am really confused, can you kindly help me out? Thank you very much!
Post Reply