Extracting between two comment tags

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
Sasha

Extracting between two comment tags

Post by Sasha » Sun Dec 04, 2005 5:31 pm

Hey everyone, im having a little issue trying to extract a few articles from a website. I want to extract the html with it including the <p values and so forth from a certain location. The code goes like this for the article section :

Code: Select all


<!-- google_ad_section_start -->

<p>Article here it uses paragraph text</p> but does this multiple times for each article its different)

<!-- google_ad_section_end -->

so in the end result I would like to extract in between the two google ad section comments including the html but I think because there is line breaks its has trouble reading it or picking it up

I tried

<!-- google_ad_section_start -->*<!-- google_ad_section_end -->

and a few other variations but it doesnt seem to work. Any assistance would be greatly apprecaited.
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Mon Dec 05, 2005 4:02 pm

In the EXTRACT command, you can use TYPE=HTM. This will preserve all HTML formatting. Does this solve this problem?
Sasha

No not really

Post by Sasha » Mon Dec 05, 2005 11:14 pm

No it really does not, I tried that many times, but it wont extract it because there are line spaces in between... so what happens is it wont read it. I think the program only reads it if the html is continuous... and line continuing but not multiple lines of html.
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Post by Tech Support » Tue Dec 06, 2005 3:07 pm

Can you please post a link to the complete web page? Or email the html code of the page to support2 AT iopus.com
Ann
Post Reply