Save extracted DIV to new page/file

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team

Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
mattiewae2
Posts: 1
Joined: Sun Dec 30, 2012 2:32 pm

Save extracted DIV to new page/file

Post by mattiewae2 » Sun Dec 30, 2012 2:43 pm

Hi!

http://www.vitisvitae.be/

From this page I would like to extract the div element content.

Code: Select all

VERSION BUILD=7601105 RECORDER=FX
VERSION BUILD=6600525     
TAB T=1        
SET !EXTRACT_TEST_POPUP NO
TAG POS=1 TYPE=DIV ATTR=ID:content EXTRACT=HTM
SAVEAS TYPE=EXTRACT FOLDER=* FILE=+{{!URLCURRENT}}.htm
In this version I only have text in a csv file, I need the html of the extracted div.

Code: Select all

VERSION BUILD=7601105 RECORDER=FX
TAB T=1
TAG POS=1 TYPE=DIV ATTR=ID:content
SAVEAS TYPE=HTM FOLDER=* FILE=+{{!URLCURRENT}}
In this version I can select the div element I want to extract but it saves the whole page.
I only want to save the div element.

It would be realy nice if someone could point me in the right direction.

Thanks!
Lantus
Posts: 4
Joined: Tue Jan 19, 2016 5:40 pm

Re: Save extracted DIV to new page/file

Post by Lantus » Mon Dec 17, 2018 11:47 pm

There is a way to do it with a javascript macro...

Something like:

Code: Select all

function writeFile(path,string,exact){//<versao>1.1</versao>
    //http://stackoverflow.com/questions/14677247/imacro-setting-variable-saveas-csv
    //import FileUtils.jsm
    Components.utils.import("resource://gre/modules/FileUtils.jsm");
    //declare file
    var file = new FileUtils.File(path);

    //declare file path
    file.initWithPath(path);

    //if it exists move on if not create it
    if (!file.exists()){
    	file.create(file.NORMAL_FILE_TYPE, 0666);
    }

    var charset = 'UTF-8';
    var fileStream = Components.classes['@mozilla.org/network/file-output-stream;1']
    .createInstance(Components.interfaces.nsIFileOutputStream);
    fileStream.init(file, 18, 0x200, false);
    var converterStream = Components
    .classes['@mozilla.org/intl/converter-output-stream;1']
    .createInstance(Components.interfaces.nsIConverterOutputStream);
    converterStream.init(fileStream, charset, string.length,
    Components.interfaces.nsIConverterInputStream.DEFAULT_REPLACEMENT_CHARACTER);

    //write file to location
    if(!exact) string = "\r\n"+string;
    converterStream.writeString(string); 
    converterStream.close();
    fileStream.close();
 } 

var string = window.content.document.getElementById("content").innerHTML; //or outerHTML 
var path = 'C:\content.html';
var exact = true;
writeFile(path,string,exact);

It won't be a html file, because there are no opening or closing of html, head and body. But you can add them to the string:

Code: Select all


string = '<html><head></head><body>'+string+'</body>';

The question is very old, and probably solved. But stays here for people with the same issues.
Post Reply