How to retrieve the name of the downloaded PDF document?

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team

Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
RichardO8
Posts: 2
Joined: Thu Mar 22, 2018 4:19 pm

How to retrieve the name of the downloaded PDF document?

Post by RichardO8 » Thu Mar 22, 2018 4:38 pm

Hi,

I am using iMacros Version10 and want to retreive the name of a PDF document that is stored behind a short link: e.g. http://shortlinks.de/40tr points to https://www.upc.at/pdf/anleitungen/fibe ... n-0517.pdf

The macro in Excel writes the shortlink to the variable {linkname}. Beside downloading the PDF file I also want to hand over the name of the actual file (b2c-installationsanleitung-wlan-0517.pdf) to excel

Here is my macro performing the download, but I did not succeeded in writing the name of the PDF file to a variable:

VERSION BUILD=10002738
TAB T=1
TAB CLOSEALLOTHERS
Set !errorignore yes
ONDOWNLOAD FOLDER=C:\CHECK_Links FILE=* WAIT=YES
URL GOTO={{linkname}}
TAB T=1

I hope, that there must be some simple solution or trick for this task.

Thank you very much for your support!

Richard
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to retrieve the name of the downloaded PDF document?

Post by chivracq » Thu Mar 22, 2018 5:27 pm

RichardO8 wrote:Hi,

I am using iMacros Version10 and want to retreive the name of a PDF document that is stored behind a short link: e.g. http://shortlinks.de/40tr points to https://www.upc.at/pdf/anleitungen/fibe ... n-0517.pdf

The macro in Excel writes the shortlink to the variable {linkname}. Beside downloading the PDF file I also want to hand over the name of the actual file (b2c-installationsanleitung-wlan-0517.pdf) to excel

Here is my macro performing the download, but I did not succeeded in writing the name of the PDF file to a variable:

Code: Select all

VERSION BUILD=10002738
TAB T=1
TAB CLOSEALLOTHERS
Set !errorignore yes
ONDOWNLOAD FOLDER=C:\CHECK_Links FILE=* WAIT=YES
URL GOTO={{linkname}}
TAB T=1
I hope, that there must be some simple solution or trick for this task.

Thank you very much for your support!

Richard
FCIM...! :mrgreen: (Read my Sig...)
=> iMB v10.0, OS=Win...?

Hum, trickier indeed than I first thought, I don't see any "easy" Solution, the Redirect and Download Mechanism seems to be handled by this 'shortlinks.de' Site at the Server Level and "nothing" is "visible" in the Source of the Page opened in a 2nd Tab to serve the '.PDF' to the Browser.
The Name of the File populates the 'File name' Field in the 'Save as' Popup but iMacros won't be able to retrieve that Info.

1- The "somewhat cumbersome" Method I would use, would be to actually let the Download take place, meaning you really download the '.PDF' (to the Folder that you already specify in 'ONDOWNLOAD'), and then using the exact Method I explained in the following Thread/Post ('Part_2' is relevant for you...) with "URL GOTO=file:///C:/CHECK_Links/" (in a/the 2nd Tab) to "extract" the '.PDF' Name from the 'dir' Listing:
- Re: Select file from folder which is not already in CSV file
It is a little bit cumbersome but not very complicated to implement... The User in that Thread never followed up so I don't know if and how they managed to solve their Scenario and I never posted "my Script" using the Method I had described but I remember it was pretty straightforward when I had tested my 'Proof of Concept"...

2- But hum, the Method I describe is "to do the job" from iMacros, but if you are originally launching your iMacros Macro from Excel already using the Scripting Interface, then you can probably achieve the same Functionality from '.vbs' as well I would think, to check the 'cmd' 'dir' Listing of that Folder and to retrieve the Name of the last File (sorted on Date) listed in that Directory.
Last edited by chivracq on Thu Mar 22, 2018 6:05 pm, edited 1 time in total.
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to retrieve the name of the downloaded PDF document?

Post by chivracq » Thu Mar 22, 2018 6:00 pm

A few more "Ideas", hum..., that means I quite like your Case then, ah-ah...! 8)

3- Have a look at the following Thread as well for a "Mix" of both Methods, where you would first create a 'cmd dir' Listing from a '.BAT' File or from your '.vbs' Script / Excel Macro, and then use iMacros to extract its Content and the specific Data you want to isolate. But hum..., I find my (1st) Method easier and more straightforward... But it's interesting Reading anyway, ah-ah...!:
- Re: Choosing uploaded file by date (=> "The 2nd Solution".)

4- And another (Creative!) Solution I thought of (and tested), is if you were using FF, all the Info you are looking for is accessible from 'about:downloads':

Code: Select all

VERSION BUILD=8820413 RECORDER=FX
TAB T=1
URL GOTO=about:downloads

SET !LOOP 31
TAG POS={{!LOOP}} TYPE=* ATTR=* EXTRACT=HTM
PROMPT {{!EXTRACT}}
=> This extracts the following Data:

Code: Select all

<richlistitem xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul" class="download download-state" active="true" displayName="b2c-installationsanleitung-wlan-0517.pdf" extendedDisplayName="b2c-installationsanleitung-wlan-0517.pdf — upc.at" extendedDisplayNameTip="b2c-installationsanleitung-wlan-0517.pdf — www.upc.at" image="moz-icon://D:\TEMP\Temp\b2c-installationsanleitung-wlan-0517.pdf?size=32" state="1" status="317 KB — upc.at — 17:44" progressmode="normal" progress="100" orient="horizontal" align="center" current="true" selected="true" style="outline: 1px solid blue;"/>
(Tested on iMacros for FF v8.8.2, Pale Moon v26.3.3 (=FF47), Win10_x64.)
Quick PoC, possible to lower the 'POS_Nb' of course, to "1" probably...
But I don't think iMB (v10 or any Version) supports any similar Functionality like the FF 'about:downloads', tja...!
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
RichardO8
Posts: 2
Joined: Thu Mar 22, 2018 4:19 pm

Re: How to retrieve the name of the downloaded PDF document?

Post by RichardO8 » Thu Mar 22, 2018 10:16 pm

Dear chivracq,

Thank you very much for your excellent thoughts and very helpful support!

I have choosen your proposed solution 2 retreiving the filename with the excel macro. As I have been concentrating on the solution of this task with imacros, I did not thought of using Excel vba for that task. So your hind was really eye-opening for me. :D

Best regards,

Richard
chivracq
Posts: 10301
Joined: Sat Apr 13, 2013 1:07 pm
Location: Amsterdam (NL)

Re: How to retrieve the name of the downloaded PDF document?

Post by chivracq » Thu Mar 22, 2018 10:57 pm

RichardO8 wrote:Dear chivracq,

Thank you very much for your excellent thoughts and very helpful support!

I have choosen your proposed solution 2 retreiving the filename with the excel macro. As I have been concentrating on the solution of this task with imacros, I did not thought of using Excel vba for that task. So your hind was really eye-opening for me. :D

Best regards,

Richard
OK..., Thanks for the Follow-up and Feedback... :D

But..., yep, I guess Sol_2 might be the "logical" Solution for you if you are (more) "fluent" with 'vbs'/'vba', I would have gone myself for Sol_1>Sol_4>Sol_3>Sol_2 in this Order as I'm more fluent with iMacros ('.iim), ah-ah...!

:arrow: Still useful for the Forum would be if you could share your '.vba' Script on how you implemented that Functionality from Excel, as we don't have many '.vba' Examples on the Forum actually... :idea:

>>>

And hum..., I thought of a 5th Sol in the meantime, the most obvious Solution of all actually, oops...!, by simply using the Built-in Var '!DOWNLOADED_FILE_NAME' which actually does exactly the Functionality that you want, ah-ah...! 8)
I didn't think of it directly because I've never used iMB myself (I only use the Free FF Add-on) and this Command is only supported on iMB, but only from iMB v10.3, which probably is then still not an Option for you as you mentioned that you were using iMB v10.0, then this Command was not yet implemented in your Version...
But iMB v10.3 was the "stable" iMB v10.x Version anyway..., maybe an Idea for you if you still have a way to update iMB v10.0, to update it to v10.3... :idea:
- (F)CI(M) = (Full) Config Info (Missing): iMacros + Browser + OS (+ all 3 Versions + 'Free'/'PE'/'Trial').
- FCI not mentioned: I don't even read the Qt...! (or only to catch Spam!)
- Script & URL help a lot for more "educated" Help...
Post Reply