Data extracting from PDF

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Data extracting from PDF

by digitalnative on Wed Feb 10, 2016 1:05 pm

Looking for help creating a script/workflow for extracting data from a PDF and separating the contents based on unique identifiers used as delimitation.

If anyone has experience with this sort of exercise please get in touch with me. Need help ASAP.
digitalnative
 
Posts: 1
Joined: Wed Feb 10, 2016 1:01 pm

Re: Data extracting from PDF

by IrishMacro on Thu Feb 11, 2016 5:22 am

ASAP eh?

I got curious so researched it a bit and... it cannot be done.
Imacros doesn't have the technology to look at PDF and extract text.
Nor can you do it with Selenium IDE.

You have to look at a workflow which will download the PDFs and convert them to text and then do your scraping with some other tool this time :)
Firefox free plugin, last version
Win7
IrishMacro
 
Posts: 135
Joined: Wed Nov 03, 2010 5:27 am


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: Google [Bot] and 4 guests

-->