I've been working on a project for a couple of years (it's a continual thing, so it's never-ending).
I've progressed to a point of using mouse/keyboard macros to scrape a list of text/links from a set of variable-length pages, to paste into Excel; then run an Excel macro to manipulate that data; then return to the webpage, close it & repeat on the next one (I have some error-checking in place in case of a failure).
I do this every 6 months or so.
I am scraping about *75,000 cemetery index pages on http://www.billiongraves.com, copying the names/dates/links of the people interred there, then sorting, filtering and eventually editing errors & merging duplicate records.
*FYI - there are about 600,000 cemetery pages, but I do some data preparation first, extracting only the 75,000 pages with data on them.
Recently, because of some minor site changes and Firefox add-in customizations, the macros that I painstakingly created over time (to pixel-perfect page coordinates, with JitBit Macro Recorder) need to be shifted & changed, which will take me a week or more to do. It's painful...
I'm thinking of iMacros as an alternative (or as an additional part of the process), as I would LIKE TO do the following, but am not sure it's capable of this.
A typical page has a particular <div><id> section which shows the data I want - it would be ideal if I could select JUST that <div> section and copy all of its contents at once, to the clipboard, which I can then paste into Excel.
*right now, my macro is scrolling & selecting specifically-positioned lines depending on the length of the list...
So I'm looking for this very basic need first, as I can build up more functionality around it later as I learn more about iMacros.
EXAMPLE PAGE: https://billiongraves.com/site-map?ceme ... 295&page=0 - in the Page Source is the section:
Code: Select all
<div id="content">
<h1 style="margin: 10px 0 25px 10px;">BillionGraves Site Map</h1>
<div class="card">
<h1 style="float:left; margin: 10px 0 10px 10px;">Burial records in <a href='/cemetery/Bethesda-Cemetery/100295' >Bethesda Cemetery</a></h1>
<br class="clearfloat" />
<div style="border-bottom:#CCC thin solid; width:916px;"> </div>
<div class="center">
*******HERE IS THE DIV ID SECTION 'MULTIPLE' WHICH CONTAINS THE DATA I WANT TO COPY******* <div id="multiple">
<div class='backlinks'><a href='/site-map'>Sitemap</a> > <a href='/site-map?country=United+States'>United States</a> > <a href='/site-map?country=United+States&state=Tennessee'>Tennessee</a> > <a href='/cemetery/Bethesda-Cemetery/100295'>Bethesda Cemetery</a></div><div><div class='record'><a href='/grave/William-R-Brooks/31780628' alt='Brooks, William R. (1833 - 1864)' title='Brooks, William R. (1833 - 1864)'>Brooks, William R. (1833 - 1864)</a></div><div class='record'><a href='/grave/Nathan-Andrew-Jackson/31709567' alt='Jackson, Nathan Andrew (1838 - 1864)' title='Jackson, Nathan Andrew (1838 - 1864)'>Jackson, Nathan Andrew (1838 - 1864)</a></div><div class='record'><a href='/grave/Josiah-S-Price/31694361' alt='Price, Josiah S (1838 - 1862)' title='Price, Josiah S (1838 - 1862)'>Price, Josiah S (1838 - 1862)</a></div><div class='record'><a href='/grave/Charles-J-Shropshire/31780629' alt='Shropshire, Charles J. (1841 - 1863)' title='Shropshire, Charles J. (1841 - 1863)'>Shropshire, Charles J. (1841 - 1863)</a></div><div class='record'><a href='/grave/William-A-Wingard/31709460' alt='Wingard, William A. (1839 - 1864)' title='Wingard, William A. (1839 - 1864)'>Wingard, William A. (1839 - 1864)</a></div></div><br/><br/>Pages: <span>1</span> </div>
</div>
</div>
Is iMacros able to select a specific page section (edit: I have found that it can) and copy the hyperlink contents to clipboard or a file? I see that it can extract content to a CSV file, for example (edit: in the paid version, not the freeware one). This could work for me (if it creates a 2-column file, with the TEXT and also the LINK - really NEED both!), as I could later combine the CSVs and import to Excel in bulk.
*If the functionality is there to quickly/easily copy a defined <div> section, I'm happy to pay the $99 for the basic version to allow me to SAVEAS a file...
Any help or direction appreciated.
I just installed the (free) Firefox add-in "iMacros for Firefox" - v. 10.1.0.1485, on Windows10 Pro-64 (v.19043.1706) with Firefox v100 (64bit).