Extract Number of Google Search Results

Information related to the use of iMacros for Web Scraping, Data Mining and creating Mashups.
Forum rules
iMacros EOL - Attention!

The renewal maintenance has officially ended for Progress iMacros effective November 20, 2023 and all versions of iMacros are now considered EOL (End-of-Life). The iMacros products will no longer be supported by Progress (aside from customer license issues), and these forums will also no longer be moderated from the Progress side.

Thank you again for your business and support.

Sincerely,
The Progress Team
Post Reply
User avatar
Tech Support
Posts: 4948
Joined: Tue Sep 20, 2005 7:25 pm
Contact:

Extract Number of Google Search Results

Post by Tech Support » Fri Jan 09, 2009 2:16 pm

How to extract the number of search results in Google.

Here is the solution:

Method A: Use the comma to identify the number we need. This macro would fail if we have less then 1000 results.

Code: Select all

URL GOTO=http://www.google.com/     
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:f ATTR=NAME:q CONTENT=solar<SP>cells 
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:f ATTR=NAME:btnG  
TAG POS=1 TYPE=B ATTR=TXT:*,* EXTRACT=TXT  
Method B (recommended): Use relative extraction with the word "Results" as anchor:
How to extract information from Google
How to extract information from Google
google information retrieval.png (49.53 KiB) Viewed 52737 times

Code: Select all

URL GOTO=http://www.google.com/
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:f ATTR=NAME:q CONTENT=solar<SP>cells 
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:f ATTR=NAME:btnG     
TAG POS=1 TYPE=P ATTR=TXT:*Results*
TAG POS=R3 TYPE=B ATTR=TXT:* EXTRACT=TXT
Further Reading:

(1) Web Scraping (General information)

(2) Extract with relative Positioning

Note: In this case we do not recommend to use simply a specific POS statement as suggested by the extraction wizard:

Code: Select all

TAG POS=5 TYPE=B ATTR=TXT:* EXTRACT=TXT  
This works, but as soon as there is one more bold word before our number, the TAG...EXTRACT command will pick up the wrong content.
Post Reply