Scrape phone numbers from contact pages using CEO as keyword

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.
Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the search box (at the top of each forum page) to see if a similar problem or question has already been addressed.
3. Try searching the iMacros Wiki - it contains the complete iMacros reference as well as plenty of samples and tutorials.
4. We can respond much faster to your posts if you include the following information: CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST
Post Reply
MarkWaynes
Posts: 7
Joined: Fri Nov 04, 2016 12:36 pm

Scrape phone numbers from contact pages using CEO as keyword

Post by MarkWaynes » Fri Nov 04, 2016 12:59 pm

What I'm doing ? I'm scraping CEO phone numbers from company websites.

What I have ?
This is my code, it currently says when CEO word appear on page, extract next 2 elements,
BUT I want like this -> When CEO text appears in page, extract next element which has numbers.

Code: Select all

Set !Timeout_Step 0
Set !ErrorIgnore Yes
Set !Extract_Test_PopUP NO
SET !DATASOURCE tj.csv

Tag Pos=1 Type=* ATTR=Txt:*CEO* Extract=Txt 
Tag Pos=R1 Type=* Attr=* Extract=Txt
Tag Pos=R1 Type=* Attr=* Extract=Txt
Set ceo {{!Extract}}
Set !Extract Null

Add !Extract {{ceo}}

SAVEAS TYPE=EXTRACT FOLDER=D:\imacro\ FILE=ceo.csv

Required answers:

Code: Select all

1. What version of iMacros are you using?
VERSION BUILD=8871104 RECORDER=FX  (Newest PaleMoon version)

2. What operating system are you using? (please also specify language)
Windows 8.1

3. Which browser(s) are you using? (include version numbers)
Palemoon Version: 26.5.0 (x64)

4. Do the included demo macros work ok? 
Yes

5. If reporting a problem with the Scripting Interface, please also test if the included VBS sample scripts run ok.
N/A

6. If recording or replay fails on a specific website: Can you please post the URL of the web page and/or the imacro that creates the problem? If you can not post the imacro or login data in the public user forum, please email N/A

7. Do you encounter the same problem with the iMacros Browser, iMacros for Internet Explorer and iMacros for Firefox? Note: If your question is specifically about iMacros for Firefox or iMacros for Chrome, please use their sub-forums. 
N/A
help appreciated, br. Mark
PS. Is this message in correct format for asking question ?
iimfun
Posts: 239
Joined: Tue Jul 19, 2016 1:06 pm

Re: Scrape phone numbers from contact pages using CEO as key

Post by iimfun » Tue Nov 08, 2016 8:36 am

What does the pattern of CEO phone numbers look like?
MarkWaynes
Posts: 7
Joined: Fri Nov 04, 2016 12:36 pm

Re: Scrape phone numbers from contact pages using CEO as key

Post by MarkWaynes » Thu Nov 10, 2016 7:39 am

Hello :)

These are the most common number formats.
+358 44 425 1256
045 425 1256
06 3476700

Help much appreciated
Br. Mark
iimfun
Posts: 239
Joined: Tue Jul 19, 2016 1:06 pm

Re: Scrape phone numbers from contact pages using CEO as key

Post by iimfun » Fri Nov 11, 2016 1:47 pm

Hello,

Since the number format is not fixed, you can try the following line at worst

Code: Select all

TAG POS=R1 TYPE=* ATTR=TXT:*<SP>* EXTRACT=TXT
Post Reply