Scrape elements which are generated by php code

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Community Moderators, iMacros Moderators

Forum rules
Before asking a question or reporting an issue:
1. Please review the list of FAQ's.
2. Use the Google search box (at the top of each forum page) to see if a similar problem or question has already been addressed. This will search the entire contents of the forums as well as the iMacros Wiki.
3. We can respond much faster to your posts if you include the following information:

CLICK HERE FOR IMPORTANT INFORMATION TO INCLUDE IN YOUR POST

Answering your own posts (e.g. attempting to "bump" your topic) drops your topic from the list of unanswered threads, so it may actually receive less views.

Scrape elements which are generated by php code

by wootshuska on Fri Jan 06, 2017 5:02 am

Hi,
I want to scrape content from webpage which is not visible in Source code. It's probably generated by php script, so I cant find it looking at source code. Of course I can click "inspect element" and see the html code of elements I want to extract. What I also can do is clicking at them.

Event clicks are very similar:
Code: Select all
EVENT TYPE=CLICK SELECTOR="HTML>BODY>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(5)>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(3)>DIV:nth-of-type(1)>DIV>TABLE>TBODY>TR>TD:nth-of-type(2)>DIV:nth-of-type(3)>DIV:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(5)>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(3)>DIV:nth-of-type(2)>DIV>TABLE>TBODY>TR>TD:nth-of-type(2)>DIV:nth-of-type(3)>DIV:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(5)>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(3)>DIV:nth-of-type(3)>DIV>TABLE>TBODY>TR>TD:nth-of-type(2)>DIV:nth-of-type(3)>DIV:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(5)>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(3)>DIV:nth-of-type(4)>DIV>TABLE>TBODY>TR>TD:nth-of-type(2)>DIV:nth-of-type(3)>DIV:nth-of-type(2)" BUTTON=0


As you can see only 1 number is changing. My first question: Is it possible to extract href from these selectors?

When I click "inspect element" and go to the element I want to extract this is what I see:
Code: Select all
<div class="gs-per-result-labels" url="http://this-is-what-i-want-to-extract.html"></div>


I tried by scraping by XPATH but it seem not to work (I think because imacros cant find these elements in source code). Do you have any tips for me?


MY OS:
OS: Windows 7 N service pack 1 64 bit
Intel Core i7-4700MQ
16gb ram
gt 755m

Firefox 50.1.0
iMacros for Firefox 9.0.3
VERSION BUILD=9030808 RECORDER=FX
wootshuska
 
Posts: 23
Joined: Sun Mar 06, 2016 5:04 pm

Re: Scrape elements which are generated by php code

by chivracq on Fri Jan 06, 2017 6:37 am

wootshuska wrote:Hi,
I want to scrape content from webpage which is not visible in Source code. It's probably generated by php script, so I cant find it looking at source code. Of course I can click "inspect element" and see the html code of elements I want to extract. What I also can do is clicking at them.

Event clicks are very similar:
Code: Select all
EVENT TYPE=CLICK SELECTOR="HTML>BODY>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(5)>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(3)>DIV:nth-of-type(1)>DIV>TABLE>TBODY>TR>TD:nth-of-type(2)>DIV:nth-of-type(3)>DIV:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(5)>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(3)>DIV:nth-of-type(2)>DIV>TABLE>TBODY>TR>TD:nth-of-type(2)>DIV:nth-of-type(3)>DIV:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(5)>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(3)>DIV:nth-of-type(3)>DIV>TABLE>TBODY>TR>TD:nth-of-type(2)>DIV:nth-of-type(3)>DIV:nth-of-type(2)" BUTTON=0
EVENT TYPE=CLICK SELECTOR="HTML>BODY>DIV>DIV:nth-of-type(2)>DIV>DIV>DIV>DIV>DIV:nth-of-type(5)>DIV:nth-of-type(2)>DIV>DIV>DIV:nth-of-type(3)>DIV:nth-of-type(4)>DIV>TABLE>TBODY>TR>TD:nth-of-type(2)>DIV:nth-of-type(3)>DIV:nth-of-type(2)" BUTTON=0


As you can see only 1 number is changing. My first question: Is it possible to extract href from these selectors?

When I click "inspect element" and go to the element I want to extract this is what I see:
Code: Select all
<div class="gs-per-result-labels" url="http://this-is-what-i-want-to-extract.html"></div>


I tried by scraping by XPATH but it seem not to work (I think because imacros cant find these elements in source code). Do you have any tips for me?

Code: Select all
MY OS:
OS: Windows 7 N service pack 1 64 bit
Intel Core i7-4700MQ
16gb ram
gt 755m

Firefox 50.1.0
iMacros for Firefox 9.0.3
VERSION BUILD=9030808 RECORDER=FX

Hum, would be easier if you had provided to URL of your Page to have a look..., or some HTML Saveas of the Page uploaded to your Thread if it's behind Login&Password (zipped, Max 256Kb), but OK...

If the 'EVENT' Mode is able to "see"/tag your Elements, I would expect the 'TAG' Mode to "see" them as well...
I have a (somewhat cumbersome!) way to "emulate" an 'EXTRACT=TXT' using the 'EVENT' Mode, but I'm not even sure that would work for your URL's...

What you can try:
Code: Select all
TAG POS=1 TYPE=DIV ATTR=CLASS:gs-per-result-labels EXTRACT=HREF
and/or:
Code: Select all
TAG POS=1 TYPE=DIV ATTR=CLASS:gs-per-result-labels EXTRACT=HTM

You can play with 'POS=n' but do those 2 Extract Statements already return stg?, i.e. not "#EANF#", and especially for 'EXTRACT=HTM', your URL should be part of the extracted Data, I would expect...
- (F)CIM = (Full) Config Info Missing: iMacros + Browser + OS with all 3 Versions...
- I usually don't even read the Question if that (required) Info is not mentioned...
- Script & URL usually help a lot for a more "educated" Help...
chivracq
 
Posts: 6381
Joined: Sat Apr 13, 2013 6:07 am
Location: Amsterdam (NL)


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 2 guests

-->