I have the following structure:
Code: Select all
<div class="left">
<h2>title</h2>
<div class="cls">
<a>link</a>
some text
</div>
<div class="cls">
<a>link</a>
some text
</div>
<div class="cls">
<a>link</a>
some text
</div>
<div class="cls">
<a>link</a>
some text
</div>
</div>
Unfortunately <div class="left"> can be positioned in various places in the page, so I cannot use XPath to get to it or to any of the inside divs
There's also no telling what comes next. Lastly the <div class="cls"> can be found in various other places in the page.
So is there a way to extract the text of the divs inside?
Ideally, is there a way to say extract the text from body > div.content > div.left > div:nth-child(2) or (3) or (30)?
The whole structure
would be something like:
Code: Select all
body
content
div.title
div.intro text
div.short abstract
div.gallery
div.left
div.right
div.links
div.bio
div.more text
/content
/body