Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

andyandy's avatar

Get html code of page and parse out content of <div class="img-wrapper">

I want to load HTML code of https://www.example.com/ and get the content of the [div class="img-wrapper"]. I want raw data => I want all the HTML tags and texts inside as is in the code.

I've tried DOMXPath, DOMDocument etc. etc. It's either too stupid and can't handle HTML5 or too clever and is stripping all HTML tags inside that DIV.

0 likes
6 replies
andyandy's avatar

That gets content of entire page. How do I get content of a single DIV?

sr57's avatar

Your filter what you want with preg_match

andyandy's avatar

Really? What if given DIV contains another DIVS?

sr57's avatar

Of course, ... give your example ...

andyandy's avatar
andyandy
OP
Best Answer
Level 4

I have downloaded this file (a single file simple_html_dom.php is enough to make this work):

https://sourceforge.net/projects/simplehtmldom/files/

Simple example is here:

https://code.tutsplus.com/tutorials/html-parsing-and-screen-scraping-with-the-simple-html-dom-library--net-11856

And this code will get all DIVs with class "wrap", and then you can for example loop them and add into a variable:

        require('public/simple_html_dom.php');
        $html = new \simple_html_dom();
        $html->load_file('https://www.example.com/');
        $items = $html->find('div[class=wrap]');



        $result = '';
        $i = 0;
        foreach ($items as $item) {
            if ($i == 3) {
                break;
            } else {
                $i++;
                $result .= $item->innertext;
            }
        }

        return $result;

Please or to participate in this conversation.