Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

Gabotronix's avatar

Issue crawling simple html table with Goutte PHP

Hi everybody, I have to create array from an html table with cities and populations, currently I'm able to get the first cell value (which is a span inside a list and the td tag itself) however I also want to get the population td value.

The structure of the html is this :

https://i.imgur.com/lZ67xPj.png

and how it looks:

https://i.imgur.com/1cG3ST5.png

This is my PHP code, currently I only know how to get the first row cell.

        $table = $crawler->filter('table')->filter('tr')->each(function ($tr, $i) {
            return $tr->filter('td')->each(function ($td, $i) {
                return $td->filter('li')->each(function ($li, $i) {
                    return $li->filter('span')->each(function ($span, $i) {
                        return $span->text();
                    });
                });
            });
        });

        $array = [];

        foreach($table as $index => $row)
        {
            if($index > 0)
            {
                $array[] = [ 'cityName' => $row[0][0][0], 'population' => 'HOW TO GET POPULATION FROM TD' ];
            }
        }

Is there a way to grab both cityname and population in the same each loop with goutte?

Maybe looping first through the cityname, then a second loop grab the populations and in a third loop combine them?

Thanks in advance.

0 likes
1 reply
automica's avatar

@gabotronix is this issue related to https://laracasts.com/discuss/channels/general-discussion/issue-scraping-html-table-with-goutte ?

if you can post the full method including url of html that its scraping, then I'll have a look to see whats up.

in your example above, if your population data is in 3rd column, you'd get it on the 2nd index

$array[] = [ 'cityName' => $row[0][0][0], 'population' =>$row[2] ];

easiest way to deduce this yourself is by

dd($row)

inside your foreach, and then you can see the see the whole object. With that you should be able to see which index you need

Please or to participate in this conversation.