Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

mozgus_'s avatar

Web scraping/crawling with Laravel

Hello all,

I want to make a small web application, which needs to scrape known websites (URL's are saved in database), and save the results in database.

So, currently I have Laravel 5.5.2 installed on my local windows machine, and I made connection with database, where I have table 'web' and in that table I have links which I want to scrape. From that links (from HTML) I will take meta tags (og:title, og:description, og:image and og:keywords) and save in different table in my database.

After this I will make search on my database.

Now, the question is, what is best way to make this? I've tried some tutorials but I didn't make it.

Please give me some hint :)

Thank you! Cheers!

0 likes
2 replies
mozgus_'s avatar

With Goutte I figured out how to make crawling on URL. Now, I know how can I return meta tags in JSON from URL that I'm crawling.

But, now I have a problem because I can't figure out how to crawl all links based on base url.

For example, my base url is:

$url = "http://novidom.ba/offer/";

I want to crawl and collect information from all links which have http://novidom.ba/offer/ as base url.

Example of that link is:

http://novidom.ba/offer/iznajmljuje-se-stan-u-prizemlju-kuce-midzic-mahala-bihac/775

Any suggest? Thanks

mozgus_'s avatar

For easiest understanding, I will add my function

$client = new Client(); $crawler = $client->request('GET', 'http://novidom.ba/offer/iznajmljuje-se-stan-u-prizemlju-kuce-midzic-mahala-bihac/775'); $meta = $crawler->filter('meta')->each(function($node) { return [ 'name' => $node->attr('name'), 'property' => $node->attr('property'), 'content' => $node->attr('content'), ]; }); return $meta;

On this way I can return all meta tags in JSON (later I will made function that saves this data in database).

I've tried with this function return all links, but I have a problem with pagination and it returns about 80 results, but I know that this should be about 1000 results (as I already said, pagination is the problem).

$url = "http://novidom.ba/offer/"; $client = new Client(); $crawler = $client->request('GET', $url); $links_count = $crawler->filter('a')->count(); $all_links = [];

    if($links_count > 0){

        $links = $crawler->filter('a')->links();
        

        foreach ($links as $link) {
            $all_links[] = $link->getURI();

        }

        $all_links = array_unique($all_links);

        echo "All Avialble Links From this page $url Page<pre>"; print_r($all_links);echo "</pre>";

    } else {

        echo "No Links Found";

    }

    die;

Please or to participate in this conversation.