How to speed up this script to verify 90K ULRs for their HTTP status code
I've a list of 90K URLs that I need to verify for the http status code they return. I've prepared a script to do that; but it's slow. Would appreciate if someone can help me speed this up -
I wish to attempt this using the inbuilt Http client; but I don't know how to only make a HEAD request; and have make concurrent requests for URLs fetched from the database.
@tisuchi Update - looks like we can't pass an array of urls as second parameter to the request method of the Guzzle Client. It works good with just 1 url passed as string.
I changed my mind a bit. What if you follow this approach?
public function handle()
{
// Initialize the Guzzle HTTP client
$client = new GuzzleHttp\Client();
// Set up the requests array
$requests = [];
// Get the URLs from the database
$urls = DB::table('internal_links')->whereNull('status')->orderBy('id')->get();
// Loop through the URLs and add a request for each URL to the requests array
foreach ($urls as $url) {
$requests[] = $client->requestAsync('HEAD', $url->href);
}
// Wait for all requests to complete
GuzzleHttp\Promise\unwrap($requests);
// Loop through the completed requests and update the status codes in the database
foreach ($requests as $request) {
$statusCode = $request->getStatusCode();
DB::table('internal_links')->where('href', $request->getUri())->update(['status' => $statusCode]);
}
return Command::SUCCESS;
}
FYI, the GuzzleHttp\Promise\unwrap function to wait for all of the requests to complete before continuing with the script. It also loops through the completed requests and updates the status codes in the database for each URL.
⚠️ You will need to install the Guzzle HTTP client package by running composer require guzzlehttp/guzzle before you can use it in your Laravel project.