You're on the right track, but the issue is that DOMDocument will "fix" or normalize your HTML, and sometimes it doesn't preserve all attributes or structure as you expect. More importantly, your code is not updating the <img src="..."> to a base64 data URI—it's leaving the original path in the output.
Why?
Because DOMDocument::loadHTML() expects a complete HTML document (with <html>, <body>, etc.), and if you pass in a fragment, it may not parse or modify it as you expect. Also, if the image path is not being resolved correctly, the base64 embedding will be skipped.
How to fix:
- Ensure you pass a full HTML document to DOMDocument (or wrap your fragment).
- Double-check your path normalization to ensure
$fullPathpoints to the actual file. - After modifying the DOM, output the correct part (the body inner HTML).
Here's a revised version with comments:
function embedImagesAsBase64ForPdf(string $html): string
{
\Log::info('🔧 embedImagesAsBase64ForPdf started');
$publicPath = public_path();
// Remove srcset and sizes attributes
$html = preg_replace('/\s(srcset|sizes)="[^"]*"/i', '', $html);
// Wrap in a full HTML document to ensure proper parsing
$html = '<!DOCTYPE html><html><head><meta charset="UTF-8"></head><body>' . $html . '</body></html>';
libxml_use_internal_errors(true);
$dom = new \DOMDocument();
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('img') as $img) {
$src = $img->getAttribute('src');
// Normalize and strip domain if needed
if (preg_match('#https?://[^/]+(/storage/.*)#i', $src, $m)) {
$src = $m[1];
}
$src = preg_replace('#^(\.\./|\.\/)+#', '', $src);
$src = ltrim($src, '/');
$fullPath = realpath($publicPath . DIRECTORY_SEPARATOR . $src);
if ($fullPath && file_exists($fullPath)) {
$imageData = file_get_contents($fullPath);
$mimeType = mime_content_type($fullPath);
$base64 = base64_encode($imageData);
$dataUri = "data:$mimeType;base64,$base64";
$img->setAttribute('src', $dataUri);
} else {
\Log::warning("❌ Could not embed image: $src (full path: $fullPath)");
}
}
// Extract the body inner HTML
$body = $dom->getElementsByTagName('body')->item(0);
$fixedHtml = '';
foreach ($body->childNodes as $child) {
$fixedHtml .= $dom->saveHTML($child);
}
return $fixedHtml;
}
Key changes:
- Wrap your HTML in a full document before parsing.
- Improved logging to show the resolved full path.
- The function now reliably replaces the
srcattribute with the base64 data URI.
Debug tip:
If you still get the original src in the output, log $src and $fullPath to ensure the file exists and is being read.
Extra:
If your images are not in /public, adjust $publicPath accordingly.
Summary:
Your main issue was not wrapping the HTML fragment in a full document for DOMDocument, and possibly path resolution. The above code should fix both issues and embed your images as base64 in the output HTML for domPDF.