PHP DOMDocument : loop through all elements with text in an html string
I made a php method which takes an html string and loops through all "p" paragraph elements and translates them, now I want to improve my method to loop through all elements with text inside! So h2, span, a elements.
Can you help me?
My PHP method
$htmlString =
'<section>
<h2>Ámbito de aplicación</h2>
<p>
Estas condiciones generales constituyen las condiciones de uso
junto con las normas comunitarias, disponibles en
http://add.com/community-rules y las posibles condiciones
complementarias pactadas entre el usuario!.
</p>
</section>';
$dom = new \DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($htmlString);
foreach( $dom->getElementsByTagName("p") as $pnode )
{
//$nodeValue
$nodeValue = $pnode->nodeValue;
$translated = $this->translate($nodeValue)
$newNode = $dom->createElement("p", $translated );
$pnode->parentNode->replaceChild($newNode, $pnode);
}
$dom->saveHTML($dom);
If yes, then maybe you could use xpath, i found this stackoverflow answer that explains it better than i could: https://stackoverflow.com/a/7906888
If you don't know which tags to look for (if you want any tag that has text inside), then the solution by @braunson is your best option i think.
i took the liberty to combine your code with the solution offered by @braunsun
This is untested code, but it should be pretty much correct.
$dom = new \DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($htmlString);
// get your first <section>
$section = $dom->getElementsByTagName('section')->item(0);
// loop through each child of <section>
foreach ($section->childNodes as $childNode) {
// get the text of the child
$nodeValue = $pnode->nodeValue;
if ($nodeValue === '') {
// don't translate this, doesn't have any text content.
continue;
}
// the child have text that can be translated
// almost identical to your original code
$translated = $this->translate($nodeValue)
$newNode = $dom->createElement($pnode->nodeName, $translated );
$pnode->parentNode->replaceChild($newNode, $pnode);
}
// loop through each section
foreach ($dom->getElementsByTagName('section') as $section) {
// loop through each child of <section>
foreach ($section->childNodes as $childNode) {
// ....
Like @braunson said, find all <section> and loop through all of them.
$dom = new \DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($htmlString);
// get all <section>
$sections = $dom->getElementsByTagName('section');
// loop through each <section>
foreach ($sections as $section) {
// loop through each child of <section>
foreach ($section->childNodes as $childNode) {
// get the text of the child
$nodeValue = $pnode->nodeValue;
if ($nodeValue === '') {
// don't translate this, doesn't have any text content.
continue;
}
// the child have text that can be translated
// almost identical to your original code
$translated = $this->translate($nodeValue)
$newNode = $dom->createElement($pnode->nodeName, $translated);
$pnode->parentNode->replaceChild($newNode, $pnode);
}
}