Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

Gabotronix's avatar

Problem replacing text contents with PHP DDOMDocument

I'm using PHP DOMDocument in order to loop through an html string like this:

$htmlString = "<section>
        <p>text</p>
        <p>text</p>
    </section>
    <section>
        <h2>text</h2>
            <p>text</p>
            <p>text</p>
    </section>"

I want to replace each tag with a text value with a custom string (called $vueInterpolation), but for some reason.

ONLY the first tag of each section is getting its contents replaced!

I want all the tags inside each section to be replaced!

My method:

$dom = new \DOMDocument();
        libxml_use_internal_errors(true);

        $htmlString = mb_convert_encoding($htmlString, 'HTML-ENTITIES', 'UTF-8');
        $dom->loadHTML($htmlString);

        $count = 0;
        $keyPattern = 'ccpaRights';

        foreach ($dom->getElementsByTagName('section') as $section)
        {

            // loop through each child of <section>
            foreach ($section->childNodes as $childNode)
            { 
                 $nodeValue = trim($childNode->nodeValue); 

                if ($nodeValue === '')
                {
                    continue; 
                }

                $count = $count + 1;
                $key = (string) $keyPattern.'Text'.$count;
                
                $vueInterpolation = ' {{ $t('.'"'.$key.'"'.' }} ';

                $newNode = $dom->createElement($childNode->nodeName, $vueInterpolation );

                $childNode->parentNode->replaceChild($newNode, $childNode);
            }
        }

        
        $dom->saveHTML($dom);
        $dom->save(public_path('/temp/result.html'));

Just so you know, this is the results I'm getting (notice only first tag of each section was replaced)

<section>
        <p> {{ $t("ccpaRightsText1" }} </p>
        <p>text</p>
    </section>
    <section>
        <h2> {{ $t("ccpaRightsText2" }} </h2>
            <p>text</p>
            <p>text</p>
    </section>

And this is the results I want:

<section>
        <p> {{ $t("ccpaRightsText1" }} </p>
        <p> {{ $t("ccpaRightsText2" }} </p>
    </section>
    <section>
        <h2> {{ $t("ccpaRightsText3" }} </h2>
            <p> {{ $t("ccpaRightsText4" }} </p>
            <p> {{ $t("ccpaRightsText5" }} </p>
    </section>
0 likes
9 replies
LaryAI's avatar
Level 58

The issue with the code is that it is replacing the child nodes of each section tag one by one, which is causing the issue. Instead, we can replace the entire contents of the section tag with the new node that contains the custom string. Here's the updated code:

$dom = new \DOMDocument();
libxml_use_internal_errors(true);

$htmlString = mb_convert_encoding($htmlString, 'HTML-ENTITIES', 'UTF-8');
$dom->loadHTML($htmlString);

$count = 0;
$keyPattern = 'ccpaRights';

foreach ($dom->getElementsByTagName('section') as $section)
{
    $count = $count + 1;
    $key = (string) $keyPattern.'Text'.$count;
    $vueInterpolation = ' {{ $t('.'"'.$key.'"'.') }} ';

    $newNode = $dom->createDocumentFragment();
    $newNode->appendXML($vueInterpolation);

    $section->nodeValue = '';
    $section->appendChild($newNode);
}

$dom->saveHTML($dom);
$dom->save(public_path('/temp/result.html'));

In this updated code, we are creating a new document fragment for each section tag and appending the custom string to it. Then, we are replacing the entire contents of the section tag with the new node that contains the custom string. This way, all the child nodes of the section tag are replaced with the custom string.

EricZwart's avatar

I gave a possible solution (using iconv) in your other post about the same problem.

You example worked fine. But when you have ë or á in the htmlstring it reproduces the error you show in the image. When you convert the the html string with iconv to ISO-8859-1 :

$htmlString = iconv("UTF-8", "ISO-8859-1", $htmlString);

It shows up fine. Hope that works for you

Gabotronix's avatar

@EricZwart Hi Eric, yeah the scpecial chars are not the problem, the problem is only the first tag of each section is getting replaced.

I want all tags inside sections to be replaced instead!

EricZwart's avatar

@Gabotronix sorry about that.

I still had the previous code and that still worked getting all the sections:

        $htmlString = '<section>
        <p>En esta Política de Privacidad describimos cómo recogemos sus datos personales y por qué los recogemos, qué hacemos con ellos, con quién los compartimos, cómo los protegemos y sus opciones en cuanto al tratamiento de sus datos personales.</p>
        <p>Esta Política se aplica al tratamiento de sus datos personales recogidos por la empresa para la prestación de sus servicios en la aplicación WOO!. Si cumplimenta cualquiera de nuestros formularios debe aceptar previamente las condiciones de esta Política y guardaremos registro de esa aceptación.</p>
    </section>
    <section>
        <h2>Addendum de CCPA Derechos de California</h2>
            <p>Si eres residente de California, consulta nuestra <a href="/legal/ccpa-rights">Declaración de privacidad de California</a>, que complementa esta Política de privacidad.</p>
            <p>Apreciamos que nos confíes tu información y trataremos de merecer esa confianza en todo momento. Para esto, el primer paso es que conozcas qué información recopilamos, por qué la recopilamos, cómo se usa y las opciones que tienes respecto de tu información. Esta Política describe nuestras prácticas de privacidad con un lenguaje sencillo, reduciendo al mínimo el vocabulario legal y técnico.</p>
    </section>';

        //$htmlString = iconv("UTF-8", "ISO-8859-1", $htmlString);
        $dom = new DOMDocument();
        libxml_use_internal_errors(true);
        $htmlString = mb_convert_encoding($htmlString, 'HTML-ENTITIES', 'UTF-8');
        $dom->loadHTML($htmlString);

        $count = 0;
        $keyPattern = 'ccpaRights';
        $newArray = [];


        foreach ($dom->getElementsByTagName('section') as $section) {

            // loop through each child of <section>
            foreach ($section->childNodes as $childNode) {
                $nodeValue = $childNode->nodeValue;

                if ($nodeValue === '') {
                    continue;
                }

                $count = $count + 1;
                $key = (string) $keyPattern.'Text'.$count;

                $newArray[$key] = $nodeValue;
            }
        }


        $fileName = '/temp/translated-'.rand(1, 1000).'.json';

        Storage::disk('public')->put($fileName, json_encode($newArray, JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT));
Gabotronix's avatar

@EricZwart Just so you know, this is the results I'm getting:

<section>
        <p> {{ $t("ccpaRightsText1" }} </p>
        <p>Esta Política se aplica al tratamiento de sus datos personales recogidos por la empresa para la prestación de sus servicios en la aplicación WOO!. Si cumplimenta cualquiera de nuestros formularios debe aceptar previamente las condiciones de esta Política y guardaremos registro de esa aceptación.</p>
    </section>
    <section>
        <h2> {{ $t("ccpaRightsText2" }} </h2>
            <p>Si eres residente de California, consulta nuestra Declaración de privacidad de California, que complementa esta Política de privacidad.</p>
            <p>Apreciamos que nos confíes tu información y trataremos de merecer esa confianza en todo momento. Para esto, el primer paso es que conozcas qué información recopilamos, por qué la recopilamos, cómo se usa y las opciones que tienes respecto de tu información. Esta Política describe nuestras pr&#xE1;cticas de privacidad con un lenguaje sencillo, reduciendo al mínimo el vocabulario legal y técnico.</p>
    </section>

And this is the results I want:

<section>
        <p> {{ $t("ccpaRightsText1" }} </p>
        <p> {{ $t("ccpaRightsText2" }} </p>
    </section>
    <section>
        <h2> {{ $t("ccpaRightsText3" }} </h2>
            <p> {{ $t("ccpaRightsText4" }} </p>
            <p> {{ $t("ccpaRightsText5" }} </p>
    </section>
kokoshneta's avatar

This is probably not the actual issue, but what exactly is $t("something" supposed to be? It’s not valid PHP/Blade, so – assuming it’s meant for later re-replacement, it will probably break.

For the actual problem, have you seen this comment in the DOMNode docs?

If you are trying to replace more than one node at once, you have to be careful about iterating over the DOMNodeList. If the old node has a different name from the new node, it will be removed from the list once it has been replaced. Use a regressive loop.

 

[code example left out]

 

The loop counter ($i) will always be in the list's interval as removed elements indexes are above the counter.

EricZwart's avatar

Maybe this one will work:

 $dom = new \DOMDocument();
libxml_use_internal_errors(true);

$htmlString = mb_convert_encoding($htmlString, 'HTML-ENTITIES', 'UTF-8');
$dom->loadHTML($htmlString);

$count = 0;
$keyPattern = 'ccpaRights';

foreach ($dom->getElementsByTagName('section') as $section) {
    // Initialize an array to store the new nodes
    $newNodes = [];

    // Loop through each child of <section>
    foreach ($section->childNodes as $childNode) {
        $nodeValue = trim($childNode->nodeValue);

        if ($nodeValue === '') {
            continue;
        }

        $count++;
        $key = $keyPattern . 'Text' . $count;

        // Create a new node 
        $newNode = $dom->createElement($childNode->nodeName, '{{ $t("' . $key . '") }}');

        // Add the new node to the array
        $newNodes[] = $newNode;
    }

    // Replace the child nodes with the new nodes
    foreach ($newNodes as $index => $newNode) {
        $childNode = $section->childNodes->item($index);
        $section->replaceChild($newNode, $childNode);
    }
}

$dom->saveHTML($dom);
$dom->save(public_path('/temp/result.html'));

Please or to participate in this conversation.