Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

trihead's avatar

How can I extract data from this kind of XML file in Laravel

Hello Pals,

I am trying to extract data from XML file with too many nodes and information. In this example I need to know how many Closure ( or )do I have, and under each Closure I look for relevant data. If you need more detail I can explain or send the XML file data which I couldn't upload here

Thanks in advance

0 likes
29 replies
trihead's avatar

How so? can you explain a little more please?

trihead's avatar

the main nodes that I look for in the file are:

closure1 closure2 closure'n' shell support

Under each of these nodes I have many other nodes, and a simple file may contain more than 1000 lines of information.

Cronix's avatar

For large files, you'd probably want to read it in a stream so you don't blow your memory up. I've used this package for large xml files, and it works great with very low memory consumption since it's not reading the whole file into memory at once.

https://github.com/prewk/xml-string-streamer

1 like
jlrdw's avatar

My son works for a very large firm he makes over $100,000 a year programming exactly what you are Asking.

He takes large proprietary data sets, and writes custom code to convert to his companies data.

It takes a while to learn this stuff you have to learn how to use string functions and the file system.

It's all about breaking down those patterns and getting one hunk at a time converting writing to new file, etc

It's hard not easy that's why people who write these kinds of programs Makes a lot of money for doing it.

trihead's avatar

Thanks jlrdw, well actually I know what string I am looking for, I am new in the Laravel and I just want to know how to access the node in the file.

jlrdw's avatar

Laravel has nothing to do with it. But you could use laravel, which in turn is PHP. But there are packages that do this sort of thing as well like @Cronix suggested.

But remember if you rely on a package you will never learn this Stuff for yourself.

Like pagination there are laravel users Who would not know how to custom write their own Paginator.

Just an observation.

1 like
trihead's avatar

Agree, I need to learn but at the same time I must deliver my project too.

Cronix's avatar

@jlrdw That's the whole point of packages though, so you don't have to reinvent the wheel for common tasks. If we needed to understand how all of the code is working, we'd all be using vanilla php without any framework or packages and take a heck of a lot longer to code an app, which most likely also won't be as well-built or secure as using a package that has been tested by thousands of people and had the bugs/security issues worked out. Not everybody is an expert at everything. Is it good to understand what's going on under the hood? Sure. Do you have to know every little thing? No.

2 likes
jlrdw's avatar

@Cronix not disagreeing here, but I just can't see someone intalling laravel, and 15 packages, and Bam I have an application.

Example, datatables: Yes an ok package, but I always write my own tables and sort (order by) routines. And some of the packages actually requires as much code to make them work as the code to role your own takes up.

However I do agree some packages are needed. One is mobile detect, no I would not want to attempt on my own.

@trihead I hope the package works for you. To add you could later study the inner workings to see how it does the transformation.

I would imagine a small custom routine could be written that does the same thing. Again it is just manipulating (finding, counting, etc) strings.

jlrdw's avatar

For grins here is one function from the package, notice the string functions being used:

protected function shave()
    {
        preg_match("/<[^>]+>/", $this->chunk, $matches, PREG_OFFSET_CAPTURE);
        if (isset($matches[0], $matches[0][0], $matches[0][1])) {
            list($captured, $offset) = $matches[0];
            if ($this->options["expectGT"]) {
                // Some elements support > inside
                foreach ($this->options["tagsWithAllowedGT"] as $tag) {
                    list($opening, $closing) = $tag;
                    if (substr($captured, 0, strlen($opening)) === $opening) {
                        // We have a match, our preg_match may have ended too early
                        // Most often, this isn't the case
                        if (substr($captured, -1 * strlen($closing)) !== $closing) {
                            // In this case, the preg_match ended too early, let's find the real end
                            $position = strpos($this->chunk, $closing);
                            if ($position === false) {
                                // We need more XML!
                                return false;
                            }
                            // We found the end, modify $captured
                            $captured = substr($this->chunk, $offset, $position + strlen($closing) - $offset);
                        }
                    }
                }
            }
            // Data in between
            $data = substr($this->chunk, 0, $offset);
            // Shave from chunk
            $this->chunk = substr($this->chunk, $offset + strlen($captured));
            return array($captured, $data . $captured);
        }
        return false;
    }
trihead's avatar

Can any of you help me with this matter?

Cronix's avatar

Help with what specifically? Did you try the package I linked to?

jlrdw's avatar

There is also simplexml phaser.

trihead's avatar

the XML file I have is more complicated with compare of the examples I saw on that package.

for example:

root --closure1 ---leftClosure ----ellipsoidalHead -----standardComponentData ------material ------idNumber -------material2 . . . . . --closure2 ---leftClosure ----ellipsoidalHead -----standardComponentData ------material ------idNumber -------material2 . . . . . --shell ---cylender ----ellipsoidalHead -----standardComponentData ------material ------idNumber -------material2

there are many level under each node and all of them somehow related to each other. If you can help me with this I will pay for the hours you spend. I am curious to know how to get data out of this kind of files which have random pattern. I can upload the file on my server so you can have look to it.

jlrdw's avatar

Any XML file should be repeatable pattern, if it's not a properly formed XML document I would tell whoever I am not doing it. Or charge a pretty penny to work with it like I said you ought to see some of the data my son has to convert at the company he works with but again he makes darn good money doing it..

Does the package you try only offer so many levels of nesting.

trihead's avatar

Well the part I need has a pattern, I meant the closure part can be 2 or more. I need to first find the closure then the nodes inside of it. in better example we can say we have to find a house, then we search for rooms, then search lights, then if have internet connection, if it has then is it cable or Ethernet and so on. then I should collect all the numbers and then make a calculation

jlrdw's avatar

Could you try to break the XML file down in the more than one part?

benjivm's avatar

@jlrdw nobody cares how much "money" your "son" makes, not only is it completely irrelevant, but worst of all it's totally useless rofl.

@trihead You should look into your options re: packages available to make your life easier here. You can use PHP's robust options or something like this:

I did this using this package:

use XmlParser;

Route::get('/', 'HomeController@index')->name('home');

Route::get('/xml', function() {
    $xml = XmlParser::load(storage_path('app/public/Comp2.xml'));

    $test = $xml->parse([
        'vesselId' => ['uses' => 'pressureVessel.generalVesselInfo.identifier'],
        'vesselLocation' => ['uses' => 'pressureVessel.generalVesselInfo.location'],
        'vesselPurchaser' => ['uses' => 'pressureVessel.generalVesselInfo.purchaser'],
    ]);

    dd($test);
});

Which outputs the following:

array:3 [▼
  "vesselId" => "Drum"
  "vesselLocation" => "Location"
  "vesselPurchaser" => "Dove"
]

I'm sure there are even more powerful solutions to this issue out there, but you can see it's possible.

Good luck.

1 like
trihead's avatar

actually the part I need is under --pressureVessel-- closure can be more or less, also I need the information between each tags below: --closure1--...--closure1-- --closure2--...--closure2-- --shell--...--shell-- --support--...--support--

I will show you a picture of the construction so you will have idea what is this XML for. this picture gives you an idea http://trihead.com/vessel.png

jlrdw's avatar

you should look into your options re: packages available to make your life easier here

But someone had to write the packages whether they make good money or just did it to help people the point is someone has to know that coding.

And have you done some experiments like importing it stripping the tags or anything like that?

Did you try the package that was earlier suggested?

trihead's avatar

I will try the package later today.

benjivm's avatar

@jlrdw that package is very easy to understand, and most of these parser wrappers are simply elegant implementations of simple ideas.

@trihead you have a lot of options, not just that one, you need to spend some time taking a look at how you want to do this.

    $xmlFile = file_get_contents(storage_path('app/public/Comp2.xml'));

    $xml = new SimpleXMLElement($xmlFile);

    foreach ($xml->pressureVessel->closure1 as $element) {
        foreach ($element as $key => $val) {
            $closure1[$key] = $val;
        }
    }

    dd($closure1);

Outputs:

array:1 [▼
  "leftClosure" => SimpleXMLElement {#449 ▼
    +"ellipsoidalHead": SimpleXMLElement {#459 ▼
      +"standardComponentData": SimpleXMLElement {#463 ▶}
      +"straightFlangeLength": "2.0000"
      +"straightFlangeNominalThickness": "0.4375"
      +"straightFlangeInnerDiameter": "60.0000"
      +"straightFlangeOuterDiameter": "60.8750"
      +"straightFlangeStaticHeadOperating": "1.5845"
      +"straightFlangeStaticHeadOperatingPlusDesignP": "101.5845"
      +"ellipsoidalHeadRatio2": "2"
      +"ellipsoidalHeadRatio": "2:1"
    }
    +"nozzle": array:2 [▼
      0 => SimpleXMLElement {#460 ▼
        +"standardComponentData": SimpleXMLElement {#469 ▼
          +"comment": array:2 [▶]
          +"identifier": "Nozzle N1 (N1)"
          +"idNumber": "1524890210"
          +"attachedTo": "Left Head"
          +"attachedToKind": "Ellipsoidal Head"
          +"attachedToidNumber": "1524890152"
          +"material": "SA-105"
          +"material2": SimpleXMLElement {#486 ▶}
          +"outerDiameterDesign": "TRUE"
          +"innerDiameter": "0.5000"
          +"outerDiameter": "1.2500"
          +"innerCorrosion": "0.0000"
          +"outerCorrosion": "0.0000"
          +"nominalThickness": "0.3750"

... etc.
1 like
trihead's avatar

@benm thanks for your help, it really meant to me actually I need to make a routine to count number of closure for example in another Xml file I might have 3 closure which it would be like closure1, closure2 and closure3 with that said there will be 3 closure and 3 search within closure for each possible node in them.

how you guys copy codes in to the comments I can't do that :)

Abdullah_Iftikhar's avatar
	$xmlString = file_get_contents(public_path('sample.xml'));

    $xmlObject = simplexml_load_string($xmlString);
               
    $json = json_encode($xmlObject);
    $phpArray = json_decode($json, true); 

    dd($phpArray);

Please or to participate in this conversation.