I used simplexml_load_string() then you can read items and namespaces of XML via this
$namespaces = $job->getNameSpaces(true);
$taleo = $job->children($namespaces['taleo']);
I had to use it once for Oracle's Taleo system. It was horrid D:
I am working on a project that gets criminal and credit reports for renters from a Credit Bureau. The Credit Bureau posts XML data to us when the process is complete.
I've noticed that I can't seem to access any data past a certain nesting point. I can clearly see in the XML the data is there, and if I just do a SimpleXML Load String the SimpleXmlElement has all the data. But if I try to convert it to an array or JSON (which is required for storage) then the I am losing some data after a certain nesting point.
json_encode is supposed to return false if too many nest levels are hit (512 is my default). But it's not. I also saw that it may be a browser limitation of the pre tag or dd() function of Laravel. But if I try to dump something deep in the next it's still empty.
Anyway I can get access to the Most Wanted Rapsheet?
Here is the Pastebin containing the XML string and my attempt to get the data.
So that XML to JSON function I found does work great, but it wouldn't work for my needs because it incorporates the name space into the key name, and I need the name space removed from the keys.
The problem is when you incorporate Namespaces in XML elements the conversion to an Array or JSON the child nodes that are namespaced are removed. I realized if I remove the namespace from the XML string before I convert the string to an XML Element, then the json_encode function works as expected and no data is removed.
So, for anyone else having this issue, here's is how I solved the problem for my needs.
Having that information here is what I did
function removeNamespaceFromXML( $xml )
{
// Because I know all of the the namespaces that will possibly appear in
// in the XML string I can just hard code them and check for
// them to remove them
$toRemove = ['rap', 'turss', 'crim', 'cred', 'j', 'rap-code', 'evic'];
// This is part of a regex I will use to remove the namespace declaration from string
$nameSpaceDefRegEx = '(\S+)=["\']?((?:.(?!["\']?\s+(?:\S+)=|[>"\']))+.)["\']?';
// Cycle through each namespace and remove it from the XML string
foreach( $toRemove as $remove ) {
// First remove the namespace from the opening of the tag
$xml = str_replace('<' . $remove . ':', '<', $xml);
// Now remove the namespace from the closing of the tag
$xml = str_replace('</' . $remove . ':', '</', $xml);
// This XML uses the name space with CommentText, so remove that too
$xml = str_replace($remove . ':commentText', 'commentText', $xml);
// Complete the pattern for RegEx to remove this namespace declaration
$pattern = "/xmlns:{$remove}{$nameSpaceDefRegEx}/";
// Remove the actual namespace declaration using the Pattern
$xml = preg_replace($pattern, '', $xml, 1);
}
// Return sanitized and cleaned up XML with no namespaces
return $xml;
}
function namespacedXMLToArray($xml)
{
// One function to both clean the XML string and return an array
return json_decode(json_encode(simplexml_load_string(removeNamespaceFromXML($xml))), true);
}
By calling the namespacedXMLToArray() function I can simply get an array that is 100% good to go in my case.
Hopefully this approach helps others. I am sure if you don't know what possible namespaces exist you can use a RegEx to find the various defined namespaces and then remove them once you know their names.
Please or to participate in this conversation.