Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

manny1979's avatar

Return a large XML dataset via api

Hi guys,

I have a bit of an issue I hope someone can help me with. I am currently trying to return a large xml dataset created from a database query via an api response. Roughly about 26000 records.

Biggest issue I am having at the moment is the fact that the date while the query takes about a 1 or 2 seconds to run, returning the xml response takes forever to return any data. On first load, seems to take up to 4 minutes to return a response.

The below shows an example of he code that I am using. In an attempt to speed up the response times, I cached the xml response in redis. However, I am finding that this is actually making no difference. Even when cached, I am still waiting for about 3 minutes.

$records = \Cache::remember('fa-products-family', 60, function() {
            $xml = new SimpleXMLElement('<?xml version="1.0" encoding="utf-8"?><ProductFamily/>');
            $records = DB::table('products')
                ->select([
                    'products.nav_id as PF_ID',
                    'products.title as Name',
                    DB::raw('products.long_description as ShortDesc'),
                    DB::raw('products.short_description as LongDesc'),
                    DB::raw('CONCAT("STYLE==",product_groups.name) as Attributes'),
                    DB::raw('NULL as Manufacturer'),
                    DB::raw('NULL as keywords'),
                    DB::raw('"Colour" as Option1'),
                    DB::raw('"Size" as Option2'),
                    DB::raw('NULL as Option3'),
                    'style_code as StyleCode',
                    DB::raw('0 as ReducedPrice'),
                    DB::raw('NULL as Offer'),
                    DB::raw('products.status as Status'),
                    DB::raw('1 as ShowInThisList'),
                    DB::raw('product_web_categories.code as WebCategoryID'),
                    DB::raw('NULL as WebCategoryIDSequence'),
                    DB::raw('NULL as WebAlt1CategoryID'),
                    DB::raw('NULL as WebAlt1CategoryIDSequence'),
                    DB::raw('NULL as WebAlt2CategoryID'),
                    DB::raw('NULL as WebAlt2CategoryIDSequence'),
                    DB::raw('NULL as WebAlt3CategoryID'),
                    DB::raw('NULL as WebAlt3CategoryIDSequence'),
                    DB::raw('NULL as WebAlt4CategoryID'),
                    DB::raw('NULL as WebAlt4CategoryIDSequence'),
                    DB::raw('NULL as WebBrandSequence'),
                    DB::raw('NULL as ColourCode'),
                    DB::raw('NULL as ModelWears'),
                    DB::raw('NULL as ModelHeight'),
                    DB::raw('0 as HideBuyButton'),
                ])
                ->leftJoin('product_variants', 'product_variants.product_id', '=', 'products.id')
                ->leftJoin('product_variant_prices', 'product_variant_prices.variant_id', '=', 'product_variants.id')
                ->leftJoin('product_groups', 'product_groups.id', '=', 'products.group_id')
                ->leftJoin('product_web_categories', 'product_web_categories.id', '=', 'products.web_category_id')
                ->leftJoin('product_colours', 'product_colours.id', '=', 'products.colour_id')
                ->leftJoin('product_brands', 'product_brands.id', '=', 'products.brand_id')
                ->leftJoin('product_genders', 'product_genders.id', '=', 'products.gender_id')
                ->distinct()
                ->get()
                ->each(function($sku) {
                   yield $sku;
                });

            $records->each(function ($sku) use ($xml) {
                $stockKeeping = $xml->addChild('item');
                foreach ($sku as $key => $value) {
                    $stockKeeping->addChild($key, htmlspecialchars($value));
                }
            });

            return $xml->asXML();
        });

        return response($records)->header('Content-Type', 'text/xml');

So just wondering if anyone has good ideas or know what I can improve with the above code to improve the data return speed.

Thanks

0 likes
6 replies
ftiersch's avatar

How big is the actual response and how fast is the connection?

If you're trying to transfer 10MB of data through a very slow connection you'll wait a couple minutes and unfortunately can't really change that (except for zipping the response for example).

davorminchorov's avatar

Why do you need to return all of records? Why not just paginate the data?

manny1979's avatar

@FTIERSCH - This is a medium sized AWS instance and the connection is pretty fast (cant be sure of the speed)

manny1979's avatar

@RUFFLES - Unfortunately this is something that cannot be changed as the company consuming the data is unwilling to update their current architecture to retrieve the data in a paginated form.

ftiersch's avatar

@EOKORIEFA - Okay, I guess AWS has a connection that should be appropriate ;-)

Have you tried to check where the actual performance bottleneck is?

Also (just out of curiosity):

What are you doing with the yield $sku part? I haven't used yield before but if I understand the code correctly it shouldn't do much here, does it?

manny1979's avatar

@FTIERSCH - Anyway to resolve his issue, I have decided to dump the results a file which will be uploaded to an Amazon S3 container. The file can ten be downloaded by the third party that required the file.

In regards to the "yield" functionality, this was me just testing things and trying to understand the functionality. It has since been removed though :)

Thanks for the assist ...

Please or to participate in this conversation.