stebrl's avatar

Best practice?

So, I'm starting a new Laravel project for my company (Apple reseller) where I have to import lots of CSV files from different distributors, which include thousands of products. All these products need to be updated a few times everyday (in-stock counter, price, ...) with a cron job (task scheduler).

On a frontend I'm selecting product categories which I want to filter when importing these files. Also I have two buttons, one for exporting all filtered products to one CSV file for Shopware (e-commerce shop system), so I can import them once in that system, and the other button to manually update the products (again, in-stock count, price, etc.) in the external Shopware database.

When importing these CSV files, every distributor has his own "fields" (properties), so there also needs to be a schema or something like this where I can define which index of a CSV row represents which product property... and so on.

Now my question is: How would you structure the code of such a "complex" project to be as simple and fast as possible (in terms of runtime for thousands of datasets), what is the best approach in a situation like this? I would like to programatically add distributors in the future, and don't worry about the individual functionalities (Interface?).

I've already coded something before which works, but sadly didn't come up to my expectations, so I need a little help to completely refactor my horrible first try.

There has to be an elegant way.

Thank you for your help. :)

0 likes
4 replies
mdecooman's avatar

Hi @StefanBeierl

Well, your question is too fuzzy to be answered properly. However, here are some thoughts after sipping my morning coffee:

  • You are not talking about 1 application doing everything even the coffee, separate the concerns and treat them as 1 "service" at a time. I would do some drawings first. Being agile doesn't mean old fashion documentation and requirement elicitation is forbidden. It will help you to clarify some business rules and processes.

  • We are not savages (us the developers) so we do not import directly in production, you need to have a temp schema in between that will allow you to play with the data before updating the live DB so if your import fails... the app serving the client is still running.

  • You do not want your schema to be highly coupled with your CSV structures so a package like fractal will come handy.

Your questions:

  • Q1: How would you structure the code of such a "complex" project to be as simple and fast as possible (in terms of runtime for thousands of datasets), what is the best approach in a situation like this?

A: This is all about software development... :) Decide what can/(has to) be computed in backend and what can be delegated to front end...

  • Q2: I would like to programmatically add distributors in the future, and don't worry about the individual functionalities (Interface?).

A: That looks like more a negotiation issue than development issue first...

PS: your question is like asking existential questions in front of a blackhole and expecting getting a very precise answer from some quantum physics magic. :D

2 likes
martinbean's avatar

@StefanBeierl With CSVs from multiple providers being in different “shapes”, this is a good candidate for the Adapter pattern. You would import the CSV rows and then use an adapter per supplier to transfer the shape of that array into the columns and value your application is expecting.

At its simplest, it would look like this:

interface ProductInterface
{
    public function getProviderId();
    public function getName();
    public function getPrice();
}
// Imagine CSV has columns: id, title, unit_price
class SomeProviderCsvAdapter implements ProductInterface
{
    public function __construct(array $csv)
    {
        $this->csv = $csv;
    }

    public function getProviderId()
    {
        return $this->csv['id'];
    }

    public function getName()
    {
        // Name corresponds to title column
        return $this->csv['title'];
    }

    public function getPrice()
    {
        // Price may be formatted as pounds
        // Convert it to pence
        return $this->csv['unit_price'] * 100;
    }
}

So importing a CSV would look something like this, in pseudo-code:

$csv = loadCsvFromSomeProvider();

$product = new SomeProviderCsvAdapter($csv);

$product = Product::updateOrCreate([
    'provider_id' => $product->getProviderId(),
], [
    'name' => $product->getName(),
    'price' => $product->getPrice(),
]);

It’s just a simplified example, but should give you an idea. You can create an adapter class per provider, and have a console job per provider too, i.e.

  • products:import-from-provider-a
  • products:import-from-provider-b
  • products:import-from-provider-c

I’d also run the import within a transaction as like @mdecooman says, you don’t want your database left in an inconsistent state if an import fails part-way through.

1 like
EventFellows's avatar

check out the video about adapter pattern here on laracasts. it tells you what and how to do it.

Please or to participate in this conversation.