Mindexperiment's avatar

Advice for serving different model based on user input

Hey community,

I'm building a Scraper that uses, so called "reading-models" to grab the data from the pages.

The basic main use case is that a user can paste the source code of an html page, than choose which reading-model to use (from a drop down list? maybe) to scrape the data with and submit the request.

As you may know there may be many reading-models that grab different types of data from pages.

I'm stuck on the process of serving these many types of reading-models, how would you load the correct model selected by the user?

Every model if made by a Composite and some Value the former is a container for data the latter is the concrete value to scrape. I show you a basic example here:


    // basic reading-model

    $links = new Composite('links', 'h3.resultItem-name');
    $links->addElement(new Value('link', 'a'));

    $model = new Composite('page-c', 'div#results');
    $model->addElement($links);

    // scrape
    $parser = new Parser($document);
    $parser->parse($model)->getValue(); // get all links inside div#result

I have many types of models for different pages and I need to let users choose which type of model use for the page they are scraping.

I have something in mind but first I want to hear what you say :D

Thank you

0 likes
3 replies
Mindexperiment's avatar

Here is some solution I think of:

A - Simply use a config file and set an array of key=>value that point to different models, low cost solution

<?php

return [
  'models' => [
    'model1' => '/path/to/model1',
    'model2' => '/path/to/model2',
    'model3' => '/path/to/model3',
  ],
];

then, permit users to select which model to use base on that array and litterally load (inlclude) a php file with the structure of the reading-model.

B - build a factory, define an json/array-like schema that simplify the building of the reading-model, use a db table to save the schema and retrive each reading-model from the database, high costs solution

C - ??

Thanks

Sinnbeck's avatar

You can use resolve to get the model

$modelName = $request->model;
$model = resolve($modelName);

Please or to participate in this conversation.