seeking architecture suggestions to handle data files for ETL & reporting
I build a lot of laravel apps that parse source data, load it into a database and then offer views of the data (tables, charts) and other reports (emailed csvs etc).
I'm seeking high-level suggestions on more elegant, perhaps more 'Laravel-esque' ways to structure the collection, ETL & archiving of this data.
Currently I use an external script (bash/python etc) to scp datafiles to a local 'incoming' directory. Then a scheduled laravel command moves them into a 'processing' directory & dispatches a job onto a database queue. This job reads in the file, parses & loads it into the DB per the appropriate model structure & its relationships.. Finally the file is archived to an archive directory. Old files are purged from the archive by an external cron script. FYI everything is CentOS.
All this file handling is fine but really kludgy and unsatisfying. I was wondering if I couldn't shoot the datafiles at the app via HTTP (curl with authentication?), have laravel receive the files, ETL, & archive. Along with event notifications (for errors) & a KPI dashboard for app health and performance.
Ben
Please or to participate in this conversation.