BENderIsGr8te's avatar

Lumen for Mass File Uploads

Background

I've only been developing for a about 5 years and as with many devs of PHP I started off procedurally, then I learned about OOP in PHP, then I noticed CodeIgniter and it didn't seem overcomplicated as Symfony or CakePHP or Yii and I used it for a few projects. But I quickly felt like I was having to add a ton of stuff that seemed pretty standard and started looking for a more feature rich Framework that didn't have too much of a learning curve.

Thanks to PHPAcademy YouTube videos which I have learned a lot, I noticed Laravel. They cautioned that compared to that of CI, it has a steep learning curve, but once you get rolling in it, it's down hill and not too bad. So I gave it a shot.

I did not have any formal training in programming, I self-taught myself everything through books, Blogs, Videos, and practice. Laravel was a HUGE step above my head, but everything looked more like the stuff I see in some other languages, so I figured it would make me a better developer (and it has).

So What is a MicroFramework?

Having said all that, I never looked into a "micro framework" before. It wasn't until @TaylorOtwell released Lumen I even knew they existed. I understand some practical uses but not fully.

So, here is my question...

I am developing an App for people to advertise rental properties on. One thing that landlords love is pictures, so I have a multiple-file uploader (similar to facebook) I have built that allows either drag & drop, or just selecting multiple files. Each file has it's own percentage meter of progress and once uploaded (and stored on Amazon S3), is displayed for the user.

Right now we are only in 1 city and have about 2,000 ads. As we grow to national coverage we expect hundreds of thousands of ads with pictures.

Would Lumen be a good way to handle some of these Ajax requests so as to keep the main app from becoming sluggish with dozens of HTTP requests per upload sets?

I've read about Lumen for incoming Requests and I think I will use it for that for a handful of external service I use. I am just trying to see if there are other ways to make my app quicker and more efficient and this photo uploading (which accepts the file up to 20MB in size, then re-sizes it and black-boxes if necessary to maintain a specific aspect ratio), then push them to Amazon S3, then store the amazon S3 location and information about each file in our MySQL database) goes great on my dev server, but I feel in production it's going to be a bottleneck for the site.

So, is Lumen a good choice for that? Is a separate Laravel install a good choice for that? Should I just use my existing Laravel and have it run like normal?

Thanks in advance for your considerations.

0 likes
5 replies
toniperic's avatar

Hey @BenderIsGreat, and welcome to Laracasts.

What Taylor tried to achieve with Lumen is basically make a framework that's not bloated with code that might never be used throughout app's lifespan. For instance - if you're building a Restful API, you perhaps won't ever need blade templating system, validating CSRF token, sessions at all etc.

So, by getting rid of all that stuff that might be considered obsolete or rather useless for such projects, you basically waste smaller amount of resources and your app delivers responses significantly faster.

As for your situation - as I've understood it correctly, it all works fine now, but you're a bit worried what might happen once the traffic gets higher?

Have you thought about having a server dedicated for processing images? Image processing is considered as time and resources consuming task. What you might do is have a queueing system - once the user uploads the photo, just hand it off to another server that will do all the image processing, black-boxing, pushing to S3 and all that stuff. This way, once the user uploads photos, you can instantly return to him and tell him his ad is being processed, and once ready will be published. No matter how many requests you have, they will (and should be) stored and ran somewhere else, as your app should return response to client as soon as possible.

Hope I could help.

BENderIsGr8te's avatar

As you can tell from my background there are a lot of new concepts to me once I switched to Laravel. Queues are a good example of something I have never used before. On my CI apps I just created a cronjob that ran every minute and checked for certain things it needed to do based on a MySQL table. I could then add things to the table and tell it if it needed to run now or in the future (using a Unix Timestamp of when it should run).

Queues sound like a much better way to achieve this and again take some of the work off the server. Queues are something I had planned on looking into in the future for handling my Incoming Notifications. In fact, I was thinking of setting up a Lumen install just to handle Queue jobs and run them all through there so they are off my main server.

Sounds like your suggestion of using a different server for processing all of media files would be the best way to keep the app speedy. It only takes about 1 second per picture to do all the tasks it needs to (if it's a 20MB file that has huge resolution and higher DPI), but if someone is uploading 30 pictures (that big), that's 30 seconds. So imagine hundreds of people uploading dozens of pictures at the same time and I can see the server starting to get bogged down.

Thanks for the suggestion.

toniperic's avatar

Queues are not really Laravel-specific, but Laravel makes leveraging them super-easy. I have written a long and thorough post about queues, so continue reading if you wish to understand queues in general and specifically how are they used with Laravel.

What are queues in web development?

  • Queues allow you to defer/postpone the processing of a time consuming task, such as sending an e-mail or processing an image, until a later time, thus drastically speeding up the web requests to your application, which returns a response and serves the client significantly faster than it would be if the task ran synchronously.
  • The Laravel Queue component provides a unified API across a variety of different queue services. There are different queue services. Some of them are hosted elsewhere, but some of them can be self-hosted, such as Beanstalkd, which I'll cover here.

How do they work behind the scenes?

  • Once you push a job onto the queue (from somewhere within your code, e.g. send an email after user has registered), all that the client has to wait for is for job to be pushed onto the queue service that is listening for jobs, and doesn't have to wait for the queue service to finish the job itself.
  • The most trivial example would be a McDonalds employee working at the cashier - the employee asks you for your order, but it doesn't go and make french-fries and hamburgers for you. It delegates the task to someone in the background (to people which are making hamburgers and french-fries), but the cashier employee returns immediatelly to you, and in the end delivers you the french-fries once they're ready.

Queue drivers in Laravel

There are several drivers for queues in Laravel. There are some differences between them, and I'll try to explain them as best as I could, usually the biggest difference being how the jobs are stored by the queueing service.

database driver
  • In order to use the database queue driver, you will need a database table to hold the jobs. To generate a migration to create this table, run the queue:table Artisan command.
  • For every job that you push onto the queue, it will get stored into the database. What this means is it uses I/O operations on the filesystem, which might not be the fastest thing in the world. It depends how fast you want your jobs to run.
    • pros: uses disk storage which is very cheap, thus meaning you could have a really huge number of jobs in the queue
    • cons: a lot slower than e.g. some driver that reads/writes to memory, as it has to read/write onto the filesystem
beanstalkd driver
  • In order to use this driver, you have to pull in the pda/pheanstalk ~3.0 package and have beanstalkd installed somewhere. You can install it on the same server as where you Laravel app is, or you can have a dedicated server only for listening and running queued jobs. Either one works.
    • pros: is an in-memory queue service, which means it's blazing fast. It's also free and you can host it yourself.
    • cons: well, eventually you run out of memory once there are too many jobs on the queue.
      • Hint: you may run Supervisor that would re-start the process once it's down, or you can use the -b option, and beanstalkd will write all jobs to a binlog. If something odd happens, you can restart beanstalkd with the same option and it will recover the contents of the log.
  • Pretty much the same pros and cons are for redis driver as well.
iron driver
  • If you do not wish to bother installing and hosting queue services such as beanstalkd, then you might pay and use external ones, such as iron.io. It basically does the same thing, except you don't have to worry about crashes and exceeding memory limit etc.
    • pros: it's very reliable solution. You don't have to worry about setting up anything.
    • cons: it costs money, for one. The synced job has to be pushed onto a cloud, and retrieved back from the cloud. Sometimes it might take time, so not the fastest thing in the world either.
sync driver
  • It is basically intended for development purposes usually. It just runs the queued jobs synchronously, just as if you haven't pushed them onto a queue service. Used so you can develop in a familiar API and work with queues, and then ideally when you want it in production you should just specify in the config which driver you want to use and everything should automatically work.
null driver
  • It just doesn't run the jobs you push to the queue using the API.

Listening for jobs

Once everything set-up, you'd have to run queue:listen artisan command in order to start listening for jobs. If you don't, the jobs would be pushed onto the queue but never actually executed.

You might as well try to run queue:work with --daemon option for forcing the queue worker to continue processing jobs without ever re-booting the framework. This results in a significant reduction of CPU usage when compared to the queue:listen command, but at the added complexity of needing to drain the queues of currently executing jobs during your deployments.

Daemon queue workers do not restart the framework before processing each job. Therefore, you should be careful to free any heavy resources before your job finishes. For example, if you are doing image manipulation with the GD library, you should free the memory with imagedestroy when you are done.

Similarly, your database connection may disconnect when being used by long-running daemon. You may use the DB::reconnect method to ensure you have a fresh connection.

Hope it was helpful.

2 likes
BENderIsGr8te's avatar

Wow, thanks for the info. I was reading about Redis all morning today. I hadn't really read it for queues as much as a simple persistent key/value pair storage. The article I was reading was talking about using it to store for example view counters (and then persisting to the database once a day as opposed to trying to persist to database each time it's viewed) or to use other items (such as config items) that need to be dynamic and can be changed, but are used on every request (thinking of something like an a "listing id" counter that can be incremented each use so that it returns a unique listing ID that is different that the database ID.

The article talked more about Leaderboards, or "recently added" type of information where instead of having to run a select * command in your SQL you could just store the ID's in a collection as they are added and then select the specific ID's from the database when needed.

Again, not that familiar with it, but figured I would start with something simple and add to it if it looked like it would be a good solution.

Here's the main article I was reading: http://oldblog.antirez.com/post/take-advantage-of-redis-adding-it-to-your-stack.html

UhOh's avatar

mind sharing with us what packages you are using for your image processing? Tnx

Please or to participate in this conversation.