webrobert's avatar

storing base64 pdfs... until we get the final file

Ok, back on my contract project. Where we are now building the document, sending it to third-party (docusign) where it gets signed/executed, then will save the final pdf to a disk. The desired format is base64pdf (for the third-party send)

What I am thinking is as each document is being made (dispatch a job) we will save the base64pdf to a column... (s0 when we are ready to send the document set to the third-party, we don't have to make 10 pdfs), once the document set is sent, we just wait for a webhook from the third-party confirming the document has been executed. we will clear the base64 column and add the file path.

// documents columns
  "ps_transaction_id" => 1
  "name" => "Standby loan agreement"
  "slug" => "standby-loan-agreement"
  "values" => [] // don't judge me.
  "base64" => "" 
  "path" => "" 
  ...

I don't expect we'll ever have more than 20-50 documents in base64. Is the ok? Is there a more efficient, better, healthier way [to store those base64 pdfs]?

0 likes
11 replies
tisuchi's avatar

@webrobert It seems a pragmatic solution. I have one concern though, the file size. Definitely, base64 encoding will increase the file size.

As you mentioned that 20-50 documents, do you know exactly what kind of content would be inside the file?

1 like
webrobert's avatar

@tisuchi they are text documents (legal docs) one - two pages each. I am converting them with browsershot.

webrobert's avatar

interesting [in base64]...

  1. the first one page document... 87.49 KB

  2. the 2 page document 81.57 KB

  3. a small 1 page document 35.98 KB

what I mean is in the whole database 20-50 documents. most are 1 page each. The overall composition is 5-7 one page documents and 1 two page document per transaction and no more than 5 transactions at any given time stored as base64.

tisuchi's avatar

@webrobert hmm.. it's interesting!!!

I expect that base64 should increase the file size. :)

1 like
webrobert's avatar

@tisuchi you're right it does... the pdfs I have saved to drive are 63kb where the base64 is 82kb

jlrdw's avatar

@webrobert Rather than use mysql for this, have you considered mongodb just for the pdf storage, but linked of course to an id in the mysql database. Just a thought.

1 like
jlrdw's avatar

@webrobert well when you delete the base64 a mysql database is still bloated (white space). Whereas in mongodb, just leave them as is, base64.

In innodb (mysql) the only way I have been able to compact (after deleting a large amount of records) is to dump data, recreate new database and import. In mysql I try not to have base64.

But again just my thoughts on it. But no big deal if you clean out bloat sometimes anyway.

1 like
Tray2's avatar
Tray2
Best Answer
Level 74

@webrobert Storing clobs/blobs in the database is not a good solution in my opinion, it will increase the table size, and if you do a lot of inserts and deletes, the table files will become fragmented on the discs, and will have large areas of unsued space instide the table.

A better solution would be to store the files in a drafts directory, and then move it to a final directory when it is finalized.

Using a mongodb like @jlrdw suggests is another option, but I would stick with using the file system.

1 like

Please or to participate in this conversation.