All options are probably fine. Use whatever you already know ;)
Also, if you're toying with AWS it might be nice to look into lambdas. Basically a function you can fire for each document on AWS. No server needed ;)
Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.
Hiya all,
Currently I am trying to figure out a best approach for my following problem. Everyday, once a day (or maybe two but mostly once) I need to convert around 1200+ RTF documents to PDF.
For the time being we are using paid software that does the conversion to PDF, long story short, it just starts up a MS office instance send the document to a printer queue and the program picks up the postscript file to convert it to PDF, it is not fast and sometimes randomly fails...
I was playing with libreoffice on ubuntu for the conversion and heck, it just works, and boy it works fast, arround 1200 documents done within one minute....
But now I am currently look at the best approach on how to deploy this.
I got around three options:
I do not have any experience with option 1 and 2 and I am not feeling like using option 3. I tested using the listener option and the uniconv solution but just use libreoffice and use the --headless mode is just plain faster.
Does anyone have any good tips on how to do this, or any suggestions? Anything would be great!
Currently my test setup is just a bash script (where the argument is the directory were the RTF files are stored):
#!/bin/bash
shopt -s nullglob
baseDirectory="/home/user/test/conversion"
# Check if directory is specified.
if [ -z "" ]
then
echo "No input directory specified."
exit
fi
# Balance files to directory, each directory contains max of 200
# This is done cause Linux has a shell limit of 249 with libreoffice.
rtfs=(/*.rtf)
for ((i=0; i < ${#rtfs[@]}; i += 200)); do
printf -v b "${baseDirectory}/balancer/%03d" $((++n))
mkdir -p $b && mv -t $b "${rtfs[@]:$i:200}"
done
# Loop balance directies and let libreoffice do its magic in converting the RTF files to PDF.
# Remove the balance directory after the conversion is done.
for d in ${baseDirectory}/balancer/*; do
id=${d: -3}
soffice "-env:UserInstallation=file:///${baseDirectory}/environments/${id}" \
--headless --convert-to pdf "$d"/*.rtf \
--outdir > ${baseDirectory}/environments/output_${id}.log 2>&1 && \
rm -rf "$d" &
done
wait
echo "Conversion done"
Please or to participate in this conversation.