Handling email threads and extracting the most recent message can be a bit tricky, as there is no universal standard for how email clients quote the original message in a reply. However, you can use some heuristics to try and parse out the original message.
One approach is to look for common patterns used by email clients to separate the quoted message from the reply, such as lines starting with ">" or patterns like "On [date], [name] wrote:". You can then use regular expressions to split the email content at these points and extract the most recent message.
Here's a basic example of how you might implement this in PHP using the php-mime-mail-parser library:
require_once 'vendor/autoload.php';
use PhpMimeMailParser\Parser;
$parser = new Parser();
// Assuming $emailContent is the raw email content piped to the script
$parser->setText($emailContent);
// Get the text or html body
$text = $parser->getMessageBody('text');
$html = $parser->getMessageBody('html');
// Define a function to clean the reply
function cleanReply($body) {
// Define your regular expressions for quoted text
$patterns = [
'/(On\s(.*)wrote:)/s', // English style
'/(Le\s(.*)écrit :)/s', // French style
'/(>\s.*)/s', // Lines starting with '>'
// Add more patterns as needed
];
// Try each pattern to split the text
foreach ($patterns as $pattern) {
if (preg_match($pattern, $body, $matches, PREG_OFFSET_CAPTURE)) {
// Return the part of the body before the quoted text
return substr($body, 0, $matches[0][1]);
}
}
// If no patterns matched, return the whole body
return $body;
}
// Clean both text and html parts
$cleanText = cleanReply($text);
$cleanHtml = cleanReply($html);
// Now you can store $cleanText and $cleanHtml as the most recent message
This is a very simplistic approach and may not work for all email clients or languages. You might need to add more patterns to the $patterns array to match the quoting styles of different email clients.
Also, be aware that this method can be error-prone, especially with complex email threads or when users manually edit the quoted text. There is no perfect solution, and you may need to refine the regular expressions based on the actual emails you receive.
If you're looking for a more robust solution, you might need to consider a library specifically designed for email processing, such as email-reply-parser. However, as of my knowledge cutoff date, there isn't a PHP version of this library that's widely supported, so you might need to port it from another language or look for a similar library in PHP.