Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

twopiers's avatar

Eloquent save converting unicode characters

I have an app that pulls some information from an API and simply saves it (via an Eloquent model) to MySQL. I'm running into an issue when a string from the API contains an en dash character. Somewhere along the way (I don't know if this is an Eloquent or PHP or MySQL thing), the en dash character gets converted to something else.

Some code:

//lookup value via API and store in $response
//for this example, $response['location'] = 'Bordeaux–Mérignac Airport'
//the hyphen is an en dash character
$location = $response['location'];

dd($location); // 'Bordeaux–Mérignac Airport'

So far, the en dash is just fine. If I copy/paste this value into a unicode converter tool (like this one), it says the en dash character's hex value is

– 

which is correct (I think). Then I save:

$some_record = App\SomeModel::find(22); //grab my model
$some_record->location = $location;
$some_record->save();

dd($some_record->location); // 'Bordeaux–Mérignac Airport'

Now the en dash is gone, replaced with a large space character. If I copy/paste the value into a unicode converter tool and convert it, it says the character is now

– 

which is something called a START OF GUARDED AREA character.

The MySQL table's charset is set to 'utf8mb4' and the collation is set to 'utf8mb4_unicode_ci'.

I'm clearly doing something wrong or forgetting a step somewhere. I'm at the end of my knowledge gap on this. Any advice would be most welcome.

Thanks for listening.

0 likes
13 replies
laracoft's avatar

Varchar and Text have their own collation and charset. Make sure it is utf8mb4. :)

twopiers's avatar

Yeah, should've posted that. Sorry.

location is a varchar field. Its collation is 'utf8mb4_unicode_ci' and its charset is 'utf8mb4'.

laracoft's avatar

Can it be text? I default to text and only varchar when forced to. In text, I have been able to store all sorts of characters including emojis with utf8mb4.

twopiers's avatar

Thank you for the suggestion. I changed the field type to text and tried it again. Same result. Bummer.

newbie360's avatar

what about

CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
laracoft's avatar

Check the mysql settings in config/database.php if they match your DB actual values.

twopiers's avatar

mysql settings in config/database.php

'mysql' => [
    'driver' => 'mysql',
    'url' => env('DATABASE_URL'),
    'host' => env('DB_HOST', '127.0.0.1'),
    'port' => env('DB_PORT', '3306'),
    'database' => env('DB_DATABASE', 'forge'),
    'username' => env('DB_USERNAME', 'forge'),
    'password' => env('DB_PASSWORD', ''),
    'unix_socket' => env('DB_SOCKET', ''),
    'charset' => 'utf8mb4',
    'collation' => 'utf8mb4_unicode_ci',
    'prefix' => '',
    'prefix_indexes' => true,
    'strict' => true,
    'engine' => null,
    'options' => extension_loaded('pdo_mysql') ? array_filter([
        PDO::MYSQL_ATTR_SSL_CA => env('MYSQL_ATTR_SSL_CA'),
    ]) : [],
],
laracoft's avatar

if you paste 'Bordeaux–Mérignac Airport' via another client, does it save correct?

twopiers's avatar

First, THANK YOU for taking time out of your day to help me. I love the Laravel community.

Just ran a couple of tests...

If I paste 'Bordeaux–Mérignac Airport' directly into the location field of my MySQL client (Table Plus on Mac), it saves just fine.

Here's the weird(er) part: I wanted to see if Eloquent was doing anything to the string to prep it prior to saving to MySQL, so I created a very simple web form to edit the SomeModel location field. When presented in the browser, the location field text input element contains the large space character (as expected). If I highlight that and then type an en dash character (option-dash on the Mac), then save, the en dash saves perfectly.

public function update(SomeModel $someModel, Request $request)
{
    $input = $request->validate([
        'location' => 'required|string|max:255'
    ]);

    $someModel->update($input);
}

So, I don't think it's a MySQL collation or charset issue, and I don't think it's an Eloquent issue.

Now I'm really stumped.

newbie360's avatar

for my knowledge, if the field is set to unique and will contain any non-number / non-english

you should use

$table->string('name')->collation('utf8mb4_bin')->unique();

for example try to insert two difference name to table when use utf8mb4_unicode_ci

you will get error duplicate entry

森下えりか
森下エリカ

https://mathiasbynens.be/notes/mysql-utf8mb4

i don't know character-set-client-handshake will cause that problem too

.php is UTF-8 ?

laracoft's avatar

@twopiers Your client test shows it is not a MySQL issue. So that leaves only Laravel and PHP.

  1. If you write a raw query, does it save correctly? Be sure to paste Bordeaux–Mérignac Airport from here, the dash in question is – and not a typical - (hyphen/minus on our keyboard).
  2. If raw query works, then only Laravel remains as the sole suspect.
rahman23's avatar

I know it's late but I think it will help in future.

First of all, It's all doing Eloquent, not a database. If you do insert manually it works normally.

The problem that Eloquent calls json_encode without the flag JSON_UNESCAPED_UNICODE. If you don't have this flag, it will convert multibyte symbols to unicode. So to solve this, you just have to use mutations:

class Order extends Model
{

    protected $fillable = [
        'name',
        'equipments',
        'documents',
    ];

    protected $casts = [
        'id'                => 'integer',
        'equipments'        => 'array',
        'documents'         => 'array',
    ];

    public function setEquipmentsAttribute($value)
    {
        $this->attributes['equipments'] = json_encode($value, JSON_UNESCAPED_UNICODE);
    }

    public function setDocumentsAttribute($value)
    {
        $this->attributes['documents'] = json_encode($value, JSON_UNESCAPED_UNICODE);
    }
}

So, now it will work normally. No need to thank ^)

1 like

Please or to participate in this conversation.