Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

Nael.Saeed's avatar

URL encoding

Hi guys, I need a way to encode URL before saving it to database but in a way similar to how chrome browser does that where symbols like : / - _ # ... etc, is not encoded unlike rawurlencode() does. is there any package or function would do that ? I have made a lot of searches but found nothing.

There was this solution but still it is a one developer written function and looks like there is still characters that are not preserved but chrome does not encode. https://stackoverflow.com/questions/4929584/encodeuri-in-php

0 likes
20 replies
bugsysha's avatar

Can you show us examples?

I know about rawurlencode() and urlencode().

Why do you need to match Chromes encoding?

Nael.Saeed's avatar

@bugsysha thanks for your reply,

Well I do store URLs in database. I need URLs to be encoded especially for URLs that has Arabic and non English letters. Later on when I send these URLs to the frontend I do not do decoding because the write thing to do is to set it in href for example is encoded, but rawurlencode() encode everything in the URL that it does not work when a user clicks on it encoded.

For examaple : https://example.com/postid#2 https%3A%2F%2Fexample.com%2Fpostid%232

Sinnbeck's avatar

Very simple hack would be to use mutators to base64 encode/decode the url on save/load :)

public funtion setUrlAttribute($url)
{
    $this->attributes['url'] = base64_encode($value);
}

public funtion getUrlAttribute($url)
{
    return base64_decode($value);
}
Nael.Saeed's avatar

@sinnbeck

Thanks for your reply,

This way does not get me what I want, I could have used urlencode() and urldecode() instead.

public funtion setUrlAttribute($url)
{
    $this->attributes['url'] = urlencode($value);
}

public funtion getUrlAttribute($url)
{
    return urldecode($value);
}

this way I will use URL not encoded in the frontend like when using in

<a href="http://example.com/لغة-عربية">a link with non English letters</a>

what I need is to save the URL to database in a good format that is encoded but works as expected when using encoded in the frontend inside the HTML code.

Sinnbeck's avatar

So the browser is not able to send the url to the backend as is? If you can send it perfectly to the backend and save it to base64 you can get it back like this.. I am doing this very same thing for chinese/japanese characters in a project which uses an old database, and it works perfectly.

bugsysha's avatar

Please show better examples. Like what is the URL, how do you want to save it in the database, and how do you want it to be presented once it is retrieved from the database. Cause if I understand what you are saying you want to translate those non english letters to english letters, right?

Sinnbeck's avatar

Ok here you go :)

$parts = parse_url($url);
$save_url = $parts['scheme'] . '://' . $parts['host'] . '/' . urlencode($parts['path']);
Sinnbeck's avatar

I took the exact example you posted. Please then show what that url will be like in chrome also :)

Sinnbeck's avatar

Ok just tested it.

Here is an updated answer :)

$parts = parse_url($url);
$fragment = isset($parts['fragment']) ? '#' . $parts['fragment'] : '';
$save_url = $parts['scheme'] . '://' . $parts['host'] . urlencode($parts['path']) . $fragment;
Nael.Saeed's avatar

@sinnbeck

chrome does not encode characters that might cause a problem like this. So chrome will keep the link as: https://www.techmeme.com/191209/p1#a191209p1

as I mentioned in my question above, this was the best solution I found till now but still is not as perfect as should be:

function encodeURI($url) {
    // http://php.net/manual/en/function.rawurlencode.php
    // https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/encodeURI
    $unescaped = array(
        '%2D'=>'-','%5F'=>'_','%2E'=>'.','%21'=>'!', '%7E'=>'~',
        '%2A'=>'*', '%27'=>"'", '%28'=>'(', '%29'=>')'
    );
    $reserved = array(
        '%3B'=>';','%2C'=>',','%2F'=>'/','%3F'=>'?','%3A'=>':',
        '%40'=>'@','%26'=>'&','%3D'=>'=','%2B'=>'+','%24'=>'$'
    );
    $score = array(
        '%23'=>'#'
    );
    return strtr(rawurlencode($url), array_merge($reserved,$unescaped,$score));

}
bugsysha's avatar

@nael.saeed I never knew that you can have such issues with language specific links. Anyway I think that you should go with the solution you've posted and improve it. You know much more about this topic than I do. I don't think there is some function that will do what you want it to do so you have to tinker till you find it.

Sinnbeck's avatar

Just found a possible solution that does exactly as you want. Hope it works :)

preg_replace_callback('/[^\x20-\x7f]/', function($match) {
    return urlencode($match[0]);
}, $url);
Sinnbeck's avatar

Did the last example work out for you? I have tested it with several urls and the output seems exactly what you are looking for :)

Nael.Saeed's avatar

No it did not. Could you please share the result you get of applying it to one of the examples we talked about earlier?

Nael.Saeed's avatar
Nael.Saeed
OP
Best Answer
Level 3

Well this is the best way I found so far for doing this, which is using the following packages to parse the URL and break it to components, then encode each component in the right way depending on the component type. At the end we join encoded components together to form the encoded URL

public function (string $url)
{
    $encodedUri = null;

        //create new Leage/Uri/Uri instance
        $uri = Uri::createFromString($url);

        //break the URI into URI components
        $scheme = Scheme::createFromUri($uri);
        $userInfo = UserInfo::createFromUri($uri);
        $host = Host::createFromUri($uri);
        $port = Port::createFromUri($uri);
        $path = Path::createFromUri($uri);
        $query = Query::createFromUri($uri);
        $fargment = Fragment::createFromUri($uri);

        //create an array of encoded components converted into strings
        $components = [
            'scheme' => optional($scheme)->getContent(),
            'user' => optional($userInfo)->getUser(),
            'pass' => optional($userInfo)->getPass(),
            'host' => optional($host)->getContent(),
            'port' => optional($port)->toInt(),
            'path' => optional($path)->getContent(),
            'query' => optional($query)->getContent(),
            'fragment' => optional($fargment)->getContent()
        ];

        //create an encoded URI from the encoded components
        $encodedUri = Uri::createFromComponents($components);
    }

    //compose URI from the component and return it as one string
    return optional($encodedUri)->jsonSerialize();
}

This solution also deals with non english domain names where they are encoded the right way starting with "xn--*" like for example: موقع.com

Would be glade to hear any thoughts related to this solution. Thanks for all participants.

Please or to participate in this conversation.