Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

Esa's avatar
Level 1

Find unicode character class in PHP

I am having hard times finding a way to get the unicode class of a char.

list of unicode classes: https://www.php.net/manual/en/regexp.reference.unicode.php

The desired function in python: https://docs.python.org/3/library/unicodedata.html#unicodedata.category

I just want the PHP equivalent to this python function.

For example, if I called the x function like this: x('-') it would return Pd because Pd is the class hyphen belongs to.

Thanks.

0 likes
4 replies
Esa's avatar
Level 1

So Apparently there is no built-in function that does that, so I wrote this function:

<?php
$UNICODE_CATEGORIES = [
        "Cc",
        "Cf",
        "Cs",
        "Co",
        "Cn",
        "Lm",
        "Mn",
        "Mc",
        "Me",
        "No",
        "Zs",
        "Zl" ,
        "Zp",
        "Pc",
        "Pd",
        "Ps" ,
        "Pe" ,
        "Pi" ,
        "Pf" ,
        "Po" ,
        "Sm",
        "Sc",
        "Sk",
        "So",
        "Zs",
        "Zl",
        "Zp"
    ];

function uni_category($char, $UNICODE_CATEGORIES) {
	foreach ($UNICODE_CATEGORIES as $category) {
		if (preg_match('/\p{'.$category.'}/', $char))
			return $category;
	} 
	return null;
}
// call the function 
print uni_category('-', $UNICODE_CATEGORIES); // it returns Pd

This code works for me, I hope it helps someby in the future :).

Esa's avatar
Level 1

@Braunson I updated the thread with an answer, thanks for your time.

Please or to participate in this conversation.