Be part of JetBrains PHPverse 2026 on June 9 – a free online event bringing PHP devs worldwide together.

Atef95's avatar

Group regex patterns by values

I've the following example :

$str = "Jhon Doe 1990";
preg_match_all("/Jh(.*) D(.*) ([0-9]{4})/", $str, $matches);

echo $matches[1]; // on
echo $matches[2]; // oe
echo $matches[3]; // 1990

I want to group results by regex pattern

so it would be like :

[
'(.*)' => ['on','oe'] ,
'([0-9]{4})' => '[1990']
]

how can I achieve this?

thanks!

0 likes
9 replies
Sinnbeck's avatar

Or do you always get those 3 exact matches in that order?

$a = [
'(.*)' => [$matches[1], $matches[2]],
'([0-9]{4})' => [$matches[3]]
];
kokoshneta's avatar

I don’t think there is a way to do that. The regex pattern and the returned results are entirely separate and don’t know anything about each other. I don’t believe PHP offers a way to access the actual regex parsing of the pattern, or indeed any lower-level interaction between the pattern and the string.

Atef95's avatar

@kokoshneta

:( do you have other suggestions ?

what I want to achieve if the following:

I'm storing values with wilcards in the DB ( I call them templates )

a basic example of the values :

Jh(.*) D(.*) ([0-9]{4})
Monitor Network (.*)
([0-9]{4}) Networks

Jh(.) D(.) ([0-9]{4}) returns Jhon Doe 1990

Therefore , I want to substitute the returned values in the rest of the templates

it would be like :

Monitor Network on ( or oe)
1990 Networks

and I keep searching for relevants results..

and so on..

kokoshneta's avatar

@Atef95 I don’t really understand what you’re trying to do. You’re saying the value in the database is Jh(.*) D(.*) ([0-9]{4}), right? That’s your ‘template’? Or are all three lines a single template? Are they all in the same cell in the database, or are they stored separately? Do you control the regex patterns in the templates? How do you know which pattern to choose from the search template into the replace template?

And how do you get Jhon Doe 1990 from the regex pattern? Where is the text you’re matching against coming from?

1 like
Atef95's avatar

@kokoshneta Sorry for the confusion !

All of them are seperate templates so every one is a row in the table :

Jh(.*) D(.*) ([0-9]{4})
Monitor Network (.*)
([0-9]{4}) Networks

and YES I control them as they are inputed from the end-user as regex values!

The first row is the template parent and the next ones are like dependencies!

The main idea is like : I've a monitors table and monitors have names

so instead of linking monitors manually with each other , it will be done dynamically through these templates!

Let's say Jh(.) D(.) ([0-9]{4}) matchs with 4 results :

Jhon Doe 1990
Jhon Doe 1991
Jhon Doe 2000
Jhon Doe 2001

For each line from above , I've to pick up the wildcards and replace them in the following template rows!

so for Jhon Doe 1990 I keep searching :

Monitor Network (on or oe)
1991  Networks

for Jhon Doe 2000

Monitor Network (on or oe)
2000  Networks

and so on..

till the end of the template!

kokoshneta's avatar
Level 27

@Atef95 So, if I understand you correctly:

  • you have a collection of monitors, which have a parent/child-type relationship to each other (e.g., Jhon Doe 1990 may be the name of a parent monitor, which may have child/dependency monitors named something like Monitor Network on and 1991 Networks, for example)
  • your monitor names are ‘grouped’ so that a user can predict (their own?) monitor names
  • the user can input regex patterns that should match specific (their own?) parent monitors
  • you start out by using such a regex pattern to find parent monitors
  • for each matched parent monitor, you then want to match child monitors by performing a regex match on child monitor names, using the child templates (regex patterns) provided by the user, but substituting the actual grouped matches from the parent template

If that is indeed what you’re trying to do, it isn’t possible using just simple regex. Your users will have to do some of the work for you. The best option, I think, would be to use named subpatterns, ensuring that your users adhere to a tight naming scheme that you can then process programmatically in PHP.

Simple option

If you don’t absolutely need to support multiple, grouped matches (e.g., for on or oe to be replaced into the same slot in Monitor Network (on|oe)), it would be fairly straightforward:

  • You would instruct your users to use named subpatterns in the parent template and add references to those named subpatterns (in a syntax you define beforehand) in the child templates

Use (?<name>pattern) to define a subpattern in the parent template.

Use {{name}} to refer to named subpatterns in the child templates.

———

Example:

Parent – Jh(?<a>.*) D(?<b>.*) (?<year>\d{4})

Child – Monitor Network ({{a}}|{{b}}) or ({{year}}) Networks

  • In your backend code, you would then filter the array containing the matches from the parent template search to only retain the named matches, and then use another regex replacement to replace the references in the child templates with the relevant matches:
$parentMonitors = 'Jhon Doe 1990';
$parentTemplate = 'Jh(?<a>.*) D(?<b>.*) (?<year>\d{4})';
preg_match_all("/{$parentTemplate}/", $parentMonitors, $parentMatches);
$parentMatches = array_filter($parentMatches, "is_string", ARRAY_FILTER_USE_KEY);
// $parentMatches is now ['a' => 'on', 'b' => 'oe', 'year' => '1990']

$childTemplates = [
	'Monitor Network ({{a}}|{{b}})',
	'({{year}}) Networks'
];

foreach ($childTemplates as $t) {
	preg_replace('/({{(.*)}})', fn($m) => $matches[$m[1]], $t);
}

// $childTemplates is now ['Monitor Network (on|oe)', '(1990) Networks']

(This is untested, so there are probably some errors in it here and there, but you get the gist of it.)

More complex options

If you do absolutely need to support grouping multiple patterns into one, you will need to instruct your users to make array-like subpattern names according to a syntax that you can then process to make actual arrays. Most of it would work as before, but there would be some added complexity. (I won’t go through that now in case you can do without it.)

1 like
Atef95's avatar

@kokoshneta Thank you so much MAN!!

 foreach ($childTemplates as $t) {


        $tl = preg_replace_callback('/\({{.*}}\)/', function ($matches) use ($parentMatches) {
            foreach ($matches as $match) {
                $pattern = preg_replace("/[^a-zA-Z]/", "", $match);
                return $parentMatches[$pattern][0];
            }
        }, $t);
    }

    dd($tl);

I had to tweak a little bit the last part however it achieves exactly what I want!!

Please or to participate in this conversation.