kvz.io
Published on

Reverse a Multibyte String in PHP

Authors
  • avatar
    Name
    Kevin van Zonneveld
    Twitter
    @kvz

PHP's strrev is not safe to use on utf-8 strings because it reverses a string one byte at a time. So if a character consists of multiple bytes it cannot be preserved as an entity in the reversed result.

There is no Multibyte String alternative to strrev either.

We did some googling, but strangely enough all solutions we encountered were either invalid or incredibly heavy memory/code wise.

For example:

What We Came Up With

Is dead simple, but I'm putting it online anyway since it's apparently not common good.

<?php
function mb_strrev ($string, $encoding = null) {
	if ($encoding === null) {
		$encoding = mb_detect_encoding($string);
	}

	$length   = mb_strlen($string, $encoding);
	$reversed = '';
	while ($length-- > 0) {
		$reversed .= mb_substr($string, $length, 1, $encoding);
	}

	return $reversed;
}
?>

Example:

<?php
echo    strrev('Gonçalves') . "\n"; // returns sevla??noG
echo mb_strrev('Gonçalves') . "\n"; // returns sevlaçnoG
?>

In our tests, the above function was factor 5x more efficient in regards to memory consumption than the preg_match_all solution.

Hope this helps

Legacy Comments (5)

These comments were imported from the previous blog system (Disqus).

Guest
Guest·

how does
while ($length-- > 0) {
...
}

compare to
while ($length > 0) {

...
$length--;
}

Kev van Zonneveld
Kev van Zonneveld·


while ($length-- > 0) // ranges from 9 -> 0, necessary to get the right mb_substr positions
while ($length > 0) // ranges from 10 -> 1, so would require extra code to subtract 1, and it already requires an extra line for $length--

kamweti muriuki
kamweti muriuki··39 likes

fair enough, thanks

Tux Lector
Tux Lector·

This one works for me.

function string_reverse ($string) {
/// Multibyte String Reverse
return \implode (null, \array_reverse (\preg_split ('//u', $string, null, PREG_SPLIT_NO_EMPTY)));
}

Sammitch
Sammitch·

This will break for strings containing accents or other combining marks.

Eg: y̅a becomes a̅y with this function.