PHP’s strrev is not safe to use on utf-8 strings because it reverses a string one byte at a time. So if a character consists of multiple bytes it cannot be preserved as an entity in the reversed result.
There is no Multibyte String alternative
We did some googling, but strangely enough all solutions we encountered were either invalid or incredibly heavy memory/code wise.
- using utf8_decode only works if your characters in the string exist in the ISO-8859-1 character set
- using preg_match_all seems weirdly over-engineered
- a simpler preg_match_all works, but on a 2MB string PHP was already using 150MB of memory. This is actually what sparked our search when when @renan_saddam noticed his PHP port of Github’s email_reply_parser choked on a 2MB multibyte email.
What we came up with
Is dead simple, but I’m putting it online anyway since it’s apparently not common good.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 2 3 4
In our tests, the above function was factor 5x more efficient in regards to memory consumption than the
Hope this helps