strlen vs mb_strlen in PHP

As a PHP developer, I am sure you use strlen often to check for the length of strings. strlen does not return the length of a string but the number of bytes in a string. In PHP, one character is one byte therefore for characters that fall within the 0 – 255 range in ASCII/UTF-8 all seem well; that is string length matches the number of bytes. There is nothing wrong with this approach of checking the length of a string using strlen if you are checking the length of string that you typed in Latin characters for your program because ASCII and some basic UTF-8 characters fall within the range of a single byte.


The problem with using strlen occurs when there is a character outside of the 1-byte range, then strlen returns values greater than the string length, which can lead to bugs and general confusion. The solution to this is to use mb_strlen, which returns the exact length of the character by checking the encoding set. Check out the snippet displaying this:

 // strlen okay: Result is 4
echo strlen('Rose');

// strlen not okay: Result is 7
echo strlen('Michał');

// mb_strlen okay: Result is 6
echo mb_strlen('Michał');

A rule of thumb I use when checking length of character input from user especially from a web browser I use mb_strlen but when I need to use the actual size of the string I use strlen for example when transferring string through or saving them in a database or if it is a string I typed out myself.

Next time you want to check the length of a string in PHP or any language it is best to know the appropriate function to use for the type of string characters.

Follow me on Twitter / Mastodon / LinkedIn or check out my code on Github or my other articles.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.