[PHP] Normalization of strings – Upper and lower case

Dieser Beitrag ist auch verfügbar in: German

Hi,

today again a small task to do in PHP. The following situation:

We have a text-string in which is (obviously) a text. Lets say the content is:
HERE you’ll get today CHEAP cucumber DIRECT from the farmer.
We would like to change this string now like we want to change all words which consist only uppercase letters in: first letter uppercase and the rest lowercase. Like it should be.
Of course we should check now as well which really start with an uppercase letter, if they are in the middle of the sentence. Else we would mess up badly the orthography.
But if you don’t have a good dictionary (database) and a good routine lying around… then we need to be happy with my routine for now like it is.

So, the task is clear. Letter only in uppercase shall be converted in first letter uppercase and the rest lowercase. Like for the example above: HERE -> Here

There is no ready solution for it. We need to make our own function. The first thought is to split each word of the text, to check each word if it is consist only capitals. If yes, then convert, if not… then not. After all putting everything together again and finish.

As far, as good. Is working very well. Just one thing I’ve mentioned. I’ve used to convert first the function ucfirst(). This is doing exactly that. First letter capital and the rest lower case. We just need to convert everything before into small letters with strtolower().
And here I got the problem. By accident I used a string with German umlauts. And the function strtolower() messed up the umlauts. Which is probably not important for the English languages, as there are no such umlauts… but maybe it is interesting as well for Turkish or Spanish or French (as example).
So, what can we do? I found the solution with: mb_strtolower($String,’UTF-8′)

As you can see you can specify the charset (in this case ‘UTF-8’). That works very well and the umlauts are correct.

And here the whole code snippet:

function NormalizeString($String) {
		$StrA = explode(" ", $String);
		$StrAc = count($StrA);
		$Out = "";
		for($i=0;$i<$StrAc;$i++) {
			$tmp = preg_replace('/[^a-zA-ZäÄöÖüÜß]/', '', $StrA[$i]);
			if(!empty($tmp) && strtoupper($tmp) == $tmp) {
				//Alles groß
				$StrA[$i] = ucfirst(mb_strtolower($StrA[$i],'UTF-8'));
			}
			$Out .= $StrA[$i]." ";
		}
		
		return $Out;
	}

First of all we create an array with all single words, using explode(). Then we pass through every single word with an For-Loop. We copy the word in a temporary variable to clean it up to compare it better. If the word is upper case, then we convert it. And finally we put the string together again and give it back. That’s it.

Best regards
Gordon

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

*