Using Regular Expressions in C# vs PHP
Join the DZone community and get the full member experience.
Join For FreeRecently at work I made a call out to all of the engineers to send me code they thought should be placed in the common library for .Net development. In one of the responses I got an email from a PHP developer that shared a regular expression to check the format of money. He stated he didn't know how often we use them (referring to regular expressions) in C# but he thought he'd pass it along anyway. That got me to thinking. Do I use regular expressions as much in C# as I did in PHPU?
For those that are curious, the regular expression that was sent over works perfectly well in C#, no problems. I think developers will find regular expressions will easily move from one to another. If you happen to be porting PHP regular expressions to C# and vice versa you shouldn't run into too many problems. For example here is a quick test I threw together with the regular expression to make sure it worked.
using System;
using System.Text.RegularExpressions;
class Money
{
public static void Main()
{
Regex exp = new Regex(@"^[1-9][0-9]{0,2}(,{0,1}[0-9]{3})*(\.[0-9]{0,2})$");
Console.WriteLine(exp.IsMatch("123.23").ToString());
}
}
When complied and ran it yielded the following result:
It worked. When it comes to regular expressions within C# I have used them. I find they are more like pepper sprinkled in a good stew rather than a main course meal. That is how I would state the difference in use of them between the two languages. If I had to give a percentage of how much less I use regular expressions I would definitely say more than half as much, and maybe go higher to 90% depending on the situation. Why?
I think it boils down to the reason that C# is strongly typed and PHP is loosely typed. For example, the following code could never be written in C#.
<?php
$x = 1;
$y = '2keith';
echo $x + $y; // will print 3 (yes you can add strings and numbers in php)
// set $x to something completely different
$x = array('a', 'b', 'c', 'd');
echo $x[1]; // will print b
?>
Running the entire thing will result in: 3b
To some this example may be scary, others may look at it as a feature. Whichever way you think, most developers when asked what the result will provide respond with all sorts of various answers. When I taught PHP classes as a consultant I would use a similar example with the class. To no exception, no one could ever predict the out come. Answers given ranged from, it should throw an exception since you can't add a string and a number. Or it should throw an exception because the variable $x was reassigned to a different type.
Do you see "why" PHP code relies on regular expressions more now? Since $x can literally become any type at any time, a PHP developer can never rely on the fact that $x is an INT. About the only way to check that value is the use of a regular expression. In the PHP world it is called Type Juggling. Conversely in C#, once the variable x is assigned a type, it cannot be changed and only valid numbers can be assigned to that type therefore eliminating the need to use a regular expression to check the value of the variable.
The question then becomes though, is this the C# way to test for the money value? I would probably argue it isn't the "best" way to handle money in C#. While it certainly works there are other things to take into consideration when adding money. For example two different types of money such as US Dollars and Euros cannot be added together. It must first be exchanged and then added. The same thing could be said of other operators performed against a variable of type money. This is where it would be suitable to use a struct and create a new type called Money.
We can in C# declare a variable as the type of decimal and use it as money if we choose. In this case we still don't need a regular expression to validate the value of our variable. Here's a sample showing one way to handle a rogue value:
decimal money;
if (Decimal.TryParse("123.a234", out money))
{
Console.WriteLine("money is valid");
}
else
{
Console.WriteLine("money is invalid");
}
I suspect a lot of programmers use this method but again a struct is more desirable. Andre de Cavaignac has a great example of building a struct for a money type. He provides these examples:
Money eur10 = new Money(MoneyCurrency.Euro, 10);
Money eurNeg10 = new Money(MoneyCurrency.USDollar, -10);
Money usd10 = new Money(MoneyCurrency.USDollar, 10);
Money usdZero = new Money(MoneyCurrency.USDollar, 0);
bool result = (eur10 == usd10); // returns false;
bool result = (eur10 > usd10); // throws InvalidOperationException (comparison not valid)
bool result = (eur10 > Money.Zero); // returns true
bool result = (eur10 > usd0); // returns true
bool result = (usd10 > eurNeg10); // returns true (positive always greater than negative)
Obviously he's put a lot of thought into how money should be handled and if you look at his library you'll see he accounts for all different type of currencies.
For those wondering the differences about regular expressions in PHP and C# I hope that gives you some insight into how the different languages respectively handle different situations. It all boils down to strong typing vs loose typing and the ability to create new types based on structs.
Published at DZone with permission of Keith Elder. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments