How to Validate a URL
by Jane Bailey
in Representative Line
on 2015-02-05
There's an old joke among programmers, particularly those who have had to use regexes more often than they're comfortable with:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
It's a seductive trap: Regexes are good at processing strings, and are more complex than your usual string-processing utilities, so it seems logical to use regexes to do advanced string-parsing. But regular expressions are not meant to do arbitrary string parsing. Regular expressions are meant to parse regular languages. Furthermore, regular expressions are notoriously hard to read, resulting in, what appears to be, a string of random characters sneezed out all over your screen. For example, consider the following that's used for parsing a valid URL:
Regex regex =new Regex(
@"^((((H|h)(T|t)|(F|f))(T|t)(P|p)((S|s)?))\://)?(www.|[a-zA-Z0-9].)[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,6}(\:[0-9]{1,5})*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&%\$#\=~_\-]+))*$"
);