Best REGEX for first/last name validation?
Looking to stop people putting initials in the First / Last name fields, plus any special characters that you would not associate with a name. I've got something, although it is coming unstuck on names like McGowan or MacGowan. I understand why although I'm stumped to provide a solution.
This is what I have:
AND( $User.ProfileId <> '00e30000001jDdz', OR( LEN(FirstName ) <=1, MID(FirstName ,2,1) = " ", NOT( REGEX( FirstName, '([A-Z][a-z]*)([\\s\\\'-][A-Z][a-z]*)*' ) ) ) )
The UserProfile is to let a System Admin do what they want. The Len & Mid bits are to stop initials, or people putting in "S J ".
There is probably a more elegant way of doing this but I'm rather fresh to REGEX.
Any suggestions on how to better this?
Be careful when "validating" names. Anything more than verifying a reasonable length (1kb?) for names will be too restrictive.
Thanks to the huge amount of feedback so far. Interesting can of worms this opens up.
Yes! Let's validate some names with RegEx.
After all, we know that all people must have a first and last name, right? And no single person has more than three or four names total? And no doubt the same person will forever be identifiable by the same name?
Plus, we know that no modern culture uses patronymic naming and people in the same nuclear family must have the same last name, right?
I think your choice of RegEx to validate names is missing the point: this is a huge unwieldy problem and, even if you massively restrict the scope of names you allow, you will forever suffer the risk of false negatives and you will be turning away people from other cultures and languages. In other words, I don't think that even attempting to validate names is worth your time.
yes, when faced with a similar problem to detect bogus leads, the best I ended up with to prevent false negatives was to screen out names with 4+ consecutive repeated letters or names from a limited set of keyboard noise strings (`asdf' and the like). I wasn't very satisfied with the results in terms of screening spam leads but I did avoid false negatives.
I get what you are saying here. On the whole, the database is made up with paid for data, which is wonderfully clean. However, I would rather manage what the users are adding to the database and deal with exceptions rather than let free reign in here. Of the two evils, it appears to be the lesser one, evidenced by the sheer lack of care that is being taken as it stands. Far too many entries being entered as "L" or "??" or "XX" or "(secretary)". Completely negates the one customer picture push.
From my experience, working in a multi lingual company, you'd probably be better off validating against characters you don't want to allow.
On our EComtract docs we have used something similar to this:
Mainly because London works VERY close with some of the Scandinavian countries but because we are global and something like a Chinese name won't match the RegEx that you are looking to implement.
REGEX( FirstName, '([A-Z][a-z]*)([\\s\\\'-][A-Z][a-z]*)*'
In the above, the first
*is a 'greedy match', meaning it
*Matches 0 or more of the preceeding token. This is a greedy match, and will match as many characters as possible before satisfying the next token.
which would be where you have the
\\s\\that represents a
So, I don't think you want the
*there if you want to prevent single characters using Regex. The same would apply to the
*you have following the second set of alpha characters.
It appears that you've also escaped an apostrophe with the
\'. Is that something you also don't want to allow? The
-following it isn't applicable in terms of a range of characters and thus would likely only apply when it follows an apostrophe; meaning Regex would look for
'-. I'm not totally certain of that, but I believe that would be the case.
*before the last closing parens would again be a 'greedy match` that due to it's placement I believe would cause the entire pattern search to repeat. From your description, I don't think that's what you desire.
I highly recommend you visit Regxr and use the on-line Regex expression builder there to create and test your expression. to ensure you're creating what you desire as its unclear to me from what you've posted. There's a quick YouTube tutorial on how to use the builder. There's even a library of RegEx patterns on the site available that users have submitted which are available for your use to help you create your own custom patterns if one doesn't already meet your needs.
So, say, Swedish people like Björn Borg will not be able to use this...
RegEx supports the use of unicode characters and those from other languages. As I said in my answer, its not entirely clear what @44f wanted to accomplish with the RegEx code he posted. Its much easier to look for something you're expecting than it is to screen out all the potential things that are possible for an external party to enter. I'll add that I was answering his specific question and not commenting on the validity of his use case.
This will match "James Woods", "Henry Ian Cusack" (first_mid will be "Harry Ian", last will be "Cusack"), "John McGlynn", "Raymond J Reynolds", etc.
It will also match "foo Manchu". Use ucfirst to force initial capitalization. There's no way to both enforce lowercase trailing letters and preserve internal caps (like McGlynn), so it's best to simply accept what the user submitted (like "Sarah McGilBertA".
ValidationExpression="^[a-z A-Z]+$" just set this property of regural expression it will only get the character and string.. for example:-
<asp:RegularExpressionValidator runat="server" ID="rev1" ControlToValidate="txtfirstname" ErrorMessage="Your First Name May Consist Only Character" ForeColor="#cc0000" SetFocusOnError="true" ValidationExpression="^[a-z A-Z]+$"></asp:RegularExpressionValidator>