• ray10k (unregistered)

    Paranoid programming strikes again, I see.

  • Hasse the Great (unregistered)

    Zeroth

  • Quite (unregistered)

    My own technique for ensuring a field is a number is to use the language's internal convert-to-number method and then to trap out the exception. :-)

  • Oneway (unregistered)

    Paranoid programming indeed. There's no way this snippet will throw an exception and the isset() check is useless as a null value will also be caught by the is_numeric() check.

  • void (unregistered)

    But the regex will prevent the function from returning negative numbers or stuff like "1.2e12", which is_numeric would allow, right?

  • Necropaw (unregistered) in reply to void

    @Void: you are correct. This function only return positive number, which use the decimal mark ".". So, if the database connection uses another locale, this will never return a number.

  • The Optimizator (unregistered)

    Also the string "Infinity" is a number which is blocked by the regex. I bet this is not a WTF and this double-checking was include because of performance optimization: If is_numeric() is much faster than checking with the regEx directly, and if in most cases there would be a non-number string submitted (let's say in 90% of all calls) then first applying is_numeric() would return in 90% of the calls very fast and only in 10% of the calls it would return slower, so the average performance increased as if you would have called it with the regex only in all cases.

  • Gooch (unregistered)

    Even if you don't trust the data source, you could just replace this entire function with floatval($value)

  • Omego2K (unregistered)

    This regard doesn't make sense to me

    /^[0-9]+(.[0-9]*|)$/

    Must start with one or more numbers. Then must be followed by a period. Then can be followed by 0 or more numbers. Then the unescaped pipe symbol meaning or? Should it be ^[0-9]+.?[0-9]*$

  • Oneway (unregistered) in reply to Gooch

    Passing a non-scalar value to floatval() will emit a E_NOTICE level error, so i would at the very least have to be acompanied by a call to is_scalar()

  • Omego2K (unregistered)

    Regard = regex

    Typed it on a tablet

  • Omego2K (unregistered)

    Actually must be followed by any character. Not just a period

  • EvilSnack (unregistered)

    A regex? Will it summon Cthulhu if you pass HTML to it?

  • Cidolfas (unregistered)

    So it looks like this was actually meant to do something like returning only the decimal portion of a number (the parentheses around the decimal part of the regex seem to suggest that). There's no built-in way to do do this in PHP and in this StackOverflow question, half of the answers indeed suggest using string operations to do it: http://stackoverflow.com/questions/6619377/how-to-get-whole-and-decimal-part-of-a-number

    The bug in the code is a) the | operator, which ensures that the decimal part is actually optional, and b) the fact that instead of returning the decimal portion, it returns the original value. -_-

    As for it being called only on DB values that are already numbers, it was written as a generic function and coding defensively in case someone else decides to use it on input that wasn't pre-validated is actually a good thing in my opinion. But yeah, the code doesn't do what it thinks it does or needs to do.

  • (nodebb) in reply to Omego2K

    Actually must be followed by any character.

    Go and read the article again; that has a key backslash which Markdown swallowed…

  • Omego2K (unregistered) in reply to dkf

    Ah you're right, but still it means that a decimal must be included but doesn't have to by followed by a number. Also wtf is up with the pipe at the end?

  • AstuteComputing (unregistered)

    Also, the pattern match will fail on negative numbers.

  • Untitled (unregistered) in reply to Omego2K

    (.[0-9]|) means “either .[0-9] OR an empty string”, effectively making the decimal part optional.

  • Fucking Numbers (unregistered)

    Ha ha ha

  • Omego2K (unregistered) in reply to Untitled

    Er didn't know that. Good to learn. Thanks!

  • Omego2K (unregistered)

    Actually I still don't get it. Would it not match

    Without anything after the decimal?

  • jmm (unregistered) in reply to The Optimizator

    True, but if it is used only in the context noted (decimals coming from the database), the original remark of 'copy/paste' is probably correct. They find a nice bit of code, and just use it everywhere-- even in places it's not needed.

  • Omego2K (unregistered)

    Then I would say it's not a reusable function anyway. It only works for values coming from a statically typed database and only for those specific columns. Maybe the function should be renamed to "get_decimal_value_from_numeric_database_column_value". Not saying it would fix the issue of having this monstrosity, but at least it would clarify intent.

  • Ex-lurker (unregistered) in reply to Cidolfas

    "So it looks like this was actually meant to do something like returning only the decimal portion of a number (the parentheses around the decimal part of the regex seem to suggest that)"

    Facepalm

    I can't believe I'm reading this. There isn't even an emoticon at the end to indicate it is tongue-in-cheek

  • verisimilidude (unregistered)

    Remy's disdain for dynamically typed languages because of type errors is misplaced I feel. In the hands of disciplined coders conventions in naming and documentation are sufficient. (Of course this website is all about programmers who should be disciplined - preferably by someone dressed in rubber with a whip.) I code in Python every day and type errors are rare in my team. If they did rise to the level of a problem Python 3 has optional type annotations on function calls. Still not a compile time answer but good unit tests should be able to catch bad function usages.

  • DEeelete! (unregistered)

    DELEte this! THis Website???

    Delete e l e t e

    this shitty fuuuuuuuuuuuuuuuuuuuuuuuuuuuuuucking website

  • isthisunique (unregistered) in reply to ray10k

    Paranoid programming is good if done right. This is actually naive and paranoid programming. One leads to the other.

    For example, I just would not bother with is_numeric because I am paranoid about it. It does way too many things that might change between versions to really mean anything and in reality you almost never ever need all of those things altogether. It's far too easy to be naive about the method even when you don't want to be. When you aren't naive it has so many quirks and edge cases that you have to consider them all.

    Proper paranoid programming might be as such:

    Pre checks...

    Check the type is a string (because otherwise why would you run this one anything but data to be deserialised, if you have an int you know it's an int and don't touch it). Some people might be lazy and not think about what they are doing though, just running their function all the time on input. That is a kind of paranoia through laziness. You filter untrusted data in reality, not trusted data, except in really sensitive hotspots on special occasions. You might expect another type or a couple of types. Either way you should check that it's exactly what you expect and you should know what to expect. Anything else, you haven't anticipated or coded for. Alternatively, why code for lots of cases that you don't need to?

    Check that the length is within reasonable bounds. You will probably always want this. It basically helps protect against overflow exploits, DDOS, etc.

    The next things depend on circumstances. You may want to escape, filter or convert it. Converting it to the right type such as double is fine but the converter should be strict (it should throw an error on anything that could not be produced by the reverse conversion). It should actually only be filtered or converted on demand except in very specific cases. You only escape for HTML just before interpolating into HTML. You only except for SQL just before interpolating into an SQL string. If you use DOM or parametised SQL then that can be done for you.

  • Anonymous (unregistered)

    When will those spambots learn the difference between hackers and crackers? :(

  • Me (unregistered) in reply to Anonymous

    Once upon a time, I think that sort of thing was called detective work. Of course, once upon a time, that sort of thing required getting out of your chair.

    And then as now, you still get in trouble for doing it without a license.

  • atornblad.se (unregistered) in reply to EvilSnack

    All regexes do!

  • El Dorko (unregistered) in reply to Quite

    Just note that if you expect to get a lof failed (non-number) cases, throwing all those exceptions can get expensive. Some comparison here: http://stackoverflow.com/questions/237159/whats-the-best-way-to-check-to-see-if-a-string-represents-an-integer-in-java

  • El Dorko (unregistered) in reply to Quite

    Just note that if you expect to get a lof failed (non-number) cases, throwing all those exceptions can get expensive. Some comparison here: http://stackoverflow.com/questions/237159/whats-the-best-way-to-check-to-see-if-a-string-represents-an-integer-in-java

  • tldr; (unregistered)

    Ummm, ten points for discussing the regex, but i do declare a FAIL by all; read the article much? Read the bit where it said the $value came from a DB and was guaranteed to both exist and be a number, perhaps becvause it was stored in a number-only data type column?

    Fail

  • tldr; (unregistered)

    suzan white (unregistered) - genius. Come at me, bro.

  • tldr; (unregistered) in reply to Omego2K

    Close. GET_potentiallydecimalNUMBER_FROM_potentiallydecimalNUMBER_AND_CONFIRM_DATABASE_UNDERSTANDS_NUMBER_IS_NUMBER_AND_THROW_EXCEPTION_IF_NUMBER_IS_STORED_IN_WORDS_BUT_NOT_ROMAN_NUMERALS

  • Nullivator (unregistered)

    Hey, Copy Paste is one of the most productive programmers out there. Don't knock him.

  • Top Coder (unregistered)

    For those of you questioning why do the regex after the IsNumeric call, have you considered that it might be running in a multithreaded process? I think this is good practice and will instruct all my subordinates to use this method going forward.

  • isthisunique (unregistered) in reply to Omego2K

    /^[0-9]+(.[0-9]*|)$/

    I didn't even know you could to that.

    I assume it's the same as...

    /^[0-9]+(.[0-9]*$|$)/

    That . is broken though and it supports 1. without a trailing number. My guess is that it wants to be... /^[0-9]+(.[0-9]+)?$/D

    None of this deals with if the input is already an it or float properly or things like overflow, sign, etc. In an ideal world PHP would expose strict conversion functions that would do the job well enough. Honestly in this case it would probably be better off just to replace the whole lot with +$var. To understand why /D, RTFM.

  • (nodebb)

    There is a subtle difference between (...|) and (...)?, and that's that the first one always captures a result and the second one doesn't (in JavaScript the relevant value would be "" vs. undefined depending).

  • Hexadecimal (unregistered)

    not sure about in this programming language, but some places (like some SQL variants) isnumeric also accepts hexadecimal numbers as numeric (like "1A23456"), potentially causing problems if you're trying to insert into a numeric column.

Leave a comment on “Checked Numbers”

Log In or post as a guest

Replying to comment #:

« Return to Article