• Bill M (unregistered) in reply to Sal Undy
    Right, that's exactly the point. The whole story hinges on the negative numbers being all camouflaged because they end with "O" instead of "0". It falls apart unless somehow every negative number ends in 6.
    ... or, of course, ends with a 2
  • Darren Poulson (unregistered)

    At my last place of work, we still used this format (and probably continue to after I left) as we still had interfaces using it from the old days of COBOL. It is a standard, and one I would've thought a BI person would've come across at some point. Not really a WTF.

    The 2t is a WTF tho.

  • Alex Godofsky (unregistered)

    TRWTF is that she didn't immediately recognize signed overpunch.

  • (cs)

    TRWTF is writing code to handle the zoned overpunch. COBOL takes care of that for you if you define the variable correctly. (or is TRWTF that I know this?)

  • ih8u (unregistered) in reply to Valued Service
    Valued Service:
    Anonymous Coward:
    Original Article:
    monkeys who were annoyed at their lack of an opposable thumb
    Awwww... now I feel really sorry for the monkeys in this story - did they lose their thumbs in an industrial accident or something?

    Poor monkeys!

    Yeah, they also have opposable toes. Which gives them an advantage over us.

    Finally got sick of the writing. Skipped to the comments after the bit about the monkeys. Glad somebody elected to educate the author.

  • anonymous (unregistered) in reply to Darren Poulson
    Darren Poulson:
    At my last place of work, we still used this format (and probably continue to after I left) as we still had interfaces using it from the old days of COBOL. It is a standard, and one I would've thought a BI person would've come across at some point. Not really a WTF.

    The 2t is a WTF tho.

    You can tell how old it is, because if it was a recent invention it'd probably use @ABCDEFGHI and PQRSTUVWXY instead of {ABCDEFGHI and }JKLMNOPQR.

  • (cs) in reply to Alex Godofsky
    Alex Godofsky:
    TRWTF is that she didn't immediately recognize signed overpunch.
    If all you have is a number ending in "O" it not unreasonable to expect a typo. So it was correct for her to ask if that was erronous da1a.
  • (cs) in reply to no laughing matter
    The Article:
    The particular value that was causing grief was 2196O. Now the very astute among you might have already spotted the problem.
    Not really Bruce, no, on account of you neglecting to mention at this point in the story that "2196O" was supposed to be a numeric value.

    And shame on you for missing a golden opportunity to make a "type O negative" joke.

  • anonymous (unregistered) in reply to Zylon
    Zylon:
    The Article:
    The particular value that was causing grief was 2196O. Now the very astute among you might have already spotted the problem.
    Not really Bruce, no, on account of you neglecting to mention at this point in the story that "2196O" was supposed to be a numeric value.

    And shame on you for missing a golden opportunity to make a "type O negative" joke.

    If it was supposed to be a string value it would have been enclosed in quotes. You're welcome.

  • Russell (unregistered)

    This one made me WTF out loud. Using letters for negatives...WTF?

  • (cs) in reply to PedanticCurmudgeon
    PedanticCurmudgeon:
    TRWTF is writing code to handle the zoned overpunch. COBOL takes care of that for you if you define the variable correctly. (or is TRWTF that I know this?)
    Todays' software: A data-importer for a BI-application. Written, as another commenter has noted, in the SQR programming language, part of the Oracle HyperboleHyperion business-intelligence software.

    Now, how do you use that nice COBOL feature from within your SQR-application?

    Addendum (2013-12-05 12:31): Upon second look, the code-snippet might be part of the data-exporter, not the importer.

    Still it indicates an environment where COBOL is not necessarily a given.

  • Fenix (unregistered) in reply to Sal Undy

    I think the author embellished a tad. I wouldn't be surprised if it failed on the FIRST negative number, and they kept trying to change the definition of insanity by running it again and again and again, hoping for a different result.

    Maybe there were N's and P's and whatnot in the sets later on, but they just got hung up on 2196O.

    Captcha: ingenium... because Alicia is the ingenium star of the NSA

  • Wayne (unregistered) in reply to Mainframer

    I ran in to this in my very first programming job in '85! I was extracting data from a time/charge billing system called, cleverly enough, TCB, sucking it in to dBase III to provide prettier invoices and better reporting. And that was exactly how they stored balances and charges. I had a subroutine and another set of columns to hold the converted data.

    Fun times.

  • (cs) in reply to anonymous
    anonymous:
    If it was supposed to be a string value it would have been enclosed in quotes. You're welcome.
    Wrong. This article was programmed in English, which always encloses literals in quotes regardless of data type. So not only was it confusingly written, it was poorly formatted. Probably wouldn't even compile.
  • anonymous (unregistered) in reply to Zylon
    Zylon:
    anonymous:
    If it was supposed to be a string value it would have been enclosed in quotes. You're welcome.
    Wrong. This article was programmed in English, which always encloses literals in quotes regardless of data type. So not only was it confusingly written, it was poorly formatted. Probably wouldn't even compile.
    English is a horrible choice of programming languages due to its rampant (or is that rampart) ambiguity and the overall difficulty of making it say what it's supposed to say. Why we write our laws in English is beyond me; if they were written in a sane language (say, BASIC) most lawyers, judges, and politicians would all be unemployed.
  • (cs)

    OK, since there seems to be some confusion about this....

    First of all, you must understand the comcept a "packed decimal". In addition to binary integers and floating point numbers, IBM 360s (and I assume other IBM computerd of that era -- 1960s), had a data type called "packed decimal" which is essentially what we now call "binary-code decimal" (BCD).

    Basically, one decimal digit is stored in one hex digit. Calculations done in packed decimal were able to maintain their precision and accuracy (even with decimals), at the cost of being slow.

    Of course, "slow" is a relative term, when the alternative is doing it by hand with an adding machine, so packed decimal became the standard data type used, particularly in COBOL, but I believe also in Fortran.

    One innovation of packed decimal over standard BCD is the use of an addition digit at the end, to represent the sign. Since they had 6 digits they weren't using ('A' thru 'F'), they made A,C,E,F positive, and B & D mean negative. So 01 23 4F is +1234, and 12 34 5B is -12345. Note, however, the "preferred" sign characters were C & D, so after any calculation, the sign would be changed to one of those as appropriate.

    The next step is how to get that number from a punched card. Data from a card is stored in EBDCIC (EBCDIC was IBM's alternative to ASCII, because IBM had to be different). So "1234" is EBCDIC is F1 F2 F3 F4, and IBM 360s had an instruction called "PACK" which would strip the upper nybble off each byte in a block except for the last, on which it reversed the nybbles. So F1 F2 F3 F4 would become 01 23 4F (you can see now how the decimal digits are "packed" together)

    This worked fine for positive numbers, but what to do about negative numbers? For those, IBM came up with a creative hack.

    On punch cards, there are printed digits 0 - 9 in each column. A single punch in any of those rows would represent that digit. There was also some blank space for two more rows at the top which could also be punched (called the X & Y rows). Holes in the X and/or Y row plus a hole in a digit would represent some other letter or character. The dash (-) was just a hole in the X row.

    So, to enter a negative number, you would type the number, then backspace, and over-type the last digit with a dash (minus sign). On the printing on top, it would look like a struck-thru digit, but a card reader would read the X-punch + digit-punch as a letter, specifically the letters 'J' thru 'R' which in EBCDIC are D1 thru D9, so when flipped by PACK become 1D thru 9D -- making the numbers negative.

    Of course, if you are working in a medium that doesn't allow overpunching (say in a text editor on a video monitor) you'd have do the digit-to-letter translation manually.

    Technically, numbers without on overpunched final digit were considered "unsigned". To force a number of be positive, you were supposed to over-punch it with the character which was just a Y-hole, which was... Ampersand (go figure).

  • (cs) in reply to Zylon
    Zylon:
    anonymous:
    If it was supposed to be a string value it would have been enclosed in quotes. You're welcome.
    Wrong. This article was programmed in English, which always encloses literals in quotes regardless of data type. So not only was it confusingly written, it was poorly formatted. Probably wouldn't even compile.

    These are literals and not quoted in English: 2 December 5, 2013

    Sincerely,

    Gene Wirchenko

  • (cs) in reply to M-x org-mode
    M-x org-mode:
    C6 D9 C9 E2 E3 4F 5A
    FTFY, but I got to it a bit late!
  • OldCoder (unregistered) in reply to Mainframer
    Mainframer:
    For the record, that's a type of zoned decimal format. It's used when your flat file dataset has fixed width columns and you want to be able to represent a negative number without losing a digit.

    It was somewhat clever for 1979, but it should be avoided at all costs now.

    1979? You young whipper-snappers know nothing!

    I was taught COBOL, overpunches and all, in 1973. As punched cards were the only way of getting data into the program work-arounds like this were not uncommon.

  • OldCoder (unregistered) in reply to anonymous
    anonymous:
    English is a horrible choice of programming languages due to its rampant (or is that rampart) ambiguity and the overall difficulty of making it say what it's supposed to say. Why we write our laws in English is beyond me; if they were written in a sane language (say, BASIC) most lawyers, judges, and politicians would all be unemployed.
    Please, please let it not be Visual BASIC!
  • (cs) in reply to no laughing matter
    no laughing matter:
    Upon second look, the code-snippet might be part of the data-exporter, not the importer.

    Still it indicates an environment where COBOL is not necessarily a given.

    Using zoned overpunch without COBOL would be another WTF.

  • Harrow (unregistered) in reply to anonymous
    anonymous:
    Mainframer:
    For the record, that's a type of zoned decimal format. It's used when your flat file dataset has fixed width columns and you want to be able to represent a negative number without losing a digit.

    It was somewhat clever for 1979, but it should be avoided at all costs now.

    Want to be "clever"? Why not use base 36 or something?
    Because very few keypunch operators could accurately convert an integer value from base-10 to base-36 on sight.

    'Zoned values with sign overpunch' was a format created for the 'binary coded decimal' 'unit record' environment, which derives from the 80-column punch card introduced in 1929. It is strongly associated with COBOL because COBOL was first developed exclusively in that environment. By 1979 it was all starting to be replaced but existing kit was not just tossed out every Xmas like it is today.

    A 'big' computer was one that had 128 KB of RAM and 20 MB of disk, and ran at about 500 KHz. There was no computing power available for interacting with humans in real time. Keypunch operators created the input records manually offline, using a honking great machine like a typewriter on steroids. Every field definition was a compromise of space, time, and complexity constraints.

    You need to understand that the world was not always as lacking in humor as it is now.

  • Kevin Klein (unregistered)

    Other than the "2t" typo, this implementation actually jives with standard practice in the COBOL world.

    http://en.wikipedia.org/wiki/Signed_overpunch

  • n_slash_a (unregistered) in reply to Anon
    Anon:
    +1 respect for the Princess Bride reference :)
    +2 respect for the Borderlands 2 reference :D
  • My name indeed (unregistered) in reply to Zylon
    Zylon:
    The Article:
    The particular value that was causing grief was 2196O. Now the very astute among you might have already spotted the problem.
    Not really Bruce, no, on account of you neglecting to mention at this point in the story that "2196O" was supposed to be a numeric value.

    And shame on you for missing a golden opportunity to make a "type O negative" joke.

    I have O negative blood. Unlike people blessed with AB+, if I should have a major accident, it might be quite difficult to find a blood donor. let me assure you, it's no laughing matter.

  • Salvius23 (unregistered) in reply to Anonymous Coward
    Anonymous Coward:
    Original Article:
    monkeys who were annoyed at their lack of an opposable thumb
    Awwww... now I feel really sorry for the monkeys in this story - did they lose their thumbs in an industrial accident or something?

    Poor monkeys!

    If you think about it too much, that's the only interpretation that even makes sense: Since opposition isn't actually necessary to operate a space bar with your thumbs, the monkeys' thumbs can only be missing entirely.

  • (cs) in reply to My name indeed
    My name indeed:
    I have O negative blood. Unlike people blessed with AB+, if I should have a major accident, it might be quite difficult to find a blood donor. let me assure you, it's no laughing matter.
    AB+? Get with the program! 0xAB = 171(dec) AB+ = 17A
  • KingBeardo (unregistered)

    Gotta love BI-curious girls...

  • Barf 4Eva (unregistered)

    honestly, fail to see the WTF... looks like EBCDIC for a VAX or something. But yeah, always fun reading files like that one... :|

  • (cs) in reply to JamesCurran
    JamesCurran:
    OK, since there seems to be some confusion about this....
    Not before reading you comment, but now...

    (But thanks for explaining the word-origin of "overpunch")

    JamesCurran:
    First of all, you must understand the comcept a "packed decimal". In addition to binary integers and floating point numbers, IBM 360s (and I assume other IBM computerd of that era -- 1960s), had a data type called "packed decimal" which is essentially what we now call "binary-code decimal" (BCD).
    What has BCD to do with this? The issue at hand is encoding signed numbers in EBCDIC. The internal representation when performing calculations on them has nothing to do with this!
    JamesCurran:
    Basically, one decimal digit is stored in one hex digit. Calculations done in packed decimal were able to maintain their precision and accuracy (even with decimals), at the cost of being slow.
    "Maintain their precision and accuracy"? They are imprecise and inaccurate! The point of BCD is that the inaccuracy is exact the same as in calculation with numbers represented as decimals.
    JamesCurran:
    One innovation of packed decimal over standard BCD is the use of an addition digit at the end, to represent the sign.
    So they are spending a digit (4 bits) where a single bit would have been enough.

    It seems some of those "but we had only so few memory" explanations are lousy excuses for not doing a proper job!

    (Also: Fixed width data-formats, where most of the space is allocated by spaces. But still enforces limits on field lengths. Slim data formats? I don't think so!)

    JamesCurran:
    The next step is how to get that number from a punched card. Data from a card is stored in EBDCIC (EBCDIC was IBM's alternative to ASCII, because IBM had to be different).
    They even invented multiple variants of EBDCIC, because IBM had to be different from IBM.

    Vendor-lockin is not enough, we want our customers locked to a specific series of our product range!

    JamesCurran:
    Technically, numbers without on overpunched final digit were considered "unsigned". To force a number of be positive, you were supposed to over-punch it with the character which was just a Y-hole, which was... Ampersand (go figure).
    So "AB+" is actually typed "171&"? Now let's encode this in XML!
  • (cs) in reply to Barf 4Eva
    Barf 4Eva:
    honestly, fail to see the WTF...
    Choose one: * Self acclaimed "leading consultant" in the ETL business does not recognise widely used format * Customer does not specify at all in which format numbers are written and customer developer does not recognise the problem when conversion repeatly fails at "2196O". * Typed programming language does not report an error when it sees "2t" as a number literal. * Export of the data writes negative numbers sometimes as positive and nobody notices it. * Or was the issue recognised, but never thoroughly analysed? "They are so rare that we usually don't worry about them."
  • (cs) in reply to no laughing matter
    What has BCD to do with this? The issue at hand is encoding signed numbers in EBCDIC. The internal representation when performing calculations on them has nothing to do with this!

    It has everything to do with this. It explains why this particular scheme for negative numbers was used, and why some scheme needed to be used. (If we didn't need to convert from EBCDIC to an internal representation, we could just punch "-1234")

    The point of BCD is that the inaccuracy is exact the same as in calculation with numbers represented as decimals.
    I'm not sure what point you are trying to make here. The result of adding BCD numbers is accurate. It does maintain precision. You may choose to limit precision to a certain number of digit (actually you had to) but how many was up to you. Saying "inaccuracy is exact the same as in calculation with numbers represented as decimals." is merely citing the definition of the term "precision".
    So they are spending a digit (4 bits) where a single bit would have been enough.
    True, but remember, bits are doled out 8 at a time, so even if you limited the sign to one bit, the other three were going to go to waste.
    They even invented multiple variants of EBDCIC, because IBM had to be different from IBM.

    And now we have Apple for that.

  • (cs) in reply to Andrea
    Andrea:
    It can be worse, guys...

    I worked with Russian military blueprints, which were "digitized" by hand, and then should be processed and put to the DB. In those times we had FIVE variations of zero character (based on the sight quality of encoding granny):

    1. 0
    2. O
    3. o
    4. Russian O (similar to latin)
    5. Russian o (also similar to latin)

    And I'll bet they all appeared in Captcha texts too.

    So fill me in on a piece of unexplained jargon. Is a BI system one that goes both ways?

  • Not Hans (unregistered) in reply to da Doctah
    da Doctah:
    Andrea:
    It can be worse, guys...

    I worked with Russian military blueprints, which were "digitized" by hand, and then should be processed and put to the DB. In those times we had FIVE variations of zero character (based on the sight quality of encoding granny):

    1. 0
    2. O
    3. o
    4. Russian O (similar to latin)
    5. Russian o (also similar to latin)

    And I'll bet they all appeared in Captcha texts too.

    So fill me in on a piece of unexplained jargon. Is a BI system one that goes both ways?

    They go Male, Female, and File Not Found.

  • (cs)

    2t is definitely signed overpunch for ±26

    Addendum (2013-12-05 15:18): Therefore:

    2O == 2t 2F == 2t 2O != 2F

  • Zapp Brannigan (unregistered) in reply to Harrow
    Harrow:
    anonymous:
    Mainframer:
    For the record, that's a type of zoned decimal format. It's used when your flat file dataset has fixed width columns and you want to be able to represent a negative number without losing a digit.

    It was somewhat clever for 1979, but it should be avoided at all costs now.

    Want to be "clever"? Why not use base 36 or something?
    Because very few keypunch operators could accurately convert an integer value from base-10 to base-36 on sight.

    'Zoned values with sign overpunch' was a format created for the 'binary coded decimal' 'unit record' environment, which derives from the 80-column punch card introduced in 1929. It is strongly associated with COBOL because COBOL was first developed exclusively in that environment. By 1979 it was all starting to be replaced but existing kit was not just tossed out every Xmas like it is today.

    A 'big' computer was one that had 128 KB of RAM and 20 MB of disk, and ran at about 500 KHz. There was no computing power available for interacting with humans in real time. Keypunch operators created the input records manually offline, using a honking great machine like a typewriter on steroids. Every field definition was a compromise of space, time, and complexity constraints.

    You need to understand that the world was not always as lacking in humor as it is now.

    I don't know when the 80 column punch card came into being, but Hollerith invented the punch card for tabulating in the 1880s.
  • (cs)

    Just to set you straight, the Byzantine Empire didn't build labyrinths. We get the adjective "Byzantine" in its modern sense through "Byzantine politics," which were a complex web of plots and intrigues.

  • @OddsWithReality (unregistered) in reply to Mainframer
    Mainframer:
    For the record, that's a type of zoned decimal format. It's used when your flat file dataset has fixed width columns and you want to be able to represent a negative number without losing a digit.

    It was somewhat clever for 1979, but it should be avoided at all costs now.

    I ran into it for the first time not 2 weeks ago. Figured someone was trying to save a byte in the fixed-width package sent between servers which I grudgingly accepted. Then I noticed that no matter how big the actual dataset was , the network code padded the message to a multiple of 1 kb packages, with additional padding to ensure that a record would never be split across two packages.

    So, to send "ok", it would send "ok" and 998 blanks. But at least we saved the byte for signs.

    Then I lost my will to live.

  • Gunslinger (unregistered)

    Who would ... but I don't even ... how is that ... What? What!?

  • The Man in Black (unregistered)

    You keep using that word. I do not think it means what you think it means.

  • donger (unregistered)

    How did Grandma come up with that system

  • Santa (unregistered)

    Gramma got run over by a reindeer

  • (cs) in reply to JamesCurran
    JamesCurran:
    It has everything to do with this. It explains why this particular scheme for negative numbers was used, and why some scheme needed to be used. (If we didn't need to convert from EBCDIC to an internal representation, we could just punch "-1234").
    As you write it: There will always be a conversion, the only difference is how easy or complex the conversion process is. Converting "-1234" only means carrying around a bit of information until the last letter is reached.
    JamesCurran:
    The point of BCD is that the inaccuracy is exact the same as in calculation with numbers represented as decimals.
    I'm not sure what point you are trying to make here. The result of adding BCD numbers is accurate.
    It's not the adding of numbers where you lose precision! Division is where the prblem lies: How to you represent 1/3 accurately with a limited number of digits?
    JamesCurran:
    So they are spending a digit (4 bits) where a single bit would have been enough.
    True, but remember, bits are doled out 8 at a time, so even if you limited the sign to one bit, the other three were going to go to waste.
    I remember having learned that some of those older machines had weird word sizes, not all of them multiples of 8...

    Wasn't there machines were an int was 36 bit?

  • Yolken Bit (unregistered)

    Aren't Bits and Bytes a breakfast cereal? Or is that Gramma's bedtime medicine?

  • Barf 4Eva (unregistered) in reply to no laughing matter
    no laughing matter:
    Barf 4Eva:
    honestly, fail to see the WTF...
    Choose one: * Self acclaimed "leading consultant" in the ETL business does not recognise widely used format * Customer does not specify at all in which format numbers are written and customer developer does not recognise the problem when conversion repeatly fails at "2196O". * Typed programming language does not report an error when it sees "2t" as a number literal. * Export of the data writes negative numbers sometimes as positive and nobody notices it. * Or was the issue recognised, but never thoroughly analysed? "They are so rare that we usually don't worry about them."

    Oh, then I'll take "Self acclaimed "leading consultant" in the ETL business does not recognise widely used format" for the answer then!

  • sdban (unregistered) in reply to M-x org-mode
    M-x org-mode:
    C6 C9 D9 E2 E3 4F 5A
    you sunk my battleship
  • The other Frank (unregistered) in reply to EvilSnack

    "If it isn't broken, don't try to fix it" says nothing about if a clearly easier and better solution comes around. We use hammers today because while rocks worked just fine, hammers are easier and more precise and produce a better result. Rocks weren't "broke", but are clearly inferior.

  • (cs) in reply to no laughing matter
    no laughing matter:
    Barf 4Eva:
    honestly, fail to see the WTF...
    Choose one: * Self acclaimed "leading consultant" in the ETL business does not recognise widely used format
    That a "leading consultant" in the ETL business, who presumably routinely deals with decades-old data, does not recognize the data format may indeed be a WTF (although, personally, I'd go with "Customer does not specify at all in which format numbers are written ..." as the #1 WTF). However, I see nothing in the story to indicate she was "self acclaimed." As far as I can tell, the only person to acclaim her as such is Bruce Johnson.
  • Diego Gutierrez (unregistered) in reply to Mainframer

    Stumbled upon this case when working on data transfers between AS/400 and MSSQL a bunch of years ago. I had forgotten about that until now.

  • Norman Diamond (unregistered) in reply to no laughing matter
    no laughing matter:
    Wasn't there machines were an int was 36 bit?
    Yes. One is mentioned near the beginning of the first edition of K&R's book on the C programming language as an example of a machine where C worked just fine.

    OK, I see you're wondering about James Curran's statement that bits were doled out 8 at a time. Right, when talking about packed decimal, we only need to observe 4 bits at a time. The matter of a sign nibble remains unchanged.

Leave a comment on “'O'-Convertible”

Log In or post as a guest

Replying to comment #:

« Return to Article