• Homer (unregistered)

    Isn't this the default table format MSSQL uses when importing CSV files? This by itself isn't much of a WTF. Did it come from a dev server? Is it an abandoned table used for importing? WhereTF is the WTF?

  • (cs) in reply to Homer


    Anonymous:
    Isn't this the default table format MSSQL uses when importing CSV files? This by itself isn't much of a WTF. Did it come from a dev server? Is it an abandoned table used for importing? WhereTF is the WTF?


    Actually by default SQL Server will saddle you with nvarchar(4000), so you can use up two bytes for every character.

  • (cs) in reply to Homer
    Anonymous:
    Isn't this the default table format MSSQL uses when importing CSV files? This by itself isn't much of a WTF. Did it come from a dev server? Is it an abandoned table used for importing? WhereTF is the WTF?
    more like that's that format of reports from RDBMS's.  MySQL and Oracle i know print reports like that, but it's so hard to parse, i don't see the point.

    then again, i did recently have to fix a problem with our CSV import code last week which had trouble with XML-like data in CSV files....

    thank God we don't actually store data in JSON organized with XML nested in CSV files in a standard RDBMS.  i'd quit at that point.
  • (cs) in reply to John Hensley
    Anonymous:
    We're going to need some better goggles.

  • maht (unregistered) in reply to WTF Batman
    WTF Batman:


    Thought so. I've actually been playing with Postgres recently. I'm finding it to be an incredible database. However, some of the syntax can be a little... well... obfuscated. So far, I'd say that's the exception rather than the rule, but you sure found a real gem.


    tbh I use Postgresql's arrays myself on one project

    here's the offending page :

    http://www.postgresql.org/docs/8.1/static/arrays.html

    though arrays have been around in psql for a while.


  • (cs)
    Alex Papadimoulis:

    Yesterday, we learned about the latest programming paradigm, Vector Oriented Programming. Continuing that trend, I'd like to introduce the next big thing in data storage: the Column Separated Value DataBase. Discovered by Shaun, the CSVDB will do nothing short of revolutionizing the way we store, manage, and think about data ...

    CREATE TABLE [Tab001]
    (
    [Col001] VARCHAR(8000) NULL,
    [Col002] VARCHAR(8000) NULL,
    [Col003] VARCHAR(8000) NULL,
    [Col004] VARCHAR(8000) NULL,
    [Col005] VARCHAR(8000) NULL,
    [Col006] VARCHAR(8000) NULL,
    [Col007] VARCHAR(8000) NULL,
    ...
    [Col041] VARCHAR(8000) NULL,
    [Col042] VARCHAR(8000) NULL,
    [Col043] VARCHAR(8000) NULL,
    [Col044] VARCHAR(8000) NULL,
    [Col045] VARCHAR(8000) NULL
    )
    CREATE TABLE [Tab002] ( ...)
    CREATE TABLE [Tab003] ( ...)
    ...

    As you can probably tell, the CSVDB combines the best parts of CSV (Comma Separated Value) files with a relational database. No longer do you need to be tied down with things like a data model -- just pick a column from the vast array of tables, and use it how ever you'd please. And on top of that, the CSVDB doesn't burden you with foreign keys, column names, data validation, table names, or even primary keys ...

    Col001 | Col002   | Col003   | Col004        | Col005 | Col006 | Col007 | Col008      | ...
    -------+----------+----------+---------------+--------+--------+----------------------+-----
    3 | NULL | NULL | http://www... | NULL | <null>5,6,7 | (really long string) | ...</null>

    ... and yes, you saw right. The CSVDB, unlike CSV files, allows you to have comma-separated data within a cell of column-separated data.



    This leads us to our next lesson today kids.... Don't smoke crack!
  • (cs) in reply to greyfade

    In opposition to the very popular SQL (commonly pronounced "Sequel"), this should be called PRQL ("Prequel").

    Pretty Ridiculous and Queer Logic

  • (cs) in reply to DiamondDave
    DiamondDave:
    Alex Papadimoulis:

    Yesterday, we learned about the latest programming paradigm, Vector Oriented Programming. Continuing that trend, I'd like to introduce the next big thing in data storage: the Column Separated Value DataBase. Discovered by Shaun, the CSVDB will do nothing short of revolutionizing the way we store, manage, and think about data ...

    CREATE TABLE [Tab001]
    (
    [Col001] VARCHAR(8000) NULL,
    [Col002] VARCHAR(8000) NULL,
    [Col003] VARCHAR(8000) NULL,
    [Col004] VARCHAR(8000) NULL,
    [Col005] VARCHAR(8000) NULL,
    [Col006] VARCHAR(8000) NULL,
    [Col007] VARCHAR(8000) NULL,
    ...
    [Col041] VARCHAR(8000) NULL,
    [Col042] VARCHAR(8000) NULL,
    [Col043] VARCHAR(8000) NULL,
    [Col044] VARCHAR(8000) NULL,
    [Col045] VARCHAR(8000) NULL
    )
    CREATE TABLE [Tab002] ( ...)
    CREATE TABLE [Tab003] ( ...)
    ...

    As you can probably tell, the CSVDB combines the best parts of CSV (Comma Separated Value) files with a relational database. No longer do you need to be tied down with things like a data model -- just pick a column from the vast array of tables, and use it how ever you'd please. And on top of that, the CSVDB doesn't burden you with foreign keys, column names, data validation, table names, or even primary keys ...

    Col001 | Col002   | Col003   | Col004        | Col005 | Col006 | Col007 | Col008      | ...
    -------+----------+----------+---------------+--------+--------+----------------------+-----
    3 | NULL | NULL | http://www... | NULL | <NULL>5,6,7 | (really long string) | ...</NULL>

    ... and yes, you saw right. The CSVDB, unlike CSV files, allows you to have comma-separated data within a cell of column-separated data.



    This leads us to our next lesson today kids.... Don't smoke crack!

    Must have been a Government Project!

  • Jimmy (unregistered) in reply to ParkinT

    Now that the CSVDB is in place, it's time to write some middleware that will hide the complexity of dealing with it from new developers (for many of whom a CSV database will be a difficult concept) and seamlessly merge the data with other established and older technologies (such as relational databases) that may exist in the enterprise. XMLOCSVDB was invented for this purpose (XML Over CSVDB).

     

  • Stinky Pete (unregistered) in reply to DiamondDave

    DiamondDave:

    This leads us to our next lesson today kids.... Don't smoke crack!

    That made my day!!   Thanks dude!

  • RohnertGeek (unregistered) in reply to e.thermal

    Outsourcing man, outsourcing!

  • Ashley Visagie (unregistered)

    And the winner for "Obfuscated Database design of the Year" goes to this WTF!  I love the smell of stupid design in the morning [;)]

    Ashley

  • (cs) in reply to LOLLORCOASTER

    Anonymous:
    Anonymous:
    I dont know about you but I store my persistent data on our conference room projector.  It is high resoluton so I think I can cram lots of data in there.

    Oh that's silly.  We have "offshored" all our data storage needs.  Now whenever any data is needed we call up a special residential warehouse (a data warehouse, heh) built to house and feed the workers, otherwise known as "Memorizers".  Then we either ask them to recall some data or we give them some data to memorize.  I cannot even describe the savings we realized from not having to buy any hardware or software, and it's so scalable!!!  And our users love it, because they don't have to use any complicated query languages (though some practice in accent decoding is necessary).

     

    Truly one of the best "Im gonna out do you" or "My Dad can't beat your dad up" come backs. Thank you for Making my day a whole lot happier as I left the office  !!!

  • (cs) in reply to e.thermal
    e.thermal:

    Volmarias:
    I propose the following solution to programmers who write things like this: Euthanasia

    What do the Youth in Asia have to do with this?

    Only Gene W. would find that funny.

  • (cs) in reply to ParkinT

    What, only two dimensions?  Where's the collection of files to give a third dimension?  Of directories to offer a fourth?  (And Parent Directories for additional dimensions?)

    Or, to take it the other direction, why not a file for each vector (row)?  A file for each entry (value)?  (Why take up all that file space with commas when you can waste it on directory entries instead?  Why make it difficult to manage the guts of the file when you can punish the directory performance?)  At least with a file for each entry you can more easily differentiate between null and empty entries, should you so desire.  And if you indicate it somewhere (e.g. encode it in the file name) you can also use different character encodings for different entries, too!

    But for all its faults, it seems to meet the most basic requirements for storage: an agreed set of delimiters (cr/lf for new vector; comma as delimiter), and an agreed character encoding (presumably ASCII, UTF-8 or UTF-16).  The default storage model for, say, Excel, is quite a lot richer but it's not like Micro$oft publishes their complete file specification for others to easily reuse.  And CSV does have some limited portability to other applications. 

    Is it time to resurrect the "binary format" versus "text-based (or human-readable) format" wars?

  • (cs) in reply to neek
    Anonymous:
    <font size="2">Paula coded it? It's brillant!
    </font>

    You forgot the else clause of your ternary operator.  I think you meant..

    Paule coded it? It's Brillant! : It's Billiant!
  • Timbo Jones (unregistered)
    Alex Papadimoulis:

    Col001 | Col002   | Col003   | Col004        | Col005 | Col006 | Col007 | Col008      | ...
    -------+----------+----------+---------------+--------+--------+----------------------+-----
    3      | NULL     | NULL     | http://www... | NULL   | <null>5,6,7  | (really long string) | ...</null>

    ... and yes, you saw right. The CSVDB, unlike CSV files, allows you to have comma-separated data within a cell of column-separated data.



    Not only that, but apparently it allows really long strings to span columns!  How'd they do that?!
  • Gio (unregistered) in reply to marvin_rabbit

    Don't be rediculous.  You're not supposed to store CSV in the fields...  That's supposed to be XML data.


    XML data? Why limit yourself to XML data? Store serialized java object in the database!
    Think of a database where you can store a full hashtable in a single field... wouldn't it be a dream?

    (I've actually seen it done - in an enterprise web application of course)
  • (cs) in reply to maht
    Anonymous:
    Anonymous:
    ... then XML is probably the better choice.


    may I ask which other schemes you considered before making this decision ?

    CSV is not a "standard" anywhere so it is a poor choice for unambiguous data transmission.

    in XML

    <row><cell>cell1</cell></row>
    and
    <row>
        <cell>
              cell1
        </cell>
    </row>

    are *not* the same

    I hoped that worked out, XML makes it hard to post examples, another reason to stand idly by not calling 911 when it looks like it might die !!



    Am I missing something here or are the two examples exactly the same in XML?
  • (cs) in reply to anon
    Anonymous:
    About the best I can say about this is at least they used VARCHAR instead of CHAR!


    Don't be too impressed. In most databases, that many CHAR(8000) columns would have exceeded the maximum row length...

  • Kiss me, I'm Polish (unregistered) in reply to Timbo Jones
    Timbu Jones:
    Alex Papadimoulis:

    Col001 | Col002   | Col003   | Col004        | Col005 | Col006 | Col007 | Col008      | ...
    -------+----------+----------+---------------+--------+--------+----------------------+-----
    3      | NULL     | NULL     | http://www... | NULL   | <null>5,6,7  | (really long string) | ...</null>

    ... and yes, you saw right. The CSVDB, unlike CSV files, allows you to have comma-separated data within a cell of column-separated data.



    Not only that, but apparently it allows really long strings to span columns!  How'd they do that?!

    You can span columns and even rows in Word since 1995. And it's designed for word processing, not table processing. A modern database should do that without problems. What's the big deal?
  • HwAoRrDk (unregistered) in reply to LOLLORCOASTER

    Anonymous:
    Oh that's silly.  We have "offshored" all our data storage needs.  Now whenever any data is needed we call up a special residential warehouse (a data warehouse, heh) built to house and feed the workers, otherwise known as "Memorizers".  Then we either ask them to recall some data or we give them some data to memorize.  I cannot even describe the savings we realized from not having to buy any hardware or software, and it's so scalable!!!  And our users love it, because they don't have to use any complicated query languages (though some practice in accent decoding is necessary).

    So, I assume you will soon be embarking on a round of legal suits (SCO-style!), sueing all those offshore tech support call-centres for infringing your patented data storage methods, yes?

    Oh, wait... No, slightly different concept. Those tech support call-centres never accurately recall the information they have stored.

    [:D]

  • (cs)

    The REAL WTF here is that this is only half a design.

    The columns should have been declared with a variety of data types (int, small int, blob, char, etc etc etc).

    Each table should have had a key to allow joining to the (obviously missing) sister tables (1 per data table) that could contain such things as fieldname and allowed values, etc.  How the hell else are you gonna properly document this sucker.

    Then they would really have something.

    I'd call it Meta WTF Database Enterprise And Deployment System (MWTFDEADS).

    It would really have the juice!

  • foonly (unregistered) in reply to say what

    PHB: You don't need to take a course in SQL, just put the data in a database.

  • O(nnnnnnnnnnnnnnnnn) (unregistered) in reply to foonly

    What's wrong with that Oracle database?

  • Ron Savage (unregistered) in reply to Kiss me, I'm Polish

    > Not only that, but apparently it allows really long strings to span columns!  How'd they do that?!

    HTML tables and the colspan option, of course?

     

  • (cs) in reply to nobody
    nobody:

    > Minor quibble. The CSV format does allow for
    > comma seperated data within a field. That field,
    > however, must be surrounded by double quotes.
    > Double quotes inside of such a field must be
    > escaped using another double quotes character.

    You're kidding, right?


    No, he's not kidding. CSV does allow for commas within the field; if they exist, the field is surrounded by quotation marks. Suppose your data looks like this:

    column1 = a
    column2 = b,c
    column3 = d

    Then the CSV equivalent is:

    a,"b,c",d
  • John (unregistered) in reply to Scott S. McCoy

    Sometimes you're stuck with the tools at hand.

    I worked for my local county government budget office, we had: Excel, Word, ESSBASE (multidimensional database) and CSV text files FTP'd to-from the old mainframe.

    Actually buying something was often impossibly difficult, for example the tech support for our ESSBASE consultant was on Compuserve. Compuserve would take payment by credit card or check (in advance); the county had no credit cards (that I could use) and would only issue checks with Invoices; I couldn't pay for it myself, nor could the consultants, as that would be an illegal 'gift' to the county... Using any software without well documented licenses was absolutly forbidden.

    So, I wrote a reporting program in Excel VBA that used the API to pull data from ESSBASE, matched position numbers to names from the text files, formatted the pages like the old Mainframe reports, and I even shaded the backgrounds of the cells every alternate 3 lines (mod(row,6)>2) tweaked the page setup options etc, etc. it was beautiful. then I stayed there 36 hours straight to print the final reports (Monday deadline written in state law, Data not done until Friday) poor little desktop printers spewing out 5000 pages, with trays that only held 200ish pages. all while suffing from a bleeding ulcer (from medication I was on, not stress)

    Unfortunetly, when my temp position became perm, someone massivly overqualified applied, and the official HR policies required they hire him instead (fair is fair, I was recommended to the temp job by my also-white-male brother) He left for a better paying job after a month, but I had already gotten a new job by then, so they had to hire two people to replace me.

    Several months later I stopped by to say 'Hi' to folks (while in the courthouse for other business), and one of the two asked me how I produced the final report. "I stayed here for 36 hours straight, fixing up the files, running the macros, and watching the printers. Good luck."

    I loved that job.

  • Geoff Oliver (unregistered)

    The person who dreamt this gem up should be dragged out in the road and shot. Why in the hell would you do something like this?

  • (cs) in reply to Casiotone
    Casiotone:
    Anonymous:
    Anonymous:
    ... then XML is probably the better choice.


    may I ask which other schemes you considered before making this decision ?

    CSV is not a "standard" anywhere so it is a poor choice for unambiguous data transmission.

    in XML

    <row><cell>cell1</cell></row>
    and
    <row>
        <cell>
              cell1
        </cell>
    </row>

    are *not* the same

    I hoped that worked out, XML makes it hard to post examples, another reason to stand idly by not calling 911 when it looks like it might die !!



    Am I missing something here or are the two examples exactly the same in XML?


    In the second example, cell has extra line feeds and space in it.  I suppose it depends on whether the parser strips that off or not.
  • KDM (unregistered)

    I think this one justifies publishing the name of the guy who concocted this ridiculousness

  • (cs) in reply to powerlord

    Hmmm. My guess is that these were created by some importing process. The source files were likely CSV or somesuch with wildly varying content in the same position. Maybe from a marketing survey or something.

    No way a person did this. (Please god, no way)

  • jesus (unregistered) in reply to marvin_rabbit
    marvin_rabbit:
    Alex Papadimoulis:

    ... and yes, you saw right. The CSVDB, unlike CSV files, allows you to have comma-separated data within a cell of column-separated data.


    Don't be rediculous.  You're not supposed to store CSV in the fields...  That's supposed to be XML data.

    Sheesh.



  • jesus (unregistered) in reply to jesus

    Apparently the CAPTCHA was case-sensitive and erased my message the first time....
    the XML would look like:

    <csv>(getting tired of retyping this)

    </csv><csv>
     <row>
      <value>...</value>
      <comma />
      <value>...</value>
     </row>
    </csv>

  • DaBookshah (unregistered) in reply to Runtime Error
    Anonymous:

    Couldn't we just track these people down and brand "WTF" on their foreheads. 


    Amen to that.
  • John Hensley (unregistered) in reply to DaBookshah
    Anonymous:
    Anonymous:

    Couldn't we just track these people down and brand "WTF" on their foreheads. 


    Amen to that.

    or maybe just the letter F. They would thenceforth be known as "F-heads."
  • (cs) in reply to Gio
    Anonymous:
    > Don't be rediculous.  You're not supposed to store CSV in the fields...  That's supposed to be XML data.

    XML data? Why limit yourself to XML data? Store serialized java object in the database!
    Think of a database where you can store a full hashtable in a single field... wouldn't it be a dream?

    (I've actually seen it done - in an enterprise web application of course)

    Me too. Table design is:
      int id;
      blob object;
    Table is named "data" of course.
  • (cs) in reply to dmitriy
    dmitriy:
    nobody:

    > Double quotes inside of such a field must be
    > escaped using another double quotes character.

    You're kidding, right?


    No, he's not kidding. CSV does allow for commas within the field; if they exist, the field is surrounded by quotation marks.


    I had the displeasure of working with a system recently that doesn't bother escaping what it writes out - double quotes, newlines...

    Why people don't realise a CSV could contain multiple lines and double quotes is beyond me...
  • me (unregistered) in reply to Volmarias
    Volmarias:
    I propose the following solution to programmers who write things like this:

    Euthanasia



    Yes those young chinese kids are quite smart (excellent musicians quite often as well), plus little nimble fingers make for quick typing.  The programmer should have been outsourcing his work.

  • (cs) in reply to Yeah, what the fag
    Anonymous:

    Actually, the format is less bloated and easier processable than the modern idiot programmer's favorite toy-format, the Xtremely Moronic Language.


    ...:|

    I love these little mini-WTFs that invariably follows any daily WTF post
    :D
  • (cs) in reply to ParkinT

    I object to the 'queer' bit
    gay people WOULD NOT do it like that :P

  • (cs) in reply to analysis

    Yes, I know a company which did it this way. The really interesting thing is, they are proud of their flexible design [:'(]

  • Crazy Loon (unregistered) in reply to borealis

    What if there is a pipe within my data? Can I escape it ("|")?

  • (cs)

    Isn't this just a flat text file with some transaction management bolted on?

  • C (unregistered)

    Looks like someone's (mis)using a code generator.

  • Cope with IT (unregistered) in reply to Gio

    That would be a real object-oriented database....

  • (cs) in reply to xcor057
    Anonymous:

    The forum should stop protecting the identity of these perpetrators.  Instead, those posting and reporting these incidents should be required to hack into the company’s security badge image database



    ...and try to figure out how that database is put together? No thanks. Not after today's post.


  • (cs) in reply to WTF Batman
    WTF Batman:
    Thought so. I've actually been playing with Postgres recently. I'm finding it to be an incredible database. However, some of the syntax can be a little... well... obfuscated. So far, I'd say that's the exception rather than the rule, but you sure found a real gem.


    Yep, PostgreSQL rules. And it also has tons of really really... interesting data types and you can extend the thing.

    And luckily it's not the OSS database "market leader", and new users who turn into it are My-"what's a foreign key?"-SQL expatriates who start working on it nice and simple, so we haven't seen too many abuses of the type system yet. =) I guess making new types on PostgreSQL needs some skill, therefore, if you find an abuse, at least you can claim that it's product of conscious malice. =)

    Or that's how I see it. Haven't really looked much deeper in the system. Fun datatypes in any case.
  • Chris (unregistered) in reply to fat-tony
    Anonymous:
    Anonymous:
    Doesn't get much less worthless than this. [8-|]


    well, it could just be 1 column of csv :)

    It could put the entire CSV row into XML, and store that in one column. In fact, store the entire CSV as XML in a table of one column of one row.

    ob.wtf: Brillant!

    Chris

  • (cs)

    With the WTFs so far, I can follow the programmer's thinking process, even if it is broken. But in this case, no primary keys. I am unable to see how someone would think "I need information in tables." but not "I need to locate the appropriate record."

    Aarrgh!

Leave a comment on “Introducing the CSVDB”

Log In or post as a guest

Replying to comment #:

« Return to Article