• Acsi (unregistered) in reply to Memo Ree
    Memo Ree:
    Imagine what happens should two people run the same report at the same time.
    What a mind blowing concept! Two people using the same thing! At the same time even!

    We still have a lot of people who can't wrap their brains around the idea that there is anything more than one computer and one user. Or, if there is more than one, it must be an identical computer using its own local storage for its own copies of software, spreadsheets, documents, etc.

    We had this problem nicely solved about 30 years ago. Professionals who understood data management and systems design used that knowledge to, uhmm, manage data and design systems. And they put it all together on hardware and operating systems that were designed from the ground up for multiple users and graceful resource sharing. You could think of this package of hardware and software as a computer specifically intended for business, or, more succinctly, a Business Computer (BC).

    Then along came the Personal Computer (PC) and technology was dealt a blow from which it has yet to fully recover. Ignoring the clear implication that a Personal computer is designed for one person and not for a business, businesses flocked to gobble them up by the ton. People who taught themselves "how to program" using a toy were suddenly everywhere, producing software aligned with the Personal Computer focus. Who cares about file locking? Concurrency control? Hell, even file permissions and user IDs were superfluous and therefore omitted. Who needs the concept of a root user when there's only one user per computer? Let the user do anything they want! Basically all the solutions that were already well understood were thrown in the trash, only to be painfully, gradually, and only partially, rediscovered in the decades to follow.

    Yeah, some wise ass will chime in to trash the multi-user computers of the 1970s as expensive, slow, and puny in their specs. Do you really think everything would have stagnated there? Just try to imagine where 30 years of progress in Business Computers could have taken us by now. The gap between that potential and today's reality -- where ignorant (or worse: arrogant) developers still assume everyone's computer is just like mine, and everyone runs as admin -- can be charged entirely to the damning influence of the Personal Computer and the software companies whose blatant lies were swallowed whole by an unsuspecting public.

    Yeah, those with the experience to see through the lies learned to despise those marketing-driven empty-promise factories. But the public, with little computer knowledge, believed everything, and is only now gradually obtaining enough computer savvy to understand the need for things we had, and discarded, before half of us were even born.

    Winner, winner, chicken dinner!

  • foxyshadis (unregistered) in reply to Memo Ree
    Memo Ree:
    Imagine what happens should two people run the same report at the same time.
    What a mind blowing concept! Two people using the same thing! At the same time even!

    We still have a lot of people who can't wrap their brains around the idea that there is anything more than one computer and one user. Or, if there is more than one, it must be an identical computer using its own local storage for its own copies of software, spreadsheets, documents, etc.

    We had this problem nicely solved about 30 years ago. Professionals who understood data management and systems design used that knowledge to, uhmm, manage data and design systems. And they put it all together on hardware and operating systems that were designed from the ground up for multiple users and graceful resource sharing. You could think of this package of hardware and software as a computer specifically intended for business, or, more succinctly, a Business Computer (BC).

    Then along came the Personal Computer (PC) and technology was dealt a blow from which it has yet to fully recover. Ignoring the clear implication that a Personal computer is designed for one person and not for a business, businesses flocked to gobble them up by the ton. People who taught themselves "how to program" using a toy were suddenly everywhere, producing software aligned with the Personal Computer focus. Who cares about file locking? Concurrency control? Hell, even file permissions and user IDs were superfluous and therefore omitted. Who needs the concept of a root user when there's only one user per computer? Let the user do anything they want! Basically all the solutions that were already well understood were thrown in the trash, only to be painfully, gradually, and only partially, rediscovered in the decades to follow.

    Yeah, some wise ass will chime in to trash the multi-user computers of the 1970s as expensive, slow, and puny in their specs. Do you really think everything would have stagnated there? Just try to imagine where 30 years of progress in Business Computers could have taken us by now. The gap between that potential and today's reality -- where ignorant (or worse: arrogant) developers still assume everyone's computer is just like mine, and everyone runs as admin -- can be charged entirely to the damning influence of the Personal Computer and the software companies whose blatant lies were swallowed whole by an unsuspecting public.

    Yeah, those with the experience to see through the lies learned to despise those marketing-driven empty-promise factories. But the public, with little computer knowledge, believed everything, and is only now gradually obtaining enough computer savvy to understand the need for things we had, and discarded, before half of us were even born.

    I'm not entirely sure whether you're trolling or not, but this phenomenon has nothing to do with the class of machine in use and everything to do with the expansion and commoditization of programming as a discipline. The more you invite every yahoo in to design your software, the more kludgy and half-broken it gets, and if you won't pay for expertise you deserve it.

    As long as there's a demand for more software than top-tier CS experts can develop, there will be plenty of non-experts developing, same as any other industry. Besides cloning, what is your grand solution, oh great whiner?

    You'd prefer to keep programming in the hands of the wizards who know what's best for the company, with no tolerance for feedback, and years- or decades-long delays to move to new projects, as the 50's-70's were. Business computers failed not because of the price - look at how expensive current mainframes are - but because the culture was stagnant and rotten.

    But all of this is evolution on a huge scale: PC hardware and software evolves over time to become better and cheaper, bad systems eventually improve or die out. Dedicated business systems tend to evolve by stealing the best PC ideas, and then only when threatened by PC competition.

  • blunder (unregistered) in reply to Jay
    Jay:
    I'm a big fan of natural keys. But I heartily agree that when there is no obvious natural key, don't try to force one. I've seen crazy keys like "customer zip code plus bith date plus first three letters of last name". Like, give it up man! Just use a sequence number! I'm especially annoyed with db designers who put together some crazy collection of fields in the hope that it will be unique, and then tack on a sequence number because they know duplicates are possible. If you're going to put a sequence number on the end anyway, why not just forget the rest and just use the sequence number?

    It's was said once already, but it bears repeating: SSNs are not guaranteed to be unique. They're not unique even if everyone gives you their real SSN, and they won't. It also does your users a disservice when you ask any (or all) of them for an SSN simply to identify them with. Coming up with a unique ID is your job, not theirs.

    http://blogs.computerworld.com/node/5969

    (Some embedded devices don't support file systems or surrogate keys, so obviously in that case SSNs are the way to go.)

  • (cs) in reply to Jay
    Jay:
    Jaime:
    I'm a big fan of surrogate keys. Looking through some of the databases I am responsible for, most of the tables in them have no natural unique identifier. What would you suggest a natural identifier for an order header record would be?

    I'm a big fan of natural keys. But I heartily agree that when there is no obvious natural key, don't try to force one. I've seen crazy keys like "customer zip code plus bith date plus first three letters of last name". Like, give it up man! Just use a sequence number! I'm especially annoyed with db designers who put together some crazy collection of fields in the hope that it will be unique, and then tack on a sequence number because they know duplicates are possible. If you're going to put a sequence number on the end anyway, why not just forget the rest and just use the sequence number?

    +1

    That said, when there is an obvious natural key, like "social security number" for employee or "two-letter state abbreviation" for a U.S. state, I'd prefer to just use it. Then I don't need a bunch of extra joins to get a recognizable identifier.

    -10

    SSNs are unique only in theory. Additionally, many people have none. It makes for an awful key... if you're using SSN and you haven't yet run into a case where it doesn't work, you've been lucky not smart.

    Unless you're trolling me, in which case godspeed.

  • Kozz (unregistered) in reply to Craig
    On a related note, I'm always amazed when I read the comments on the article. It makes me wonder how many WTFs the readers of the blog are creating for future generations to scratch their heads over.

    Sir, your real WTF is that readers suppose they'll find insightful, informative comments attached to thse articles.

  • AnOldRelic (unregistered) in reply to Marc B
    Marc B:
    I wonder if his please for help were ignored because he didn't say pleas.

    ......There is a special level in Hell dedicated to people like you....

  • rbjacobs (unregistered)

    Yes, this is the real WTF! Keeping such a monstrosity alive is actually Worse Than Failure.

    Bonus ending: Someone received a bonus for 'managing' this.

  • PRMan (unregistered) in reply to Cyrus
    Cyrus:
    Fred:
    Anon:
    Fred:
    >The code module contained such gems as this function: > Function FileExists(f As String) As Boolean

    That one is actually well designed - it just looks sucky because of the way you had to do things in VB.

    No, no it isn't.

    Ok, why it the design poor? It has a reasonable name, and actually uses parameters instead of global variables. Ok, the FreeFile thing is a red herring, but other than that it does what it has to do. Blame Bill Gates for the error handling weird syntax.

    Public Function FileExists(sFullPath As String) As Boolean
        Dim oFile As New Scripting.FileSystemObject
        FileExists = oFile.FileExists(sFullPath)
    End Function

    The original is better as it doesn't leak FileSystemObjects.

  • Mike D. (unregistered) in reply to Jay
    Jay:
    <comment type=grammar subtype=nazi> "largely because writing applications are boring"

    I think that should be, "writing applications IS boring". Just because applications is plural doesn't mean that the sentence calls for a plural verb. The subject of the verb is "writing", not "applications".

    I usually brush off the grammar errors but for some reason this was leaped [sic] out and attacked me. </comment>

    Your attribute values need quotes.

    I usually brush off XML errors but this one leapt out and attacked me, too. ;p

    On a different note, I am wondering how a database having "not a single stored procedure" means that it is defective. I've got a CMS-style web app I cooked up in Django this past month that seems to be working pretty well, even though I haven't written any SQL, much less stored procedures. Django's ORM seems to be working pretty well so far. Mostly, I'm wondering if you guys have good examples of stored procedure usage that are more than just SQL macros.

  • PRMan (unregistered) in reply to AnOldHacker

    Does nobody use transactions?

    Even most of the poor programmers I know use transactions to avoid this problem in their badly-written MAX+1 code.

  • Doc (unregistered)

    Great Scott! They mastered ternary logic! And the Cubs did win the World Series! I've got to get back to 1985 and tell Marty!

  • Someone like Kevin (unregistered) in reply to Jay
    Jay:
    Craig:
    You've never had a natural identifier change on you?

    ON UPDATE CASCADE

    Yeah, that's great for performance in your OLTP DB. </sarcasm>

  • (cs) in reply to Jay
    Jay:
    <comment type=grammar subtype=nazi> "largely because writing applications are boring"

    I think that should be, "writing applications IS boring". Just because applications is plural doesn't mean that the sentence calls for a plural verb. The subject of the verb is "writing", not "applications".

    I usually brush off the grammar errors but for some reason this was leaped out and attacked me. </comment>

    Teaching opportunity: in "writing applications are boring", "writing" is a present participle; in "writing applications is boring", it's a gerund.

  • (cs) in reply to PRMan
    PRMan:
    Cyrus:
    Fred:
    Anon:
    Fred:
    >The code module contained such gems as this function: > Function FileExists(f As String) As Boolean

    That one is actually well designed - it just looks sucky because of the way you had to do things in VB.

    No, no it isn't.

    Ok, why it the design poor? It has a reasonable name, and actually uses parameters instead of global variables. Ok, the FreeFile thing is a red herring, but other than that it does what it has to do. Blame Bill Gates for the error handling weird syntax.

    Public Function FileExists(sFullPath As String) As Boolean
        Dim oFile As New Scripting.FileSystemObject
        FileExists = oFile.FileExists(sFullPath)
    End Function

    The original is better as it doesn't leak FileSystemObjects.

    VB6 destroyes COM objects by reference counting. Every FileSystemObject created by this function will be properly destroyed at the end of the function as the (poorly named) oFile variable goes out of scope.

  • (cs) in reply to Mike D.
    Mike D.:
    Jay:
    <comment type=grammar subtype=nazi> "largely because writing applications are boring"

    I think that should be, "writing applications IS boring". Just because applications is plural doesn't mean that the sentence calls for a plural verb. The subject of the verb is "writing", not "applications".

    I usually brush off the grammar errors but for some reason this was leaped [sic] out and attacked me. </comment>

    Your attribute values need quotes.

    I usually brush off XML errors but this one leapt out and attacked me, too. ;p

    On a different note, I am wondering how a database having "not a single stored procedure" means that it is defective. I've got a CMS-style web app I cooked up in Django this past month that seems to be working pretty well, even though I haven't written any SQL, much less stored procedures. Django's ORM seems to be working pretty well so far. Mostly, I'm wondering if you guys have good examples of stored procedure usage that are more than just SQL macros.

    I'll go one further... Too many stored procedures often indicates cargo cult programming, or an application design philosophy that hasn't been updated in ten years. Stored procedures can be used for a lot more than SQL macros. However, the best uses of stored procedures I've seen are in the applications where they are only used when they provide some specific benefit. People who use stored procedures by reflex instead of by design sometimes end up simply wrapping CRUD in a pretty dress.

  • will (unregistered)

    unfortunately that sql code is not uncommon. It comes from offices that have bought into the "use stored procedures for everything, dynamic SQL is bad"

    You run into a lot of instances where dynamic SQL in Crystal reports would be easy to do but you have to use a stored procedure, so you have to build the function mentioned.

  • Simon (unregistered) in reply to PRMan
    PRMan:
    Does nobody use transactions?

    Even most of the poor programmers I know use transactions to avoid this problem in their badly-written MAX+1 code.

    They're certainly poor programmers, if they think transactions will make any difference. The problem is simply that if two users run "select max+1" at the same time, they'll get the same answer, and they'll try to use the same 'unique' key on the next insert.

  • SW (unregistered) in reply to The Real Jason

    I'm supporting it now.

  • (cs) in reply to Simon
    Simon:
    PRMan:
    Does nobody use transactions?

    Even most of the poor programmers I know use transactions to avoid this problem in their badly-written MAX+1 code.

    They're certainly poor programmers, if they think transactions will make any difference. The problem is simply that if two users run "select max+1" at the same time, they'll get the same answer, and they'll try to use the same 'unique' key on the next insert.

    If they use the SERIALIZABLE transaction isolation mode, one will wait for the other instead of arriving at the same answer. Of course, it effectively single-threads inserts, but it gives correct results.
  • oheso (unregistered) in reply to da Doctah
    da Doctah:
    Teaching opportunity: in "writing applications are boring", "writing" is a present participle; in "writing applications is boring", it's a gerund.

    Further teaching opp: "writing applications" is the subject; therefore the original sentence is correct. Writing applications (e.g., Microsoft Word, OpenOffice Writer, et hoc genus omni) are boring.

  • Delicious (unregistered)

    TRWTF is expecting people to read VBScript for laughs. I'd rather poke a blunt fork through my testicles.

  • Rand (unregistered) in reply to Jaime
    Jaime:
    I'll go one further... Too many stored procedures often indicates cargo cult programming, or an application design philosophy that hasn't been updated in ten years. Stored procedures can be used for a lot more than SQL macros. However, the best uses of stored procedures I've seen are in the applications where they are only used when they provide some specific benefit. People who use stored procedures by reflex instead of by design sometimes end up simply wrapping CRUD in a pretty dress.

    Stored procedures are pre-compiled by the database when they are loaded, while in-line SQL needs to be compiled when it is run; thus stored procedures always provide a speed benefit.

    On another note, keeping all code in stored procedures helps make the code more searchable. I work on an application with tens of thousands of files. It is much faster for me to find the SQL I need if it is always in the SQL files (as stored procedures) and I don't need to search the other file types.

  • Mike D. (unregistered) in reply to Simon
    Simon:
    PRMan:
    Does nobody use transactions?

    Even most of the poor programmers I know use transactions to avoid this problem in their badly-written MAX+1 code.

    They're certainly poor programmers, if they think transactions will make any difference. The problem is simply that if two users run "select max+1" at the same time, they'll get the same answer, and they'll try to use the same 'unique' key on the next insert.

    And one of those transactions will fail because of the "unique" constraint. The process that fails reruns the SELECT MAX+1 transaction. I.e. "Keep running the play until you get it right."

    ... SQL programmers check result codes, right? Now I'm wondering how much Django is doing for me.

    Rand:
    Stored procedures are pre-compiled by the database when they are loaded, while in-line SQL needs to be compiled when it is run; thus stored procedures always provide a speed benefit.
    So the time required to fetch a stored procedure is always less than the time needed to compile the SQL?

    I think I'll be sticking with Django and its compiled-SQL cache for now.

    Rand:
    On another note, keeping all code in stored procedures helps make the code more searchable. I work on an application with tens of thousands of files. It is much faster for me to find the SQL I need if it is always in the SQL files (as stored procedures) and I don't need to search the other file types.
    ... Wut? Your development workspace is the database itself, and not offline? I hope you have some version control in place. And I can't think of a time when a database search could do something that a good `find` couldn't in about the same time.
  • (cs) in reply to Rand
    Rand:
    Jaime:
    I'll go one further... Too many stored procedures often indicates cargo cult programming, or an application design philosophy that hasn't been updated in ten years. Stored procedures can be used for a lot more than SQL macros. However, the best uses of stored procedures I've seen are in the applications where they are only used when they provide some specific benefit. People who use stored procedures by reflex instead of by design sometimes end up simply wrapping CRUD in a pretty dress.

    Stored procedures are pre-compiled by the database when they are loaded, while in-line SQL needs to be compiled when it is run; thus stored procedures always provide a speed benefit.

    On another note, keeping all code in stored procedures helps make the code more searchable. I work on an application with tens of thousands of files. It is much faster for me to find the SQL I need if it is always in the SQL files (as stored procedures) and I don't need to search the other file types.

    I think a new episode of Friends is on tonight... oh, wait, it's not the early 1990s. Every major database confers the same benefits to in-line SQL through a process called "ad-hoc batch caching". It was first added to Microsoft SQL Server in 1997. This feature was the exact source of my comment about not revising application design philosophies in ten years. Interestingly, one of the specific cases where I always use stored procedures is where I want to force the execution plan to never be cached. You can easily create a stored procedure with the "recompile" option, but it is very difficult to fool the ad-hoc batch caching mechanism. As for searchability, you seem to suggest that I keep my SQL embedded within my application code. On the contrary, I can simply search the data access layer code (which sometimes consists of stored procedures).

    Finally, how long does it take to compile a stored procedure? Do you really drive the entire architecture of your projects around the idea of saving three seconds of processor time per day on a database server that is probably disk-bound anyways?

  • hans (unregistered) in reply to Grammar Nazi

    Should remind you that not all of us are native American writers and that there exists more than one English.

  • Cheong (unregistered) in reply to Jay
    Jay:
    <comment type=grammar subtype=nazi> "largely because writing applications are boring"

    I think that should be, "writing applications IS boring". Just because applications is plural doesn't mean that the sentence calls for a plural verb. The subject of the verb is "writing", not "applications".

    I usually brush off the grammar errors but for some reason this was leaped out and attacked me. </comment>

    Except when he really means writing the word ("application") multiple times on paper is boring. :P
  • 50% Opacity (unregistered) in reply to hans
    hans:
    Should remind you that not all of us are native American writers and that there exists more than one English.

    Of course, how could we forget about Ungrammaticlish and Typonglish, the other two main branches of English?

  • 50% Opacity (unregistered) in reply to Cheong
    Cheong:
    Except when he really means writing the word ("application") multiple times on paper is boring. :P

    No. Even in this case... no!

  • Coder (unregistered)

    There is no Windows API for "FileExists". The closest is probably "FindFirstFile", which in VB maps to the DIR function, which would be used for the canonical MS BASIC "FileExists" function.

    Both the "On Error" and "filesystem" approaches are ugly hacks. Using the filesystem object requires that you create and destroy a filesystem object every time you test: that's going to be slower even than the disk access. Using "On Error" in VB4,5, or 6, creates an Error Object every time access fails. Which is not as bad, but still not useful if all you are going to do with the object is immediately destroy it.

    Many of the "VB" examples you used to see at microsoft.com where actually "ASP Classic" examples, and included the explicit destruction of objects, on the assumption that the code would run in a global namespace, rather than in a function. And of course, many people advocate the explicit destruction of objects in VB because they don't know the difference between VB and broken C++.

  • (cs) in reply to Mike D.
    Mike D.:
    Jay:
    <comment type=grammar subtype=nazi> [...] </comment>
    Your attribute values need quotes.

    I usually brush off XML errors but this one leapt out and attacked me, too. ;p

    It could be SGML. That has different, laxer rules in this area than XML so simple words such as above do not need quoting (which is one reason why HTML is so horrendous to parse; SGML is… difficult).

  • Dan (unregistered) in reply to EmperorOfCanada

    Lotus Notes is ideal for a CMS. You have authentication, a very suitable security model, a document based database, workflow, a web server, e-mail support, choice of programming languages, clustering support etc. etc.

    In fact I would say there were few platforms that were better suited.

  • Jimmy Jones (unregistered)

    ] "Function FileExists(f As String) As Boolean "

    Is there something actually wrong with that? (Apart from the suckyness which BASIC forces you to add)

  • Raw (unregistered) in reply to anon
    anon:
    Function FileExists(FilePath As String) As Boolean
    FileExists = Len(Dir(FilePath)) > 0
    

    End Function

    Dangerous, as Dir seqences can't be run concurrently, so if you call that from a loop which scans a directory, you'll get wierd fails.

  • Ralf (unregistered) in reply to Jaime
    Jaime:
    Simon:
    PRMan:
    Does nobody use transactions?

    Even most of the poor programmers I know use transactions to avoid this problem in their badly-written MAX+1 code.

    They're certainly poor programmers, if they think transactions will make any difference. The problem is simply that if two users run "select max+1" at the same time, they'll get the same answer, and they'll try to use the same 'unique' key on the next insert.

    If they use the SERIALIZABLE transaction isolation mode, one will wait for the other instead of arriving at the same answer. Of course, it effectively single-threads inserts, but it gives correct results.

    Hm, no, pretty sure a Microsoft SQL Server will give back the same "next" number to two callers for all TRANSACTION ISOLATION LEVELS. It might block the insert of one party until after the other commited - but then fail the second one as this ID exists.

    So: SERIALIZABLE also does not help here!

  • (cs) in reply to Ralf
    Ralf:
    Jaime:
    Simon:
    PRMan:
    Does nobody use transactions?

    Even most of the poor programmers I know use transactions to avoid this problem in their badly-written MAX+1 code.

    They're certainly poor programmers, if they think transactions will make any difference. The problem is simply that if two users run "select max+1" at the same time, they'll get the same answer, and they'll try to use the same 'unique' key on the next insert.

    If they use the SERIALIZABLE transaction isolation mode, one will wait for the other instead of arriving at the same answer. Of course, it effectively single-threads inserts, but it gives correct results.

    Hm, no, pretty sure a Microsoft SQL Server will give back the same "next" number to two callers for all TRANSACTION ISOLATION LEVELS. It might block the insert of one party until after the other commited - but then fail the second one as this ID exists.

    So: SERIALIZABLE also does not help here!

    Our highly paid consultants solved that problem by creating a "NextId" table and putting SERIALIZABLE on a stored procedure that retrieved Max(Id)+1 and updated the Id in the same transaction.

    Addendum (2010-05-25 06:32): Of course they used this table to generate every unique id for every table in their system, by keeping a NextId for every table. When only one person was importing thousands of records the speed was acceptable. But as soon as multiple people or processes where importing data, everything would grind to a halt.

  • (cs) in reply to Cheong
    Cheong:
    Jay:
    <comment type=grammar subtype=nazi> "largely because writing applications are boring"

    I think that should be, "writing applications IS boring". Just because applications is plural doesn't mean that the sentence calls for a plural verb. The subject of the verb is "writing", not "applications".

    I usually brush off the grammar errors but for some reason this was leaped out and attacked me. </comment>

    Except when he really means writing the word ("application") multiple times on paper is boring. :P

    No - it would be correct if he referred to applications that write something or are used for writing (in that case he'd be missing a hyphen: writing-applications).

    If he refers to the fact that the process of writing one or more applications, Jay is correct.

  • Banking Contractor (unregistered) in reply to Fred
    Fred:
    >Boolean-type values in the database were rendered as a CHAR(5) field holding possible values of 'true', 'false', or NULL.

    He must have been burned in the past by MySql's crazy syntax to store and retrieve booleans in a BIT(1) type field.

    The code module contained such gems as this function: Function FileExists(f As String) As Boolean

    That one is actually well designed - it just looks sucky because of the way you had to do things in VB.

    deleted in-line SQL where ever he found it

    Also just his personal preference. Having in-line SQL isn't necessarily bad - depending on the application and how it's implemented.

    Wow!!! Are you sure this isn't your code??? I'm sorry, but I've written many many great applications in VB6 - and in fact I can say that a very very successful software company that provides real-time trading systems used VB6 for it's extremely rich, fast and flexible GUI.

    "Having in-line SQL isn't necessarily bad" - Don't let your boss see that statement - I would fire you on the spot!

    Cool Captcha - SALUTO!!

  • Marc (unregistered) in reply to Crabs
    Crabs:
    Strings suck for joining, selecting, and indexing on, because the compare takes an order of N longer than comparing an integer.

    You seem to believe in fairytales and are bitching about too many things at once. Let me clarify.

    Either you complain about indexing performance or select/join performance, because the problems are mutually exclusive. That is, once you've indexed, select/join performance issues due to use of strings will have disappeared. If you don't create indexes, you can't bitch about indexing performance.

    That aside, to compare strings, you usually don't have to compare all the characters between strings. String length is usually stored with strings; if string length differs, you've got your "no match" in O(1). All characters only need to be compared in case of a full match. Only in this worst case scenario, performance is O(n). For all other scenarios, just like integers, the strings can be compared 32 (or even 64) bits at a time. In most cases, the first comparison gives sufficient information to move on. So in practice, although there is a difference in performance, it is marginal.

    For all practical purposes, the performance of the database does not depend on using strings vs. integers but mainly on proper normalization. If you've got columns like "phonenumber1", "phonenumber2" etc and need to search the database for a phone number, this type of poor design has a much bigger performance impact than strings vs integers.

  • (cs) in reply to Marc
    Marc:
    Crabs:
    Strings suck for joining, selecting, and indexing on, because the compare takes an order of N longer than comparing an integer.

    You seem to believe in fairytales and are bitching about too many things at once. Let me clarify.

    Either you complain about indexing performance or select/join performance, because the problems are mutually exclusive. That is, once you've indexed, select/join performance issues due to use of strings will have disappeared. If you don't create indexes, you can't bitch about indexing performance.

    That aside, to compare strings, you usually don't have to compare all the characters between strings. String length is usually stored with strings; if string length differs, you've got your "no match" in O(1). All characters only need to be compared in case of a full match. Only in this worst case scenario, performance is O(n). For all other scenarios, just like integers, the strings can be compared 32 (or even 64) bits at a time. In most cases, the first comparison gives sufficient information to move on. So in practice, although there is a difference in performance, it is marginal.

    For all practical purposes, the performance of the database does not depend on using strings vs. integers but mainly on proper normalization. If you've got columns like "phonenumber1", "phonenumber2" etc and need to search the database for a phone number, this type of poor design has a much bigger performance impact than strings vs integers.

    Collation plays a huge role in the performance of string comparisons. Most databases don't do string comparisons at a binary level. They typically allow things like case insensitivity, ignore trailing spaces, and match equivalent unicode characters. Although, the performance of a search is usually limited by IO, not CPU, so all this becomes irrelevant.

  • DC (unregistered) in reply to anon
    anon:
    Function FileExists(FilePath As String) As Boolean
    FileExists = Len(Dir(FilePath)) > 0
    

    End Function

    Spot on, except you want FileExists = Len(Dir(FilePath, vbNormal)) > 0

    The vbNormal ensures you are getting file, not a directory. It is fileExists, after all.

    You can also do:

    FileExists =(GetAttr(FileName) <> vbDirectory) which is a bit faster, and avoids odd things that happen when using Dir.

    Using FSO is nice, but it's slower (x10) than the native VB stuff.

    The idiot who thought it was well designed hadn't considered what happens when yoo do fileExists() for a file locked by someone elseat, and the fact that in VB (& other languages) opening a file that doesn't exist actually creates the file, thus the function always succeeds. Awesome.

    Well designed? WTF? Avoid that guys code.

  • DC (unregistered) in reply to PRMan

    [quote]

    Public Function FileExists(sFullPath As String) As Boolean
        Dim oFile As New Scripting.FileSystemObject
        FileExists = oFile.FileExists(sFullPath)
    End Function
    [/quote]

    The original is better as it doesn't leak FileSystemObjects.[/quote]

    It doesn't leak, VB handles that when the routine goes out of scope. Most people put a 'Set ofile = Nothing' in there to be neat.

    We ain't talking about C, ya know. I suppose you think it was well designed too.

  • wisi (unregistered) in reply to blunder
    blunder:
    Jay:
    I'm a big fan of natural keys. But I heartily agree that when there is no obvious natural key, don't try to force one. I've seen crazy keys like "customer zip code plus bith date plus first three letters of last name". Like, give it up man! Just use a sequence number! I'm especially annoyed with db designers who put together some crazy collection of fields in the hope that it will be unique, and then tack on a sequence number because they know duplicates are possible. If you're going to put a sequence number on the end anyway, why not just forget the rest and just use the sequence number?

    It's was said once already, but it bears repeating: SSNs are not guaranteed to be unique. They're not unique even if everyone gives you their real SSN, and they won't. It also does your users a disservice when you ask any (or all) of them for an SSN simply to identify them with. Coming up with a unique ID is your job, not theirs.

    http://blogs.computerworld.com/node/5969

    (Some embedded devices don't support file systems or surrogate keys, so obviously in that case SSNs are the way to go.)

    A few years ago, universities had to rewrite all their applications and database tables that used SSN as an identifier to comply with Federal regulations. If you want a unique key for your tables, make sure it is under your control and not owned by someone else.

  • Jay (unregistered) in reply to AnOldHacker
    AnOldHacker:
    Jay:
    That said, when there is an obvious natural key, like "social security number" for employee or "two-letter state abbreviation" for a U.S. state, I'd prefer to just use it. Then I don't need a bunch of extra joins to get a recognizable identifier.

    Except that SSNs are NOT unique!

    It's hard to come up with a natural identifier which is stable. (Think zipcode or telephone number--what happens when you go global?) What's worse, many of the things which ought to work (SSNs, City+State) DON'T.

    In case anyone misses it, the problem with max+1 indexing (aside from races) is that the final element might be deleted. If the deletion is incomplete, the orphaned pointers will point to the next record to be created.

    How are SSN's not unique? Unless you're thinking that an illegal alien might be using someone else's SSN.

    I absolutely agree that city+state is not an adequate key for a place. There are states where you can have more than one city with the same name. That sort of thing should certainly be considered when you are thinking about natural keys. Almost every textbook ever written on database design points out that a person's name is not an adequate primary key because there could be many people named "John Smith" or "Mary Jones".

    I'd be reluctant to use a phone number as a primary key as people change their phone numbers fairly routinely. But suppose we just accepted that and used phone number as a customer id. Then we go global and American-format phone numbers are no longer adequate. I think I'd say, So what? If we haven't planned for going global, then it's likely that we'll have to add or change many columns in the database related to addresses and telephones and currency and probably lots of other things -- like what data do you need to calculate a VAT as opposed to a sales tax and what new data do we need because of government regulations in various countries and who knows what other cultural things. (I once worked for a software house that made a medical office management system. When we sold a copy to a clinic on an Indian reservation, we had to add a field to the customer record for "blood", i.e. how many eighths Indian this person was. 8=full-blooded Indian, 4=half Indian, etc.) To say that amidst all this work we also have to expand our primary key from varchar(10) to varchar(15) would be a drop in the bucket.

  • Mike M (unregistered) in reply to Fred

    Fred:

    Ok, why it the design poor? It has a reasonable name, and actually uses parameters instead of global variables. Ok, the FreeFile thing is a red herring, but other than that it does what it has to do. Blame Bill Gates for the error handling weird syntax.

    Wellllll.... assume the target file is an Excel file that some other user or perhaps, the application itself, has open for reading. The open statement in the function will fail due to a file lock, not because the file doesn't exist.

    Why assume that f is an Excel file? Because it often was.

    But that's just kid's stuff. How about taking a stab at defending the reports section? Each Crystal Report was pointed at a purpose-built SQL table that was emptied row by row and repopulated row by row and field by field each time the report was called. The original developer had apparently never heard of stored procedures.

  • (cs) in reply to Mike M
    Mike M:
    Fred:
    Ok, why it the design poor? It has a reasonable name, and actually uses parameters instead of global variables. Ok, the FreeFile thing is a red herring, but other than that it does what it has to do. Blame Bill Gates for the error handling weird syntax.

    Wellllll.... assume the target file is an Excel file that some other user or perhaps, the application itself, has open for reading. The open statement in the function will fail due to a file lock, not because the file doesn't exist.

    Why assume that f is an Excel file? Because it often was.

    But that's just kid's stuff. How about taking a stab at defending the reports section? Each Crystal Report was pointed at a purpose-built SQL table that was emptied row by row and repopulated row by row and field by field each time the report was called. The original developer had apparently never heard of stored procedures.

    How would a stored procedure help here? If you forced this idiot to use a stored procedure, he would simply write a stored procedure that clears and fills his "temp tables" using the same row by row approach he currently uses. Stored procedures are not a magic panacea for people who don't know how to use databases.

  • Mike M (unregistered) in reply to Jaime
    How would a stored procedure help here? If you forced this idiot to use a stored procedure, he would simply write a stored procedure that clears and fills his "temp tables" using the same row by row approach he currently uses. Stored procedures are not a magic panacea for people who don't know how to use databases.

    Yeah, good point. I meant using a stored procedure as the source data for the report, but as you have made plain, I made far too many assumptions about the original developer's capabilities. A stupid mistake, especially since I've actually met him.

  • runfaraway (unregistered)

    "sdate As String"

    I want to be sedated.

  • Ack! (unregistered)

    It's the 2 girls-1 cup of code.... You can't unsee it!

  • Andy (unregistered) in reply to anon

    What if the file exists, but is of length zero?

  • Quirkafleeg (unregistered)

    To do what FileExists is trying to do, I'd use access(); but I don't write VB code.

Leave a comment on “The Homegrown CMS”

Log In or post as a guest

Replying to comment #:

« Return to Article