• my name (unregistered)

    greenfield is a myth. and the replacement with a big bang usually never happens

  • (nodebb)

    My election day prediction is a lengthy argument about the most efficient way to count the number of lines in a CSV file.

  • Greg (unregistered)

    Bonus point for the potential off by one error: you can have n rows with only n-1 new lines

  • (nodebb) in reply to Greg

    Or, worse, n rows and n newlines, so the split creates a blank row at the end, and you finish with n+1 rows to process, one of which is guaranteed to be empty.

  • Jaloopa (unregistered)

    What are the odds the row count is later passed in to another method that reads the entire file again and loops through the lines up to countRows to process them?

  • TS (unregistered)

    Just because it's a CSV file doesn't mean it isn't going to run into the 65535 row limit. And just because this code dates from 2007 doesn't mean the same thing can't bite you badly a decade or more later. https://www.bbc.co.uk/news/technology-54423988

  • (nodebb)

    And of course if any of the CSV data consists of quoted string fields containing embedded newlines, then all newline-based counts are meaningless. Despite the existence of a standard (RFC 4180) the reality is CSV is a very loose format with lots of variations in what gets generated or accepted. Maybe these devs know embedded unescaped newlines are impossible. Far more likely they just never considered the possibility.

  • (nodebb)

    I am not really religious, but Jesus Christ.

    Addendum 2024-11-05 08:08: Even though this code made me nearly puke, it has to be pointed out that they at least correctly disposed the stream. So there's that.

  • (nodebb) in reply to MaxiTB

    @MaxiTB :

    it has to be pointed out that they at least correctly disposed the stream.

    Ah, so you're the sort of person who'd be up there on that cross at the end of The Life of Brian, singing "Always Look on the Bright Side of Life" with all the rest???

  • (nodebb)

    There's also our old friend, the check-if-exists-then-open antipattern.

  • Lurk (unregistered) in reply to dkf
    Comment held for moderation.
  • OldCoder (unregistered)

    I think I'd be as much concerned about the fact that the first line is a case statement!! Particularly as the value of that constant might just be 65536. The mind boggles what the other case values might be - and WTF the switch statement is doing.

  • Steve (not that one) (unregistered)

    "But we're not even reading an Excel file, we're reading a CSV." -- which was probably dumped from an Excel file -- and then read back in after the counting.

  • (nodebb) in reply to Steve_The_Cynic

    Nope, again, not religious. But it's funny to me that someone reads a complete file into memory than creates countless additional sub strings just to count how many Environment.NewLine are in a file while making sure they are a good citizen and don't waste OS resources like file handles. Then again, on second though, Just remember the last laugh is on you :-)

  • Anon (unregistered)

    Excel 2007 raised the row limit to 1,000,000 rows.

    Anyone who has pressed Ctrl+Down Arrow one too many times, or understands how computers store numbers, would know that the row limit increased to 1,048,576.

  • Robin (unregistered) in reply to TS
    Comment held for moderation.

Leave a comment on “Counting it All”

Log In or post as a guest

Replying to comment #:

« Return to Article