• iNFiNiTyLoOp (unregistered)
    #include <stdio.h>
    #include <ctype.h>
    char * RemoveSpaces(char * source){
    	char * dest, * ret = dest = source;
    	while( *(isspace(*source) ? dest : dest++) = *source++ );
    	return ret;
    }
    main(int argc, char ** argv){  /*For testing the above.*/
    	while( *(++argv) ) printf("%s", RemoveSpaces(*argv));
    }
    
  • panzi (unregistered)

    python:

    def removeSpaces(s):
    	return s.replace(' ','')

    C++:

    #include <string>
    
    std::string & removeSpaces( std::string & str ) {	
    	std::string::iterator to = str.begin();
    
    	for(
    		std::string::const_iterator from = str.begin();
    		'\0' != *from;
    		++ from ) {
    		if ( ' ' != *from )
    			*(to ++) = *from;
    	}
    
    	*to = '\0';
    	return str;
    }

    Java:

    class StringThings {
    	static String removeSpaces(String s) {
    		return s.replaceAll(" ","");
    	}
    }

    It's a long time ago since I wrote something in VB, but I think this should do it (I run Linux here, so I can't verify it):

    Function RemoveSpaces(ByRef S As String) As String
    	Return S.Replace(" ","")
    End Function
  • Huan (unregistered) in reply to Guy

    Yes, however strings are immutable. Since removing spaces changes the string, new strings are created. The original string remains the same.

  • (cs)

    Sorry if this isn't original, I've only read the first two pages of comments. (But I have searched for "Unicode" in the other pages and didn't find anything.)

    Python, using the Unicode character database to see which characters are spaces:

    import unicodedata
    
    def removespaces(subject):
        """
        Returns subject, sans spaces. (A space is defined as any character
        considered a space by Unicode.)
        """
        # XXX Do I absolutely have to use a separate 'stripped' variable?
        # XXX If there's a way to get a list of all of the Unicode characters in a
        # category, I could do that instead of calling unicodedata.category() on
        # every character.
        stripped = ''
        for char in subject:
           # Calling unicode() even if subject is already a Unicode string is bad
           # but also the simplest way to code it.
           if unicodedata.category(unicode(char)) != 'Zs':
               stripped += char
        return stripped
  • panzi (unregistered) in reply to Anonymous
    Anonymous:
    Here's a Python version that is more dedicated to the subject of this site ;-)
    WTF = 'Hello    World   ab c '
    ''.join([c for c in WTF if c!=' '])

    Actually this code ("lazy" version; python 2.4):

    WTF = 'Hello    World   ab c '
    print ''.join(c for c in WTF if c != ' ')

    Might be approximately similar performant then:

    WTF = 'Hello    World   ab c '
    print WTF.replace(' ','')

    Maybe even faster, because it don't has to append all those empty strings when a space is found. ;) But on the other hand replace is implemented in C and therefore faster.

  • panzi (unregistered) in reply to Matt Nordhofff

    [quote user="Matt Nordhofff"]Sorry if this isn't original, I've only read the first two pages of comments. (But I have searched for "Unicode" in the other pages and didn't find anything.)

    Python, using the Unicode character database to see which characters are spaces:

    from unicodedata import category
    
    def removespaces(subject):
    	return u''.join(c for c in subject if category(c) in ('Zs','Cc'))
    

    But doesn't do the isspace() method the same thing for str and unicode objects? And so I can write a generic function which will accept str and unicode:

    def removespaces(subject):
    	return ''.join(c for c in subject if not c.isspace())
    
  • panzi (unregistered)

    Erm, the last comment of mine is messed up. Sorry. The quote is broken and a bit of text I wrote is missing: Your implementation is O(n^2) because of the +=. With the join method O(n) is possible, while still using the unicode library.

  • Mark (unregistered) in reply to Sebastián

    C is C++ subset.

    I'd first complain that in-place string operations are oftend BAD IDEA (strings passed as parameters, for example).

  • AdT (unregistered) in reply to Anonymous Tart
    Anonymous Tart:
    std::erase(std::remove_if(str.begin(),
        str.end(), std::bind2nd(std::equal_to<char>(),
        remove)), str.end());

    Aaargh! You must be kidding!

    std::erase(std::remove(str.begin(),
        str.end(), remove), str.end());
    
  • AdT (unregistered) in reply to AdT

    remove_if only makes sense if you actually need a predicate function (or functor), e.g.

    std::remove_if(str.begin(), str.end(), std::isspace);
    
  • (cs) in reply to panzi

    (This post is in reply to panzi's comments about my silly Python Unicode character database implementation.)

    Well, the point was to be silly, not to be O(n). :P

    Your first implementation looks good, except you should use ''.join(...) instead of u''.join(...) so that the returned string won't be a Unicode string unless the string passed to it is. Your second implementation (with str.isspace()) ruins the fun.

    Interesting to know that using += makes it O(n^2), though. Python FastCGI scripts have to return all of the output at once (or use yield), so I do a lot of +=ing in them.

  • (cs)

    ABAP CONDENSE <fieldname> NO-GAPS.

  • Joe P (unregistered)

    The real question to me is why does RemoveSpace remove all of the Spaces??? Should it be RemoveSpaces?

    Some Perl implementations:

    #!/usr/bin/perl
    #remSpaces.pl
    use warnings;
    use strict;
    
    print @ARGV, "\n";
    

    This works for something simple like:

    remSpaces.pl this is a line of text thisisalineoftext

    It will of course fail if you use 'quoted strings' in the cmd line arguments, but you should know how to use your tools...

    remSpaces.pl 'this is a line of text' this is a line of text

    $str=join'',split/ /,$str;
    

    The above may have been mentioned but it a unique method that I thought of after people used a straight replace regex. I did learn about tr/ //g, And a person mentioned that it is faster (I have yet to research this) then s/ //g

    This has been a great read! Some really cool methods.

  • Stephen Touset (unregistered) in reply to Michael
    char* remove_spaces(char *str) {
    
      if (NULL == str)
        return str;
    
      if (0 == *str)
        return str;
    
      char *to   = str;
      char *from = str;
    
      do {
        if (' ' != *from)
          *to++ = *from;
      } while (from++);
    
      return str;
    
    }

    This avoids needing any special code to append the remaining '\0' on the end, since the do-while loop will always run an "extra" time.

  • AdT (unregistered) in reply to Stephen Touset
    Stephen Touset:
      do {
        if (' ' != *from)
          *to++ = *from;
      } while (from++);
    

    This will loop until from is the NULL pointer. Apart from that, the special case code for the empty string is completely redundant.

  • Jason (unregistered)

    To all the arrogant half-wits who criticize the idea of putting the r-value on the left, consider this:

    bool x; ... if (x = true) { return; } ...

    GCC does not catch that (I realize that comparing with true/false is absurd, but I've seen it in a lot of code not written by me). Also, why not put the r-value on the left? It's perfectly legal C/C++ and does, under some circumstances, save you from writing bugs. I never could adopt the habit, maybe because I've never been bitten by this problem, but why the hostility? You complain that people should know the difference between = and ==, yet you can't seem to bring yourselves to understand the concept of commutativity.

    Of course, it does fail, as you say, when comparing two l-values, but it does catch many problems.

    As for the comment about the original code being "reasonably efficient", it's not. It does string concatenation, which doesn't work so well in VB. A better way would be to use Space$ or similar to allocate a string big enough to hold the output buffer, then use Mid$(...) = Mid$(...) to assign to the string, then RTrim$ the result before returning.

  • My Name (unregistered) in reply to JGW
    JGW:
    Obfuscators of the world, unite:
    void remove(char* s, char c) {
    for(char*p=s;*p=*s;p+=(*s++!=c));
    }

    That's not obfuscated. That's ... SWEET !

  • Joe P (unregistered)

    Ohh, the best Perl script: y/// is the same as tr///! y/ //d

  • Joe P (unregistered) in reply to FooBaz
    bstorer:
    Zylon:
    This all would have been much more interesting if the task was instead to remove all EXTRA spaces, leaving only single spaces, if any.
    Not in Ruby:
    str.squeeze! ' '
    
    In perl: y/ / /s;
  • (cs)

    Here's a dandy in Ruby, that I just wrote up.

    Rather than go the direct approach (like "'remove my spaces'.tr(' ','')") this will do the same thing but in the process will also,

    - Find all the combinations of characters than can be removed. 
    - Then removes them for each combination. 
    - Then takes those removed characters from each combination and finds which one removed the most spaces without removing any other character and picks that solution.
    

    Man, it's hard to write code this convoluted, but fun!

    # remove_spaces.rb
    
    def do_combos(range, pos, combos)
      if range.last - range.first == 0 then return combos end
      split = range.first + ((range.last - range.first) / 2)
      range.each { |i| combos[i][pos] = i>split }
      combos = do_combos(range.first..split.to_i, pos+1, combos)
      do_combos(split.to_i+1..range.last,  pos+1, combos)
    end
    
    class String
      def remove_spaces
        # make map of possible combinations
        combos = (0..2**length-1).map { [] }
        combos = do_combos(0..2**length-1, 0, combos)
    
        # make map of possible solutions
        solutions = combos.map do |combo|
          i = -1
          combo.inject(['','']) do |solution, do_it|
            i += 1
            if do_it then
              [solution[0]+self[i..i], solution[1]]
            else
              [solution[0], solution[1]+self[i..i]]
            end
          end
        end
    
        # find best solution, the one that removed the most spaces without other characters in there
        solutions.inject(['',self]) { |best, sol| (sol[0].match(/^ +$/) and sol[0].length >= best[0].length) ? sol : best }[1]
      end
    end
    
    p ARGV.shift.remove_spaces
    

    A typical example,

    mike@dedeX ruby $ time ruby remove_spaces.rb "test this   string ! "
    "testthisstring!"
    
    real    12m3.352s
    user    11m27.059s
    sys     0m35.102s
    

    But be sure to have a good machine to run this on if you are removing spaces from a string with more than like 23 some odd characters. It quickly gets out of hand and just gets killed on my iron.

    mike@dedeX ruby $ time ruby remove_spaces.rb "test         this     string ! "
    Killed
    
    real    3m31.015s
    user    1m20.429s
    sys     0m19.149s
    
  • pst (unregistered) in reply to cfreeman
    cfreeman:
    Perl:

    $str =~ s/ //g;

    Some things Perl was just meant to do.

    I'm kind of fond of `y' ... maybe it makes me feel esoteric, but since this is WTF,

    $no_whitespace_str = join "", grep /[^ ]/, split // for $str;

    ... but, as with all these examples, WHY?

  • Fernando (unregistered)

    a Haskell version without lib calls (actually works!):

     removeSpaces = filter (/=' ')

    and for removing extra spaces:

     removeExtraSpaces = unwords . words
  • qwert (unregistered)

    "I don't even know what the hell the Try/Catch is supposed to be doing"

    It's because once a space is removed the string length changes and (assuming the FOR statement doesn't evaluate the range end each time) this will lead to an array index overflow...

  • regards (unregistered)

    your code is sucked!!!!!!!!!!!

  • David Guaraglia (unregistered)

    The greatest WTF is the first implementation (in VB.NET) accomplishes exactly nothing.

    If you check the code, it's only concatenating the spaces (not the useful data) into a string variable local to the function... He might as well "refactor" it as:

    Function RemoveSpaces(ByVal data as String) as String
    End Function
    

    And my implementation is even more efficient, it returns in constant time! LOL

    Abysmal...

  • Daniel (unregistered) in reply to Zonkers

    Easier way in Ruby to remove whitespaces in a string:

    str.gsub(/\s/, '')

Leave a comment on “Removing Spaces, the Easy Way”

Log In or post as a guest

Replying to comment #:

« Return to Article