Comment On 7.5k - 10k lines / day

Rocco Caputo pointed me to a Perl job opportunity the other day that was pretty, umm, demanding. I always wondered ... how could anyone possibly write that much code in one day? Apparently, your best bet is to go the whole IHBLRIA route and rewrite things like the lc() function, as Edy's colleague demonstrates: [expand full text]
« PrevPage 1Next »

Perl

2004-11-04 14:01 • by Guardian Bob
Yeah, the point of perl is to do as much on one line as possible. For example, Towers of Hanoi:
<BR>
<code>sub a{if(my$l=pop){a(@_[0,2,1],--$l);print"Move disc $l from $_[0] to $_[2]
";a(@_[1,0,2],$l);}}a 'A'..'C',pop;</code><p>
Or, (in this case) use the lc() function of perl

re: 7.5k - 10k lines / day

2004-11-04 14:33 • by Jason
Something like this should work:

byte[] ret = new byte[100];
int sofar = 0;
foreach(DictionaryEntry d in h)
{
string s = (string)d.Value;
System.Text.Encoding.ASCII.GetBytes(s,0,s.Length,ret,sofar);
sofar += s.Length;
}

re: 7.5k - 10k lines / day

2004-11-09 12:45 • by mauke
@ Maciej Ceglowski:
Using /o for this would be another WTF. Perl compiles constant
regexes at compile time anyway. /o is for when you have a pattern containing
variables and you want to tell Perl that their content never changes. But in
that case it would be probably better to use qr//.

re: 7.5k - 10k lines / day

2004-11-04 14:33 • by Maciej Ceglowski
This one earns extra WTF points for the i flag (ignore case), the completely irrelevant s flag (which tells Perl to match . on newlines), and gratuitious use of the comment marker as a regular expression delimiter.

I bet you could really speed this baby up by adding the o flag (compile each regex once, instead of every time the subroutine is called)! And think of the job security when it's time to make the application support Unicode.

re: 7.5k - 10k lines / day

2004-11-04 14:34 • by Jason
Oh yeah, h is a Hashtable (of strings, of course). Missed a line there.

re: 7.5k - 10k lines / day

2004-11-04 14:49 • by Alan Bellingham
And if you don't want to trust lc(), then how about

tr/A-Z/a-z/;

re: 7.5k - 10k lines / day

2004-11-04 14:49 • by Chad
Jason, you need to resize the "ret" byte array, or that code would throw an array size exception.

the following would work, however it is not the most efficient code in the world:

Hashtable hash = new Hashtable();
hash.Add("string 1", "string 1 value");
hash.Add("string 2", "string 2 value");
hash.Add("string 3", "string 3 value");

ArrayList arr = new ArrayList();
foreach(string str in hash.Values)
{
char[] chars = str.ToCharArray();
foreach(char c in chars)
arr.Add((byte)c);
}
byte[] bArr = (byte[])arr.ToArray(typeof(byte));

re: 7.5k - 10k lines / day

2004-11-04 14:51 • by Manni
I've seen people defend some of the weirdest code on this site that the vast majority collectively agrees is a WTF. I'd love to see someone defend this.

I'm not condemning the author of this code though, because I wrote a very inefficient and comical implementation of VB's Replace() function. Then again, the author knows enough to use regular expressions (not very well), but doesn't know about lc().

re: 7.5k - 10k lines / day

2004-11-04 15:06 • by Jason
Yeah, I agree about the Array size. I wasn't going for genericity, just an example ;)

BTW, I didn't know that you could cast a char straight to a byte. Good to know.

re: 7.5k - 10k lines / day

2004-11-04 15:09 • by Mike
Here's a C# way to do it. Don't know if this is exactly what you want, but it works.

ArrayList a = new ArrayList();

foreach( string s in yourHashTable )
{
a.Add( System.Text.Encoding.Default.GetBytes( s ) );
}

byte[] hashTableStringBytes = (byte[])a.ToArray( typeof( byte ) );

re: 7.5k - 10k lines / day

2004-11-04 15:19 • by Mork van Ork
Alex, use postgres. And while you're at it, move this site to a real OS unless you can stand the humiliation to have your frontpage defaced to display your own code ;-)

re: 7.5k - 10k lines / day

2004-11-04 15:33 • by Alex Papadimoulis
Thanks Jason & Mike; that should definately do the trick to get them in. Now how do we get it out of a byte array?

And Mork, I think you misunderstood. This site is running on Windows 2003, not some hippie freeware thrown together by a bunch of stoned slackers. ;-).

re: 7.5k - 10k lines / day

2004-11-04 16:10 • by Voytek
Or maybe this:

Hashtable ht = new Hashtable();
ht[0] = "aaa";
ht[1] = "bbb";
ht[2] = "ccc";

MemoryStream stream = new MemoryStream();
new BinaryFormatter().Serialize(stream, ht.Values);
byte[] bytes = stream.ToArray();

re: 7.5k - 10k lines / day

2004-11-04 16:51 • by RichB
Jason wrote:
> I didn't know that you could cast a char straight to a byte. Good to know.
Compile this code, then use ildasm to look at the generated IL - you'll see why it is possible to cast a char to a byte

class X { static void Main() { char a='A'; System.Console.WriteLine(a); } }

re: 7.5k - 10k lines / day

2004-11-04 17:13 • by Chad Grant
You need some sort of delimiter, I used \r and \n

public static void Main()
{
Hashtable hash = new Hashtable();
hash.Add("string 1", "string 1 value");
hash.Add("string 2", "string 2 value");
hash.Add("string 3", "string 3 value");

byte[] bArr = HashToByteArray(hash);
hash = ByteArrayToHashtable(bArr);

foreach(object key in hash.Keys)
Console.WriteLine(String.Format("Key:{0} Value:{1}",key, hash[key]));

RL();
}

private static byte[] HashToByteArray(Hashtable hash)
{
char keyValueDelim = '\n';
char delim = '\r';

ArrayList arr = new ArrayList();
foreach(object key in hash.Keys)
{
foreach(char c in key.ToString())
arr.Add((byte)c);

arr.Add((byte)keyValueDelim);

foreach(char c in hash[key].ToString())
arr.Add((byte)c);

arr.Add((byte)delim);
}
return (byte[])arr.ToArray(typeof(byte));
}

private static Hashtable ByteArrayToHashtable(byte[] bArr)
{
char keyValueDelim = '\n';
char delim = '\r';

string[] keyVals = Encoding.Default.GetString(bArr).Split(delim);

Hashtable hash = new Hashtable();
foreach(string str in keyVals)
{
if (str == String.Empty)
continue;

string[] keyVal = str.Split(keyValueDelim);
hash.Add(keyVal[0],keyVal[1]);
}
return hash;
}

Doing it right...

2004-11-04 17:23 • by Ian Bicking
Well, if I was going to do write this code, I'd be afraid of typos. So I'd probably write a program to write the program; much more efficient.

print "sub utl {\n \$i = \$_[0];\n";
for ($i=0; $i < 26 ; $i++) {
print " \$i =~ s#"
.chr(ord('A')+$i)."#".chr(ord('a')+$i)."#gsi;\n";
}
print " return \$i;\n}\n";

I'm not much of a Perl programmer, or I could probably do it in fewer lines. Or I *guess* you could do:

print "sub utl {return lc(\$_[0];}\n";

But at that point you should really make this reusable, like:

sub genfunc {print "sub $_[0] {return $_[1](\$_[0]);}\n";}
genfunc "utl", "lc";

As you can probably tell, I'm just addicted to reusability. I could easily write 7.5-10k lines a day with only a few hundred lines of code. And after one day, I could keep reusing that code, pumping out another 7.5-10k of code each day without even writing any new code; now *that* is efficiency! Damn I'm good.

re: 7.5k - 10k lines / day

2004-11-04 17:43 • by AvonWyss
private static byte[] HashToByteArray(Hashtable hash)
{
char keyValueDelim = '\n';
char delim = '\r';

ArrayList arr = new ArrayList();
foreach(object key in hash.Keys)
{
foreach(char c in key.ToString())
arr.Add((byte)c);

arr.Add((byte)keyValueDelim);

foreach(char c in hash[key].ToString())
arr.Add((byte)c);

arr.Add((byte)delim);
}
return (byte[])arr.ToArray(typeof(byte));
}

Come on, you can't be serious. Ever heard of UNICODE and chars not in the US-ASCII set? No? You CANNOT reliably convert a char to a byte. Put in some äöüàéè etc. and it will fail.

re: 7.5k - 10k lines / day

2004-11-04 17:55 • by Chad Grant
AvonWyss , It works fine, dumbass.

re: 7.5k - 10k lines / day

2004-11-04 17:57 • by Phil Scott
I'm curious as to how the mysterious "hashtable of strings to a byte array" function helps in terms of not needing a dedicated database server personally.

this site can't possibly be that taxing on a server, can it?

re: 7.5k - 10k lines / day

2004-11-04 18:01 • by Chad Grant
In the end, no one knows what the heck this is to be used for, what would be in the hashtables, etc... so how could we write the *perfect* code for him.

re: 7.5k - 10k lines / day

2004-11-04 18:08 • by AvonWyss
public static byte[] HashtableContainingStringsOnlyToBytes(Hashtable input) {
using (MemoryStream stream=new MemoryStream()) {
using (BinaryWriter writer=new BinaryWriter(stream, Encoding.UTF8)) {
writer.Write(input.Count);
foreach (DictionaryEntry entry in input) {
writer.Write((string)entry.Key);
writer.Write((string)entry.Value);
}
writer.Flush();
return stream.ToArray();
}
}
}

public static Hashtable BytesToHashtableContainingStringsOnly(byte[] input) {
using (BinaryReader reader=new BinaryReader(new MemoryStream(input), Encoding.UTF8)) {
int count=reader.ReadInt32();
Hashtable result=new Hashtable(count);
for (int i=0; i<count; i++) {
result.Add(reader.ReadString(), reader.ReadString());
}
return result;
}

re: 7.5k - 10k lines / day

2004-11-04 18:11 • by Kelsey
Casting from char to byte

Don't do it, you're ignoring any encodings that the char may have. In particular, if you have anything other than ascii characters, then casting from a char to a byte will fail to give you meaning full results.

I'm from the land beyond OZ in java, but I'm sure that C# will have encoding transformers in it.

re: 7.5k - 10k lines / day

2004-11-04 18:12 • by AvonWyss
It does? With your code, for the strings:
hash.Add("string 1 äöü", "string 1 value");
hash.Add("string 2 àéè", "string 2 value");
hash.Add("string 3 ???", "string 3 value");

I get the output:
Key:string 2 …‚Š Value:string 2 value
Key:string 1 „” Value:string 1 value
Key:string 3 … ƒ Value:string 3 value

So it DOES NOT work. Not to say that any \r \n will break your code. I'd think again the next time before you call someone a dumbass.

re: 7.5k - 10k lines / day

2004-11-04 18:18 • by AvonWyss
(I had some chars from the arabic alphabeth in the string 3, which unfortunately are not recognized here, but at least gracefully handled and displayed as '?')

re: 7.5k - 10k lines / day

2004-11-04 18:28 • by Alex Papadimoulis
@Phil Scott

as you know, I had to limit the RSS feeds to 1 item. I want to bring it to 10, but my home cable connection can't handle it: http://thedailywtf.com/stats .. last week, with one item a day.

I want a dedicated server because ... well, why not. But, if I get a server, I need to buy SQL Server. I dont mind the $150/mo. But I do mind the $2500+ software costs, especially for Version 2000, as 2005 is due soon.

So in the mean time I've signed up with a shared hosting provider (1&1 hosting). Unfortunately, they're lousy bastards and restrict the hell out of .NET code. No signed assemblies (easy fix), and apparently the BinaryFormatter class is restricted. Hence why I'm asking for a workaround.

It's all good, I'm going to save the $ from the ads and put it into the "Sql Server fund". Hopefully when 2005 comes out, there will be enough saved to pay for a small portion, and I wont feel so bad forking over the $$$.

re: 7.5k - 10k lines / day

2004-11-04 18:45 • by AvonWyss
@ Alex Papadimoulis

I have a web site with http://www.discountasp.net/ and I am completely satisfied. They offer a unrestricted service, good and quick support, and you can also have SQL2000 databases for very little money.

(I'm not affiliated with them, just a happy customer!)

re: 7.5k - 10k lines / day

2004-11-04 20:45 • by Curt Sampson
Hmmm. I wonder if the URLs are fixed yet? Or are we still not allowing urls, but only URLs starting with "http:"? As well, what's the encoding of your page? On my system it comes out as Shift_JIS, but I suspect that's not what you want.

Anyway, those gripes aside, I suggest you give PostgreSQL a try. I've used both it and SQL Server extensively, and PostgreSQL compares pretty well. It's a real DBMS, not a lame attempt like MySQL.

re: 7.5k - 10k lines / day

2004-11-04 20:53 • by Joe White
My site is hosted with www.webhost4life.com, and while there's few seconds' lag every now and then as it JIT-compiles my ASP.NET code, other than that I have no complaints. Their tech support is fairly responsive, and $10 a month gets you both ASP.NET and an SQL Server database.

And instead of buying MSSQL, couldn't you just use MSDE? I don't recall any restrictions against using it on a server, though it's been a while since I read its license agreement.

re: 7.5k - 10k lines / day

2004-11-04 20:55 • by Chad
AvonWyss, there must be something wrong with your Console. I get the same output as I do input with that code. I do understand unicode, HOWEVER this site is in English. Here is the Unicode version, not that unicode is some big mystery, but this code outputs twice the amount of bytes, since unicode uses 2 bytes, and ASCII only one. So.... you decide... ASCII or Unicode, eat a bowl of d*ck either way

private static byte[] HashToByteArray(Hashtable hash)
{
char keyValueDelim = '\n';
char delim = '\r';

StringBuilder sb = new StringBuilder();
foreach(object key in hash.Keys)
sb.AppendFormat("{0}{1}{2}{3}", key, keyValueDelim, hash[key], delim);
return Encoding.Unicode.GetBytes(sb.ToString());
}

private static Hashtable ByteArrayToHashtable(byte[] bArr)
{
char keyValueDelim = '\n';
char delim = '\r';

Hashtable hash = new Hashtable();
foreach(string str in Encoding.Unicode.GetString(bArr).Split(delim))
if (str.Trim() != String.Empty)
hash.Add(str.Split(keyValueDelim)[0],str.Split(keyValueDelim)[1]);
return hash;
}

re: 7.5k - 10k lines / day

2004-11-04 22:03 • by Matt Ryall
Alex said: "This site is running on Windows 2003, not some hippie freeware thrown together by a bunch of stoned slackers."

So you won't be using the hippie freeware code provided by the stoned slackers above for your Hashtable problem? You'd prefer to pay some company to come up with a half-baked unstable solution?

You can hardly knock collaborative open development when you're promoting it on your own site.

re: 7.5k - 10k lines / day

2004-11-05 00:03 • by wulong
I won't trust no dang hippie with my hashtable problem... they'll smoke it all!

(sorry, couldn't help myself ;)

re: 7.5k - 10k lines / day

2004-11-05 03:48 • by AvonWyss
Chad, ther's nothing wrong with my console. But there's something seriously wrong with your attitude. The comments and the code you posted do talk for themselfes; even your unicode solution does break as soon as there are \r and/or \n contained in any of the strings.

For a working and efficient solution, check out my previous comment:
http://thedailywtf.com/archive/2004/11/04/3303.aspx#3327

I don't think I have to add anything. Just maybe that the code you posted could be used as a WTF, at least that's what came to my mind when I saw your attempts to solve that problem.

re: 7.5k - 10k lines / day

2004-11-05 05:51 • by Lothar
@Chad:
>HOWEVER this site is in English. Here is the
>Unicode version, not that unicode is some big
>mystery, but this code outputs twice the amount
>of bytes, since unicode uses 2 bytes, and ASCII
>only one. So.... you decide... ASCII or Unicode,
>eat a bowl of d*ck either way

man UTF-8

Unicode you need in the moment you want to use exotic things like the euro-symbol (which is not #0x80 but #0x20AC)

re: 7.5k - 10k lines / day

2004-11-05 06:36 • by jasmine strong
The Euro symbol is hardly exotic; it's the biggest currency in the world!

re: 7.5k - 10k lines / day

2004-11-05 08:31 • by Sergio Pereira
Everybody got the Hastable question wrong :) I think Alex meant that he wanted the answer in true WTF style :P

int byteCount=0;
string key0 = (string)hashtable.Keys[0];
byte key0byte0 = (byte)key0[0];
byteCount++;
byte key0byte1 = (byte)key0[1];
byteCount++;
...
string value0 = (string)hashtable.Values[0];
byte value0byte0 = (byte)value0[0];
byteCount++;
byte value0byte1 = (byte)value0[1];
byteCount++;
...
string key1 = (string)hashtable.Keys[1];
byte key1byte0 = (byte)key1[0];
byteCount++;
byte key1byte1 = (byte)key1[1];
byteCount++;
...
string value1 = (string)hashtable.Values[1];
byte value1byte0 = (byte)value1[0];
byteCount++;
byte value1byte1 = (byte)value1[1];
byteCount++;
...

byte[] bytes = new byte[byteCount];
int currentIndex = 0;
bytes[0] = key0byte0;
if(byteCount==0) return bytes;
bytes[1] = key0byte1;
if(byteCount==1) return bytes;
...
//make sure you write enough code lines
// like above to allow for maximum capacity!


The inverse process is left as an brainteaser for the readers :)

re: 7.5k - 10k lines / day

2004-11-05 10:06 • by Bart Park
The belief that unicode means 2 bytes is also a WTF. Here is a fairly good overview of unicode and character sets.

http://www.joelonsoftware.com/articles/Unicode.html

How many WTF's can you get in one WTF, anyway?

re: 7.5k - 10k lines / day

2004-11-05 11:28 • by AvonWyss
Thanks Bart, but I doubt that my friend Chad will get that...

re: 7.5k - 10k lines / day

2004-11-05 11:42 • by Phil Scott
Ah, well doesn't that suck about the BinaryFormatter. I was curious as to why the work around was needed.

Have you thought about using HTTP compression? We saw a 90% decrease in our traffic on some text heavy pages with little impact on the performance of the over all servers.

re: 7.5k - 10k lines / day

2004-11-05 15:52 • by Jason
@Kelsey,

>I'm from the land beyond OZ in java, but I'm >sure that C# will have encoding transformers >in it.

Yes, it does: System.Text.Encoding.ASCII|Unicode.

@Everyone who mentioned the non-safeness of directly casting char to byte

Thanks - that's at least as good to know as that it can be done ;)

re: 7.5k - 10k lines / day

2004-11-05 17:42 • by Daniel H.
Well, at least I am happy to see that some guys admit that the computer world doesn't end at the US borders...

re: 7.5k - 10k lines / day

2004-11-05 21:12 • by Fogelman
String.ToCharArray() does your per item work in and (char []).ToString() out.

Hastable.CopyTo gives you an array representation.

If stream in / out etc then create a thin wrapper around a static Hashtable or StringDictionary etc, then hand code

protected MyObject(SerializationInfo info, StreamingContext context)
{
// Get Serialization Enumerator from info.GetEnumerator()
// GetNext() to start,
// then.. Key = GetNext, Value = GetNext, Add hashtable item until enum exahusted.
// Since you only put in pairs, you only get back pairs. Alignment not an issue.
}
[SecurityPermissionAttribute(SecurityAction.Demand,SerializationFormatter
=true)]
public virtual void GetObjectData(SerializationInfo info, StreamingContext context)
{
foreach hashtable item
.. info.Add(/* strongly typed version to avoid reflection*/)
}

re: 7.5k - 10k lines / day

2004-11-07 17:26 • by Peter da Silva
You don't need to use a hippie californian OS to run PostgreSQL, I'm sure it'll run under that professional coffee-drinking subsystem from the Pacific Northwest that those outstanding blokes from Redmond ship in Services for UNIX 3.5...

Re: 7.5k - 10k lines / day

2011-01-23 22:25 • by cindy (unregistered)
find for all kinds of watches and handbags .
A Lange & Sohne watches
Audemars Piguet watches
Ulysse Nardin watches
Vacheron Constantin watches
Michele Watches
Ebel watches
And so on

http://www.replica038.com

Re: re: 7.5k - 10k lines / day

2011-05-03 11:56 • by tharpa
346166 in reply to 25330
Alex Papadimoulis:
Thanks Jason &amp; Mike; that should definately do the trick to get them in. Now how do we get it out of a byte array?

And Mork, I think you misunderstood. This site is running on Windows 2003, not some hippie freeware thrown together by a bunch of stoned slackers. ;-).


This post and forum have been brought to you by Microsoft.
« PrevPage 1Next »

Add Comment