- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
No, that's slashdot your thinking of there. backslashdot would be the place where everyone is helpful and informative, and never condescending. It is a scary, scary place.
Admin
However, the XML spec says that you may use an empty-element tag for any empty element regardless of whether it can contain anything else, so it looks like browsers are in violation if they don't allow <script/>.
(The spec also notes that you SHOULD only use empty-element tags for tags specifically declared EMPTY so you can be compatible with old parsers.)
Admin
I agree with you that performance can really be an issue, but they should have kept the possibility to understand an empty tag.
Last week, the company I'm working at had performance problem with their web service - it took 4-5 seconds to perform when not having much clients, and slowed down considerably under load. We used a lot of XML for data transfer between the tiers, so I decided that the .Net XML parser was way too slow. I built a custom parser in a night just for testing. My use of hash-tables and stack-buffers did not only improved the speed, but also made it much more scalable in a multi-cpu environment. Also, when I encountered a "/>", some custom goto-table hijacked the whole process of parsing the empty tag and went into a lightning fast ASM subroutine that pops characters out of the buffer, making using <ABC/> MUCH MUCH more performant than reading <ABC></ABC>. You can't imagine how much time parsers lose on those...
Turns out, in only one night of intense coding, I've saved almost 50 ms from the 10-20 parsing I did in my tests (against .Net parser), on only one CPU (my machine isn't dual-core yet). That's 1 percent right there, in one day, and it will scale way better on a multi-core processor so maybe even more than 1% in the production environment.
Admin
I'm sure they'll be eager to hire you hand-edit dozens (possibly hundreds) of files daily! It's so easy!
Admin
Admin
oops... I forgot my handy dandy /sarcasm tag
Admin
know!
Admin
So either restrict the DTD that you accept or don't validate the xml in prod (in the parser, anyway). I generally take the second approach, but I also tend to have a limited set of clients or servers to contend with.
Admin
How about seperate and then instead of separate and than. "My idea to seperate them was better then his" shudder
Admin
Crap, now I have to go play some starcraft; carrier rush is a beautiful thing.
Admin
Admin
Admin
All hail topcod3r, king of the trolls!
[image]Admin
Wrong thread, you want the Root beer one
CAPTCHA: nibh National Institute of Beer Homes?
Admin
But Bill built an empire by . . .
Admin
Dudes come on
Its "u" and "ur". Get with the now!
Admin
I just found out the other day that we have data like this:
<Element id="whatever"> <Name "some value"/> <OtherName "some other value"/> </Element>And, yes, the guy that came up with that also made sure we used our own parser, built on boost::spirit.
My head nearly exploded.
Admin
Seconded, this had me rolling when I heard it in my mind with a Boris Badinoff accent!
Admin
[quote user="NM"][quote]I've seen official parsers barf, or behave in ways you wouldn't expect, on empty tags.[quote]
I call <bs/>
Empty tags have been part of XML from the beginning. There is no reason but ignorance / stupidity to barf on them.
It's even more stupid than that braindead developer in the other room who doesn't understand that you should use 'void *' and not 'char *' when a pointer goes to various types. At least, there was a time (1979?) when that wasn't completely retarded, because 'void' wasn't a part of the C language yet. [/quote]
Sorry, but <bs> is a content required tag. You must call <bs></bs>!! :-)
Admin
Forgot to fix the first one! But, this was well done, I might even go so far as to label it a clbuttic!
Admin
There are many advantages to this stricture. It's gender-neutral, because monorchism is a physical anomaly either way. It would create a fun diversion in the office when crazed XML developers (there is no other kind) either hack off one of their testicles with a pruning fork or else superglue one on. And it would keep the fearsome troll in HR happy. ("Drop those pants! Now! It's Company Policy!")
But mostly, the advantage is that it would stop people using XML.
Admin
When you translate while speaking or writing a foreign language, you're simply not doing it right.
Admin
I worked on a project where a vendor was supplying an "XML" feed. I put XML in quotes because it wasn't actually well-formed at all, but it looked kinda like XML so that was good enough for them.
We actually had a conference call with them and our client where we agreed that the feed would be well-formed XML -- and to verify the standard we all agreed that expat would be the parser of choice.
I'll never understand how people make invalid XML, given all of the open source and built-in solutions for all modern development platforms.
I can only imagine they're writing code like this:
$xml = "<name>$name</name>
$address";Admin
From this site's HTTP headers:
This site uses HTML, not XHTML. This is just as well, because it's not well-formed, and, if it were XHTML, conforming browsers would have to stop processing the document at the first error.
Admin
This may be a silly question, but is there a single standard set of test case XML files somewhere, that could be used to test XML parsers for compliance to standards. For example, NIST has test files for MD5 and SHA1 hash generators. If your md5sum utility generates the correct hash for each of these test files, then it is correct.
(http://csrc.nist.gov/archive/ipsec/papers/rfc2202-testcases.txt)
Use of some sort of industry-standard XML test suite would end a lot of arguments. If your parser handles all of the test cases correctly, it is by definition, correct. If a parser can't handle an XML file that a compliant parser can handle, then the problem is in the parser, and vice versa.
Admin
They're never going to forgive your not using "their" there.
(captcha: erat -- indeed)
Admin
When parsing in XML mode they are exactly the same. The fact that XHTML is often served and parsed as plain HTML (or the fact that IE is broken and always parses as HTML) doesn't mean it's according to the spec.
Admin
You've confused XHTML with text/html tagsoup sprinkled with slashes.
Set proper MIME type and try again.
Admin
XML on its own is useless, there has to be a schema that defines the structure. And the schema might require that instead of <tag /> you write <tag></tag>. Also that someTag must come before otherTag and no other way.
So the real WTF is they changed the schema without notice (even though it's more probable they have an inbreed something-like-XML parser)
Admin
Is this possible to require in standard XSD? I've never seen it done...
Admin
Does this mean I have to start writing
in my HTML? :(
Admin
I second that!
Admin
you missed one:
"... until it's to late."
I work with people who write like this every day. Its going to kill me at some point.
Admin
So Somebody implemented another <XML Pronounced Less than eXtensible Markup Language
Admin
That would be ill-formed;
is defined in HTML as an empty element, so no end tag is allowed. OTOH,
is valid XHTML.
Admin
XSD doesn't allow that restriction to be defined. Ever heard of DTD?
Admin
I'm pretty sure a DTD can't specify either that you must use <tag></tag>, or that you must use <tag />. How do you think you'ld express that?
Admin
I'm working on a project right now with two subcontractors. Our API basically says "We exchange data with protocol <X> and the data satisfies xml schema <Y>". How many of our subcontractors do you think can actually deliver a client capable of satisfying both these requirements at, say, the third attempt? (Hint: it's not a positive integer.)
Admin
Fuck the vendor for not realising <Carrier /> is in fact invalid XML on two counts. Typical developer incompetence
And to a lower extent, fuck the client for thinking they know enough to mention their "solution" which is also invalid.
Admin
Hence it's a genuine WTF, as the tags being talked about most certainly weren't part of the XHTML definition.
Admin
Having worked in the Logistics software business for a while, let me just say that there is a good bit too little understanding of IT standards. In fact, most smaller carriers still seem to think of "this computer thing" as a nuisance that they hope will go away soon again.
The carriers are more concerned that you don't put any spray cans into your shipment than they are with ensuring their data is consistent. Security? Their drivers get security training, that should do.
I just wait for the day that somebody is going to badly misuse this. I won't give you any help here, but shipment data submission (via FTP) might be a good place to start looking... ;-)
Admin
That it doesn't add anything - I understand people doing it for the br tag in XHTML (although I don't understand why people would want to use XHTML), but in XML it really has no value whatsoever (it doesn't even add to the readability).
(All IMHO, of course.)
Admin
Yes, but while you're learning you generally translate. Even now, when I'm proofreading, I might translate back & forth just to check this kind of error. Mostly it comes naturally, but I'm now fluent in French.
So when you translate while speaking or writing a foreign language, you're just not fluent yet.
Admin
Admin
Admin
Don't get me started on people that don't know the difference between there/their/they're. And English is only my third language!
Captcha: praesent
Admin
Wow, dont you just love Substandard, standards! LOL
Jiff www.anonweb.eu.tc
Admin
Wait until you're worked with companies that think it's a good idea to not just make their own XML-Parser in Java, but also throw in one made in JavaScript for good measure... none of which handles all the aspects of XML of course... Oh, and both mixed freely...
I can only imagine what they'll come up with when they upgrades to Java 6 (where you can run JavaScript inside of a Java program, wee hoo!)
Well, well, one day when the period for prosecution has expired on those crimes, I might just post it to this site...
Admin
That was actually painful to read
Admin
I fear that some code over here would break in a similar manner. One of my coworkers seemed to be oblivious to the fact that Java has XML parsers, so his genious (sic) solution was worthy of the main page. Long story short, lets say that even changing the tag order will break his code. Or adding a new attribute.
Hell, I might actually submit the whole thing as a WTF!