- Feature Articles
- CodeSOD
- Error'd
- Forums
-
Other Articles
- Random Article
- Other Series
- Alex's Soapbox
- Announcements
- Best of…
- Best of Email
- Best of the Sidebar
- Bring Your Own Code
- Coded Smorgasbord
- Mandatory Fun Day
- Off Topic
- Representative Line
- News Roundup
- Editor's Soapbox
- Software on the Rocks
- Souvenir Potpourri
- Sponsor Post
- Tales from the Interview
- The Daily WTF: Live
- Virtudyne
Admin
Most of these temporary strings are unnecessary, because stripos has an optional offset and substr has an offset and length. Simply get the index of the start tag ($position), then the index of the end tag using stripos($string, $end_tag, $position + strlen($start_tag)), and finally substr($string, $position + strlen($start_tag), $second_position, $position + strlen($start_tag)). Add the trim and caching for strlen($start_tag) and that's only 5 calls: 2x stripos, 1x strlen, 1x substr and 1x trim. That's 2 fewer substr calls.
Admin
Given that they had no schema doc it's a decent bet that their understanding of what XML is extended no more deeply than what we see demonstrated in
fetch_data(...)
. To them it's simply CSV with very verbose commas.Addendum 2024-11-11 07:11: Heck, maybe their original interchange format really was CSV back in the 1200 baud modem days and they "modernized" it to XML in about 2020 when they added a web API. Come 2030 they'll add a REST API and finally adopt JSON too in about 2035.
Admin
Imagine their attempt at parsing JSON with substring 🤣
Admin
One thing I noticed, the tags are case-insensitive! That means in the actual XML, this code would find "authCode", "AuthCode", and even "authcode"
Admin
Honest question: Why is JSON worse than XML?
Admin
No native date data type. No concept of inter-document references. Only one numeric data type. No namespace support.
Admin
One person's bug is another person's feature.
Admin
The other (another?) big problem with this is that it is only ever going to find the first incidence of any chosen tag. What happens when your document has multiple entries?
Admin
I am going to stand up here in front of everyone and admit that I've used a similar technique on a program to scrape data from web pages.
Admin
Is no one going to make disparaging remarks about the WTFery of using
curl
to scrape the response, instead of, y'know, something less WTFy?Addendum 2024-11-11 13:37: To elaborate: providing the response in a less WTFy way which doesn't require it to be scraped by
curl
Admin
I've dealt with this sort of thing, in the transaction-processing world.
This time, I was trying to get a new register system talking to the transaction processor we were selling. This system was pretty old - the documentation talked about serial cables, though we were running this over TCP. And I could see the several layers built atop one another - there was a rudimentary SOH/STX/ETX frame, even on TCP, with its own checksum footer and length header, then there was a CSV layer where it was arguments, delimited by commas, in a fixed order, and then one of the last of those contained a bunch of very-useful metadata in an XML string.
So I got it all working, and got our server sending responses back, and those responses matched their documentation and passed an XML validator. They even had a schema to validate against, IIRC, so I was very confident despite not having access to actual hardware to test on. Sure, I was working in PHP, but I was using the well-tested standard XML libraries.
We get the hardware, and it falls flat. After fixing a few issues the documentation had failed to note, about that TCP layer (because of course they predated the idea of HTTP for everything), I was still stuck on getting the XML to work. It kept throwing an arcane error code. Eventually an engineer responded to my email, and explained that my schema-compliant XML was in fact invalid, according to their parser, and that I needed to make it look more like the samples.
I don't know what the code looks like, on their side. But I bet it's even worse than this, because I only got their parser to accept my responses, when I formatted them with one element per line, with whitespace, in the order that the sample responses had used.
Needless to say, before I pushed this out, I had added a detailed comment to my "build XML by concatenating strings" code, explaining exactly why this abomination was necessary, and including the commit hash of the version that did things properly.
Admin
“XML” that merely dresses up a name/value list with angle brackets is not unheard of in transaction systems — or any other domain that established standards in the 2000s. Code processing such formats doesn’t need to handle empty or nested values because they just don’t happen.
Admin
I maintained a server that had clients parse using static offsets. The parsing was done by an unmaintained 1st party Windows app that set URLs in an embedded IE frame to do GET requests for everything, then grabbed the body that has to be HTML, but contained "XML" in the body.
We once inadvertently broke it when someone's editor added a trailing line break in the template file. Leading to this at the end of the template.
Admin
This reminds me of what I had to do to process bookmarks in the DOCX format's "XML". Maybe it's been fixed now, but way back when those were not represented in valid structured XML... Regex was the only way to go... or yeah, maybe I should have gone with string operators. It was a mess...
Admin
Seems okay for the current usage. Maybe a simplified version (a wrapper) could be implemented, which you could call like this:
$trxnnumber = fetch_data_simple($response, 'trxnnumber');
Edit Admin
What is XML's native date data type?
That's never got XML users into trouble in the past. /s
You only need one. JSON can express any number that that can be written down using a finite number of decimal digits.
The only things in JSON that are named are the keys of properties in objects. Each object is its own namespace. Why would you need namespacing for anything else? There is nothing else to name.
The real WTF of JSON is that it is so convenient to use, people use it for things for which it was not designed. It's a data interchange format, not a config file format or a UI design format or even a programming language.