• Rob (unregistered)

    Most of these temporary strings are unnecessary, because stripos has an optional offset and substr has an offset and length. Simply get the index of the start tag ($position), then the index of the end tag using stripos($string, $end_tag, $position + strlen($start_tag)), and finally substr($string, $position + strlen($start_tag), $second_position, $position + strlen($start_tag)). Add the trim and caching for strlen($start_tag) and that's only 5 calls: 2x stripos, 1x strlen, 1x substr and 1x trim. That's 2 fewer substr calls.

  • (nodebb)

    Given that they had no schema doc it's a decent bet that their understanding of what XML is extended no more deeply than what we see demonstrated in fetch_data(...). To them it's simply CSV with very verbose commas.

    Addendum 2024-11-11 07:11: Heck, maybe their original interchange format really was CSV back in the 1200 baud modem days and they "modernized" it to XML in about 2020 when they added a web API. Come 2030 they'll add a REST API and finally adopt JSON too in about 2035.

  • (nodebb) in reply to WTFGuy

    Imagine their attempt at parsing JSON with substring 🤣

  • (nodebb)

    One thing I noticed, the tags are case-insensitive! That means in the actual XML, this code would find "authCode", "AuthCode", and even "authcode"

  • (nodebb)

    JSON is terrible; XML was bureaucratic and bloated, but JSON isn't fit-for-purpose

    Honest question: Why is JSON worse than XML?

  • (nodebb) in reply to Gearhead

    Honest question: Why is JSON worse than XML?

    No native date data type. No concept of inter-document references. Only one numeric data type. No namespace support.

  • (nodebb) in reply to Jaime

    One person's bug is another person's feature.

  • OldCoder (unregistered)

    The other (another?) big problem with this is that it is only ever going to find the first incidence of any chosen tag. What happens when your document has multiple entries?

  • (nodebb)

    I am going to stand up here in front of everyone and admit that I've used a similar technique on a program to scrape data from web pages.

  • (nodebb)

    Is no one going to make disparaging remarks about the WTFery of using curl to scrape the response, instead of, y'know, something less WTFy?

    Addendum 2024-11-11 13:37: To elaborate: providing the response in a less WTFy way which doesn't require it to be scraped by curl

  • gman003 (unregistered)

    I've dealt with this sort of thing, in the transaction-processing world.

    This time, I was trying to get a new register system talking to the transaction processor we were selling. This system was pretty old - the documentation talked about serial cables, though we were running this over TCP. And I could see the several layers built atop one another - there was a rudimentary SOH/STX/ETX frame, even on TCP, with its own checksum footer and length header, then there was a CSV layer where it was arguments, delimited by commas, in a fixed order, and then one of the last of those contained a bunch of very-useful metadata in an XML string.

    So I got it all working, and got our server sending responses back, and those responses matched their documentation and passed an XML validator. They even had a schema to validate against, IIRC, so I was very confident despite not having access to actual hardware to test on. Sure, I was working in PHP, but I was using the well-tested standard XML libraries.

    We get the hardware, and it falls flat. After fixing a few issues the documentation had failed to note, about that TCP layer (because of course they predated the idea of HTTP for everything), I was still stuck on getting the XML to work. It kept throwing an arcane error code. Eventually an engineer responded to my email, and explained that my schema-compliant XML was in fact invalid, according to their parser, and that I needed to make it look more like the samples.

    I don't know what the code looks like, on their side. But I bet it's even worse than this, because I only got their parser to accept my responses, when I formatted them with one element per line, with whitespace, in the order that the sample responses had used.

    Needless to say, before I pushed this out, I had added a detailed comment to my "build XML by concatenating strings" code, explaining exactly why this abomination was necessary, and including the commit hash of the version that did things properly.

  • Duke of New York (unregistered)

    “XML” that merely dresses up a name/value list with angle brackets is not unheard of in transaction systems — or any other domain that established standards in the 2000s. Code processing such formats doesn’t need to handle empty or nested values because they just don’t happen.

  • Duckbrain (unregistered)

    I maintained a server that had clients parse using static offsets. The parsing was done by an unmaintained 1st party Windows app that set URLs in an embedded IE frame to do GET requests for everything, then grabbed the body that has to be HTML, but contained "XML" in the body.

    We once inadvertently broke it when someone's editor added a trailing line break in the template file. Leading to this at the end of the template.

    <?php // The whitespace before this tag is needed for parsing in <redacted>
    
  • (nodebb)

    This reminds me of what I had to do to process bookmarks in the DOCX format's "XML". Maybe it's been fixed now, but way back when those were not represented in valid structured XML... Regex was the only way to go... or yeah, maybe I should have gone with string operators. It was a mess...

  • lzsiga (unregistered)

    Seems okay for the current usage. Maybe a simplified version (a wrapper) could be implemented, which you could call like this: $trxnnumber = fetch_data_simple($response, 'trxnnumber');

  • (nodebb) in reply to Jaime

    No native date data type.

    What is XML's native date data type?

    No concept of inter-document references.

    That's never got XML users into trouble in the past. /s

    Only one numeric data type.

    You only need one. JSON can express any number that that can be written down using a finite number of decimal digits.

    No namespace support.

    The only things in JSON that are named are the keys of properties in objects. Each object is its own namespace. Why would you need namespacing for anything else? There is nothing else to name.

    The real WTF of JSON is that it is so convenient to use, people use it for things for which it was not designed. It's a data interchange format, not a config file format or a UI design format or even a programming language.

  • Loren Pechtel (unregistered)

    I'm going to say this one isn't a WTF. I think they're reading a very simple data structure and don't need to worry about all the fancy things that are legal. And, yes, while it's a bit more verbose than it needs to be doing it this way gives you stuff to examine while debugging and I would expect the optimizer to clean it up.

Leave a comment on “Pay for this Later”

Log In or post as a guest

Replying to comment #:

« Return to Article