So it turns out that, on digging into the XML::Atom source, I happened to notice the pod included a note about Unicode that I had somehow previously missed. Adding
$XML::Atom::ForceUnicode = 1;
to Wirebird::Remote::XMLFeed eliminated the need to decode/flag the incoming data. Not sure how I overlooked that.
And then I noticed that XML::FeedPP had some substantial bugfixes, and decided to install and play with it. Alas, it’s built on top of XML::TreePP which seems considerably less robust than LibXML.
Both of them drop everything but the most basic fields, in the name of providing a standardized interface. This is usually fine since, honestly, almost no one fills in the more esoteric Atom fields. But if the data is there, I’d like to grab it, which means I’m increasingly leaning toward dropping the off-the-shelf library and just fleshing out Wirebird::Remote::XML.
This probably means it’s time to learn GRDDL, a tortured acronym for a method of converting XML to XML/RDF by means of a ruleset. Which really means this should be a “what I’m reading” entry rather than a “coding out loud” entry, since I’m scrapping code instead of committing it. Oops.
On a more meta note: it’s graduation week, so probably another light schedule for Wirebird work.