XKCD 927: standards

As the Wirebird::Remote hierarchy gradually fills in, there are some things I need to consider. As I’ve alluded to, very often a resource will have multiple sources of data and Wirebird will need to decide what’s best because there is no “authoritative” source. The bid() system currently just parses the best available source (in terms of what the standard can provide, not necessarily what a particular implementation does provide), but ultimately it will need to parse all sources above a minimum quality. For instance:

Back to my mastodon.social profile as an example. Currently, Mastodon can parse it as follows, in order of preference:

  • Atom
  • RSS
  • HTML

The RSS doesn’t offer anything the Atom doesn’t, but if the Atom is missing, the RSS doesn’t have a lot of things the HTML does. But the standard offers it, so I don’t want to necessarily bump RSS below HTML in priority. So it’s probably going to be necessary for the Remote::retrieve() routines to not stomp on already-existing properties.

When processing the subscriptions, things get even more interesting. If we process the feed entries, that RSS still doesn’t have authorship information - but if we follow the link to the page for each status we have a lot more options. That means a lot more http calls, though. Probably unavoidable - I just have to keep track of whether a link has been thoroughly mined out, so I don’t keep calling it every time Wirebird notices it.

“Minimum viable product” turns out not to be so “minimal” when you’re trying to build a solid foundation.

No commits for a couple days, sorry. I was wrong last week when I said I’d have more time this week.


Comment? Email it to me. (I'll assume I can publish it unless you say otherwise)


Next post: Coding out loud: the Unicode battle continues

Previous post: What I’m reading: WebID and Access Control