Rubberducking ahead! Mostly I’m recording my thought process so I don’t reinvent the wheel later.

The first decisions that need to be made here are about file structure.

The Site

The root will be a sioc:Site (and also a schema:WebSite). For single-user sites this will probably display recent blog posts or whatever in the HTML, but the linked data will be the site info.

Three (four, if you count HEAD) HTTP verbs are legal, auth permitting. * GET/HEAD - Self-evident, and generally permitted regardless of authentication. * PUT - Will update the non-list site attributes. * POST - Will add list-type site attributes. * DELETE - Will delete the resource. (Pretty rare at the site level, hopefully).

Four attributes are lists of items:

  • space_of : Any resource. That doesn’t narrow things down much, but a sioc:Space (which a Site is a specific type of) can hold… anything.
  • usergroup_of : sioc:UserGroup. A UserGroup is just a group of members; literally the only unique attribute it has is has_member. This is also inherited from Space.
  • has_administrator : sioc:UserAccount. This is a Site-specific attribute, and conceivably a single-user site could have no UserGroup, just a direct administrator.
  • host_of : sioc:Container. A Container is anything that contains other things, so that would be a Blog or Forum. (Why a UserGroup isn’t just another Container I do not know.)

Most random resources POSTed here are going to bounce with a 501 Not Implemented. I’m sure things will come up down the line that just need posted to the site generically, but I can’t think of any offhand.

I am inclined to say a Wirebird will have a default UserGroup at /users/, but if not the client will have to POST one before UserAccounts can be created.

POSTing a UserAccount directly to the Site will (auth permitting) promote that UserAccount to administrator status, creating it if necessary (and adding it to the default UserGroup).

Containers being POSTed here will be sitewide forums and such. For reading feeds, we won’t need those yet.

UserGroups

HTTP verbs will look like the Site ones.

UserGroups only have one list-type attibute of interest: has_member : sioc:UserAccount.

Having urls built from usernames bring up the (well, an) endless REST debate: since the client is (probably) choosing the username, should it just build the url and PUT the data directly to /~username or whatever? I think I’m going with the former. /~username and /@username may be aliases to the same thing, we’ll see.

UserAccounts

SIOC doesn’t have users, just accounts. (Actual humans get to be foaf:Person or schema:Person.)

UserAccounts can have all sorts of attributes, but for our purposes we will only be concerned with:

  • subscriber_of : sioc:Container. For the most part, Containers POSTed here will be already-existing things, but we’ll allow one type (so far) to be created by POST: a SubscriptionList.

Because we’re also going to be ActivityPub-compatible, a UserAccount will also be a stream:Actor. This means it needs an Inbox and Outbox, both flavored stream:OrderedCollection and sioc:Container.

SubscriptionLists

Despite being a specific subtype, SubscriptionLists don’t have any different attributes than base Containers. Here again, the attribute we’re most interested in is:

  • container_of : sioc:Item.

This is where things get a little hairy. The Item is going to be a (representation of a) Weblog - does it live under the SubscriptionList? How about the Weblog’s Posts? For a single-user site, it doesn’t really matter, but for a multi-user site should there be a shared area? Would doing so be potential exposure of something (even if we restrict GET to subscribers)?

Our urls can get kind of long:

http://SITENAME/users/USERNAME/subscriptions/WEBLOG/POST

Or I could put it all in a shared space, which could reduce them to:

http://SITENAME/cache/WEBLOG/POST

I know RESTful urls aren’t supposed to matter, but I think I’ll go with the latter form. The cache directory can itself be a Container, of course.

Subscriptions

This seems like a log of work to get all the way down to subscribing, which is as simple as… POSTing an url to the SubscriptionList. But for minimum-viable, most of the above stuff will get initialized on database creation and won’t need to be edited.

When the url gets POSTed, we hand things off to XML::Feed. If the target isn’t a feed itself, XML::Feed includes a find_feeds function. It also doesn’t care if it’s RSS or Atom; the interface is the same.

Once we have the feed parsed, it’s pretty simple to build the Item: whatever url find_feeds found goes in sioc:feed, $feed->title goes in sioc:name, $feed->link goes in sioc:link.

Polling

The polling process is pretty simple: retrieve all the SubscriptionLists, their subscriber_of and their container_of, and use XML::Feed to get the current feed for each container_of Item. Store each item in the cache, and put a link in each subscriber_of UserAccount’s Inbox.

This last bit gets a little sticky, since ActivityPub expects Inbox entries to be wrapped in an Action, but luckily it will assume a Create if it’s given just a naked Object. So each feed entry will be saved as a sioc:Post and a streams:Article, and eventually when we have an ActivityPub API we can read the Inbox directly that way.

Authors

There is one thing that’s iffy about both RSS and Atom feeds, and that’s the unpredictability of what you end up with in the “author” fields. You can have authors attached to the individual entries or the overall feed, and the authors can be a plaintext string, an url, an email address, or various combinations thereof. And since the point of Wirebird is to be a social network, that’s not good. We may not care about reliably identifying the author of news articles, but if I’m reading a friend’s blog I want to be able to link that data properly.

But that’s for down the road.


Comment? Email it to me. (I'll assume I can publish it unless you say otherwise)


Next post: Coding out loud: the Wirebird base library

Previous post: Steps toward a minimum viable product