If you look at the screen shots from last week, you’ll notice there are a number of missing bits of data. The “member since” dates (in fact, all of the creation/modification dates) are missing, as are the authors of blog posts. The dates are fairly easy to put back in, but the authors are a little more complicated.

Facebook maintains “shadow profiles” on people - a compilation of things it has figured out indirectly about you. Wirebird needs to build shadow profiles too. Okay, technically they’re “address books” or “contact lists” but I’m going to keep calling them shadow profiles just as a reminder that they can easily be abused.

Let us assume that Alyssa P. Hacker follows me on Mastodon. Masto offers much of the usual profile information:

url https://mastodon.social/@gamehawk
a as:Person
as:preferredUsername gamehawk
as:name Karen 🥨👵🏻🌲🏖️
as:summary Wandering Jayhawker[...]
as:icon/as:image [as:Image pointing to https://files.mastodon.social/accounts/avatars/000/007/209/original/media.png]

None of this is really uniquely identifiable, or at least non-forgeable, except the url/. When I was on Facebook, I regularly got friend requests from accounts that had cloned my mother’s: used her profile picture and information, and sent friend requests to everyone on her friends list. Some of her friends would automatically accept (thereby revealing all their “show only to friends” info), others would helpfully tell her to “change your password” (when no unprivileged access had been made to her account, just public info). But we’ll say that Alyssa has followed this account long enough that, even if the owner’s real name isn’t Karen-Bunch-of-Emoji, she is interested in other things they write.

Alyssa also subscribes to the Atom feed for this blog. Can Wirebird authoritatively link them up? The feed offers very little in the way of profile information.

    name: Karen
    email: silver@phoenyx.net

Exactly zero bits of that (except the low-order ASCII part of my name) match up to anything in my Mastodon account, so Wirebird won’t notice a link. But if it processes the HTML home of that feed, it can find a link to the project’s Gitlab page. It has to be careful about these, since there’s no semantic markup: is that a blogroll, or a list of the author’s other homes around the web, or a link to the software that created the website? (There is, as I write this, a link to Plerd at the bottom of the page but I am not jmac.)

Wirebird might have some specific knowledge about how to deal with Gitlab, so even though Gitlab doesn’t have any semantic markup to speak of (at least from a quick human browse), Wirebird can tell that one of the members is https://gitlab.com/gamehawk. Now we have the tenuous linkage of a shared username between a Mastodon account and a Gitlab account, and a blog that points to something that Gitlab account worked on.

The Gitlab account doesn’t share an avatar with the Mastodon account, and the blog doesn’t have one at all. But the Gitlab account’s display name is “Karen Cravens” so there’s that shared first name again - another tenuous link.

Wirebird will almost certainly have specific knowledge about how to deal with email addresses, though, so it will look at http://phoenyx.net/silver, ~silver, and @silver, and do a webfinger request at phoenyx.net. These all turn up nothing (even though the long-neglected phoenyx.net is mine too). Looking at the Atom and RSS feeds for phoenyx.net itself, that email address turns up only in the body of an entry, not in any of the author fields - Hugo doesn’t seem to do that, or maybe I have it configured wrong. So even though that’s been our domain for over two decades(!) Wirebird can’t positively identify me with it.

Of course, an Atom feed can put anything in for the email address, so even if we matched that up we can’t trust it yet. So for some more authoritative info, Wirebird might have knowledge about how to check Keybase. Is there a gamehawk user there? Why, yes there is. Alas, Keybase still doesn’t have fediverse verification, but now Wirebird knows authoritatively that a person who calls themselves “Karen C” on Keybase has verified their ownership of phoenyx.net and a Github account named tyrosinase. Oh, and the Keybase avatar is the same (a diagram of a tyrosinase structure) as the Gitlab avatar. So now we know:

Relationship web

There are a lot of tenuous links, and not many fully authenticated ones. And some of the fully authenticated ones don’t assert identity - knowing that gamehawk is a Keybase member is not saying that gamehawk is Keybase, and knowing that silver is a phoenyx.net email is not saying anything other than that phoenyx.net handles mail for silver. (And technically, until we send email to that account we don’t even know that for sure.)

There are also some false links. In the same sidebar list, wirebird.com links to the Fielding Dissertation, so Wirebird will have to explore whether https://www.ics.uci.edu/~fielding/ is another homepage for the author of the blog at wirebird.com. (Spoiler: it is not.) Same for the CPAN libraries (though there is in fact a GAMEHAWK contributor, and it is me, although “contributor” is loosely used here).

It’s likely that Wirebird will have some other trusted sites besides Keybase, and those will probably be another plugin system. So if Alyssa is a Perl person, they might have installed the CPAN plugin, and so Wirebird might check CPAN as well as Keybase. CPAN is not a double-opt-in (well, the email address might be; it’s been years) so its arrows wouldn’t be as bold or bidirectional as Keybase’s, but the fact that it’s GAMEHAWK, Karen Cravens, silver@phoenyx.net, and wirebird.com starts to add up.

Deciding when to consider it authoritative is tough, though. It’s all circumstantial, and even if we start to look at the posts on Masto, seeing mentions of, say, the CPAN account aren’t necessarily confirmation, from a machine’s perspective. For all it knows, I might be complaining about the counterfeit CPAN account that claims it’s me. For the sake of the “blue checkmark” level, it needs to be something that both identities assert unambiguously. Nothing we’ve discovered so far is that, yet.

User intervention

Overall, we have the beginnings of a robust web of connections for this blog, but not much for the Mastodon account.

At this point, Alyssa’s Wirebird might ask her if these pages are all the same person but I think it’s important that it internally regard this as “Alyssa wants me to group these” and not “I now have authoritative evidence that these all belong to the same entity.” There are two reasons for this:

  • Alyssa might have just glanced at the avatars and usernames and gotten phished, and
  • All the information we’ve collected up to this point is publicly available, whereas Alyssa might reveal secret info by explicitly linking a friend’s home and work profiles, for instance.

There are some things that her Wirebird can do to ask me to authoritatively confirm the linkages, the same way Keybase confirms things without having an explicit API: e.g. https://gist.github.com/tyrosinase/7e1f4c2e6b47b6081a968be1906db251 Of course, if I was willing to do that I’d probably also be willing to add some links to better identify myself. Mastodon doesn’t have a defined place for an URL in the profile, but it’s probably safe to assume that if someone has a link in their text profile and points to that profile from the link using author fields or a FOAF file or webfinger, the two identities are linked.

For now, I’m just going to leave my identities the way they are, and wait for Wirebird to get smart enough to dox me by itself.

Comment? Email it to me. (I'll assume I can publish it unless you say otherwise)

Next post: Building tests and cleaning up

Previous post: Cleaning house: adding the docs