Unfurl v2022.11: Social Media Edition
It's been a while, but a new Unfurl release is here! v2022.11 adds new features and has behind-the-scenes changes. With all the attention on Twitter lately, in this post I'm going to highlight changes related to social media websites:
- Defining Twitter's sharing (
s
) parameter values (all 71 of them!) - Extracting timestamps from Mastodon IDs
- Decoding multiple types of LinkedIn identifiers
- Expanding Substack redirect links
- Parsing common tracking/analytics query string parameters
Get it now, or read on for more details about the new features!
Besides the headline-grabbing changes at Twitter, there have been some gradual, less obvious changes as well: the query string parameters. A few years ago (maybe 2018?) the s
parameter appeared, and people (myself included) began speculating and trying to figure out its purpose. By experimentation, the values for s
of 19, 20, and 21 seemed pretty clear: they meant a sharing source of Android, Twitter Web, and iOS, respectively (and Unfurl parsed them as such).
A few weeks ago, someone was poking at Twitter's JavaScript files and discovered an object with the mappings of 71 values for the sharing codes! They kindly shared this with me (thanks 2xyo!) and I added them to Unfurl.
The codes generally show the combination of device type (iOS, iPhone, Android, web browser) and method (email, WhatsApp, copy) used to share the tweet. I haven't personally seen the majority of these codes in use so I can't say they all are still valid, but then I also haven't shared a tweet from my iPad using LinkedIn (s=71
)!
Here's my cleaned-up interpretation of what the s
codes mean (links to the original .js files are in the GitHub issue if you're curious).
s Parameter |
Shared From |
---|---|
01 | an Android using SMS |
02 | an Android using Email |
03 | an Android using Gmail |
04 | an Android using Facebook |
05 | an Android using WeChat |
06 | an Android using Line |
07 | an Android using FBMessenger |
08 | an Android using WhatsApp |
09 | an Android using Other |
10 | iOS using Messages or SMS |
11 | iOS using Email |
12 | iOS using Other |
13 | an Android using Download |
14 | iOS using Download |
15 | an Android using Hangouts |
16 | an Android using Twitter DM |
17 | Twitter Web using Email |
18 | Twitter Web using Download |
19 | an Android using Copy |
20 | Twitter Web using Copy |
21 | iOS using Copy |
22 | iOS using Snapchat |
23 | an Android using Snapchat |
24 | iOS using WhatsApp |
25 | iOS using FBMessenger |
26 | iOS using Facebook |
27 | iOS using Gmail |
28 | iOS using Telegram |
29 | iOS using Line |
30 | iOS using Viber |
31 | an Android using Slack |
32 | an Android using Kakao |
33 | an Android using Discord |
34 | an Android using Reddit |
35 | an Android using Telegram |
36 | an Android using Instagram |
37 | an Android using Daum |
38 | iOS using Instagram |
39 | iOS using LinkedIn |
40 | an Android using LinkedIn |
41 | Gryphon using Copy |
42 | an iPhone using SMS |
43 | an iPhone using Email |
44 | an iPhone using Other |
45 | an iPhone using Download |
46 | an iPhone using Copy |
47 | an iPhone using Snapchat |
48 | an iPhone using WhatsApp |
49 | an iPhone using FBMessenger |
50 | an iPhone using Facebook |
51 | an iPhone using Gmail |
52 | an iPhone using Telegram |
53 | an iPhone using Line |
54 | an iPhone using Viber |
55 | an iPhone using Instagram |
56 | an iPhone using LinkedIn |
57 | an iPad using SMS |
58 | an iPad using Email |
59 | an iPad using Other |
60 | an iPad using Download |
61 | an iPad using Copy |
62 | an iPad using Snapchat |
63 | an iPad using WhatsApp |
64 | an iPad using FBMessenger |
65 | an iPad using Facebook |
66 | an iPad using Gmail |
67 | an iPad using Telegram |
68 | an iPad using Line |
69 | an iPad using Viber |
70 | an iPad using Instagram |
71 | an iPad using LinkedIn |
In addition to the s
parameter, we've seen t
roll out gradually. I saw t
on links shared from Android in late 2021 (s=19
), then from Twitter Web (s=20
) in early 2022, and finally from iOS (s=21
) a bit later in 2022. I don't think anyone outside of Twitter knows exactly how the t
parameter is constructed, but from my observations it appears consistent per device for a time. I shared tweets via numerous methods in August from my phone and the t
was consistently the same. I did similar tests again in November, and the t
value was again the same for different sharing methods, but it was different than from August. Maybe a software update or some other change on the device caused a change in the t
"fingerprint"? With this in mind, I think seeing the same t
values on multiple links suggests the same device was the sharing source. However, different t
values could still be from the same device, just over a longer time period.
Mastodon
This isn't actually a new parser (it's been in Unfurl for a few years), but I figured it would be worth mentioning with the increased interest in Mastodon. Mastodon is similar to Twitter in some respects; one of those is that the URLs of "toots" (Mastodon's version of tweets) contain an embedded timestamp. The long ID at the end of the URL is similar to a Twitter Snowflake:
https://infosec.exchange/web/@RyanDFIR/109306117687853105
Due to the federated nature of Mastodon, it could be running on domain that Unfurl doesn't know about. To avoid false positives, I only have a short allowlist of domains to parse as Mastodon instances. If you know of any others that you'd like to be parsed, let me know.
A while ago, I did some research and discovered how to dissect a TikTok identifier and extract a timestamp. Ollie Boyd figured out that IDs in LinkedIn post URLs had a similar makeup and made a tool to extract those timestamps. I've added this ability to Unfurl:
LinkedIn Messaging IDs
It turns out these LinkedIn IDs are used in more places than posts. One place they used to appear was in Messaging threads. When viewing messages on linkedin.com, the URL for each message thread (series of messages with a user) looked like https://www.linkedin.com/messaging/thread/6685980502161199104/
. The ID at the end has an embedded timestamp that seemed to line up with when the first message in the thread was sent.
I've been referencing this in past tense because this isn't the case anymore; message threads now have URLs that look like https://www.linkedin.com/messaging/thread/2-ZTRkNzljZjgtOTRmNC00ZGJkLWJlYTktMDFjOWU4MTgxMjhjXzAxMA==/
. These new IDs (which I'm calling "v2" from the 2-
at the beginning) are base64-encoded UUIDs with a few characters appended. The above "v2" ID decodes to e4d79cf8-94f4-4dbd-bea9-01c9e818128c_010
.
For those familiar with UUIDs, you may spot that this looks like a UUIDv4 (randomly-generated). I went back through my LinkedIn messages threads, all the way back to 2009 (wow, I've been on there a long time), and found something interesting. The older message threads had UUIDs that fit the form of UUIDv5 (name-based), while the newer ones fit UUIDv4. From my messages, the switch from UUIDv5 to UUIDv4 happened near early 2021-05 (I have a UUIDv5 message on 2021-04-26 and a UUIDv4 on 2021-05-14).
Why I am going on about this? Neither version 4 or 5 UUIDs contain any embedded timestamp information (unlike version 1). However, now for this particular use case, we can infer that a LinkedIn ID based on UUIDv5 corresponds to a message thread older than 2021-05, while one with a UUIDv4 was sent after that. It's a small, rough bit of timing information, but that's what Unfurl is all about: trying to parse all those tiny pieces of knowledge, in the hope that when put together they might paint a clearer picture.
LinkedIn Profile IDs
A few months ago, Jack Crook showed how to decode LinkedIn Profile IDs and use their sequential nature to estimate profile creation time:
These "profile IDs" are different than the other IDs we discussed previously. I thought this technique was really interesting; I've added parsing the ID from base12 to Unfurl. I don't yet do anything with taking that number and estimating the creation time, but that sounds like a neat little project when I find the time.
Tracking URL Parameters
Many websites add URL parameters to links to help with user tracking and analytics. This is not a new practice; we've all seen a bunch of parameters tacked on the end of links. As investigators, we can sometimes use these parameters to infer more information: how a user clicked on a link, what site the link was on, or even when they clicked it.
These parameters are key/value pairs; for example, in utm_source=newsletter
, the key is utm_source
and the value is newsletter
. The values often contain helpful clues (in the example, I'd guess that the link was from an email newsletter). Even in the cases when the values are opaque, we can glean some information from the key. For example, with fbclid=IwAR3Nuy7koMAB1KyVE1NqjcVGqAExIxVjQLSx-01U_e3LHKwSOzf2NsyP0UI
, I have no idea (yet!) how to parse anything out of the IwAR3...
value, but from the key I can infer the link was from Facebook.
I've added parsing of some of the most common of the tracking/analytics parameters to Unfurl. If you find one you'd like added, please let me know.
Substack
I've seen Substack increase in popularity as well. I so far only subscribe to "The Info Op" by the grugq, but there is a lot of other good content there too. I typically read it via email and noticed that all the links go through Substack redirects. I added expanding of Substack's redirect links to Unfurl; since many of the links are to Twitter/Mastodon and Substack adds utm_*
tracking parameters, this enables those parsers to run as well, making some nice Unfurl graphs:
Get it!
Those are the major items in this Unfurl release. There are more changes that didn't make it into the blog post; check out the release notes for more. To get Unfurl with these latest updates, you can:
- use it online at dfir.blog/unfurl or unfurl.link
- if using pip,
pip install dfir-unfurl -U
will upgrade your local Unfurl to the latest - View the release on GitHub
All features work in both the web UI and command line versions (unfurl_app.py & unfurl_cli.py).