<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[dfir.blog]]></title><description><![CDATA[Digital forensics, web browsers, visualizations, & open source tools]]></description><link>https://dfir.blog/</link><image><url>https://dfir.blog/favicon.png</url><title>dfir.blog</title><link>https://dfir.blog/</link></image><generator>Ghost 5.82</generator><lastBuildDate>Tue, 07 Apr 2026 14:08:54 GMT</lastBuildDate><atom:link href="https://dfir.blog/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Hindsight v2026.01 Released!]]></title><description><![CDATA[Hindsight v2026.01 brings new features, including parsing Sync Data, an updated terminal interface, improved output formats, and dozens of fixes and enhancements.]]></description><link>https://dfir.blog/hindsight-v2026-01/</link><guid isPermaLink="false">69821bfc04abfd293590e01b</guid><category><![CDATA[Hindsight]]></category><category><![CDATA[Chrome]]></category><category><![CDATA[Open Source Tools]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Wed, 04 Feb 2026 18:00:00 GMT</pubDate><media:content url="https://dfir.blog/content/images/2026/02/hindsight-cli.png" medium="image"/><content:encoded><![CDATA[<h2 id="sync-data-parsing">Sync Data Parsing</h2><img src="https://dfir.blog/content/images/2026/02/hindsight-cli.png" alt="Hindsight v2026.01 Released!"><p>A new feature that I&apos;m excited to about in this release is parsing of Chrome&apos;s Sync Data. When a user signs into Chrome with their Google account, Chrome can sync bookmarks, passwords, extensions, history, and more across devices. </p><p>This sync functionality stores data locally in LevelDB files, and Hindsight can now parse it - at least partially. Most of the LevelDB records hold data encoded in different protobufs, many of which Hindsight now parses. The <em>meaning</em> and function of these parsed records is definitely an area for further research, as there is a wealth of information in the data. Hindsight currently only parses out what devices were used for syncing and enhances the existing &quot;Source&quot; column in the timeline with details about the originating device for synced URL visits:</p><figure class="kg-card kg-image-card"><img src="https://dfir.blog/content/images/2026/02/hindsight-sync-source.png" class="kg-image" alt="Hindsight v2026.01 Released!" loading="lazy" width="943" height="576" srcset="https://dfir.blog/content/images/size/w600/2026/02/hindsight-sync-source.png 600w, https://dfir.blog/content/images/2026/02/hindsight-sync-source.png 943w" sizes="(min-width: 720px) 720px"></figure><h2 id="updated-terminal-interface">Updated Terminal Interface</h2><p>Hindsight&apos;s terminal interface has been largely unchanged for almost 10 years (!?) now, and it showed. Hindsight now uses the <code>rich</code> library to provide a much more polished command-line interface, while still keeping with the spirit and style of the original version. This is mostly a cosmetic change; the command line syntax remains the same. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dfir.blog/content/images/2026/02/hindsight-rich-cli.gif" class="kg-image" alt="Hindsight v2026.01 Released!" loading="lazy" width="1278" height="716" srcset="https://dfir.blog/content/images/size/w600/2026/02/hindsight-rich-cli.gif 600w, https://dfir.blog/content/images/size/w1000/2026/02/hindsight-rich-cli.gif 1000w, https://dfir.blog/content/images/2026/02/hindsight-rich-cli.gif 1278w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Hindsight&apos;s updated terminal interface</span></figcaption></figure><h2 id="new-artifacts-expanded-parsing">New Artifacts &amp; Expanded Parsing</h2><p>Beyond Sync Data, v2026.01 adds parsing for several other Chrome artifacts:</p><ul><li><strong>Permission Actions</strong> from the Preferences file, showing what permission requests websites have made</li><li><strong>Login Data For Account</strong> database, used for account-specific saved credentials in recent Chrome versions</li><li><strong>Account Capabilities</strong> from Preferences, translated into human-readable descriptions</li><li><strong>Parsing for more timestamped values in Preferences, </strong>as there are many top- or second-level keys that just hold a timestamp and are easy to parse</li></ul><h2 id="improved-output-formats">Improved Output Formats</h2><p>All three output formats (XLSX, JSONL, and SQLite) received improvements in this release. The SQLite output in particular was overhauled to be more comparable to the other formats, making it easier to work with Hindsight data in your tool of choice. The JSONL output, which was introduced to make it easier to import Hindsight results into <a href="https://timesketch.org/?ref=dfir.blog" rel="noreferrer">Timesketch</a>, previously only had timestamped records. It now includes all records; those without any intrinsic timestamp (like various storage items) have their timestamp set to the Unix epoch and a timestamp description of &quot;Not a time&quot;.</p><h2 id="more-robust-parsing">More Robust Parsing</h2><p>There are over a dozen fixes and improvements to make Hindsight&apos;s parsing more reliable and complete:</p><ul><li>Updated parsing for changes in Chrome v142&apos;s DIPS records</li><li>New danger types and interrupt reason codes for download records</li><li>Better handling of extension version strings and preference timestamps</li><li>More tolerant File System logical path creation</li><li>Improved file-closing and resource management</li></ul><h2 id="get-hindsight">Get Hindsight!</h2><p>You can get Hindsight, view the code, and see the full change log on <a href="https://github.com/obsidianforensics/hindsight?ref=dfir.blog" rel="noopener">GitHub</a>. Both the command line and web UI versions of this release are available as:</p><ul><li>compiled exes attached to the <a href="https://hindsig.ht/release?ref=dfir.blog">GitHub release</a> or in the dist/ folder</li><li>.py versions are available by <code>pip install pyhindsight</code> or downloading/cloning the <a href="https://hindsig.ht/github?ref=dfir.blog">GitHub repo</a>.</li></ul>]]></content:encoded></item><item><title><![CDATA[Unfurl 2025.03]]></title><description><![CDATA[Unfurl v2025.03 adds new features, including 
parsing Google Search's UDM parameter, support for Mastodon forks (like Truth Social), and a utility parser to "clean up" inputs.]]></description><link>https://dfir.blog/unfurl-parses-googe-udm-and-truth-social/</link><guid isPermaLink="false">67d1001404abfd293590df33</guid><category><![CDATA[Unfurl]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Thu, 13 Mar 2025 13:30:40 GMT</pubDate><media:content url="https://dfir.blog/content/images/2025/03/unfurl-google-udm-3.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2025/03/unfurl-google-udm-3.png" alt="Unfurl 2025.03"><p>A new Unfurl release is here! v2025.03 adds new features and some fixes, including:</p><ul><li>Parsing Google Search&apos;s UDM parameter</li><li>Recognizing Mastodon usernames and parsing Mastodon forks (like truthsocial[.]com and gab[.]com) </li><li>Utility parser to &quot;clean up&quot; inputs</li></ul><p><a href="#get-it" rel="noreferrer">Get the new version now</a>, or read on for more details about the new features!</p><h2 id="google-search-udm-parameter">Google Search UDM Parameter</h2><p>I was first made aware of the UDM query string parameter in Google Search when lots of people starting posting about the &quot;udm=14 hack&quot; to turn off AI-generated content in Search results. What this parameter seems to do is control the results page type, and <strong>udm=14</strong> sets the results page to &quot;Web&quot;. </p><p>When you click on different results types in a Google Search results page, you can observed the <code>udm</code> value changing as well. In the screenshot below, I selected &quot;Images&quot; and the <code>udm</code> value changed to <code>2</code>.</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/03/unfurl-google-udm-1.png" class="kg-image" alt="Unfurl 2025.03" loading="lazy" width="1125" height="665" srcset="https://dfir.blog/content/images/size/w600/2025/03/unfurl-google-udm-1.png 600w, https://dfir.blog/content/images/size/w1000/2025/03/unfurl-google-udm-1.png 1000w, https://dfir.blog/content/images/2025/03/unfurl-google-udm-1.png 1125w"><figcaption><span style="white-space: pre-wrap;">Google Search &quot;Images&quot; Results Page with UDM=2</span></figcaption></figure><p>I manually incremented the <code>udm</code> value in the URL and observed the what type of results page was served. <code>udm</code> of <code>51</code> was the highest value I found; setting it to <code>56</code> and above results in a redirect back to the search results page with the <code>udm</code> parameter stripped off (at least until 65, then I stopped testing). The results are in the table below:</p>
<!--kg-card-begin: html-->
<table>
<thead>
<tr>
<th>UDM Value</th>
<th>Google Search Results Page Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>All</td>
</tr>
<tr>
<td>2</td>
<td>Images</td>
</tr>
<tr>
<td>3</td>
<td>Products</td>
</tr>
<tr>
<td>6</td>
<td>Learn</td>
</tr>
<tr>
<td>7</td>
<td>Videos</td>
</tr>
<tr>
<td>8</td>
<td>Jobs</td>
</tr>
<tr>
<td>12</td>
<td>News</td>
</tr>
<tr>
<td>14</td>
<td>Web</td>
</tr>
<tr>
<td>15</td>
<td>Things to do</td>
</tr>
<tr>
<td>18</td>
<td>Forums</td>
</tr>
<tr>
<td>28</td>
<td>Shopping</td>
</tr>
<tr>
<td>36</td>
<td>Books</td>
</tr>
<tr>
<td>37</td>
<td>Products</td>
</tr>
<tr>
<td>38</td>
<td>Videos</td>
</tr>
<tr>
<td>44</td>
<td>Visual matches</td>
</tr>
<tr>
<td>47</td>
<td>Web (+&quot;Refine Results&quot; panel)</td>
</tr>
<tr>
<td>48</td>
<td>Exact matches</td>
</tr>
<tr>
<td>51</td>
<td>Homework</td>
</tr>
</tbody>
</table>
<!--kg-card-end: html-->
<h2 id="mastodon-parsing-improvements">Mastodon Parsing Improvements</h2><p>There&apos;s a few minor enhancements to the Mastodon parser in this release. Unfurl now recognizes the username section of a post URL, and splits it into local username and account domain, if applicable. </p><figure class="kg-card kg-image-card kg-width-wide"><a href="https://dfir.blog/unfurl/?url=https://infosec.exchange/@404mediaco@mastodon.social/114116259626492341"><img src="https://dfir.blog/content/images/2025/03/unfurl-mastodon-long-username.png" class="kg-image" alt="Unfurl 2025.03" loading="lazy" width="1852" height="1225" srcset="https://dfir.blog/content/images/size/w600/2025/03/unfurl-mastodon-long-username.png 600w, https://dfir.blog/content/images/size/w1000/2025/03/unfurl-mastodon-long-username.png 1000w, https://dfir.blog/content/images/size/w1600/2025/03/unfurl-mastodon-long-username.png 1600w, https://dfir.blog/content/images/2025/03/unfurl-mastodon-long-username.png 1852w" sizes="(min-width: 1200px) 1200px"></a></figure><p>I&apos;ve also added <code>truthsocial.com</code> and <code>gab.com</code> to the Mastodon parser. Even though they aren&apos;t part of the Fediverse (like most other Mastodon servers are), since they&apos;re based on Mastodon&apos;s code, Unfurl can parse them just the same. </p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://dfir.blog/content/images/2025/03/unfurl-truth-social.png" class="kg-image" alt="Unfurl 2025.03" loading="lazy" width="1686" height="1137" srcset="https://dfir.blog/content/images/size/w600/2025/03/unfurl-truth-social.png 600w, https://dfir.blog/content/images/size/w1000/2025/03/unfurl-truth-social.png 1000w, https://dfir.blog/content/images/size/w1600/2025/03/unfurl-truth-social.png 1600w, https://dfir.blog/content/images/2025/03/unfurl-truth-social.png 1686w" sizes="(min-width: 1200px) 1200px"></figure><h2 id="input-clean-up-actions">Input &quot;Clean Up&quot; Actions</h2><p>I&apos;m always on the lookout for ways to make Unfurl more helpful and usable. Some of the most common issues I&apos;ve seen when people use Unfurl is improperly formatted inputs, like enclosing the input URL or string in quotes or including leading/trailing spaces. If this happens, Unfurl can&apos;t properly parse the inputs (as it doesn&apos;t <em>know</em> that those are errors), and so gives an unsatisfying result to the user. </p><p>I&apos;ve added a few &quot;clean up&quot; actions to fix these common issues. Since I, like Unfurl, can&apos;t be truly sure that these extra characters are unintentional, I wanted to make these modifications visible to the user (both for transparency and to stick with Unfurl&apos;s &quot;show your work&quot; philosophy).</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/03/unfurl-quote-cleanup.png" class="kg-image" alt="Unfurl 2025.03" loading="lazy" width="1908" height="921" srcset="https://dfir.blog/content/images/size/w600/2025/03/unfurl-quote-cleanup.png 600w, https://dfir.blog/content/images/size/w1000/2025/03/unfurl-quote-cleanup.png 1000w, https://dfir.blog/content/images/size/w1600/2025/03/unfurl-quote-cleanup.png 1600w, https://dfir.blog/content/images/2025/03/unfurl-quote-cleanup.png 1908w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">Unfurl &quot;Clean Up&quot; parser removing quotes</span></figcaption></figure><p>If you use Unfurl and have any other &quot;annoyances&quot; or quality-of-life type issues, please let me know! I&apos;d love to make Unfurl easy and enjoyable to use for everyone.</p><h2 id="get-it">Get it!</h2><p>Those are the major items in this Unfurl release. There are more changes that didn&apos;t make it into the blog post; check out the <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2025.03?ref=dfir.blog" rel="noreferrer">release notes</a> for more. To get Unfurl with these latest updates, you can:</p><ul><li>use it online at <a href="https://dfir.blog/unfurl/">dfir.blog/unfurl</a>  or <a href="https://unfurl.link/?ref=dfir.blog">unfurl.link</a></li><li>if using pip, <code>pip install dfir-unfurl -U</code> will upgrade your local Unfurl to the latest</li><li>View the release on <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2025.03?ref=dfir.blog" rel="noreferrer">GitHub</a></li></ul><p>All features work in both the web UI and command line versions.</p>]]></content:encoded></item><item><title><![CDATA[Hindsight v2025.03 Released!]]></title><description><![CDATA[Hindsight v2025.03 focuses on Extensions - parsing more activity and state records, highlighting Extension permissions, and making it easier to examine Manifests.]]></description><link>https://dfir.blog/hindsight-parses-browser-extensions/</link><guid isPermaLink="false">67cf125f04abfd293590deba</guid><category><![CDATA[Hindsight]]></category><category><![CDATA[Web Browsers]]></category><category><![CDATA[Tools]]></category><category><![CDATA[Chrome]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Tue, 11 Mar 2025 17:02:06 GMT</pubDate><content:encoded><![CDATA[<h3 id="background">Background</h3><p>I&apos;ve been following some of the news related to attacks involving browser extensions and read some great write-ups about what happened and how. I&apos;d encourage everyone to read the post by John Tuckner (of Secure Annex) about the Cyberhaven Extension compromise:</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://secureannex.com/blog/cyberhaven-extension-compromise/?ref=dfir.blog"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Cyberhaven Extension Compromise</div><div class="kg-bookmark-description">How the Cyberhaven extension was compromised and what it means for your organization.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://secureannex.com/assets/icon.png" alt><span class="kg-bookmark-author">Secure Annex</span><span class="kg-bookmark-publisher">John Tuckner</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://secureannex.com/blog-images/cyberhaven-extension-compromise/cyberhaven.png" alt></div></a></figure><p>One of the things that&apos;s been on my radar for a long time was adding more parsing of Extension-related databases to Hindsight, and this seemed like a timely excuse!</p><h2 id="new-extension-data-section">New &quot;Extension Data&quot; Section</h2><p>Hindsight can now part eight more databases related to Extension activity (they all use LevelDB and share a similar format). They are:</p><ul><li>Extension Rules</li><li>Extension Scripts</li><li>Extension State</li><li>Local App Settings</li><li>Local Extension Settings</li><li>Managed Extension Settings</li><li>Sync App Settings</li><li>Sync Extension Settings</li></ul><p>As these records are different than other &quot;Storage&quot; ones, I decided to put them in a new&#xA0;<code>Extension Data</code>&#xA0;output section. There aren&apos;t any explicit timestamps associated with records (although plenty of timestamps are present inside the unstructured <code>Value</code> fields). I have some ideas on plugins and additional parsing, but that will need to wait for a subsequent release. For now, I think simply surfacing this data is a good place to start.</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/03/hindsight-extension-data.png" class="kg-image" alt="New &quot;Extension Data&quot; Tab in XLSX Output" loading="lazy" width="1717" height="388" srcset="https://dfir.blog/content/images/size/w600/2025/03/hindsight-extension-data.png 600w, https://dfir.blog/content/images/size/w1000/2025/03/hindsight-extension-data.png 1000w, https://dfir.blog/content/images/size/w1600/2025/03/hindsight-extension-data.png 1600w, https://dfir.blog/content/images/2025/03/hindsight-extension-data.png 1717w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">New &quot;Extension Data&quot; Tab in XLSX Output</span></figcaption></figure><p>Another, more minor, change in this version is to the <code>Installed Extensions</code> section of the output - I&apos;ve added <strong>Permissions</strong> and <strong>Manifest</strong> columns. The <strong>Manifest</strong> column is the extension&apos;s entire <code>manifest.json</code> file, as lots of different parts of it are relevant for analysis, depending on the question being asked. I pulled out the <strong>Permissions</strong> section from the manifest into its own column to highlight it, as I think it&apos;s particular important. I also think it&apos;s useful to be able to quickly scan down the list of installed extensions and see what permissions each has, in case something jumps out as a bit unusual. </p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/03/hindsight-extension-permissions-manifest.png" class="kg-image" alt="Updated &quot;Installed Extensions&quot; Tab, with Permissions and Manifest Columns" loading="lazy" width="2000" height="273" srcset="https://dfir.blog/content/images/size/w600/2025/03/hindsight-extension-permissions-manifest.png 600w, https://dfir.blog/content/images/size/w1000/2025/03/hindsight-extension-permissions-manifest.png 1000w, https://dfir.blog/content/images/size/w1600/2025/03/hindsight-extension-permissions-manifest.png 1600w, https://dfir.blog/content/images/2025/03/hindsight-extension-permissions-manifest.png 2168w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">Updated &quot;Installed Extensions&quot; Tab, with Permissions and Manifest Columns</span></figcaption></figure><h2 id="get-hindsight">Get Hindsight!</h2><p>You can get Hindsight, view the code, and see the full change log on <a href="https://github.com/obsidianforensics/hindsight?ref=dfir.blog" rel="noopener">GitHub</a>. Both the command line and web UI versions of this release are available as:</p><ul><li>compiled exes attached to the <a href="https://hindsig.ht/release?ref=dfir.blog">GitHub release</a> or in the dist/ folder</li><li>.py versions are available by <code>pip install pyhindsight</code> or downloading/cloning the <a href="https://hindsig.ht/github?ref=dfir.blog">GitHub repo</a>.</li></ul>]]></content:encoded></item><item><title><![CDATA[Unfurl v2025.02 Released]]></title><description><![CDATA[Unfurl v2025.02 adds parsing of obfuscated IP addresses, more Bluesky timestamps, and more!]]></description><link>https://dfir.blog/unfurl-parses-obfuscated-ip-addresses/</link><guid isPermaLink="false">67b1f90804abfd293590de21</guid><category><![CDATA[Unfurl]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Wed, 19 Feb 2025 14:41:19 GMT</pubDate><media:content url="https://dfir.blog/content/images/2025/02/unfurl-deceptive-ip-address.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2025/02/unfurl-deceptive-ip-address.png" alt="Unfurl v2025.02 Released"><p>A new Unfurl release is here! v2025.02 adds new features and some fixes, including:</p><ul><li>Parsing of IP addresses, including encoded or obfuscated variants</li><li>Resolving Bluesky handles to their backing identifiers (DIDs), and then looking up that DID in the plc.directory audit log to find its creation timestamp</li><li>Bug fixes and speed enhancements for bulk parsing</li></ul><p>This is a relatively small release; but in addition to the new features, it fixes a few bugs (see the full changelog on the&#xA0;<a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2025.02?ref=dfir.blog" rel="noreferrer">GitHub release page</a>).&#xA0;<a href="https://dfir.blog/unfurl-parses-obfuscated-ip-addresses/#get-it" rel="noreferrer">Get it now</a>, or read on for more details about the new features!</p><h3 id="parsing-of-ip-addresses-in-many-forms">Parsing of IP Addresses (in many forms)</h3><p>Unfurl previously only parsed domain names, but now can correctly recognize IP addresses. Not just IPs as they most typically appear (like 8.8.8.8 or 10.0.0.1), but in other forms, which are often used by attackers to try to obscure the actual destination (like http://example.com@1157586937). Below are more supported examples (from a <a href="https://www.trustwave.com/en-us/resources/blogs/spiderlabs-blog/evasive-urls-in-spam/?ref=dfir.blog" rel="noreferrer">Trustwave report</a>); all examples point to a Google IP:</p><ul><li>Dotted decimal IP address:&#xA0;<a href="https://216.58.199.78/?ref=dfir.blog" rel="nofollow">https://216.58.199.78</a>&#xA0;(the most common)</li><li>Octal IP address:&#xA0;<a href="https://216.58.199.78/?ref=dfir.blog" rel="nofollow">https://0330.0072.0307.0116</a>&#xA0;(convert each decimal number to octal)</li><li>Hexadecimal IP address:&#xA0;<a href="https://216.58.199.78/?ref=dfir.blog" rel="nofollow">https://0xD83AC74E</a>&#xA0;(convert each decimal number to hexadecimal)</li><li>Integer or DWORD IP address:&#xA0;<a href="https://216.58.199.78/?ref=dfir.blog" rel="nofollow">https://3627730766</a>&#xA0;(convert hexadecimal IP to integer)</li></ul><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/02/unfurl-deceptive-ip-address-1.png" class="kg-image" alt="Unfurl v2025.02 Released" loading="lazy" width="1423" height="1021" srcset="https://dfir.blog/content/images/size/w600/2025/02/unfurl-deceptive-ip-address-1.png 600w, https://dfir.blog/content/images/size/w1000/2025/02/unfurl-deceptive-ip-address-1.png 1000w, https://dfir.blog/content/images/2025/02/unfurl-deceptive-ip-address-1.png 1423w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">Unfurl parsing a deceptive URL with a username and encoded IP address</span></figcaption></figure><p></p><h3 id="parsing-and-lookups-of-bluesky-handles">Parsing and Lookups of Bluesky Handles</h3><p>Unfurl added support for parsing the embedded timestamps out of Bluesky post IDs (&quot;TIDs&quot;) in the v2024.11 release; this latest release adds the ability to resolve a Bluesky handle to its underlying <code>did</code> , then consult the plc.directory audit log to see when that <code>did</code> was created. </p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/02/unfurl-bsky-timestamps.png" class="kg-image" alt="Unfurl v2025.02 Released" loading="lazy" width="1710" height="1038" srcset="https://dfir.blog/content/images/size/w600/2025/02/unfurl-bsky-timestamps.png 600w, https://dfir.blog/content/images/size/w1000/2025/02/unfurl-bsky-timestamps.png 1000w, https://dfir.blog/content/images/size/w1600/2025/02/unfurl-bsky-timestamps.png 1600w, https://dfir.blog/content/images/2025/02/unfurl-bsky-timestamps.png 1710w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">Unfurl parsing a bsky.app URL, showing the handle creation and the post timestamps</span></figcaption></figure><div class="kg-card kg-callout-card kg-callout-card-blue"><div class="kg-callout-emoji">&#x2139;&#xFE0F;</div><div class="kg-callout-text">Note: both the handle resolution and reading the creation timestamp from the audit log require a remote lookup, which is disabled by default in the local Python version. You can enable it by changing the <code spellcheck="false" style="white-space: pre-wrap;">unfurl.ini</code> file.</div></div><h2 id="get-it">Get it!</h2><p>Those are the major items in this Unfurl release. There are more changes that didn&apos;t make it into the blog post; check out the <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2025.02?ref=dfir.blog" rel="noreferrer">release notes</a> for more. To get Unfurl with these latest updates, you can:</p><ul><li>use it online at <a href="https://dfir.blog/unfurl/">dfir.blog/unfurl</a>  or <a href="https://unfurl.link/?ref=dfir.blog">unfurl.link</a></li><li>if using pip, <code>pip install dfir-unfurl -U</code> will upgrade your local Unfurl to the latest</li><li>View the release on <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2025.02?ref=dfir.blog" rel="noreferrer">GitHub</a></li></ul><p>All features work in both the web UI and command line versions.</p>]]></content:encoded></item><item><title><![CDATA[Authenticating Screenshots from Netflix's Carry-On Movie]]></title><description><![CDATA[I watch Netflix's Carry-On, notice a real Google Search URL on screen, extract lots of data points from it and "authenticate" the screenshot.]]></description><link>https://dfir.blog/authenticating-screenshots-from-netflix-carry-on-movie/</link><guid isPermaLink="false">677b39a704abfd293590dbca</guid><category><![CDATA[Unfurl]]></category><category><![CDATA[Web Browsers]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Mon, 13 Jan 2025 17:12:09 GMT</pubDate><media:content url="https://dfir.blog/content/images/2025/01/carry-on-google-search-url-1.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2025/01/carry-on-google-search-url-1.png" alt="Authenticating Screenshots from Netflix&apos;s Carry-On Movie"><p>Over the winter holiday, I got a bit of downtime. During this, I was watching Netflix&apos;s <em>Carry-On</em> when I noticed something: an actual URL on screen! Often in movies and TV, any &quot;web browsers&quot; that appear are mock-ups (and either look awesomely futuristic or laughable bad). Not only did this appear to be a real-life web browser showing a real webpage, it was a Google Search Engine Results Page (SERP), which I know can have tons of interesting bits encoded in it. My wife chuckled at me as I paused the movie to take a closer look (she&apos;s used to that by now). Here it is:</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/01/carry-on-google-search-url.png" class="kg-image" alt="Authenticating Screenshots from Netflix&apos;s Carry-On Movie" loading="lazy" width="2000" height="1269" srcset="https://dfir.blog/content/images/size/w600/2025/01/carry-on-google-search-url.png 600w, https://dfir.blog/content/images/size/w1000/2025/01/carry-on-google-search-url.png 1000w, https://dfir.blog/content/images/size/w1600/2025/01/carry-on-google-search-url.png 1600w, https://dfir.blog/content/images/2025/01/carry-on-google-search-url.png 2210w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">A Google Search Results Page (SERP) from the Netflix movie </span><i><em class="italic" style="white-space: pre-wrap;">Carry-On</em></i></figcaption></figure><p>The next day, I went back to that scene (about 47 minutes in, if you want to see it yourself) and did my best to type out the URL. I got as far as the <code>oq</code> query string parameter then gave up, as the image was getting blurry and I already had quite a bit. For the Google SERP URL, I was able to read the <code>q</code>, <code>rlz</code>, <code>ei</code>, <code>ved</code>, <code>uact</code>, and <code>oq</code> query string parameters. I put the URL into <a href="https://unfurl.link/?ref=dfir.blog" rel="noreferrer">Unfurl</a>, and got: </p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/01/unfurl-carry-on-google-serp-url-1.png" class="kg-image" alt="Authenticating Screenshots from Netflix&apos;s Carry-On Movie" loading="lazy" width="2000" height="589" srcset="https://dfir.blog/content/images/size/w600/2025/01/unfurl-carry-on-google-serp-url-1.png 600w, https://dfir.blog/content/images/size/w1000/2025/01/unfurl-carry-on-google-serp-url-1.png 1000w, https://dfir.blog/content/images/size/w1600/2025/01/unfurl-carry-on-google-serp-url-1.png 1600w, https://dfir.blog/content/images/2025/01/unfurl-carry-on-google-serp-url-1.png 2000w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">Unfurl parsing a Google SERP that appeared in Netflix&apos;s </span><i><em class="italic" style="white-space: pre-wrap;">Carry-On</em></i></figcaption></figure><p>There&apos;s a ton of stuff here! If you don&apos;t know what all those Google Search parameters mean, no problem; Unfurl does its best to parse and explain them. I&apos;d encourage you to <a href="https://dfir.blog/unfurl/?url=https://www.google.com/search?q=nova+shock&amp;rlz=1C1RXQR_enUS928US928&amp;ei=HEBlZL-eLrOmptQP3pmK4AI&amp;ved=0ahUKEwi_nJusn_3-AhUzk4kEHd6MAiwQ4dUDCBA&amp;uact=5&amp;oq=nova+shock" rel="noreferrer">take a look at the interactive graph</a> yourself; there&apos;s useful hover text on some nodes that isn&apos;t visible in the screenshot above. </p><p>I&apos;ll summarize what Unfurl pulled out of each query string parameter from the <em>Carry-On</em> Google SERP URL:</p><ul><li><code>q</code>: &quot;nova shock&quot; - the terms used in the Google search <strong>q</strong>uery</li><li><code>oq</code>: &quot;nova shock&quot; - the &quot;<strong>o</strong>riginal <strong>q</strong>uery&quot; terms entered by the user.<ul><li>Sometimes auto-complete or suggestions are used to reach the actual search terms (in <code>q</code>) from the <code>oq</code> value, but that doesn&apos;t look to have happened here, since the <code>q</code> and <code>oq</code> are the same.</li></ul></li><li><code>rlz</code>: this is used for grouping promotion event signals and anonymous user cohorts (<a href="https://dfir.blog/google-search-rlz/" rel="noreferrer">more info on <code>rlz</code> in this post</a>). Interesting parsed info:<ul><li>the search was performed using Chrome Omnibox (that combination URL and search box at the top of Chrome)</li><li>the language was English</li><li>the Chrome browser used to make the search was installed in United States the week of <strong>2020-11-16</strong>, which is also the same time period the first Google search was made from that system</li></ul></li><li><code>ei</code>: has info about when the search session started. The <em>search session</em> starting timestamp is before the actual <em>search </em>occurred; this is often seconds before, but could be many hours.  <ul><li>The search session started <strong>2023-05-17 20:59:08.757567+00:00</strong></li></ul></li><li><code>ved</code>: often appears when a user clicks a link on a Google page. It contains information about the link that was clicked on: position on the page, link type, and timing&#xA0;(<a href="https://dfir.blog/google-ved-versions/" rel="noreferrer">more info on <code>ved</code> in this post</a>).<ul><ul><li>The search session started <strong>2023-05-17 20:59:08.757567+00:00</strong> (matches the <code>ei</code> timestamp)</li></ul></ul></li></ul><div class="kg-card kg-callout-card kg-callout-card-blue"><div class="kg-callout-emoji">&#x2139;&#xFE0F;</div><div class="kg-callout-text">An important note is that what we folks external to Google know (or think we know) about Google Search URLs has been deduced through research and testing, and could be invalidated at any time if Google makes changes. Google doesn&apos;t publish what these query string parameters mean and how to interpret them, but a lot of people have spent a lot of time and effort trying to figure that out (both for forensic and search engine optimization reasons). </div></div><p>That&apos;s a lot of information extracted from one URL! Most of the time when I&apos;m doing this kind of analysis, I don&apos;t have a video (or screenshot) of the user doing the actions in the browser, and the data points from the URL can help paint the full picture of what happened. In this instance, however, we <em>can</em> see what the user was doing, which lets us ask a different question: is what is encoded in the URL consistent with what we&apos;re seeing? Or phrased another way: <strong>has the screenshot been manipulated?</strong></p><h2 id="is-the-carry-on-screenshot-consistent-with-the-movie-setting">Is the <em>Carry-On</em> screenshot consistent with the movie setting?</h2><p>So, how did the <em>Carry-On</em> screenshot do as far as being consistent with the events around it? Let&apos;s go through each data point from the URL and see how it fits with what we see in the movie:</p><table>
<thead>
<tr>
<th>Attribute</th>
<th>On-Screen</th>
<th>Extracted Data Point</th>
<th>Match</th>
</tr>
</thead>
<tbody>
<tr>
<td>Search query</td>
<td>&quot;Nov Chuck&quot;</td>
<td>&quot;nova shock&quot;</td>
<td>&#x274C;</td>
</tr>
<tr>
<td>Browser is Chrome</td>
<td>Yes</td>
<td>Yes</td>
<td>&#x2705;</td>
</tr>
<tr>
<td>Search location</td>
<td>Google Home Page <br> or New Tab Page</td>
<td>Omnibox</td>
<td>&#x274C;</td>
</tr>
<tr>
<td>Language</td>
<td>English</td>
<td>English</td>
<td>&#x2705;</td>
</tr>
<tr>
<td>Browser installed 2020-11-16</td>
<td>Unknown</td>
<td>Unknown</td>
<td>&#x2754;</td>
</tr>
<tr>
<td>Search session start</td>
<td>2023-05-17</td>
<td>202?-12-24</td>
<td>&#x274C;</td>
</tr>
</tbody>
</table>
<p><strong>Conclusion</strong>: <em>The screenshot has been altered!</em> &#x1F632; </p><p>I know, who could have guessed a computer screen in a movie had some edits applied? The search query from the URL didn&apos;t match what was on the screen, which is the most definitive mismatch I can see. The two matching attributes, the browser being Chrome and the language being English, are so common that it would be strange if they didn&apos;t match. The search location mismatch (Omnibox vs Home Page) I don&apos;t weigh heavily, as I&apos;ve had a hard time getting a <code>rlz</code> parameter to appear in SERP URLs, so I haven&apos;t been able to verify its behavior. Likewise, the browser install date from the <code>rlz</code> is plausible, but not useful for verification in this case. </p><p>The last big mismatch is on the search session timestamp. While the search session starting timestamp can be a ways before the search actually occurs, 7 month is quite a stretch (as the movie is set on December 24th and the embedded timestamp is May 17th). However, if you kind of squint at the computer&apos;s clock while the search is happening, it might resemble <code>5/1?/????</code>. So maybe the computer and the Google search agree on the date at least, but the people on-screen aren&apos;t being honest about what the timeframe is? &#x1F9D0;</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/01/carry-on-search-clock.png" class="kg-image" alt="Authenticating Screenshots from Netflix&apos;s Carry-On Movie" loading="lazy" width="558" height="283"><figcaption><span style="white-space: pre-wrap;">Blurry screenshot of the system clock onscreen in </span><i><em class="italic" style="white-space: pre-wrap;">Carry-On</em></i></figcaption></figure><h2 id="real-life-applications">Real Life Applications</h2><h3 id="evaluating-the-authenticity-of-screenshots">Evaluating the Authenticity of Screenshots</h3><p>Now, this post is just a fun exercise (no one expects screenshots from movies to match reality), but it does have more serious parallels. If you come across a screenshot, whether that&apos;s during a DFIR investigation, some OSINT research, or just on social media, if that screenshot has a URL in it, you potentially have some more data points around the veracity of that screenshot.</p><p>This post highlighted how useful something like a search engine URL can be, but all sorts of URLs can have interesting bits encoded inside them, like those from <a href="https://dfir.blog/unfurl/?url=https://twitter.com/_RyanBenson/status/1189581422685634560?s=20" rel="noreferrer">Twitter/X</a>, <a href="https://dfir.blog/unfurl/?url=https://discordapp.com/channels/427876741990711298/537760691302563843/643183730227281931" rel="noreferrer">Discord</a>, <a href="https://dfir.blog/unfurl/?url=https://www.tiktok.com/@billnye/video/6854717870488702213?lang=en" rel="noreferrer">TikTok</a>, and many more!</p><h3 id="importance-of-verification">Importance of Verification </h3><p>Above, when I said I just typed out the URL from <em>Carry-On</em> and dropped it into Unfurl to get all those results, I wasn&apos;t being completely honest. I did put my transcribed URL into Unfurl, but when I took a close look at the results I noticed things weren&apos;t quite right.</p><p>Some of the things we&apos;ve observed about the URLs are useful, like what we think the timestamps represent, and some are more like trivia. One of the less-useful things we&apos;ve figured out is that the last (or 3rd) value in the <code>ei</code> parameter should match the 13-3 value in the <code>ved</code> parameter. We don&apos;t know what the meaning of these values, but after looking at enough examples, we expect them to match. And in my first transcribed example... they don&apos;t. We also expect the timestamps in the <code>ved</code> and the <code>ei</code> to match, and those don&apos;t either. What&apos;s going on?</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2025/01/unfurl-carry-on-1.png" class="kg-image" alt="Authenticating Screenshots from Netflix&apos;s Carry-On Movie" loading="lazy" width="2000" height="563" srcset="https://dfir.blog/content/images/size/w600/2025/01/unfurl-carry-on-1.png 600w, https://dfir.blog/content/images/size/w1000/2025/01/unfurl-carry-on-1.png 1000w, https://dfir.blog/content/images/size/w1600/2025/01/unfurl-carry-on-1.png 1600w, https://dfir.blog/content/images/2025/01/unfurl-carry-on-1.png 2000w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">First attempting at Unfurling the SERP URL from </span><i><em class="italic" style="white-space: pre-wrap;">Carry-On</em></i></figcaption></figure><p>This led me to experiment with the <code>ei</code> and <code>ved</code> parameters; specifically, with the characters that can be a little ambiguous (like lowercase &quot;L&quot; (l) and uppercase &quot;i&quot; (I). After some tinkering, I found that I had initially misread two characters, both in the <code>ei</code> parameter. The correct value was <code>HEBlZL-eLrOmptQP3pmK4AI</code>; previously I had the 4th and last characters switched with their homoglyphs (<code>HEBIZL-eLrOmptQP3pmK4Al</code>). This helps illustrate that even &quot;trivia&quot;-type knowledge has its uses; while I don&apos;t know what those values <em>mean</em>, I was able to use them as a kind of consistency check. </p><h2 id="try-it-out">Try It Out!</h2><p>That&apos;s it for this post. If you found it interesting, I&apos;d encourage you to try it on a screenshot you find and let me know how it goes! Unfurl is useful for this, and you can use it <a href="https://unfurl.link/?ref=dfir.blog" rel="noreferrer">online </a>or <a href="https://github.com/obsidianforensics/unfurl?ref=dfir.blog" rel="noreferrer">locally</a>.</p>]]></content:encoded></item><item><title><![CDATA[Video of "What Can DFIQ Do For You?" Posted]]></title><description><![CDATA[The talk "What Can DFIQ Do For You?" that Jon Brown and I gave at the SANS DFIR Summit 2023 has been posted on YouTube!]]></description><link>https://dfir.blog/dfiq-video-at-sans-dfir-summit-2023/</link><guid isPermaLink="false">6765afa704abfd293590dbae</guid><category><![CDATA[Presentations & Interviews]]></category><category><![CDATA[Open Source Tools]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Wed, 20 Dec 2023 17:59:00 GMT</pubDate><media:content url="https://dfir.blog/content/images/2024/12/dfiq-sans-video.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2024/12/dfiq-sans-video.png" alt="Video of &quot;What Can DFIQ Do For You?&quot; Posted"><p>The talk &quot;What Can DFIQ Do For You?&quot; that Jon Brown and I gave at the SANS DFIR Summit 2023 has been posted on YouTube! It was awesome to be able to publicly launch <a href="https://dfiq.org/?ref=dfir.blog" rel="noreferrer">DFIQ</a>; I hope this is just the start to a new DFIR community resource. </p><p><a href="https://www.youtube.com/@SANSForensics?ref=dfir.blog"></a></p><figure class="kg-card kg-embed-card"><iframe width="200" height="113" src="https://www.youtube.com/embed/oFCVREL3IDE?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen title="What Can DFIQ Do For You?"></iframe></figure>]]></content:encoded></item><item><title><![CDATA[Unfurl v2023.09 Released!]]></title><description><![CDATA[Unfurl v2023.09 adds parsing for JWTs, URLs with encoded DoH (DNS over HTTPS) requests, and more Mastodon servers. ]]></description><link>https://dfir.blog/unfurl-parsing-jwt-and-doh/</link><guid isPermaLink="false">66579e6f04abfd293590d980</guid><category><![CDATA[Unfurl]]></category><category><![CDATA[Open Source Tools]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Wed, 27 Sep 2023 13:30:00 GMT</pubDate><media:content url="https://dfir.blog/content/images/2023/09/unfurl-parse-jwt-1.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2023/09/unfurl-parse-jwt-1.png" alt="Unfurl v2023.09 Released!"><p>A new Unfurl release is here! v2023.09 adds new features and some fixes. The release adds:</p><ul><li>Parsing of JWTs (JSON Web Tokens)</li><li>Parsing of DoH (DNS over HTTPS) URLs</li><li>More recognized Mastodon servers</li></ul><p>This is a relatively small release; but in addition to the new features, it fixes a few bugs (see the full changelog on the <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2023.09.05?ref=dfir.blog">GitHub release page</a>). <a href="#get-it">Get it now</a>, or read on for more details about the new features!</p><h2 id="parse-json-web-tokens-jwts">Parse JSON Web Tokens (JWTs)</h2><p>JSON Web Tokens (JWTs) are used frequency for authorization and signing purposes. I won&apos;t go into much details about their structure here (<a href="https://jwt.io/introduction?ref=dfir.blog">check this out for an introduction</a>), but just say at the highest level JWTs have three parts: header, payload, and signature. Each of these is base64-encoded, and separated by a <code>.</code>. Unfurl first splits a JWT into those three components, then base64-decodes the header and payload, then parses the resulting JSON objects. While Unfurl could parse all that in one step, it does it in three steps to keep with the &quot;show your work&quot; spirit of the tool. </p><p>Here&apos;s Unfurl parsing a simple JWT (<a href="https://en.wikipedia.org/wiki/JSON_Web_Token?ref=dfir.blog#Structure">from Wikipedia</a>):</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2023/09/unfurl-parse-jwt.png" class="kg-image" alt="Unfurl v2023.09 Released!" loading="lazy" width="1460" height="1061" srcset="https://dfir.blog/content/images/size/w600/2023/09/unfurl-parse-jwt.png 600w, https://dfir.blog/content/images/size/w1000/2023/09/unfurl-parse-jwt.png 1000w, https://dfir.blog/content/images/2023/09/unfurl-parse-jwt.png 1460w" sizes="(min-width: 1200px) 1200px"><figcaption>Unfurl parsing a simple JWT</figcaption></figure><p>I encounter these often when looking through links in emails. Here&apos;s another example, but with a lot more other parsers as well:</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2023/09/unfurl-jwt-lnks.gd-email.png" class="kg-image" alt="Unfurl v2023.09 Released!" loading="lazy" width="1410" height="1116" srcset="https://dfir.blog/content/images/size/w600/2023/09/unfurl-jwt-lnks.gd-email.png 600w, https://dfir.blog/content/images/size/w1000/2023/09/unfurl-jwt-lnks.gd-email.png 1000w, https://dfir.blog/content/images/2023/09/unfurl-jwt-lnks.gd-email.png 1410w" sizes="(min-width: 1200px) 1200px"><figcaption>Unfurl parsing an email link with a JWT</figcaption></figure><p>Don&apos;t you just love how ridiculous email links have gotten? This one wasn&apos;t even malicious. </p><h2 id="dns-over-https-doh">DNS over HTTPS (DoH)</h2><p>I was reading a <a href="https://www.dshield.org/diary/Decoding+DNS+over+HTTPs+Requests/29488?ref=dfir.blog">SANS Internet Storm Center post by Johannes Ullrich</a> a while ago about decoding DoH requests in their honeypot and found it interesting. I knew a little about DoH, but hadn&apos;t seen URLs contained encoded requests before. I created an Unfurl parser for them; see an example below:</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2023/09/unfurl-dns-doh.png" class="kg-image" alt="Unfurl v2023.09 Released!" loading="lazy" width="1609" height="940" srcset="https://dfir.blog/content/images/size/w600/2023/09/unfurl-dns-doh.png 600w, https://dfir.blog/content/images/size/w1000/2023/09/unfurl-dns-doh.png 1000w, https://dfir.blog/content/images/size/w1600/2023/09/unfurl-dns-doh.png 1600w, https://dfir.blog/content/images/2023/09/unfurl-dns-doh.png 1609w" sizes="(min-width: 1200px) 1200px"><figcaption>Unfurl parsing a URL containing an encoded DoH message</figcaption></figure><h2 id="more-mastodon-servers">More Mastodon Servers</h2><p>Unfurl has parsed timestamps from Mastodon&apos;s Toots for a long time, but it previously recognized a limited number of Mastodon servers. With the uptake of Mastodon usage, I&apos;ve updated the list of Mastodon servers Unfurl knows about to nearly 250. </p><h2 id="get-it">Get it!</h2><p>Those are the major items in this Unfurl release. There are more changes that didn&apos;t make it into the blog post; check out the <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2023.09.05?ref=dfir.blog">release notes</a> for more. To get Unfurl with these latest updates, you can:</p><ul><li>use it online at <a href="https://dfir.blog/unfurl/">dfir.blog/unfurl</a> &#xA0;or <a href="https://unfurl.link/?ref=dfir.blog">unfurl.link</a></li><li>if using pip, <code>pip install dfir-unfurl -U</code> will upgrade your local Unfurl to the latest</li><li>View the release on <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2023.09.05?ref=dfir.blog">GitHub</a></li></ul><p>All features work in both the web UI and command line versions (<strong><strong>unfurl_app.py</strong></strong> &amp; <strong><strong>unfurl_cli.py</strong></strong>).</p>]]></content:encoded></item><item><title><![CDATA[Unfurl v2022.11: Social Media Edition]]></title><description><![CDATA[This "social media edition" Unfurl release includes parsing Twitter sharing codes, timestamps from Mastodon and LinkedIn IDs, expanding Substack redirects, & more!]]></description><link>https://dfir.blog/unfurl-parsing-twitter-mastodon-linkedin/</link><guid isPermaLink="false">66579e6f04abfd293590d97e</guid><category><![CDATA[Unfurl]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Thu, 10 Nov 2022 14:18:00 GMT</pubDate><media:content url="https://dfir.blog/content/images/2022/11/unfurl-2022.11-square-2.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2022/11/unfurl-2022.11-square-2.png" alt="Unfurl v2022.11: Social Media Edition"><p>It&apos;s been a while, but a new Unfurl release is here! v2022.11 adds new features and has behind-the-scenes changes. With all the attention on Twitter lately, in this post I&apos;m going to highlight changes related to social media websites:</p><ul><li>Defining <strong>Twitter&apos;s</strong> sharing (<code>s</code>) parameter values (all 71 of them!)</li><li>Extracting timestamps from <strong>Mastodon</strong> IDs</li><li>Decoding multiple types of <strong>LinkedIn</strong> identifiers</li><li>Expanding <strong>Substack</strong> redirect links </li><li>Parsing common tracking/analytics query string parameters</li></ul><p><a href="#get-it">Get it now</a>, or read on for more details about the new features!</p><h2 id="twitter">Twitter</h2><p>Besides the headline-grabbing changes at Twitter, there have been some gradual, less obvious changes as well: the query string parameters. A few years ago (maybe 2018?) the <code>s</code> parameter appeared, and people (myself included) <a href="https://twitter.com/EikoFried/status/995601093001400320?ref=dfir.blog">began</a> <a href="https://twitter.com/mattnavarra/status/1044147538922803201?ref=dfir.blog">speculating</a> and trying to figure out its purpose. By experimentation, the values for <code>s</code> of 19, 20, and 21 seemed pretty clear: they meant a sharing source of Android, Twitter Web, and iOS, respectively (and Unfurl parsed them as such). </p><p>A few weeks ago, someone was poking at Twitter&apos;s JavaScript files and discovered an object with the mappings of 71 values for the sharing codes! They kindly <a href="https://github.com/obsidianforensics/unfurl/issues/162?ref=dfir.blog">shared this with me</a> (<strong>thanks <a href="https://github.com/2xyo?ref=dfir.blog">2xyo</a>!</strong>) and I added them to Unfurl. </p><p>The codes generally show the combination of device type (iOS, iPhone, Android, web browser) and method (email, WhatsApp, copy) used to share the tweet. I haven&apos;t personally seen the majority of these codes in use so I can&apos;t say they all are still valid, but then I also haven&apos;t shared a tweet from my iPad using LinkedIn (<code>s=71</code>)! </p><p>Here&apos;s my cleaned-up interpretation of what the <code>s</code> codes mean (links to the original .js files are in the <a href="https://github.com/obsidianforensics/unfurl/issues/162?ref=dfir.blog">GitHub issue</a> if you&apos;re curious).</p><!--kg-card-begin: markdown--><table>
<thead>
<tr>
<th><code>s</code> Parameter</th>
<th>Shared From</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>an Android using SMS</td>
</tr>
<tr>
<td>02</td>
<td>an Android using Email</td>
</tr>
<tr>
<td>03</td>
<td>an Android using Gmail</td>
</tr>
<tr>
<td>04</td>
<td>an Android using Facebook</td>
</tr>
<tr>
<td>05</td>
<td>an Android using WeChat</td>
</tr>
<tr>
<td>06</td>
<td>an Android using Line</td>
</tr>
<tr>
<td>07</td>
<td>an Android using FBMessenger</td>
</tr>
<tr>
<td>08</td>
<td>an Android using WhatsApp</td>
</tr>
<tr>
<td>09</td>
<td>an Android using Other</td>
</tr>
<tr>
<td>10</td>
<td>iOS using Messages or SMS</td>
</tr>
<tr>
<td>11</td>
<td>iOS using Email</td>
</tr>
<tr>
<td>12</td>
<td>iOS using Other</td>
</tr>
<tr>
<td>13</td>
<td>an Android using Download</td>
</tr>
<tr>
<td>14</td>
<td>iOS using Download</td>
</tr>
<tr>
<td>15</td>
<td>an Android using Hangouts</td>
</tr>
<tr>
<td>16</td>
<td>an Android using Twitter DM</td>
</tr>
<tr>
<td>17</td>
<td>Twitter Web using Email</td>
</tr>
<tr>
<td>18</td>
<td>Twitter Web using Download</td>
</tr>
<tr>
<td>19</td>
<td>an Android using Copy</td>
</tr>
<tr>
<td>20</td>
<td>Twitter Web using Copy</td>
</tr>
<tr>
<td>21</td>
<td>iOS using Copy</td>
</tr>
<tr>
<td>22</td>
<td>iOS using Snapchat</td>
</tr>
<tr>
<td>23</td>
<td>an Android using Snapchat</td>
</tr>
<tr>
<td>24</td>
<td>iOS using WhatsApp</td>
</tr>
<tr>
<td>25</td>
<td>iOS using FBMessenger</td>
</tr>
<tr>
<td>26</td>
<td>iOS using Facebook</td>
</tr>
<tr>
<td>27</td>
<td>iOS using Gmail</td>
</tr>
<tr>
<td>28</td>
<td>iOS using Telegram</td>
</tr>
<tr>
<td>29</td>
<td>iOS using Line</td>
</tr>
<tr>
<td>30</td>
<td>iOS using Viber</td>
</tr>
<tr>
<td>31</td>
<td>an Android using Slack</td>
</tr>
<tr>
<td>32</td>
<td>an Android using Kakao</td>
</tr>
<tr>
<td>33</td>
<td>an Android using Discord</td>
</tr>
<tr>
<td>34</td>
<td>an Android using Reddit</td>
</tr>
<tr>
<td>35</td>
<td>an Android using Telegram</td>
</tr>
<tr>
<td>36</td>
<td>an Android using Instagram</td>
</tr>
<tr>
<td>37</td>
<td>an Android using Daum</td>
</tr>
<tr>
<td>38</td>
<td>iOS using Instagram</td>
</tr>
<tr>
<td>39</td>
<td>iOS using LinkedIn</td>
</tr>
<tr>
<td>40</td>
<td>an Android using LinkedIn</td>
</tr>
<tr>
<td>41</td>
<td>Gryphon using Copy</td>
</tr>
<tr>
<td>42</td>
<td>an iPhone using SMS</td>
</tr>
<tr>
<td>43</td>
<td>an iPhone using Email</td>
</tr>
<tr>
<td>44</td>
<td>an iPhone using Other</td>
</tr>
<tr>
<td>45</td>
<td>an iPhone using Download</td>
</tr>
<tr>
<td>46</td>
<td>an iPhone using Copy</td>
</tr>
<tr>
<td>47</td>
<td>an iPhone using Snapchat</td>
</tr>
<tr>
<td>48</td>
<td>an iPhone using WhatsApp</td>
</tr>
<tr>
<td>49</td>
<td>an iPhone using FBMessenger</td>
</tr>
<tr>
<td>50</td>
<td>an iPhone using Facebook</td>
</tr>
<tr>
<td>51</td>
<td>an iPhone using Gmail</td>
</tr>
<tr>
<td>52</td>
<td>an iPhone using Telegram</td>
</tr>
<tr>
<td>53</td>
<td>an iPhone using Line</td>
</tr>
<tr>
<td>54</td>
<td>an iPhone using Viber</td>
</tr>
<tr>
<td>55</td>
<td>an iPhone using Instagram</td>
</tr>
<tr>
<td>56</td>
<td>an iPhone using LinkedIn</td>
</tr>
<tr>
<td>57</td>
<td>an iPad using SMS</td>
</tr>
<tr>
<td>58</td>
<td>an iPad using Email</td>
</tr>
<tr>
<td>59</td>
<td>an iPad using Other</td>
</tr>
<tr>
<td>60</td>
<td>an iPad using Download</td>
</tr>
<tr>
<td>61</td>
<td>an iPad using Copy</td>
</tr>
<tr>
<td>62</td>
<td>an iPad using Snapchat</td>
</tr>
<tr>
<td>63</td>
<td>an iPad using WhatsApp</td>
</tr>
<tr>
<td>64</td>
<td>an iPad using FBMessenger</td>
</tr>
<tr>
<td>65</td>
<td>an iPad using Facebook</td>
</tr>
<tr>
<td>66</td>
<td>an iPad using Gmail</td>
</tr>
<tr>
<td>67</td>
<td>an iPad using Telegram</td>
</tr>
<tr>
<td>68</td>
<td>an iPad using Line</td>
</tr>
<tr>
<td>69</td>
<td>an iPad using Viber</td>
</tr>
<tr>
<td>70</td>
<td>an iPad using Instagram</td>
</tr>
<tr>
<td>71</td>
<td>an iPad using LinkedIn</td>
</tr>
</tbody>
</table>
<!--kg-card-end: markdown--><p>In addition to the <code>s</code> parameter, we&apos;ve seen <code>t</code> roll out gradually. I saw <code>t</code> on links shared from Android in late 2021 (<code>s=19</code>), then from Twitter Web (<code>s=20</code>) in early 2022, and finally from iOS (<code>s=21</code>) a bit later in 2022. I don&apos;t think anyone outside of Twitter knows exactly how the <code>t</code> parameter is constructed, but from my observations it appears consistent per device <em>for a time. </em>I shared tweets via numerous methods in August from my phone and the <code>t</code> was consistently the same. I did similar tests again in November, and the <code>t</code> value was again the same for different sharing methods, but it was different than from August. Maybe a software update or some other change on the device caused a change in the <code>t</code> &quot;fingerprint&quot;? With this in mind, I think seeing the same <code>t</code> values on multiple links suggests the same device was the sharing source. However, different <code>t</code> values could still be from the same device, just over a longer time period.</p><h2 id="mastodon">Mastodon</h2><p>This isn&apos;t actually a new parser (it&apos;s been in Unfurl for a few years), but I figured it would be worth mentioning with the increased interest in Mastodon. Mastodon is similar to Twitter in some respects; one of those is that the URLs of &quot;toots&quot; (Mastodon&apos;s version of tweets) contain an embedded timestamp. The long ID at the end of the URL is similar to a Twitter Snowflake:</p><p><strong>https://infosec.exchange/web/@RyanDFIR/109306117687853105</strong></p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://dfir.blog/content/images/2022/11/unfurl-mastodon-ryandfir.png" class="kg-image" alt="Unfurl v2022.11: Social Media Edition" loading="lazy" width="1424" height="982" srcset="https://dfir.blog/content/images/size/w600/2022/11/unfurl-mastodon-ryandfir.png 600w, https://dfir.blog/content/images/size/w1000/2022/11/unfurl-mastodon-ryandfir.png 1000w, https://dfir.blog/content/images/2022/11/unfurl-mastodon-ryandfir.png 1424w" sizes="(min-width: 1200px) 1200px"></figure><p>Due to the federated nature of Mastodon, it could be running on domain that Unfurl doesn&apos;t know about. To avoid false positives, I only have a <a href="https://github.com/obsidianforensics/unfurl/blob/master/unfurl/parsers/parse_mastodon.py?ref=dfir.blog#L57">short allowlist</a> of domains to parse as Mastodon instances. If you know of any others that you&apos;d like to be parsed, <a href="https://infosec.exchange/@RyanDFIR?ref=dfir.blog">let me know</a>. </p><h2 id="linkedin">LinkedIn</h2><p>A while ago, I did some research and discovered how to <a href="https://dfir.blog/tinkering-with-tiktok-timestamps/">dissect a TikTok identifier and extract a timestamp</a>. <a href="https://twitter.com/ollie_boyd_/status/1465340486588276739?ref=dfir.blog">Ollie Boyd</a> figured out that IDs in LinkedIn post URLs had a similar makeup and <a href="https://github.com/Ollie-Boyd/Linkedin-post-timestamp-extractor?ref=dfir.blog">made a tool</a> to extract those timestamps. I&apos;ve added this ability to Unfurl:</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2022/11/unfurl-linkedin-post.png" class="kg-image" alt="Unfurl v2022.11: Social Media Edition" loading="lazy" width="1227" height="922" srcset="https://dfir.blog/content/images/size/w600/2022/11/unfurl-linkedin-post.png 600w, https://dfir.blog/content/images/size/w1000/2022/11/unfurl-linkedin-post.png 1000w, https://dfir.blog/content/images/2022/11/unfurl-linkedin-post.png 1227w" sizes="(min-width: 1200px) 1200px"><figcaption>Unfurl extracting a timestamp from a LinkedIn Post ID</figcaption></figure><h3 id="linkedin-messaging-ids">LinkedIn Messaging IDs</h3><p>It turns out these LinkedIn IDs are used in more places than posts. One place they used to appear was in Messaging threads. When viewing messages on linkedin.com, the URL for each message thread (series of messages with a user) looked like <code>https://www.linkedin.com/messaging/thread/6685980502161199104/</code>. The ID at the end has an embedded timestamp that seemed to line up with when the first message in the thread was sent. </p><p>I&apos;ve been referencing this in past tense because this isn&apos;t the case anymore; message threads now have URLs that look like <code>https://www.linkedin.com/messaging/thread/2-ZTRkNzljZjgtOTRmNC00ZGJkLWJlYTktMDFjOWU4MTgxMjhjXzAxMA==/</code>. These new IDs (which I&apos;m calling &quot;v2&quot; from the <code>2-</code> at the beginning) are base64-encoded UUIDs with a few characters appended. The above &quot;v2&quot; ID decodes to <code>e4d79cf8-94f4-4dbd-bea9-01c9e818128c_010</code>. </p><p>For those familiar with UUIDs, you may spot that this looks like a <a href="https://www.rfc-editor.org/rfc/rfc4122.html?ref=dfir.blog#section-4.4">UUIDv4 </a>(randomly-generated). I went back through my LinkedIn messages threads, all the way back to 2009 (wow, I&apos;ve been on there a long time), and found something interesting. The older message threads had UUIDs that fit the form of <a href="https://www.rfc-editor.org/rfc/rfc4122.html?ref=dfir.blog#section-4.3">UUIDv5 </a>(name-based), while the newer ones fit UUIDv4. From my messages, the switch from UUIDv5 to UUIDv4 happened near early 2021-05 (I have a UUIDv5 message on 2021-04-26 and a UUIDv4 on 2021-05-14). </p><p>Why I am going on about this? Neither version 4 or 5 UUIDs contain any embedded timestamp information (unlike <a href="https://dfir.blog/unfurl/?url=a28cad70-0d73-11ea-aaef-0800200c9a66">version 1</a>). However, now for this particular use case, we can infer that a LinkedIn ID based on UUIDv5 corresponds to a message thread <em>older </em>than 2021-05, while one with a UUIDv4 was sent after that. It&apos;s a small, rough bit of timing information, but that&apos;s what Unfurl is all about: trying to parse all those tiny pieces of knowledge, in the hope that when put together they might paint a clearer picture. </p><h3 id="linkedin-profile-ids">LinkedIn Profile IDs</h3><p>A few months ago, <a href="https://twitter.com/jackcr?ref=dfir.blog">Jack Crook</a> showed how to decode LinkedIn Profile IDs and use their sequential nature to estimate profile creation time:</p><figure class="kg-card kg-embed-card"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">All of the profiles listed in the article and this thread were created within days of each other.  <br>jennie-biller-9b631120a<br>victor-sites-40139b20a<br>charolette-pare-93b3a220a<br>vivian-christy-b1246320a<br>maryann-robles-2924b620a<br>1/4 <a href="https://t.co/N3Na6HAydN?ref=dfir.blog">https://t.co/N3Na6HAydN</a></p>&#x2014; Jack Crook (@jackcr) <a href="https://twitter.com/jackcr/status/1575915823075495936?ref_src=twsrc%5Etfw&amp;ref=dfir.blog">September 30, 2022</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
</figure><p>These &quot;profile IDs&quot; are different than the other IDs we discussed previously. I thought this technique was really interesting; I&apos;ve added parsing the ID from base12 to Unfurl. I don&apos;t yet do anything with taking that number and estimating the creation time, but that sounds like a neat little project when I find the time. </p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://dfir.blog/content/images/2022/11/unfurl-linkedin-profile-id.png" class="kg-image" alt="Unfurl v2022.11: Social Media Edition" loading="lazy" width="1289" height="895" srcset="https://dfir.blog/content/images/size/w600/2022/11/unfurl-linkedin-profile-id.png 600w, https://dfir.blog/content/images/size/w1000/2022/11/unfurl-linkedin-profile-id.png 1000w, https://dfir.blog/content/images/2022/11/unfurl-linkedin-profile-id.png 1289w" sizes="(min-width: 1200px) 1200px"></figure><h2 id="tracking-url-parameters">Tracking URL Parameters</h2><p>Many websites add URL parameters to links to help with user tracking and analytics. This is not a new practice; we&apos;ve all seen a bunch of parameters tacked on the end of links. As investigators, we can sometimes use these parameters to infer more information: how a user clicked on a link, what site the link was on, or even when they clicked it.</p><p>These parameters are key/value pairs; for example, in <code>utm_source=newsletter</code>, the key is <code>utm_source</code> and the value is <code>newsletter</code>. The values often contain helpful clues (in the example, I&apos;d guess that the link was from an email newsletter). Even in the cases when the values are opaque, we can glean some information from the key. For example, with <code>fbclid=IwAR3Nuy7koMAB1KyVE1NqjcVGqAExIxVjQLSx-01U_e3LHKwSOzf2NsyP0UI</code>, I have no idea (yet!) how to parse anything out of the <code>IwAR3...</code> value, but from the key I can infer the link was from Facebook. </p><p>I&apos;ve added parsing of some of the most common of the tracking/analytics parameters to Unfurl. If you find one you&apos;d like added, <a href="https://github.com/obsidianforensics/unfurl/issues/new/?ref=dfir.blog">please let me know</a>. </p><h2 id="substack">Substack</h2><p>I&apos;ve seen Substack increase in popularity as well. I so far only subscribe to <a href="https://grugq.substack.com/?ref=dfir.blog">&quot;The Info Op&quot; by the grugq</a>, but there is a lot of other good content there too. I typically read it via email and noticed that all the links go through Substack redirects. I added expanding of Substack&apos;s redirect links to Unfurl; since many of the links are to Twitter/Mastodon and Substack adds <code>utm_*</code> tracking parameters, this enables those parsers to run as well, making some nice Unfurl graphs:</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2022/11/unfurl-substack.png" class="kg-image" alt="Unfurl v2022.11: Social Media Edition" loading="lazy" width="1907" height="1098" srcset="https://dfir.blog/content/images/size/w600/2022/11/unfurl-substack.png 600w, https://dfir.blog/content/images/size/w1000/2022/11/unfurl-substack.png 1000w, https://dfir.blog/content/images/size/w1600/2022/11/unfurl-substack.png 1600w, https://dfir.blog/content/images/2022/11/unfurl-substack.png 1907w" sizes="(min-width: 1200px) 1200px"><figcaption>Unfurl parsing a Substack redirect link from an email</figcaption></figure><h2 id="get-it">Get it!</h2><p>Those are the major items in this Unfurl release. There are more changes that didn&apos;t make it into the blog post; check out the <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2022.11?ref=dfir.blog">release notes</a> for more. To get Unfurl with these latest updates, you can:</p><ul><li>use it online at <a href="https://dfir.blog/unfurl/">dfir.blog/unfurl</a> &#xA0;or <a href="https://unfurl.link/?ref=dfir.blog">unfurl.link</a></li><li>if using pip, <code>pip install dfir-unfurl -U</code> will upgrade your local Unfurl to the latest</li><li>View the release on <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2022.11?ref=dfir.blog">GitHub</a></li></ul><p>All features work in both the web UI and command line versions (<strong><strong>unfurl_app.py</strong></strong> &amp; <strong><strong>unfurl_cli.py</strong></strong>).</p>]]></content:encoded></item><item><title><![CDATA[More Search URL Parsing, MISP Lists, & More in Unfurl v2022.02]]></title><description><![CDATA[Unfurl v2022.02 adds parsing for Google Search's aqs parameter, integrates MISP "warninglists", adds 3x more shortlink expansions, and more! ]]></description><link>https://dfir.blog/search-parsing-and-misp-lists-in-unfurl/</link><guid isPermaLink="false">66579e6f04abfd293590d97d</guid><category><![CDATA[Unfurl]]></category><category><![CDATA[Open Source Tools]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Wed, 02 Mar 2022 14:41:01 GMT</pubDate><media:content url="https://dfir.blog/content/images/2022/03/unfurl-misp-domain-lists-1.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2022/03/unfurl-misp-domain-lists-1.png" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02"><p>A new Unfurl release is here! v2022.02 has been a long time coming and adds new features, including:</p><ul><li>Parsing for Google Search&apos;s  <code>aqs</code> parameter</li><li>Integrates MISP&apos;s &quot;warning lists&quot; to enrich domain names</li><li>Supports expanding shortlinks from 3x more domains </li><li>Extract encoded timestamps from Twitter image filenames</li><li>Parsing for Brave Search</li></ul><p><a href="#get-it">Get it now</a>, or read on for more details about the new features!</p><h2 id="google-searchs-aqs-parameter">Google Search&apos;s <code>aqs</code> Parameter</h2><p>Google Search&apos;s Assisted Query Stats (or <code>aqs</code>) parameter isn&apos;t new (it&apos;s been around <a href="https://bugs.chromium.org/p/chromium/issues/detail?id=132667&amp;ref=dfir.blog">since 2012</a> from what I can tell). Unlike many other Google Search URL parameters, it isn&apos;t a secret - it&apos;s (mostly) documented in the Chromium source. Per a <a href="https://source.chromium.org/chromium/chromium/src/+/main:components/search_engines/template_url.h;l=195?ref=dfir.blog">comment in the code</a>, AQS&apos; purpose is to log &quot;impressions of all autocomplete matches shown at the query submission time.&quot;</p><p>So what does that really mean? Consider the following screenshot:</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2022/02/google_search_aqs_suggestions-.png" class="kg-image" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02" loading="lazy" width="868" height="250" srcset="https://dfir.blog/content/images/size/w600/2022/02/google_search_aqs_suggestions-.png 600w, https://dfir.blog/content/images/2022/02/google_search_aqs_suggestions-.png 868w"><figcaption><span style="white-space: pre-wrap;">Searching for &quot;unfurl url&quot; in Chrome&apos;s Omnibox</span></figcaption></figure><p>In the screenshot, I have typed &quot;unfurl url&quot; into Chrome&apos;s &quot;Omnibox&quot; (the address/search box). Chrome is showing me four suggestions relevant to what I have entered:</p><p><strong>Suggestion 1</strong>: Do a Google Search for the text I entered (&quot;unfurl url&quot;)<br><strong>Suggestions 2-4</strong>: Visit relevant pages from my local history - parts of the page title and URL that contain the words I entered are bolded in each suggestion</p><p>I ultimately selected the first suggestion and was sent to the Google Search Engine Results Page (SERP) for &quot;unfurl url&quot;. The URL had an <code>aqs</code> parameter: <code>aqs=chrome..69i57j69i60l3.7758j0j9</code>. Parsing <a href="https://dfir.blog/unfurl/?url=https://www.google.com/search?q=unfurl+url&amp;oq=unfurl+url&amp;aqs=chrome.0.69i59j69i60l3.19794j0j1&amp;sourceid=chrome&amp;ie=UTF-8">that URL with Unfurl</a> yields:</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2022/02/unfurl_google_search_aqs.png" class="kg-image" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02" loading="lazy" width="2000" height="687" srcset="https://dfir.blog/content/images/size/w600/2022/02/unfurl_google_search_aqs.png 600w, https://dfir.blog/content/images/size/w1000/2022/02/unfurl_google_search_aqs.png 1000w, https://dfir.blog/content/images/size/w1600/2022/02/unfurl_google_search_aqs.png 1600w, https://dfir.blog/content/images/size/w2400/2022/02/unfurl_google_search_aqs.png 2400w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">Google SERP URL containing an </span><code spellcheck="false" style="white-space: pre-wrap;"><span>aqs</span></code><span style="white-space: pre-wrap;"> parameter, parsed with Unfurl</span></figcaption></figure><p>What Unfurl parses from the <code>aqs</code> parameter can give quite a bit of insight about what I did to get to that Google SERP: </p><ul><li>I started on the &quot;New Tab Page&quot; in Chrome</li><li>I was shown four suggestions (&quot;Autocomplete Matches&quot;)</li><li>The first (index 0) was a Google Search suggestion</li><li>The second, third, and fourth (indexes 1-3) were URLs from my local history that were related to the text I entered</li><li>I select the first suggestion</li><li>It was 19.794 seconds from when I started typing to when I went to the SERP (this seems long; taking a screenshot slowed me down evidently)</li></ul><p>The <code>aqs</code> parameter doesn&apos;t capture the <em>content</em> of the suggestions offered to me, but I think you&apos;d agree that what it does log is pretty interesting. The mechanics of unpacking the <code>aqs</code> parameter would be too much for this post, but I may come back to it in a future post. You can also take a look through <a href="https://github.com/obsidianforensics/unfurl/blob/master/unfurl/parsers/parse_google.py?ref=dfir.blog">Unfurl&apos;s code for parsing it</a> if you&apos;re curious.</p><h2 id="enrich-domain-names-using-misp-lists">Enrich Domain Names using MISP Lists</h2><p>One requested feature was to have some sort of annotation for domain names showing how popular they are. The <a href="https://www.misp-project.org/?ref=dfir.blog">open source MISP project</a> has a curated set of lists of all sorts, including domain names:</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/MISP/misp-warninglists?ref=dfir.blog"><div class="kg-bookmark-content"><div class="kg-bookmark-title">GitHub - MISP/misp-warninglists: Warning lists to inform users of MISP about potential false-positives or other information in indicators</div><div class="kg-bookmark-description">Warning lists to inform users of MISP about potential false-positives or other information in indicators - GitHub - MISP/misp-warninglists: Warning lists to inform users of MISP about potential fal...</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.githubassets.com/favicons/favicon.svg" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">MISP</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://opengraph.githubassets.com/7b36abb30607d0c2a88abf6f90e6d993289d5671c53aab6cf81a4ae78c05238c/MISP/misp-warninglists" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02"></div></a></figure><p>The purpose of these lists is to add context (a domain is in the top 1K/5K/1M domains, an IP address belongs to GCP, a hash is of EICAR, etc) to help in deciding if something is a false positive or not, not to list &quot;good&quot; or &quot;bad&quot; things.</p><p>Unfurl uses the various domain lists to annotate a domain (see below). Check out the link above to <code>misp-warninglists</code> for the full list of their lists (there are a lot). </p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://dfir.blog/content/images/2022/03/unfurl-misp-domain-lists.png" class="kg-image" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02" loading="lazy" width="1580" height="781" srcset="https://dfir.blog/content/images/size/w600/2022/03/unfurl-misp-domain-lists.png 600w, https://dfir.blog/content/images/size/w1000/2022/03/unfurl-misp-domain-lists.png 1000w, https://dfir.blog/content/images/2022/03/unfurl-misp-domain-lists.png 1580w" sizes="(min-width: 1200px) 1200px"></figure><h2 id="more-shortlink-resolutions">More Shortlink Resolutions</h2><p>One of those MISP &quot;warninglists&quot; is of domains used for link shortening. Unfurl already supported resolving some shortlinks, but it was a list I had manually pulled together and tested. Adding MISP&apos;s list to my own triples the number of shortlink domains Unfurl supports (from 27 to 81). </p><p>One other shortlink-related improvement was parsing LinkedIn &quot;slinks&quot;, as Brian Krebs calls them:</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://krebsonsecurity.com/2022/02/how-phishers-are-slinking-their-links-into-linkedin/?ref=dfir.blog"><div class="kg-bookmark-content"><div class="kg-bookmark-title">How Phishers Are Slinking Their Links Into LinkedIn</div><div class="kg-bookmark-description">If you received a link to LinkedIn.com via email, SMS or instant message, would you click it? Spammers, phishers and other ne&#x2019;er-do-wells are hoping you will, because they&#x2019;ve long taken advantage of a marketing feature on the business networking site&#x2026;</div><div class="kg-bookmark-metadata"><span class="kg-bookmark-author">Krebs on Security</span><span class="kg-bookmark-publisher">Skip to content</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://krebsonsecurity.com/wp-content/uploads/2022/02/redirect.png" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02"></div></a></figure><p>Unfurl already resolved LinkedIn shortlinks with the format <code>lnkd.in/xyz123</code>. This involves extracting the shortcode (<code>xyz123</code> in my fictitious example), creating the intermediary &quot;slink&quot; URL using that shortcode (<code>https://www.linkedin.com/slink?code=xyz123</code>), then finally determining the destination of that shortlink using the <code>Location</code> header. This Unfurl update adds the ability to expand &quot;slinks&quot; directly, in addition to the more typical <code>lnkd.in</code> shortlinks. </p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2022/03/unfurl-slink-krebs.png" class="kg-image" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02" loading="lazy" width="1312" height="1174" srcset="https://dfir.blog/content/images/size/w600/2022/03/unfurl-slink-krebs.png 600w, https://dfir.blog/content/images/size/w1000/2022/03/unfurl-slink-krebs.png 1000w, https://dfir.blog/content/images/2022/03/unfurl-slink-krebs.png 1312w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">A LinkedIn &quot;slink&quot; mentioned in Krebs&apos; article, parsed with Unfurl</span></figcaption></figure><blockquote><strong>A note on contacting external resources</strong>: For many different reasons, I wanted to ensure that Unfurl reached out to external domains as little as possible, but some external resources would be really useful in Unfurl (as in the case of expanding shortlinks). My &quot;middle ground&quot; was to allow Unfurl to contact an allowlist of link shortener services to get the &quot;expanded&quot; link, but <strong>not</strong> contact the destination. If this doesn&apos;t work for you and you&apos;d rather Unfurl not reach out to any external sites, there is a setting to disable all remote lookups. </blockquote><h2 id="recognize-and-parse-twitter-image-filenames">Recognize and Parse Twitter Image Filenames</h2><p>Unfurl has parsed the <a href="https://dfir.blog/unfurl/?url=https://twitter.com/_RyanBenson/status/1189581422685634560?s=20">Twitter Snowflakes in tweets</a> since its inception, but I only recently learned that the names Twitter gives to uploaded images also contain a Snowflake! It&apos;s mentioned by Dr. Neal Krawetz on his blog way back in 2014 (!):</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.hackerfactor.com/blog/index.php?%2Farchives%2F634-Name-Dropping.html=&amp;ref=dfir.blog"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Name Dropping - The Hacker Factor Blog</div><div class="kg-bookmark-description"></div><div class="kg-bookmark-metadata"><span class="kg-bookmark-author">The Hacker Factor Blog</span><span class="kg-bookmark-publisher">Filename Ballistics</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.hackerfactor.com/blog/templates/default/img/emoticons/smile.png" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02"></div></a></figure><p>It appears different than the Snowflakes used in tweets - it&apos;s base64-encoded rather than shown as a decimal (<code>EqmR8DPVEAAd5mv</code> vs <code>1344769819887865856</code>) and has three extra bytes at the end (I haven&apos;t been able to determine their purpose yet). But like tweets, the timestamp embedded in the Snowflake is consistent with when the object (tweet or image) was created - which in the case of images means the time it was uploaded to Twitter.</p><p>If we encounter one of these images elsewhere still with the name Twitter gave it, we have some hints about it: that it came from Twitter and when it was uploaded to Twitter. The odds of an image having a name that can be properly decoded as a Twitter Snowflake, with a reasonable embedded timestamp, and <em>not </em>being from Twitter is vanishingly small (unless it was deliberately renamed by someone). </p><p>In this <a href="https://dfir.blog/unfurl/?url=https://dfir.blog/content/images/size/w1000/2022/03/EqmR8DPVEAAd5mv.jpeg">example below</a>, I saved an image from a tweet, then uploaded it to my site (without renaming it). Unfurl indicates that the image might have originally come from Twitter and shows the upload timestamp from the Snowflake. </p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://dfir.blog/content/images/2022/03/unfurl-twitter-image-outside-twitter-1.png" class="kg-image" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02" loading="lazy" width="1695" height="1151" srcset="https://dfir.blog/content/images/size/w600/2022/03/unfurl-twitter-image-outside-twitter-1.png 600w, https://dfir.blog/content/images/size/w1000/2022/03/unfurl-twitter-image-outside-twitter-1.png 1000w, https://dfir.blog/content/images/size/w1600/2022/03/unfurl-twitter-image-outside-twitter-1.png 1600w, https://dfir.blog/content/images/2022/03/unfurl-twitter-image-outside-twitter-1.png 1695w" sizes="(min-width: 1200px) 1200px"></figure><h2 id="brave-search">Brave Search</h2><p>Lastly, this update adds the ability for Unfurl to <a href="https://dfir.blog/unfurl/?url=https://search.brave.com/search?q=unfurl&amp;source=web&amp;tf=pm">parse a Brave Search URL</a>. It&apos;s relatively basic, at least compared to the Google Search parser (which is massive), but I think it&apos;s a good start.</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2022/02/unfurl-brave-search.png" class="kg-image" alt="More Search URL Parsing, MISP Lists, &amp; More in Unfurl v2022.02" loading="lazy" width="1703" height="808" srcset="https://dfir.blog/content/images/size/w600/2022/02/unfurl-brave-search.png 600w, https://dfir.blog/content/images/size/w1000/2022/02/unfurl-brave-search.png 1000w, https://dfir.blog/content/images/size/w1600/2022/02/unfurl-brave-search.png 1600w, https://dfir.blog/content/images/2022/02/unfurl-brave-search.png 1703w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">Brave Search URL parsed with Unfurl</span></figcaption></figure><h2 id="get-it">Get it!</h2><p>Those are the major items in this Unfurl release. There are more changes that didn&apos;t make it into the blog post; check out the <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2022.02?ref=dfir.blog">release notes</a> for more. To get Unfurl with these latest updates, you can:</p><ul><li>use it online at <a href="https://dfir.blog/unfurl/">dfir.blog/unfurl</a>  or <a href="https://unfurl.link/?ref=dfir.blog">unfurl.link</a></li><li>if using pip, <code>pip install dfir-unfurl -U</code> will upgrade your local Unfurl to the latest</li><li>View the release on <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2022.02?ref=dfir.blog">GitHub</a></li></ul><p>All features work in both the web UI and command line versions (<strong>unfurl_app.py</strong> &amp; <strong>unfurl_cli.py</strong>).</p>]]></content:encoded></item><item><title><![CDATA[Hindsight v2021.12]]></title><description><![CDATA[Hindsight v2021.12 adds parsing of more preference items, site settings (including HSTS records), Session Storage, and more!]]></description><link>https://dfir.blog/hindsight-v2021-12/</link><guid isPermaLink="false">66579e6f04abfd293590d97b</guid><category><![CDATA[Hindsight]]></category><category><![CDATA[Open Source Tools]]></category><category><![CDATA[Chrome]]></category><category><![CDATA[Tools]]></category><category><![CDATA[Web Browsers]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Tue, 21 Dec 2021 14:14:00 GMT</pubDate><media:content url="https://dfir.blog/content/images/2021/12/hindsight-v2021.12.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2021/12/hindsight-v2021.12.png" alt="Hindsight v2021.12"><p>This latest version of Hindsight adds parsing of more preference items, site settings (including HSTS records), Session Storage, and more! It also includes other small enhancements, bug fixes, and minor changes to support Chrome up to version 96.</p><h3 id="new-site-setting-record-type">New &quot;Site Setting&quot; Record Type</h3><p>Over time, Hindsight has gained the ability to parse more and more artifacts from Chrome, many of which are a bit different than &quot;traditional&quot; browser history items like URL visits, cookies, or cached items. Hindsight parses things like if a site was muted, if the user zoomed in, if a site used HSTS, or even if the page title changed in the background. </p><p>As I added these, I had been adding them to Hindsight&apos;s timeline as &quot;Preference&quot; items (as the initial ones came from the <code>Preferences</code> file), but over time that label seemed less and less apt. I decided to add a new &quot;Site Setting&quot; record type, as most of these records pertain to a setting for the visited site. Like other record types, it can have variations (<code>zoom level</code>, <code>hsts</code>, <code>engagement</code>, &amp; more).</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2021/12/hindsight-site-setting-records.png" class="kg-image" alt="Hindsight v2021.12" loading="lazy" width="1580" height="511" srcset="https://dfir.blog/content/images/size/w600/2021/12/hindsight-site-setting-records.png 600w, https://dfir.blog/content/images/size/w1000/2021/12/hindsight-site-setting-records.png 1000w, https://dfir.blog/content/images/2021/12/hindsight-site-setting-records.png 1580w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">Examples of the new &quot;Site Setting&quot; records</span></figcaption></figure><p>I plan on adding more &quot;Site Setting&quot; records in the future - these might not be critical to every investigation, but I really like the level of detail they provide and you never know when they might come in handy. </p><h3 id="parsing-of-hsts-records">Parsing of HSTS records </h3><p>HSTS is one of these new &quot;Site Setting&quot; records. We can use <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Strict-Transport-Security?ref=dfir.blog">HTTP Strict-Transport-Security (HSTS)</a> settings to tell if a browser has visited a particular site before, as well as a little about timing of the visit.</p><p>The <code>TransportSecurity</code> file holds HSTS settings, most of which look like this:</p><pre><code class="language-JSON">{
	&quot;expiry&quot;: 1671127807.687742,
	&quot;host&quot;: &quot;df0sSkr4gOg4VK8d/NNTAWFtAN/MjCgPCJ5ml+ucdZE=&quot;,
	&quot;mode&quot;: &quot;force-https&quot;,
	&quot;sts_include_subdomains&quot;: false,
	&quot;sts_observed&quot;: 1639591807.687746
}</code></pre><p>The <code>host</code> is a hashed value (according to <a href="https://source.chromium.org/chromium/chromium/src/+/main:net/http/transport_security_persister.h;l=110?ref=dfir.blog">Chromium source code</a>) &quot;so that the stored state does not trivially reveal a user&apos;s browsing history to an attacker reading the serialized state on disk.&quot; The code also shows how this hashed value is constructed. This doesn&apos;t let us reverse the hash (since that&apos;s not how hashes work), but it does let us generate hashes from known inputs and compare. Hindsight does just that, computing the hashed <code>host</code> value for every domain and subdomain seen in other browser artifacts, and comparing to <code>host</code> values in the <code>TransportSecurity</code> file. If it finds a match, Hindsight will show the domain; if not it will show the hashed version: </p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2021/12/hindsight-hsts-records.png" class="kg-image" alt="Hindsight v2021.12" loading="lazy" width="1589" height="453" srcset="https://dfir.blog/content/images/size/w600/2021/12/hindsight-hsts-records.png 600w, https://dfir.blog/content/images/size/w1000/2021/12/hindsight-hsts-records.png 1000w, https://dfir.blog/content/images/2021/12/hindsight-hsts-records.png 1589w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">HSTS records in Hindsight XLSX Report</span></figcaption></figure><h3 id="parsing-additional-preference-items">Parsing Additional Preference Items</h3><p>Hindsight can also parse more from Chrome&apos;s Preferences file, including whether network prefetching is enabled, sync settings, zoom percentages (instead of raw levels), password manager usage, and the session event log. These all are interesting, but I especially like the session event log records: </p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://dfir.blog/content/images/2021/12/hindsight-session-event-log.png" class="kg-image" alt="Hindsight v2021.12" loading="lazy" width="1113" height="430" srcset="https://dfir.blog/content/images/size/w600/2021/12/hindsight-session-event-log.png 600w, https://dfir.blog/content/images/size/w1000/2021/12/hindsight-session-event-log.png 1000w, https://dfir.blog/content/images/2021/12/hindsight-session-event-log.png 1113w"><figcaption><span style="white-space: pre-wrap;">Session Event Log records in Hindsight XLSX Report</span></figcaption></figure><p>They give some high-level insights about usage; for example, from the above screenshot you can infer that:</p><ul><li>I have Chrome set to &quot;Continue where you left off&quot;, as seconds after each session start, a restore happens</li><li>None of these sessions ended in a crash (can sometimes happen if an exploit was attempted against the browser) - useful knowledge in some particular investigations</li><li>I tend to leave Chrome running quasi-permanently, not opening/closing a lot</li><li>I have a tab hoarding problem</li></ul><h2 id="get-hindsight">Get Hindsight</h2><p>You can get Hindsight, view the code, and see the full change log on <a href="https://github.com/obsidianforensics/hindsight?ref=dfir.blog" rel="noopener">GitHub</a>. Both the command line and web UI versions of this release are available as:</p><ul><li>compiled exes attached to the <a href="https://hindsig.ht/release?ref=dfir.blog">GitHub release</a> or in the dist/ folder</li><li>.py versions are available by <code>pip install pyhindsight</code> or downloading/cloning the <a href="https://hindsig.ht/github?ref=dfir.blog">GitHub repo</a>.</li></ul>]]></content:encoded></item><item><title><![CDATA[Cookies Database Moving in Chrome 96]]></title><description><![CDATA[To support stronger security for Chrome, some network-related files - including the Cookies database - are moving locations on disk. ]]></description><link>https://dfir.blog/cookies-database-moving-in-chrome-96/</link><guid isPermaLink="false">66579e6f04abfd293590d97a</guid><category><![CDATA[Chrome]]></category><category><![CDATA[Web Browsers]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Thu, 16 Dec 2021 15:28:34 GMT</pubDate><media:content url="https://dfir.blog/content/images/2021/12/chrome-network-folder.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2021/12/chrome-network-folder.png" alt="Cookies Database Moving in Chrome 96"><p>The reason for this change is to enable sandboxing of Chrome&apos;s network service, so it can only access files on the file system that it needs. This would make it so any compromised network service can&apos;t access other files in the user&apos;s profile directory. Because of how ACLs work on Windows, to accomplish this the files needed by network services have moved from the user&apos;s profile directory to a <code>Network</code> subdirectory.</p><p>Network-related files that have/will be moved are:</p><ul><li>Cookies (SQLite)</li><li>Network Persistent State (JSON) </li><li>Reporting and NEL (SQLite)</li><li>TransportSecurity (JSON)</li><li>Trust Tokens (SQLite)</li></ul><p>The &quot;Cache&quot; directory (HTTP cache) is also included in the sandbox, but it was already in its own directory so it didn&apos;t need to move. </p><p>You can use my Chrome Evolution visualization to compare files in Chrome <a href="https://dfir.blog/chrome-evolution/?ver=95">95</a> vs <a href="https://dfir.blog/chrome-evolution/?ver=96">96</a>.</p><p>This migration is starting with Windows, and is eventually planned to happen on macOS, Linux, Android and ChromeOS. Other operating systems might be included later (but <em>not </em>iOS). </p><p>For more details on how the data is moving and why, please see <a href="https://docs.google.com/document/d/1Q7VwAsrWU45eC3Sl4bj9rj10H0pWdwBwwbKziomDCUc/edit?ref=dfir.blog#heading=h.7nki9mck5t64">Migration of Network Data</a> by Will Harris (<a href="https://twitter.com/parityzero?ref=dfir.blog">@parityzero</a>) - and thanks to Will for <a href="https://twitter.com/parityzero/status/1449033853457207302?ref=dfir.blog">pointing out this change</a>.</p><h3 id="forensic-tools-impact">Forensic Tools Impact</h3><p><strong>Plaso &amp; log2timeline - no impact.</strong> log2timeline parses every file independent of its path, so this change to Chrome has no impact.</p><p><strong>Hindsight - impacted.</strong> Hindsight currently uses file paths to find files to parse, so this change to Chrome caused problems (the Cookies database and TransportSecurity file would not be parsed). <strong>A new Hindsight release (2021.12) is available now that fixes this.</strong></p><h3 id="references">References</h3><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://docs.google.com/document/d/1Q7VwAsrWU45eC3Sl4bj9rj10H0pWdwBwwbKziomDCUc/edit?ref=dfir.blog#"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Migration of Network Data</div><div class="kg-bookmark-description">Migration of Network Data This Document is Public Authors: wfh@chromium.orgSep 2021 One-page overview As part of the larger Network Sandbox work, the files that the network service needs to access will be moved into a folder that the sandbox can be granted access to. This migration does not a...</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://ssl.gstatic.com/docs/documents/images/kix-favicon7.ico" alt="Cookies Database Moving in Chrome 96"><span class="kg-bookmark-author">Google Docs</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://lh6.googleusercontent.com/FDO6vmlxXF412kTdtYof79pabhnCbS1KLZKDsB6ue66IDR0Evn1oghzGABME9wjkxRusZZovHBeDgQ=w1200-h630-p" alt="Cookies Database Moving in Chrome 96"></div></a></figure><figure class="kg-card kg-bookmark-card kg-card-hascaption"><a class="kg-bookmark-container" href="https://bugs.chromium.org/p/chromium/issues/detail?id=1173622&amp;ref=dfir.blog"><div class="kg-bookmark-content"><div class="kg-bookmark-title">1173622 - chromium - An open-source project to help move the web forward. - Monorail</div><div class="kg-bookmark-description"></div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://bugs.chromium.org/static/images/monorail.ico" alt="Cookies Database Moving in Chrome 96"></div></div></a><figcaption>Issue 1173622: store files needed by network service in separate directory</figcaption></figure>]]></content:encoded></item><item><title><![CDATA[Metasploit URLs, Hash Lookups, & More in Unfurl v2021.06.15]]></title><description><![CDATA[A new Unfurl release is here! v2021.06.15 adds decoding of some Metasploit URLs, hash identification and API lookups, & more!]]></description><link>https://dfir.blog/metasploit-urls-and-hash-lookups-in-unfurl/</link><guid isPermaLink="false">66579e6f04abfd293590d979</guid><category><![CDATA[Unfurl]]></category><category><![CDATA[Open Source Tools]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Tue, 15 Jun 2021 13:19:00 GMT</pubDate><media:content url="https://dfir.blog/content/images/2021/06/unfurl_metasploit-payload-uuid-url-1.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2021/06/unfurl_metasploit-payload-uuid-url-1.png" alt="Metasploit URLs, Hash Lookups, &amp; More in Unfurl v2021.06.15"><p>A new Unfurl release is here! v2021.06.15 adds decoding of some Metasploit URLs, hash identification and API lookups, more control over remote lookups, better UUID parsing, and a few more shortlink expansions. It also has a number of smaller fixes, code cleanups, and tests. </p><p><a href="#get-it">Get it now</a>, or read on for more details about the new features!</p><h2 id="metasploit-urls">Metasploit URLs</h2><p><a href="https://twitter.com/DidierStevens?ref=dfir.blog">Didier Stevens</a> has written about (<a href="https://github.com/DidierStevens/Beta/blob/master/metatool.py?ref=dfir.blog">and made a tool for!</a>) decoding different Metasploit artifacts: <a href="https://isc.sans.edu/forums/diary/Metasploits+Payload+UUID/23555/?ref=dfir.blog">payload UUIDs</a> and <a href="https://isc.sans.edu/forums/diary/Finding+Metasploit+Cobalt+Strike+URLs/27204/?ref=dfir.blog">shellcode URLs</a>. Thanks to his excellent work (that&apos;s he published as open source), I was able to see how those Metasploit artifacts are constructed and build decoders into Unfurl:</p><figure class="kg-card kg-gallery-card kg-width-wide kg-card-hascaption"><div class="kg-gallery-container"><div class="kg-gallery-row"><div class="kg-gallery-image"><img src="https://dfir.blog/content/images/2021/06/unfurl_metasploit-checksum-url.png" width="1229" height="850" loading="lazy" alt="Metasploit URLs, Hash Lookups, &amp; More in Unfurl v2021.06.15" srcset="https://dfir.blog/content/images/size/w600/2021/06/unfurl_metasploit-checksum-url.png 600w, https://dfir.blog/content/images/size/w1000/2021/06/unfurl_metasploit-checksum-url.png 1000w, https://dfir.blog/content/images/2021/06/unfurl_metasploit-checksum-url.png 1229w" sizes="(min-width: 720px) 720px"></div><div class="kg-gallery-image"><img src="https://dfir.blog/content/images/2021/06/unfurl_metasploit-payload-uuid-url.png" width="1696" height="843" loading="lazy" alt="Metasploit URLs, Hash Lookups, &amp; More in Unfurl v2021.06.15" srcset="https://dfir.blog/content/images/size/w600/2021/06/unfurl_metasploit-payload-uuid-url.png 600w, https://dfir.blog/content/images/size/w1000/2021/06/unfurl_metasploit-payload-uuid-url.png 1000w, https://dfir.blog/content/images/size/w1600/2021/06/unfurl_metasploit-payload-uuid-url.png 1600w, https://dfir.blog/content/images/2021/06/unfurl_metasploit-payload-uuid-url.png 1696w" sizes="(min-width: 720px) 720px"></div></div></div><figcaption>Unfurl decoding two different types of URLs generated by Metasploit</figcaption></figure><p>You can read his blog posts for the details on the artifacts (<a href="https://isc.sans.edu/forums/diary/Metasploits+Payload+UUID/23555/?ref=dfir.blog">payload UUIDs</a> and <a href="https://isc.sans.edu/forums/diary/Finding+Metasploit+Cobalt+Strike+URLs/27204/?ref=dfir.blog">shellcode URLs</a>), but the super abbreviated version is that we can often extract at least the platform that was targeted (Windows in both examples above) - and sometimes more! It&apos;s another great example of extracting useful information from the way identifiers are generated.</p><p>Live Unfurl Examples:</p><ul><li><a href="https://dfir.blog/unfurl/?url=https://example.com/4PGoVGYmx8l6F3sVI4Rc8g1wms758YNVXPczHlPobpJENARSuSHb57lFKNndzVSpivRDSi5VH2U-w-pEq_CroLcB--cNbYRroyFuaAgCyMCJDpWbws/">Metasploit payload UUID URL</a></li><li><a href="https://dfir.blog/unfurl/?url=https://example.com/WsJH">Metasploit shellcode URL</a></li></ul><h2 id="hash-identification-and-remote-lookup">Hash Identification and Remote Lookup</h2><p>This release also adds the ability to identify potential hashes: MD5, SHA-1, SHA-256, &amp; SHA-512. The detection is based on characters and length, so it&apos;s not high fidelity (for example, MD5 hashes are the same length as UUIDs, so some nodes will be identified as potentially both). </p><p>To aid with determining what&apos;s an actual hash and what&apos;s not, Unfurl can query remote services to see if they&apos;ve seen that value before. At present, two services are supported: <a href="https://www.virustotal.com/?ref=dfir.blog">VirusTotal </a>and <a href="https://www.nitrxgen.net/md5db?ref=dfir.blog">Nitrxgen&apos;s MD5 lookup database</a>. </p><p>The VirusTotal integration is fairly basic; if a (free) VirusTotal API key is set in the Unfurl config file, Unfurl will query the VirusTotal API with potential file hash values and add a child node with file type &amp; name (if found). </p><p>Nitrxgen&apos;s MD5 lookup database is a bit different; it&apos;s a dataset of plaintext &#x2192; MD5 hashes with over a trillion values. Unfurl can query it with potential MD5 values to see if it corresponds with a known plaintext string. This is different than the VirusTotal lookup (which queries hashes of file content), as the Nitrxgen lookup is for hashed text strings. However, sometimes both can be true, as in the image below:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dfir.blog/content/images/2021/06/unfurl_md5-hash-with-lookups-3.png" class="kg-image" alt="Metasploit URLs, Hash Lookups, &amp; More in Unfurl v2021.06.15" loading="lazy" width="1259" height="1042" srcset="https://dfir.blog/content/images/size/w600/2021/06/unfurl_md5-hash-with-lookups-3.png 600w, https://dfir.blog/content/images/size/w1000/2021/06/unfurl_md5-hash-with-lookups-3.png 1000w, https://dfir.blog/content/images/2021/06/unfurl_md5-hash-with-lookups-3.png 1259w" sizes="(min-width: 720px) 720px"><figcaption>Unfurl identifying an MD5 hash value and looking it up on VirusTotal and Nitrxgen</figcaption></figure><p>These remote lookups can add value to Unfurl, but they also come with risk (as Unfurl is sending out potentially-sensitive hashes to 3rd parties). To give the user control over this, Unfurl has a new <code>remote_lookups</code> setting. Users can change it (from the default, <code>false</code>) in the <code>unfurl.ini</code> file. There&apos;s also a command line option to allow lookups (<code>-l</code> or &#xA0;<code>--lookups</code>) from <code>unfurl_cli.py</code>. The CLI tool will fall back to the value specified in <code>unfurl.ini</code> if no command line option is set. Users <strong>need to set this option to enable any remote lookups</strong> (it&apos;s disabled by default). Shortlink resolution and MAC address vendor lookups are now also controlled by this option, as they are remote lookups as well. </p><p>Live Unfurl Examples:</p><ul><li><a href="https://dfir.blog/unfurl/?url=https://dfir.blog/?test=5f4dcc3b5aa765d61d8327deb882cf99">MD5 hash detection and lookup in both VirusTotal and Nitrxgen</a></li><li><a href="http://localhost:5000/b69049b7576687c0efed9b3cb9fa8f3beb218e31c30d200c1a67ad46bd06fcf0?ref=dfir.blog">SHA256 lookup on VirusTotal</a></li></ul><h2 id="uuidv1-random-node-id-detection">UUIDv1 Random Node ID Detection</h2><p>Unfurl has been able to detect and expand UUIDs since its beginning. Version 1 UUIDs have been particularly interesting, with their embedded timestamp and MAC address. This release adds the ability to determine if the Node ID contained in the UUIDv1 is an actual MAC address or a random number. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dfir.blog/content/images/2021/06/unfurl_uuid-v1-with-random-node-id.png" class="kg-image" alt="Metasploit URLs, Hash Lookups, &amp; More in Unfurl v2021.06.15" loading="lazy" width="1147" height="955" srcset="https://dfir.blog/content/images/size/w600/2021/06/unfurl_uuid-v1-with-random-node-id.png 600w, https://dfir.blog/content/images/size/w1000/2021/06/unfurl_uuid-v1-with-random-node-id.png 1000w, https://dfir.blog/content/images/2021/06/unfurl_uuid-v1-with-random-node-id.png 1147w" sizes="(min-width: 720px) 720px"><figcaption>Unfurl parsing a UUIDv1 with a random Node ID</figcaption></figure><p>Live Unfurl Example:</p><ul><li><a href="https://dfir.blog/unfurl/?url=94c73940-6bd1-11e6-899a-ff601ec855b4">UUIDv1 with random Node ID</a></li></ul><h2 id="get-it">Get it!</h2><p>To get Unfurl with these latest updates, you can:</p><ul><li>use <a href="https://dfir.blog/unfurl/">dfir.blog/unfurl</a> online</li><li>if using pip, <code>pip install dfir-unfurl -U</code> will upgrade your local Unfurl to the latest</li><li>View the release on <a href="https://github.com/obsidianforensics/unfurl/releases/tag/v2021.06.15?ref=dfir.blog">GitHub</a></li></ul><p>All features work in both the web UI and command line versions (<strong>unfurl_app.py</strong> &amp; <strong>unfurl_cli.py</strong>). </p><p><a href="https://twitter.com/_RyanBenson?ref=dfir.blog">Let me know</a> what you think! </p>]]></content:encoded></item><item><title><![CDATA[Unfurl Plugin and "Site Characteristics" Artifact Added in Hindsight]]></title><description><![CDATA[<p>I&apos;m happy to announce there is a new Hindsight release available! <strong>2021.04.26 </strong>has many small improvements and fixes, including adding support Chrome 88 - 90, but the main new features are an <strong>Unfurl plugin</strong> and parsing of the <strong>Site Characteristics Database</strong>!</p><h2 id="unfurl-plugin">Unfurl Plugin</h2><p>I&apos;m</p>]]></description><link>https://dfir.blog/unfurl-plugin-and-site-characteristics-database-added-to-hindsight/</link><guid isPermaLink="false">66579e6f04abfd293590d978</guid><category><![CDATA[Hindsight]]></category><category><![CDATA[Digital Forensics]]></category><category><![CDATA[Chrome]]></category><category><![CDATA[Unfurl]]></category><category><![CDATA[Python]]></category><category><![CDATA[Open Source Tools]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Wed, 28 Apr 2021 17:34:59 GMT</pubDate><content:encoded><![CDATA[<p>I&apos;m happy to announce there is a new Hindsight release available! <strong>2021.04.26 </strong>has many small improvements and fixes, including adding support Chrome 88 - 90, but the main new features are an <strong>Unfurl plugin</strong> and parsing of the <strong>Site Characteristics Database</strong>!</p><h2 id="unfurl-plugin">Unfurl Plugin</h2><p>I&apos;m excited that this new Hindsight version has an integration with Unfurl! <a href="https://dfir.blog/unfurl">Unfurl </a>takes a URL and expands (&quot;unfurls&quot;) it into a directed graph, and is useful for exploring data encoded in URLs or other text values. Unfurl typically displays all this in an interactive graph visualization, but that doesn&apos;t fit well into Hindsight&apos;s output. Instead, this new Unfurl plugin stores the &quot;text tree&quot; version of the output (as seen in the Unfurl CLI tool). At this time, the only thing that Unfurl plugin runs on are Local Storage records. I chose these for a few reasons:</p><h3 id="timestamp-detection-parsing">Timestamp Detection + Parsing</h3><p>Local Storage records lack explicit timestamps (they&apos;re just a collection of key/values pairs associated with an origin). Unfurl can often translate a value into a human-readable timestamp, potentially adding some hints as to timing on these records. Hindsight had a &quot;Generic Timestamp Converter&quot; plugin that did this previously, but it was rather limited and Unfurl does a much better job and covers a wider variety of timestamps. Example:</p><figure class="kg-card kg-code-card"><pre><code>origin: https://www.reddit.com
key: push-token-last-refresh-ms
value: 1615493428164</code></pre><figcaption>Local Storage key/value pair for reddit.com</figcaption></figure><figure class="kg-card kg-code-card"><pre><code>2021-03-11 20:10:28.164 (Converted as Epoch milliseconds) [Unfurl]</code></pre><figcaption>Unfurl parsing a timestamp from a value in Local Storage</figcaption></figure><p>When Unfurl&apos;s output is rather simple (like just a timestamp conversion), the plugin reformats the &quot;tree&quot; into a single line summary that works better in Hindsight.</p><h3 id="decoding-values">Decoding Values</h3><p>Another reason is that Local Storage values are often encoded. Unfurl&apos;s chaining of multiple simple transforms can sometimes bring clarity to an obscured value. For example:</p><figure class="kg-card kg-code-card"><pre><code>origin: http://www.metacritic.com
key: __ansync3rdp_criteo
value: eyJiSWQiOiJjcml0ZW8iLCJ1Q29kZSI6bnVsbCwidHMiOjE1MzExODAxNDYwOTh9</code></pre><figcaption>Local Storage key/value pair for metacritic.com</figcaption></figure><p>The <code>value</code> from above is parsed by Unfurl (using base64, JSON, and timestamp conversions), and the &quot;text tree&quot; output is saved in the &quot;Interpretation&quot; column (in the same way other Hindsight plugins save their results):</p><figure class="kg-card kg-code-card"><pre><code>[1] eyJiSWQiOiJjcml0ZW8iLCJ1Q29kZSI6bnVsbCwidHMiOjE1MzExODAxNDYwOTh9
 &#x2514;&#x2500;(b64)&#x2500;[2] {&quot;bId&quot;:&quot;criteo&quot;,&quot;uCode&quot;:null,&quot;ts&quot;:1531180146098}
    &#x251C;&#x2500;(JSON)&#x2500;[3] bId: criteo
    &#x251C;&#x2500;(JSON)&#x2500;[4] uCode: None
    &#x2514;&#x2500;(JSON)&#x2500;[5] ts: 1531180146098
       &#x2514;&#x2500;(&#x1F553;)&#x2500;[6] 2018-07-09 23:49:06.098 
</code></pre><figcaption>Unfurl parsing an encoded Local Storage value</figcaption></figure><p>These are just a few examples of how Unfurl can be helpful on Local Storage values. All the parsers from the web version Unfurl are included in the Hindsight plugin, so things like UUIDs, zlib-compressed strings, Twitter Snowflakes, and a whole lot more can be parsed. If this plugin works out well, I&apos;ll evaluate if there are other places in Hindsight that an Unfurl integration would make sense. &#xA0;</p><h2 id="site-characteristics-database">Site Characteristics Database</h2><p>The other new feature is added parsing of the &quot;Site Characteristics Database&quot;. It is a part of Chrome that tracks a few different behaviors on sites, such as if the site changes the favicon or page title in the background. These behaviors aren&apos;t that interesting in and of themselves, but they can provide interesting context. </p><p>Behind the scenes, the &quot;Site Characteristics Database&quot; is stored in a LevelDB as a collection of key/value pairs. The key for each record is the MD5 hash of the origin and the record&apos;s value is a protobuf. Luckily, since Chromium is open source, we can find the <code>.proto</code> file that corresponds to that protobuf, so decoding it is easier:</p><figure class="kg-card kg-code-card"><pre><code>// Copyright 2019 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

syntax = &quot;proto2&quot;;

option optimize_for = LITE_RUNTIME;

// Contains the information that we want to track about a given site feature.
// Next Id: 3
message SiteDataFeatureProto {
  // The cumulative observation time for this feature in seconds, set to 0 once
  // this feature has been observed.
  optional int64 observation_duration = 1;
  // The time at which this feature has been used (set to 0 if it hasn&apos;t been
  // used), in seconds since epoch.
  optional int64 use_timestamp = 2;
}

// Contains decaying average performance measurement estimates.
// Next Id: 4
message SiteDataPerformanceMeasurement {
  // A decaying average of the CPU usage measurements. Units: microseconds.
  optional float avg_cpu_usage_us = 1;
  // A decaying average of the process footprint measurements. Units: kilobytes.
  optional float avg_footprint_kb = 2;
  // A decaying average of the duration from navigation commit to &quot;loaded&quot;.
  // Units: microseconds.
  optional float avg_load_duration_us = 3;
};

// Defines the data that we want to track about a given site.
// Next Id: 7
message SiteDataProto {
  // The last time this site has been in the loaded state, in seconds since
  // epoch.
  optional uint32 last_loaded = 1;

  // List of features that we&apos;re tracking.
  optional SiteDataFeatureProto updates_favicon_in_background = 2;
  optional SiteDataFeatureProto updates_title_in_background = 3;
  optional SiteDataFeatureProto uses_audio_in_background = 4;
  optional SiteDataFeatureProto deprecated_uses_notifications_in_background = 5;

  // Load time performance measurement estimates. This maintains a decaying
  // average of the resource usage of a page until shortly after it becomes
  // idle.
  optional SiteDataPerformanceMeasurement load_time_estimates = 6;
}</code></pre><figcaption><a href="https://source.chromium.org/chromium/chromium/src/+/master:components/performance_manager/persistence/site_data/site_data.proto?ref=dfir.blog">https://source.chromium.org/chromium/chromium/src/+/master:components/performance_manager/persistence/site_data/site_data.proto</a></figcaption></figure><p>To process these records, Hindsight first calculates the MD5 hashes of every origin seen in other artifacts it has already parsed, then compares each Site Characteristic key to them. If a match is found, Hindsight uses that origin in the &quot;URL&quot; field for the record; if not, Hindsight shows something like &quot;MD5 of origin: 99cd2175108d157588c04758296d1cfc&quot;. For the &quot;Value&quot; field, Hindsight parses the <code>site_data</code> protobuf and stores the result (it looks similar to JSON). To order these records by time, Hindsight uses the <code>last_loaded</code> value from the protobuf. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dfir.blog/content/images/2021/04/image-1.png" class="kg-image" alt loading="lazy" width="1392" height="362" srcset="https://dfir.blog/content/images/size/w600/2021/04/image-1.png 600w, https://dfir.blog/content/images/size/w1000/2021/04/image-1.png 1000w, https://dfir.blog/content/images/2021/04/image-1.png 1392w" sizes="(min-width: 720px) 720px"><figcaption>Example &quot;Site Characteristic Database&quot; record in Hindsight XLSX output</figcaption></figure><h3 id="deleted-records">Deleted Records</h3><p>Since the datastore is LevelDB, we can recover deleted data from it! For deleted records, we can only get the key (the origin MD5), not the value protobuf, so we lose some information, including any explicit timestamps. However, this recovered data can still be useful. </p><p>One potential use case for this is showing that a user visited a particular site. Looking through my own browser history, I have over 1200 records where Hindsight couldn&apos;t find the Site Characteristic origin by comparing its key to the rest of my browsing history. This means that these origins don&apos;t appear anywhere else in my Chrome history, yet there is still some (small) indication I visited them in these Site Characteristic records. If you have a site of particular importance to a case, <a href="https://gchq.github.io/CyberChef/?ref=dfir.blog#recipe=MD5()&amp;input=Z2l0aHViLmNvbQ">you could calculate the MD5 of the origin</a> and then search these records for it. Since the timestamp information in deleted records is missing, Hindsight places these records at the beginning of the timeline (at 1970-01-01), but uses a filter to hide them in the Excel output by default to avoid cluttering it.</p><h3 id="future-research">Future Research</h3><p>Things to explore around &quot;Site Characteristic Database&quot; records in the future: </p><ul><li>What effect clearing different types of browser data has on Site Characteristics Database records? If they persist despite history being cleared, they could be even more useful in showing a particular site was visited.</li><li>The various <code>observation_duration</code> timestamps: they are relative timestamps (count of seconds), but could potentially still be useful.</li><li>More precise meaning of the <code>last_loaded</code> timestamp: In some quick testing, it looks to be updated when the page was closed: page timestamp + page <code>visit_duration</code> ~= <code>last_loaded</code> timestamp. This is interesting, as not all pages have a <code>visit_duration</code> value set, and it could potentially show interesting things about user behavior.</li></ul><h2 id="get-hindsight">Get Hindsight</h2><p>You can get Hindsight, view the code, and see the full change log on <a href="https://github.com/obsidianforensics/hindsight?ref=dfir.blog" rel="noopener">GitHub</a>. Both the command line and web UI versions of this release are available as:</p><ul><li>compiled exes attached to the <a href="https://github.com/obsidianforensics/hindsight/releases/latest?ref=dfir.blog" rel="noopener">GitHub release</a> or in the dist/ folder</li><li>.py versions are available by <code>pip install pyhindsight</code> or downloading/cloning the GitHub repo.</li></ul>]]></content:encoded></item><item><title><![CDATA[Keystroke Flow from Chrome Omnibox]]></title><description><![CDATA[I take saved keystrokes from Chrome's Omnibox and graph them in a Sankey flow diagram.]]></description><link>https://dfir.blog/keystroke-flow-from-chrome-omnibox/</link><guid isPermaLink="false">66579e6f04abfd293590d977</guid><category><![CDATA[Visualizations]]></category><category><![CDATA[Chrome]]></category><category><![CDATA[Web Browsers]]></category><category><![CDATA[Open Source Tools]]></category><category><![CDATA[Digital Forensics]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Thu, 18 Feb 2021 13:58:00 GMT</pubDate><media:content url="https://dfir.blog/content/images/2021/02/keystroke-flow-2.gif" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2021/02/keystroke-flow-2.gif" alt="Keystroke Flow from Chrome Omnibox"><p>The &quot;Network Action Predictor&quot; is an SQLite database that&apos;s long been part of Chrome (<a href="https://dfir.blog/chrome-evolution/?ver=17">since Chrome 17</a>) but hasn&apos;t gotten much attention. The (simplified) summary of its function is to help Chrome seem faster to the user by predicting the resources Chrome will need and preloading them. <a href="https://twitter.com/KevinPagano3?ref=dfir.blog">Kevin Pagano</a> wrote a blog post that does a nice job introducing the artifact and covering the basic info about it. I won&apos;t cover the same stuff here, so check out his <a href="https://www.stark4n6.com/2021/02/chrome-network-action-predictor.html?ref=dfir.blog">post for an introduction to Chrome&apos;s Network Action Predictor</a>. His post gave me the little kick to dust off and polish (a little) a visualization I had been playing with for this artifact a while ago. </p><p>I&apos;ve been interested in visualizations and applying them to digital forensics for a while now (<a href="https://dfir.blog/tag/visualizations/">some examples on the blog</a>). When I was exploring the Network Action Predictor data the type of chart that came to mind was a <a href="https://en.wikipedia.org/wiki/Sankey_diagram?ref=dfir.blog">Sankey diagram</a>. A Sankey is a type of flow diagram. I think the best way to explain how it works is show an example. I came across this one a few years ago and it has stuck with me as an effective use of the visualization technique:</p><figure class="kg-card kg-embed-card">
    <blockquote class="reddit-card">
      <a href="https://www.reddit.com/r/dataisbeautiful/comments/6a4pb8/how_52_ninthgraders_spell_camouflage_sankey/?ref_source=embed&amp;ref=share">How 52 ninth-graders spell &apos;camouflage&apos;, Sankey diagram [OC]</a> from
      <a href="https://www.reddit.com/r/dataisbeautiful/?ref=dfir.blog">dataisbeautiful</a>
    </blockquote>
    <script async src="https://embed.redditmedia.com/widgets/platform.js" charset="UTF-8"></script>
</figure><p>Each &quot;node&quot; (the colored bars) represents the number of items in that state and the &quot;bands&quot; (or &quot;links&quot;) connect one node to the next. Both the nodes&apos; and bands&apos; sizes are drawn in proportion to their value. It&apos;s easy to see what spelling &quot;paths&quot; were the most common, see where they diverged, and how common each end state was. It&apos;s a ton of interesting information packed in a small area! I find myself tracing different paths, making comparisons, and just generally exploring it: all hallmarks of an effective visualization. </p><p>Below is the &quot;Network Action Predictor&quot; data as Sankey (after a little massaging; read on for the details). There isn&apos;t just one starting node (<em>C </em>in the spelling example above) as there were many different starting letters. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dfir.blog/content/images/2021/02/keystroke_flow_network_action_predictor.png" class="kg-image" alt="Keystroke Flow from Chrome Omnibox" loading="lazy"><figcaption>User keystroke &quot;flow&quot; as saved in Network Action Predictor DB</figcaption></figure><p>In the <em>Keystroke Flow</em> Sankey chart you can see a few different things (beyond that I visit Twitter way too much). When I visit <a href="https://dfir.blog/unfurl/">Unfurl</a>, I most often type <em>un</em> and then select the suggestion. I do the same with Twitter; <em>tw</em> then the suggestion. It&apos;s interesting that after seeing this in data, I came to realize I follow this pattern quite often when launching things: whether in the Windows Start Menu, Mac&apos;s Spotlight Search, or the Chrome Omnibox, I hit a shortcut (Windows key, Command+Space, or Ctrl+T, respectively) then the first couple letters of what I&apos;m looking for. </p><p>The chart gets a little more interesting further down with the Github and Hindsight entries. I access two different Github repos and two Hindsight-related sites often; these all have some common starting places (<em>g</em> or <em>h</em>) and then diverge. The edges overlap and it&apos;s a bit harder to see (in the screenshot image at least; in the actual graph you can hover, highlight, and move nodes).</p><p>I think there&apos;s a couple things that could be of value in this artifact (or visualization). I find artifacts that show what a user actually typed have value, particularly with regard to user intention. After seeing the chart, it would be hard for me to argue that I only went to Twitter or Unfurl by mistake. Conversely, if you did find a visit to a site of interest in the Network Action Predictor data, found it was only visited once, and could see what the user typed to get there, that might help inform your opinion (for or against) as to if the visit was accidental or not. </p><p>I hadn&apos;t published this before as I couldn&apos;t see a good way to integrate it with existing timeline-centric tools (Hindsight, Plaso, or Timesketch) as there isn&apos;t any timestamp information in it. I&apos;ve put it up in my <a href="https://github.com/obsidianforensics/scripts?ref=dfir.blog">scripts Github repository</a>, kind of a catch-all for one-off scripts. I still consider the visualization to be in the proof-of-concept/prototype phase, but I thought someone might find it interesting or useful. </p><h2 id="how-to-build-the-sankey-diagram">How to Build the Sankey Diagram</h2><p>The data stored in the &quot;Network Action Predictor&quot; isn&apos;t quite in the format needed for a Sankey. The <code>network_action_predictor</code> table has <code>user_text</code> and <code>url</code> columns (among others), but that doesn&apos;t give us the &quot;in-between&quot; states (C &#x2192; Cam &#x2192; Camoflau &#x2192; Camoflauge in the spelling example) and a Sankey without those is much less helpful. There are multiple ways to construct the intermediate states (using SQLite&apos;s <code>rowid</code> is one option), but the way I chose to approach it in my script is laid out below. </p><h3 id="filter-function filter() { [native code] }1">Filter</h3><p>First, I wanted to filter out rows that aren&apos;t helpful for the visualization. I removed rows with <code>number_of_hits</code> == 0 (the 0-hit rows are quite numerous and are suggestions that were not correct) and rows where <code>user_text</code> == <code>url</code> (there weren&apos;t any intermediate steps; these are more rare). There also often are a lot of URLs that have only been visited a few times. These can make the graphic &quot;noisy&quot; so I added the ability to filter out any entries that are below a user-defined &quot;threshold&quot; value (2 by default).</p><h3 id="construct-nodes-links">Construct Nodes &amp; Links</h3><p>Next, I needed turn the rows of <code>user_text</code>, <code>url</code>, and <code>number_of_hits</code> values into nodes and links. I looped through all the rows and grouped the <code>user_text</code> entries by what URL they point at. This resulted in a dictionary for each URL with keys:values being <code>user_text</code>: <code>number_of_hits</code> . Example:</p><figure class="kg-card kg-code-card"><pre><code class="language-Python">   &apos;https://www.youtube.com/&apos;: {
       &apos;y&apos;: 5.0,
       &apos;yo&apos;: 5.0,
       &apos;you&apos;: 3.0
   }</code></pre><figcaption>Grouping <code>user_text</code>s that point to same URL</figcaption></figure><p>I then need to convert these into &quot;link&quot; form. Doing so on the same YouTube data as the above example yields these links:</p><figure class="kg-card kg-code-card"><pre><code class="language-Python"> &apos;y&apos; -&gt; youtube.com (5) 
 &apos;yo&apos; -&gt; youtube.com (5) 
 &apos;you&apos; -&gt; youtube.com (3)</code></pre><figcaption>This is pretty similar to the raw <code>network_action_predictor</code> rows</figcaption></figure><p>Now there are many <code>user_text</code> entries all pointing to a URL, not to other text items. This would result in a graph that&apos;s only two &quot;levels&quot; deep, not the multi-leveled flow graph desired. I needed to modify the links so that <code>user_text</code> entries that eventually point to the same URL <strong>and </strong>that are subsets point to each other instead, showing the flow (and not lead to overcounting the end result); something like this:</p><figure class="kg-card kg-code-card"><pre><code class="language-Python">  &apos;y&apos; -(5)-&gt; |&apos;yo&apos;| ---(2)-------------------&gt; | youtube.com |
             |&apos;yo&apos;| ---(3)--&gt; |&apos;you&apos;| --(3)--&gt; | youtube.com |</code></pre><figcaption>It&apos;s hard to illustrate in ASCII, but that&apos;s why the final product is a graph ;)</figcaption></figure><blockquote><strong>Important note</strong>: this is assuming something about the data; those 5 hits for <code>y</code> to youtube.com are actually the same 5 hits as for <code>yo</code> to youtube.com. I couldn&apos;t find confirmation that this is the case, but from looking at a bunch of different test data sets I&apos;ve collected I believe it to be true. The alternative is that there actually was 10 hits to youtube.com (5 from <code>y</code> and 5 from <code>yo</code>), not the 5 I&apos;m interpreting it as. </blockquote><p>To transform these nodes and links into the &quot;chained&quot; form I want, I go through each URL&apos;s dictionary and see if any <code>user_text</code> values are the same, but with one letter added at the end. Examples: <strong>y &amp; yo</strong> and <strong>yo &amp; you</strong>. If so, made a new link ( <strong>y &#x2192; yo</strong> ) with the &quot;weight&quot; being the overlap (<strong>5</strong>).</p><p>To wrap this part up, I made links for any nodes that didn&apos;t fall into this &quot;subset&quot; pattern, then did a little massaging to save the nodes and links in a JSON file suitable for the graphing library.</p><h3 id="display-the-chart">Display the Chart</h3><p>To build the actual visualization, I used <a href="https://d3js.org/?ref=dfir.blog">d3.js</a> and a <a href="https://github.com/d3/d3-plugins/blob/master/sankey/sankey.js?ref=dfir.blog">Sankey plugin</a>. There are other Sankey options; this one is quite old, but I did start this project a long time ago. You can do incredible things with d3.js, but I am by no means a master with it and this chart is fairly spartan. It&apos;s mostly the example code with a few tweaks; most of the work I did was in transforming the Network Action Predictor data into a JSON in the format the library needed.</p><h2 id="run-it-yourself">Run it Yourself</h2><p>In my scripts repository, there is a <a href="https://github.com/obsidianforensics/scripts/tree/master/keystroke-flow?ref=dfir.blog">keystroke-flow directory</a>. Run <code>python3 keystroke-flow.py &quot;/path/to/Network Action Predictor&quot;</code> and it will create a JSON file. If you want to tweak the threshold value mentioned above, pass <code>-t &lt;number&gt;</code> to filter out URLs that have less than <code>&lt;number&gt;</code> incoming links. There&apos;s a <code>keystroke_flow_diagram.html</code> file in that directory that will render the JSON into the <em>Keystroke Flow</em> chart, but you can&apos;t just open it to view the results. If you do, you won&apos;t see the chart, as CORS policy won&apos;t let it load. </p><p>Fortunately, Python can help us out here. Open a command prompt and change directories into <code>keystroke-flow</code>. Then run <code>python -m http.server</code>, open <a href="http://localhost:8000/?ref=dfir.blog">http://localhost:8000/</a> in a browser, and click <code>keystroke_flow_diagram.html</code> to view your own <em>Keystroke Flow</em> Sankey!</p>]]></content:encoded></item><item><title><![CDATA[New Hindsight Release: Better LevelDB parsing, New Web UI View, & More!]]></title><description><![CDATA[Latest Hindsight version (2021.01.16) brings exciting new features: improved LevelDB parsing (including deleted!), viewing Hindsight results in the web UI, and more!]]></description><link>https://dfir.blog/hindsight-better-leveldb-and-new-web-ui/</link><guid isPermaLink="false">66579e6f04abfd293590d975</guid><category><![CDATA[Hindsight]]></category><category><![CDATA[Open Source Tools]]></category><category><![CDATA[Python]]></category><category><![CDATA[Web Browsers]]></category><category><![CDATA[Chrome]]></category><dc:creator><![CDATA[Ryan Benson]]></dc:creator><pubDate>Mon, 18 Jan 2021 18:19:40 GMT</pubDate><media:content url="https://dfir.blog/content/images/2021/01/hindsight-2021.01.16-banner.png" medium="image"/><content:encoded><![CDATA[<img src="https://dfir.blog/content/images/2021/01/hindsight-2021.01.16-banner.png" alt="New Hindsight Release: Better LevelDB parsing, New Web UI View, &amp; More!"><p>It&apos;s been a while, but a new Hindsight release is here! This new version (2021.01.16) brings exciting new features: improved LevelDB parsing (including deleted!), viewing Hindsight results in the web UI, and more!</p><h2 id="improved-leveldb-parsing">Improved LevelDB Parsing</h2><p>LevelDB has been used in Chrome for years... and for years I&apos;ve had difficulties parsing it. The Python support for LevelDB hasn&apos;t been great; all the Python packages required you to have LevelDB installed on the system already and they acted like a shim to it. This worked great on Linux systems, as LevelDB was (relatively) easy to install, but proved a challenge on Windows systems. &#xA0;</p><p>Then <a href="https://twitter.com/kviddy?ref=dfir.blog">Alex Caithness</a> from CCL Forensics came out with a couple of fantastic <a href="https://www.cclsolutionsgroup.com/post/hang-on-thats-not-sqlite-chrome-electron-and-leveldb?ref=dfir.blog">blog</a> <a href="https://www.cclsolutionsgroup.com/post/indexeddb-on-chromium?ref=dfir.blog">posts</a> (and code!) exploring Chrome&apos;s IndexedDB. IndexedDB in Chrome is complicated in its own right, but it also uses LevelDB for data storage. In Alex&apos;s exploration of IndexedDB, he created a <strong>pure Python parser for LevelDB</strong>! This code (which he <a href="https://github.com/cclgroupltd/ccl_chrome_indexeddb?ref=dfir.blog">released as open source</a>), makes reading LevelDB in Python <em>a lot</em> easier. I&apos;ve switched Hindsight over to using <a href="https://github.com/cclgroupltd/ccl_chrome_indexeddb?ref=dfir.blog">ccl_chrome_indexeddb</a> for reading LevelDB and removed the old code and dependencies, which means Hindsight should parse LevelDB records now out of the box on all platforms! </p><p>Right now, FileSystem and LocalStorage records are the only LevelDB-backed artifacts that Hindsight parses, but I&apos;ll be adding more in the coming months. Both these record types appear in the &quot;Storage&quot; tab. Thanks to Alex&apos;s code, I was able to add a two new columns (<em>Sequence </em>and <em>State</em>), both about the LevelDB internals; I&apos;ll expand on them in a later post. The File System records got a few more additional columns, thanks to suggestions from <a href="https://twitter.com/chadtilbury?ref=dfir.blog">Chad Tilbury</a>, that help you see what files still exist on disk and a bit about them (size and type). </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dfir.blog/content/images/2021/01/hindsight_file-system-specific.png" class="kg-image" alt="New Hindsight Release: Better LevelDB parsing, New Web UI View, &amp; More!" loading="lazy"><figcaption>New Backing Database and File System columns in &quot;Storage&quot; tab</figcaption></figure><h3 id="bonus-deleted-records-">Bonus: Deleted Records!</h3><p>One of the things that excited me initially when I was digging into LevelDB is that the format lends itself to keeping deleted records around for a while. I&apos;ve been using a golang program called <a href="https://github.com/golang/leveldb/tree/master/cmd/ldbdump?ref=dfir.blog">ldbdump</a> to explore deleted records, and you can find a lot of them! Another great thing about the switch to using the CCL Forensics&apos; code in Hindsight is that since it parses deleted records, Hindsight now can too! More to come on this in a later post.</p><h2 id="viewing-sqlite-results-in-hindsight-s-web-ui">Viewing SQLite Results in Hindsight&apos;s Web UI</h2><p>Since Hindsight&apos;s beginning, it has been a parsing tool; you would have to view that parsed output somewhere else (an XLSX file in Excel, or maybe a JSONL file loaded into Timesketch). Thanks to Ryne Everett, you can now view parsed records in Hindsight too! He&apos;s added the ability to view Hindsight&apos;s SQLite output in the Hindsight web UI. It uses his <a href="https://gitlab.com/ryneeverett/sqlite-view?ref=dfir.blog">sqlite-view</a> project, which is based on <a href="https://github.com/inloop/sqlite-viewer?ref=dfir.blog">sqlite-viewer</a>, to add a SQL-like view and querying interface to Hindsight. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://dfir.blog/content/images/2021/01/hindsight-sqlite-view.png" class="kg-image" alt="New Hindsight Release: Better LevelDB parsing, New Web UI View, &amp; More!" loading="lazy"><figcaption>Viewing Hindsight&apos;s output in the browser using <code>sqlite-view</code></figcaption></figure><p>After running Hindsight&apos;s web UI and processing some browser history files, there&apos;s a new button (<em>View SQLite DB in Browser</em>). After clicking that, a view like the above screenshot will appear. You can select which table to view by clicking the table name at the top, and you can do SQLite queries as if you were in an external SQLite viewer. </p><p>It does require a separate <a href="https://github.com/obsidianforensics/hindsight?ref=dfir.blog#manual-installation">install step</a>, as we didn&apos;t want to bundle all the supporting Javascript code in the Hindsight repo. If you don&apos;t have the necessary Javascript code installed, you just won&apos;t be able to use the new functionality (the button will be grayed out); everything else in Hindsight should continue to work as normal. I&apos;ve included these supporting files in the compiled EXE version, so this feature is enabled in it.</p><h2 id="parsing-media-history-artifacts">Parsing &quot;Media History&quot; Artifacts</h2><p>Chrome added a new &quot;Media History&quot; database in version 86, and this version of Hindsight adds support for parsing it. See this <a href="https://dfir.blog/media-history-database-added-to-chrome/">blog post</a> for more info on this new artifact.</p><h2 id="update-minimum-python-version-to-3-8">Update Minimum Python version to 3.8</h2><p>The switch to using the CCL Forensics LevelDB parsing code necessitated moving Hindsight to use Python 3.8, rather than 3.7. I hope this isn&apos;t too big an issue for anyone, as 3.7 has moved to security-fixes only and 3.8 (and 3.9) have performance improvements as well. </p><h2 id="get-hindsight">Get Hindsight</h2><p>You can get Hindsight, view the code, and see the full change log on <a href="https://github.com/obsidianforensics/hindsight?ref=dfir.blog">GitHub</a>. Both the command line and web UI versions of this release are available as:</p><ul><li>compiled exes attached to the <a href="https://github.com/obsidianforensics/hindsight/releases/latest?ref=dfir.blog">GitHub release</a> or in the dist/ folder</li><li>.py versions are available by <code>pip install pyhindsight</code> or downloading/cloning the GitHub repo.</li></ul><p><em>NOTE: Windows Defender has been flagging the EXEs as malware, presumably because they were packaged with PyInstaller</em>. The Python script versions are not being flagged. If you&apos;d like to build the EXEs from the Python code yourself, all I did was: <code>pyinstaller --distpath .\dist .\spec\hindsight.spec</code> from the root of the repo.</p>]]></content:encoded></item></channel></rss>