Thursday, March 19, 2015

How do I turn on Tracking Protection? Let me count the ways.


I get this question a lot from various people, so it deserves its own post. Here's how to turn on Tracking Protection in Firefox to avoid connecting to known tracking domains from Disconnect's blocklist:
  1. Visit about:config and turn on privacy.trackingprotection.enabled. Because this works Firefox 35 or later, this is my favorite method. In Firefox 37 and later, it also works on Fennec.
  2. On Fennec Nightly, visit Settings > Privacy and select the checkbox "Tracking Protection".
  3. Install Lightbeam and toggle the "Tracking Protection" button in the top-right corner. Check out the difference in visiting only 2 sites with Tracking Protection on and off!
  4. On Firefox Nightly, visit about:config and turn on browser.polaris.enabled. This will enable privacy.trackingprotection.enabled and also show the checkbox for it in about:preferences#privacy, similar to the Fennec screenshot above. Because this only works in Nightly and also requires visiting about:config, it's my least favorite option.
  5. Do any of the above and sign into Firefox Sync. Tracking Protection will be enabled on all of your desktop profiles!

Wednesday, March 18, 2015

Tracking Protection talk on Air Mozilla

In August 2014, Georgios Kontaxis and I gave a talk on the implementation status of tracking protection in Firefox. At the time the talk was Mozillians only, but now it is public! Please visit Air Mozilla to view the talk, or see the slides below. The implementation status has not changed very much since last August, so most of the information is still pretty accurate.

Monday, November 10, 2014

Tracking Protection in Firefox


On Monday a project that I've been working on was officially announced as part of a larger privacy initiative called Polaris. In case you missed it, there is an experimental tracking protection feature in Firefox Nightly that allows people to avoid being tracked by not communicating with known tracking domains, especially those that do not respect DNT. Our initial blocklist is from Disconnect. As a side effect, blocking resources from tracking domains speeds up page load times on average by 20%. Privacy features rarely coincide with performance benefits, so that's exciting.

Currently, tracking protection is available by turning on browser.polaris.enabled in about:config. If you care about privacy in Firefox and are running Nightly, please give it a try. Requiring about:config changes is quite onerous, but we need your feedback to improve tracking protection. You can read official instructions on how to turn on tracking protection or see the animated gif below (original slide deck here for people who like to advance manually).

Many thanks to everyone who helped get this landed, especially my awesome intern, Georgios Kontaxis, and the team at Disconnect for open sourcing their blocklist.

Wednesday, September 10, 2014

Making decisions with limited data

It is challenging but possible to make decisions with limited data. For example, take the rollout saga of public key pinning.

The first implementation of public key pinning included enforcing pinning on addons.mozilla.org. In retrospect, this was a bad decision because it broke the Addons Panel and generated pinning warnings 86% of the time. As it turns out, the pinset was missing some Verisign certificates used by services.addons.mozilla.org, and the pinning enforcement on addons.mozilla.org included subdomains. Having more data lets us avoid bad decisions.

To enable safer rollouts, we implemented a test mode for pinning. In test mode, pinning violations are counted but not enforced. With sufficient telemetry, it is possible to measure how badly sites would break without actually breaking the site.

Due to privacy restrictions in telemetry, we do not collect per-organization pinning violations except for Mozilla sites that are operationally critical to Firefox. This means that it is not possible to distinguish pinning violations for Google domains from Twitter domains, for example. I do not believe that collecting the aggregated number of pinning violations for sites on the Alexa top 10 list constitutes a privacy violation, but I look forward to the day when technologies such as RAPPOR make it easier to collect actionable data in a privacy-preserving way.

Fortunately for us, Chrome has already implemented pinning on many high-traffic sites. This is fantastic news, because it means we can import Chrome’s pin list in test mode with relatively high assurance that the pin list won’t break Firefox, since it is already in production in Chrome.

Given sufficient test mode telemetry, we can decide whether to enforce pins instead of just counting violations. If the pinning violation rate is sufficiently low, it is probably safe to promote the pinned domain from test mode to production mode. The screenshot below shows a 3 week period where we promoted cdn.mozilla.com and media.mozilla.com and Google domains to production, as well as expand coverage on Twitter to include all subdomains.



Because the current implementation of pinning in Firefox relies on built-in static pinsets and we are unable to count violations per-pinset, it is important to track changes to the pinset file in the dashboard. Fortunately HighStock supports event markers which somewhat alleviates this problem, and David Keeler also contributed some tooltip code to roughly associate dates with Mercurial revisions. Armed with the timeseries of pinning violation rates, event markers for dates that we promoted organizations to production mode (or high-traffic organizations like Dropbox were added in test mode due to a new import from Chromium) we can see whether pinning is working or not.

Telemetry is useful for forensics, but in our case, it is not useful for catching problems as they occur. This limitation is due to several difficulties, which I hope will be overcome by more generalized, comprehensive SSL error-reporting and HPKP:
  • Because pinsets are static and built-in, there is sometimes a 24-hour lag between making a change to a pinset and reaching the next Nightly build.
  • Telemetry information is only sent back once per day, so we are looking at a 2-day delay between making a change and receiving any data back at all.
  • Telemetry dashboards (as accessible from telemetry.js and telemetry.mozilla.org) need about a day to aggregate, which adds another day.
  • Update uptake rates are slow. The median time to update Nightly is around 3 days, getting to 80% takes 10 days or longer.
Due to these latency issues, pinning violation rates take at least a week to stabilize. Thankfully, telemetry is on by default in all pre-release channels as of Firefox 31, which gives us a lot more confidence that the pinning violation rates are representative.

Despite all the caveats and limitations, using these simple tools we were able to successfully roll out pinning pretty much all sites that we’ve attempted (including AMO, our unlucky canary) as of Firefox 34 and look forward to expanding coverage.

Thanks for reading, and don’t forget to update your Nightly if you love Mozilla! :)

Tuesday, August 26, 2014

Firefox 32 supports Public Key Pinning

Public Key Pinning helps ensure that people are connecting to the sites they intend. Pinning allows site operators to specify which certificate authorities (CAs) issue valid certificates for them, rather than accepting any one of the hundreds of built-in root certificates that ship with Firefox. If any certificate in the verified certificate chain corresponds to one of the known good certificates, Firefox displays the lock icon as normal.

Pinning helps protect users from man-in-the-middle-attacks and rogue certificate authorities. When the root cert for a pinned site does not match one of the known good CAs, Firefox will reject the connection with a pinning error. This type of error can also occur if a CA mis-issues a certificate.

Pinning errors can be transient. For example, if a person is signing into WiFi, they may see an error like the one below when visiting a pinned site. The error should disappear if the person reloads after the WiFi access is setup.



Firefox 32 and above supports built-in pins, which means that the list of acceptable certificate authorities must be set at time of build for each pinned domain. Pinning is enforced by default. Sites may advertise their support for pinning with the Public Key Pinning Extension for HTTP, which we hope to implement soon. Pinned domains include addons.mozilla.org and Twitter in Firefox 32, and Google domains in Firefox 33, with more domains to come. That means that Firefox users can visit Mozilla, Twitter and Google domains more safely. For the full list of pinned domains and rollout status, please see the Public Key Pinning wiki.

Thanks to Camilo Viecco for the initial implementation and David Keeler for many reviews!

Wednesday, July 23, 2014

Download files more safely with Firefox 31


Did you know that the estimated cost of malware is hundreds of billions of dollars per year? Even without data loss or identity theft, the time and annoyance spent dealing with infected machines is a significant cost.

Firefox 31 offers improved malware detection. Firefox has integrated Google’s Safe Browsing API for detecting phishing and malware sites since Firefox 2. In 2012 Google expanded their malware detection to include downloaded files and made it available to other browsers. I am happy to report that improved malware detection has landed in Firefox 31, and will have expanded coverage in Firefox 32.

In preliminary testing, this feature cuts the amount of undetected malware by half. That’s a significant user benefit.

What happens when you download malware? Firefox checks URLs associated with the download against a local Safe Browsing blocklist. If the binary is signed, Firefox checks the verified signature against a local allowlist of known good publishers. If no match is found, Firefox 32 and later queries the Safe Browsing service with download metadata (NB: this happens only on Windows, because signature verification APIs to suppress remote lookups are only available on Windows). In case malware is detected, the Download Manager will block access to the downloaded file and remove it from disk, displaying an error in the Downloads Panel below.


How can I turn this feature off? This feature respects the existing Safe Browsing preference for malware detection, so if you’ve already turned that off, there’s nothing further to do. Below is a screenshot of the new, beautiful in-content preferences (Preferences > Security) with all Safe Browsing integration turned off. I strongly recommend against turning off malware detection, but if you decide to do so, keep in mind that phishing detection also relies on Safe Browsing.
Many thanks to Gian-Carlo Pascutto and Paolo Amadini for reviews, and the Google Safe Browsing team for helping keep Firefox users safe and secure!

Wednesday, October 23, 2013

Cookie counting

Understanding cookies through user studies

A cookie is a small piece of data stored in the browser by websites. Although cookies are mostly invisible, they serve many purposes such as saving items in shopping carts, authenticating to websites, and displaying targeted ads or other personalized content. Understanding more about how websites use cookies allows us to write tools that manage cookies effectively.

In June 2013, the Mozilla User Research team ran a paid study of 573 Firefox users that included data on cookie and browsing events. The user population was census-balanced and included only US users. The study ran for a median of 18.8 days, during which time we observed 18.4 million attempts to set cookies by examining HTTP Set-Cookie headers. Each Set-Cookie header counts as a single event, even though it may contain multiple cookies. Storing multiple pieces of information across separate cookies or combining them into a single cookie are equally powerful. Set-Cookie headers are not the only method for setting cookies, but they are sufficiently prevalent to be representative. We did not observe read events due to volume constraints. We observed 2.84 million pages loaded, measured by counting tab-ready events.

N = 573 Tab-ready events Set-Cookie events Tab-ready events/day
Median 3552 12297 189
Total 2842270 12439439

Counting origins

Throughout this post we use top-level domains (from the Public Suffix list) plus one component to count origins. For example, we consider foo.example.com and bar.example.com to represent the same origin. The public suffix mechanism is not perfect, because a single organization may own many origins (e.g., doubleclick.net and google.com both belong to Google). In total, study users visited 40682 unique origins (counted by tab-ready events) and received set-cookie events from 32786 unique origins. Below is the distribution of cookie events per tab event.

Who uses cookies?

Building effective cookie management tools requires understanding who sets cookies. Cookie activity is difficult to characterize because sites vary highly in both the number of cookies they set and the amount of third-party content (which may set cookies on behalf of the third-party site) that they include. Although each page load event incurs on average around 3.6 Set-Cookie events, many sites incur an order of magnitude more.

The graph below shows the 20 origins responsible for the most set-cookie events. These origins represent 0.05% of unique cookie-setting origins and are responsible for 42.7% of set-cookie events seen in the study data. Set-cookie attempts are either first-party, where the origin of the cookie being set is the same as the one in the location bar, or third-party, where the origins don't match.

Who uses third-party cookies?

Third-party cookies have many purposes. For example, social widget implementations usually rely on third-party cookies to display personalized content, and inline ads rely on third-party cookies to provide targeted ads and perform frequency capping. Of the 12.4 million set-cookie events in the study, 50.4% are for third-party cookies (shown in red in the graph above).

The graph below shows the top 20 origins setting third-party cookies, responsible for 41.1% of third-party set-cookie events. adnxs.com belongs to AppNexus, an ad exchange. Facebook sets mostly first-party cookies, but because Facebook's social widgets are included on many sites, Facebook sets many third-party cookies (which may have originally been created in a first-party context). Of the top 20 origins, 18 primarily offer advertising services.

It is interesting to compare this data to Table IV from Eubank et al.'s survey on third-party cookies. In the Eubank survey, the authors used simulated data from crawling Alexa's top 500 websites, included all types of third-party embedded data, and did not canonicalize domains using the public suffix list. Even though the methodology is different, many origins in the top 20 overlap.

How many third-party cookies are from origins the user knows?

One interesting question is whether or not users intentionally accept cookies, especially in the case of third-party cookies. We examine two possible heuristics for estimating whether a user interaction with an origin is intentional:
  1. The user has already accepted cookies from the origin (pre-existing cookie condition)
  2. The user has visited the origin by entering it into the location bar (simulated history condition)
Both of these conditions rely on previous interactions. Any potential changes to the way browsers handle third-party cookies must consider what to do with previous interactions (in this case, existing cookies and location bar history).

Pre-existing cookie condition

We did not ask study participants to clear cookies before beginning the study. Of the third-party set-cookie events, 90.8% of them were sent to users who had already accepted cookies from that origin. The graph below shows this percentage for the top 20 origins that set third-party cookies. In this graph, nearly all origins are above 75% with the exception of doubleclick.net. This dip can be explained by a handful of users who have a particular security addon installed.

Simulated history condition

Another heuristic for evaluating if a user has interacted with a site is whether that origin has appeared in the location bar, as measured by tab-ready events. This lets us count third-party origins that have previously appeared in a first-party context.

For each user, we take the entire set of origins extracted from tab-ready events to simulate that user’s history, then count whether the origins in the Set-Cookie events appear in the simulated history. The graph below shows this percentage for the top 20 origins of third-party cookies.


Overall, 19.6% of third-party cookie events came from origins in users’ simulated history in the course of the study. Not surprisingly, nearly all users had visited facebook.com and youtube.com, which are currently ranked 2nd and 3rd most visited sites according to Alexa. Interestingly, adnxs.com also appeared much of the time in simulated histories, even though the rank of adnxs.com is currently 576 in the US according to Alexa. From looking through tab-ready events, adnxs.com appeared in redirects and popups.

How long do cookies live?

The Set-Cookie HTTP header has an optional expiration time that tells the browser how long to keep the cookie. From the graph below, many cookies are long-lived, possibly longer-lived than the installation of the operating system or browser. 20% of third-party cookie expiration times were one week or less, and 51% of third-party cookie expiration times were longer than 6 months.

What's next?

Data from real users is crucial to understanding how websites use cookies and therefore what kind of technical solutions to cookie management make sense (or if indeed we should be concentrating on cookies at all). We hope that this is just the start of using data to shape our technologies and policies. Please join dev-privacy to continue the discussion.

Many thanks to Gregg Lind for deploying the study and to Jonathan Mayer, Alex Fowler, John Jensen, and Chris Karlof for reviewing this post.