When coding a webpage, sometimes you know something is very
likely to be needed, even if it’s not needed yet. You can give the
browser a hint:
<link rel=prefetch href=url>
The browser will take a note, and then when it doesn’t have anything more
important to do it might request
url. Later on,
if it does turn out to need
url, it will already
have it.
For example, I wrote a slideshow where each slide was essentially:
Prefetching the image for slide N+1 when viewing slide N made each
transition practically instant.
This works for images, but also works for CSS, JS, HTML, anything!
Or, at least, it used to.
The browser stores URLs it fetches in a cache. At its
simplest this looks like a big dictionary, from
url to the contents of that url:
a.test/js
javascript1
b.test/js
javascript2
Unfortunately, attackers can
abuse this to learn about your browsing on other sites, and all
the major browsers (Safari, Chrome,
Firefox)
now have partitioned their cache. This means if you are on
a.test and load a.test/js that JS will not
be reused if you go to b.test and load
a.test/js again. The dictionary’s keys look like
(site, url):
a.test:a.test/js
javascript1
b.test:a.test/js
javascript1
b.test:b.test/js
javascript2
Even if the keys a.test:a.test/js and
b.test:a.test/js both have exactly the same JS bytes,
they need to be kept separate to avoid a privacy leak.
So now imagine you are a modern browser visiting a.test
and you encounter:
<link rel=prefetch href=b.test/index.html>
Where should you store it in your cache? Well, it depends what the
user is going to do. If they are going to click on a link to
b.test/index.html, then when they need the HTML they will
be visiting
b.test and so you want to store it as
b.test:b.test/index.html. On the other hand, if it’s
going to load in an iframe, the user will still be on
a.test and so you want to store it as
a.test:b.test/index.html. You just don’t know. Just
guess?
The guess is a risky one: if you store it under the wrong key then
you’ll have fetch the same resource again just to store it under the
right key. Users will see double fetching.
It turns out browsers guess differently here. I made test pages
(iframe,
new
page) and while Firefox guesses you’ll load it in an
iframe, Safari (with the experimental LinkPrefetch setting
enabled) and Chrome guess you’ll load it in a new page.
Except, I think this implies more of a decision than there probably
actually was. I doubt anyone explicitly considered the probability
that a prefetched resource would be used by an iframe. Instead, my
guess is when updating an enormous amount of code to add cache keys,
multiple developers just ended up coding different things.
Why Prefetch Is Broken
Link post
When coding a webpage, sometimes you know something is very likely to be needed, even if it’s not needed yet. You can give the browser a hint:
The browser will take a note, and then when it doesn’t have anything more important to do it might requesturl
. Later on, if it does turn out to needurl
, it will already have it.For example, I wrote a slideshow where each slide was essentially:
Prefetching the image for slide N+1 when viewing slide N made each transition practically instant.This works for images, but also works for CSS, JS, HTML, anything! Or, at least, it used to.
The browser stores URLs it fetches in a cache. At its simplest this looks like a big dictionary, from
url
to the contents of that url:a.test/js
b.test/js
Unfortunately, attackers can abuse this to learn about your browsing on other sites, and all the major browsers (Safari, Chrome, Firefox) now have partitioned their cache. This means if you are on
a.test
and loada.test/js
that JS will not be reused if you go tob.test
and loada.test/js
again. The dictionary’s keys look like(site, url)
:a.test:a.test/js
b.test:a.test/js
b.test:b.test/js
Even if the keys
a.test:a.test/js
andb.test:a.test/js
both have exactly the same JS bytes, they need to be kept separate to avoid a privacy leak.So now imagine you are a modern browser visiting
Where should you store it in your cache? Well, it depends what the user is going to do. If they are going to click on a link toa.test
and you encounter:b.test/index.html
, then when they need the HTML they will be visitingb.test
and so you want to store it asb.test:b.test/index.html
. On the other hand, if it’s going to load in an iframe, the user will still be ona.test
and so you want to store it asa.test:b.test/index.html
. You just don’t know. Just guess?The guess is a risky one: if you store it under the wrong key then you’ll have fetch the same resource again just to store it under the right key. Users will see double fetching.
It turns out browsers guess differently here. I made test pages (iframe, new page) and while Firefox guesses you’ll load it in an iframe, Safari (with the experimental LinkPrefetch setting enabled) and Chrome guess you’ll load it in a new page.
Except, I think this implies more of a decision than there probably actually was. I doubt anyone explicitly considered the probability that a prefetched resource would be used by an iframe. Instead, my guess is when updating an enormous amount of code to add cache keys, multiple developers just ended up coding different things.
I’ve filed a spec issue (#6723) proposing:
Here’s hoping browsers are interested in fixing this, and stopping those double fetches.(Disclosure: I work for Google, but not on Chrome. Speaking only for myself.)